What is Malware Analysis?
You download what looks like a normal PDF. You open it. Nothing happens — or so you think. Behind the scenes, a process just spawned, a connection just opened to a server in Eastern Europe, and your keystrokes are now being logged.
How do you figure out what just happened? How do you take that suspicious file, rip it apart, and understand exactly what it does — without getting infected yourself?
That’s malware analysis. And it’s one of the most fascinating disciplines in cybersecurity.
What is Malware Analysis?
Malware analysis is the process of examining malicious software to understand:
- How it works — What does the code actually do?
- How it spreads — Does it exploit a vulnerability? Social engineering? USB drives?
- What it targets — Files? Credentials? Network resources?
- What damage it causes — Data theft? Encryption? Persistence?
- How to detect and prevent it — What signatures, indicators, or behaviors can we look for?
The end goal is to extract Indicators of Compromise (IoCs) — things like file hashes, IP addresses, domain names, registry keys, or mutex names that uniquely identify the malware. These IoCs feed into antivirus signatures, firewall rules, intrusion detection systems, and threat intelligence platforms.
Think of it like forensic science, but for software. A detective examines a crime scene to understand what happened, who did it, and how to catch them. A malware analyst does the same thing — but the crime scene is a binary file, and the evidence is hidden in assembly code, network traffic, and system calls.
Types of Malware — Know Your Enemy
Before we analyze malware, let’s understand what we’re dealing with. Malware comes in many flavors, each with different objectives and behaviors.
| Type | What It Does | Example |
|---|---|---|
| Virus | Attaches itself to legitimate programs and spreads when those programs run | CIH (Chernobyl virus) |
| Worm | Self-replicating — spreads across networks without user interaction | WannaCry, Conficker |
| Trojan | Disguises itself as legitimate software to trick users into running it | Zeus, Emotet |
| Ransomware | Encrypts your files and demands payment for the decryption key | WannaCry, LockBit, Ryuk |
| Spyware | Silently monitors user activity — keystrokes, screenshots, browsing history | Pegasus, FinFisher |
| Rootkit | Hides deep in the OS to maintain persistent, undetectable access | Sony BMG rootkit, ZeroAccess |
| RAT (Remote Access Trojan) | Gives the attacker full remote control over the infected machine | DarkComet, njRAT |
| Botnet Agent | Turns the infected machine into a “zombie” controlled by a command server | Mirai, Emotet (also a botnet) |
Most modern malware doesn’t fit neatly into one category. WannaCry, for instance, was a worm (self-propagating via EternalBlue exploit) that delivered ransomware (encrypted files and demanded Bitcoin). Emotet started as a banking trojan, evolved into a botnet, and became a delivery platform for other malware. The lines are blurry.
The Three Pillars of Malware Analysis
There are three main approaches to analyzing malware, each with different depth, difficulty, and trade-offs.
1. Static Analysis — Look, Don’t Touch
Static analysis means examining the malware without executing it. You’re looking at the file itself — its structure, metadata, strings, imports, and code — but never actually running it.
This is the safest approach. Since the malware never executes, there’s no risk of infection. It’s also the fastest way to get a first impression.
What you do in static analysis:
- File identification — What type of file is it? PE executable? ELF binary? Office document with macros?
- Hashing — Generate SHA256/MD5 hashes to fingerprint the sample and check if it’s already known
- String extraction — Pull readable text from the binary. You’d be surprised how often malware contains URLs, IP addresses, error messages, or even the author’s username in plain text
- PE header analysis — For Windows executables, examine the imports (what Windows APIs does it call?), exports, sections, and timestamps
- Packer detection — Is the binary packed or obfuscated? Packers like UPX compress the binary to hide its true contents
Example — extracting strings from a suspicious binary:
$ strings suspicious.exe | head -20
!This program cannot be run in DOS mode.
.text
.rdata
.data
kernel32.dll
CreateFileA
WriteFile
WinExec
http://evil-c2-server.com/beacon
cmd.exe /c whoami
HKEY_CURRENT_USER\Software\Microsoft\Windows\CurrentVersion\Run
Look at that. Without running a single line of code, we already know this binary:
- Calls
CreateFileAandWriteFile(file system operations) - Uses
WinExec(executing commands) - Reaches out to
http://evil-c2-server.com/beacon(command & control server) - Runs
cmd.exe /c whoami(reconnaissance) - Writes to the
Runregistry key (persistence — it wants to survive reboots)
That’s a lot of intelligence from just strings. Of course, sophisticated malware won’t leave strings in plaintext — they’ll encrypt or obfuscate them. But you’d be surprised how many samples are sloppy.
Example — checking a hash on VirusTotal:
$ sha256sum suspicious.exe
a1b2c3d4e5f6... suspicious.exe
Take that hash, search it on VirusTotal, and you’ll instantly see if 70+ antivirus engines have already flagged it. If it’s a known sample, someone has probably already written a full report on it.
2. Dynamic Analysis — Let It Run (Safely)
Dynamic analysis means executing the malware in a controlled environment and observing what it does. Instead of reading the code, you watch its behavior — what files it creates, what network connections it makes, what registry keys it modifies.
This is more dangerous than static analysis because the malware is actually running. That’s why you always do this inside a sandbox — an isolated virtual machine that you can snapshot before execution and revert after.
What you observe in dynamic analysis:
- Process activity — What processes does it spawn? Does it inject into other processes?
- File system changes — Does it create, modify, or delete files?
- Registry changes — Does it add persistence keys? Modify security settings?
- Network traffic — What domains does it contact? What data does it send?
- API calls — What system calls does the malware make at runtime?
Setting up a safe analysis environment:
The golden rule — never analyze malware on your host machine. Here’s a typical setup:
- Create a VM — Use VirtualBox or VMware. Windows 10 is the most common target OS
- Take a clean snapshot — Before you introduce any malware
- Disable networking (or route through an isolated network) — You don’t want the malware calling home from your network
- Install analysis tools inside the VM — Process Monitor, Wireshark, Regshot, etc.
- Copy the sample into the VM and execute it
- Observe everything — Capture network traffic, file changes, registry modifications
- Revert to snapshot when done — The VM is clean again, as if nothing happened
Example — watching malware behavior with Process Monitor:
After detonating the sample in your sandbox, Process Monitor (Procmon) shows you every file operation, registry access, and network call in real time:
Operation Path Result
CreateFile C:\Users\Admin\AppData\Local\Temp\svchost.exe SUCCESS
RegSetValue HKCU\Software\Microsoft\Windows\CurrentVersion\Run\Update SUCCESS
TCP Connect 185.141.27.98:443 SUCCESS
CreateFile C:\Users\Admin\Documents\*.docx SUCCESS
WriteFile C:\Users\Admin\Documents\README_DECRYPT.txt SUCCESS
From this we can see the malware:
- Drops a copy of itself as
svchost.exein the Temp folder (disguised as a system process) - Adds itself to the
Runkey (persistence) - Connects to
185.141.27.98on port 443 (C2 communication) - Accesses
.docxfiles (likely encrypting them) - Drops a ransom note (
README_DECRYPT.txt)
That’s ransomware behavior — and we figured it out just by watching.
Automated sandboxes:
If you don’t want to set up everything manually, automated sandboxes do the heavy lifting for you:
- Cuckoo Sandbox — Open-source, self-hosted. Submit a file, and it runs it in a VM, captures everything, and generates a full report
- Any.Run — Interactive online sandbox. You can watch the malware execute in real time in your browser
- Hybrid Analysis — Free online sandbox by CrowdStrike. Upload a sample and get a detailed behavior report
- Joe Sandbox — Commercial-grade, incredibly detailed analysis
3. Reverse Engineering — Reading the Blueprint
Reverse engineering is the deepest level of analysis. You take the compiled binary, disassemble it back into assembly code (or decompile it into pseudo-C), and read through the logic line by line.
This is where you go from “this malware contacts a C2 server” to “this malware uses a custom XOR-based encryption algorithm with key 0x4F to encode its C2 communications, and it falls back to a DGA (Domain Generation Algorithm) if the primary C2 is unreachable.”
This is the hard part. It requires:
- Understanding of x86/x64 assembly language
- Knowledge of OS internals (Windows API, PE format, memory management)
- Patience — a lot of patience
- Familiarity with disassemblers and debuggers
Key tools for reverse engineering:
| Tool | Type | Notes |
|---|---|---|
| Ghidra | Disassembler/Decompiler | Free, open-source (by NSA). Excellent decompiler |
| IDA Pro | Disassembler/Decompiler | Industry standard. Expensive, but the best |
| x64dbg | Debugger | Free, great for dynamic debugging on Windows |
| Radare2/Cutter | Disassembler/Debugger | Free, open-source. Steep learning curve |
| Binary Ninja | Disassembler | Clean UI, good API for scripting |
Example — decrypting hardcoded strings with Ghidra:
You load the malware into Ghidra, and the decompiler shows you something like this:
void decrypt_string(char *encrypted, int len) {
for (int i = 0; i < len; i++) {
encrypted[i] = encrypted[i] ^ 0x4F;
}
}
void contact_c2(void) {
char c2_domain[] = {0x2D, 0x29, 0x2E, 0x27, 0x1E, 0x24, 0x20, 0x1E, ...};
decrypt_string(c2_domain, sizeof(c2_domain));
// c2_domain is now "evil-c2-server.com"
connect_to_server(c2_domain);
}
The malware stored its C2 domain as XOR-encrypted bytes — strings wouldn’t have caught this. But by reverse engineering the decryption function, we recovered the actual domain. That’s an IoC we can now feed into our detection systems.
This is why static analysis alone isn’t always enough. Sophisticated malware obfuscates everything — strings, API calls, control flow. Reverse engineering peels back those layers.
A Complete Analysis Workflow
Let’s put it all together. Here’s a realistic workflow for analyzing a suspicious sample from start to finish.
Step 1: Sample Collection and Safety
You receive a suspicious file — maybe from a honeypot, an email attachment, or an incident response engagement.
# First, hash it for identification
$ sha256sum sample.exe
e3b0c44298fc1c149afbf4c8996fb924...
# Check the hash on VirusTotal
# If it's already known, read existing reports first
# Identify the file type
$ file sample.exe
PE32 executable (GUI) Intel 80386, for MS Windows
Step 2: Static Analysis
# Extract strings
$ strings -n 8 sample.exe > strings_output.txt
# Check for packers
$ die sample.exe # Detect It Easy
# Result: UPX 3.96 detected
# Unpack if needed
$ upx -d sample.exe -o sample_unpacked.exe
# Analyze PE headers with pefile (Python)
$ python3 -c "
import pefile
pe = pefile.PE('sample_unpacked.exe')
for entry in pe.DIRECTORY_ENTRY_IMPORT:
print(entry.dll.decode())
for func in entry.imports:
print(f' {func.name.decode() if func.name else hex(func.ordinal)}')
"
Suspicious imports to watch for:
| API Call | Why It’s Suspicious |
|---|---|
VirtualAlloc / VirtualProtect |
Allocating executable memory (shellcode injection) |
CreateRemoteThread |
Injecting code into another process |
WriteProcessMemory |
Writing into another process’s memory |
URLDownloadToFile |
Downloading files from the internet |
RegSetValueEx |
Modifying the registry (persistence) |
CryptEncrypt |
Encrypting data (ransomware indicator) |
IsDebuggerPresent |
Anti-analysis — checking if it’s being debugged |
Step 3: Dynamic Analysis
Fire up your sandbox VM, take a snapshot, and detonate the sample.
Run these tools simultaneously:
- Process Monitor — Capture all file, registry, and network operations
- Wireshark — Capture all network traffic
- Regshot — Take a registry snapshot before and after execution to see what changed
- Process Hacker — Monitor running processes and their memory
Execute the sample and let it run for a few minutes. Some malware has sleep timers or checks for user activity before activating.
Step 4: Analyze the Results
After execution, you review:
=== Network Connections ===
DNS query: update-service.malwaredomain.com → 185.141.27.98
HTTPS connection to 185.141.27.98:443
POST /gate.php with encrypted body (likely exfiltrated data)
=== File System Changes ===
Created: C:\Users\Admin\AppData\Local\Temp\updater.exe (copy of itself)
Created: C:\Users\Admin\AppData\Roaming\config.dat (encrypted config)
Modified: Multiple .docx and .xlsx files in Documents folder
=== Registry Changes ===
Added: HKCU\Software\Microsoft\Windows\CurrentVersion\Run\WindowsUpdate
Value: C:\Users\Admin\AppData\Local\Temp\updater.exe
=== Process Activity ===
sample.exe → spawned cmd.exe → executed "whoami" and "ipconfig"
sample.exe → injected into explorer.exe via CreateRemoteThread
Step 5: Extract IoCs
From all of this analysis, we compile our Indicators of Compromise:
Indicators of Compromise:
File Hashes:
- SHA256: e3b0c44298fc1c149afbf4c8996fb924...
- MD5: d41d8cd98f00b204e9800998ecf8427e
Network:
- Domain: update-service.malwaredomain.com
- IP: 185.141.27.98
- URL: https://185.141.27.98/gate.php
- User-Agent: "Mozilla/5.0 (compatible; MSIE 10.0)"
Host:
- File: C:\Users\*\AppData\Local\Temp\updater.exe
- Registry: HKCU\Software\Microsoft\Windows\CurrentVersion\Run\WindowsUpdate
- Mutex: Global\XYZ_MUTEX_001
Behavior:
- Enumerates documents (.docx, .xlsx, .pdf)
- Process injection via CreateRemoteThread into explorer.exe
- Exfiltrates data via HTTPS POST to /gate.php
These IoCs are now actionable. They can be shared with your security team, fed into your SIEM, used to write firewall rules, or published in a threat intelligence report.
Anti-Analysis Techniques — Malware Fights Back
Here’s where it gets interesting. Malware authors know their creations will be analyzed, so they build in defenses. Some common anti-analysis techniques:
VM Detection
The malware checks if it’s running inside a virtual machine. If yes, it assumes it’s being analyzed and either does nothing or self-destructs.
// Check for VMware registry key
RegOpenKeyEx(HKLM, "SOFTWARE\\VMware, Inc.\\VMware Tools", ...);
// Check for VirtualBox artifacts
if (GetModuleHandle("VBoxHook.dll") != NULL) {
ExitProcess(0); // We're being analyzed, bail out
}
// Check CPU instruction timing
// VMs have slightly different timing for CPUID instructions
Counter: Use bare-metal analysis machines for sophisticated samples, or patch VM artifacts to make the environment look more like a real machine.
Anti-Debugging
The malware detects if a debugger is attached and changes its behavior.
if (IsDebuggerPresent()) {
// Run benign code instead of malicious payload
MessageBox(NULL, "Hello World!", "Legit App", MB_OK);
return;
}
Counter: Patch IsDebuggerPresent to always return false, or use kernel-level debuggers that are harder to detect.
Packing and Obfuscation
The malware’s real code is compressed or encrypted. At runtime, it unpacks itself into memory. Static analysis sees only the packer stub, not the actual malicious code.
Counter: Let the malware unpack itself in a debugger, then dump the unpacked code from memory.
Time Bombs
The malware waits — sometimes minutes, sometimes days — before activating. If a sandbox only runs for 60 seconds, it’ll see nothing malicious.
Counter: Fast-forward the system clock in the sandbox, or use sandboxes with extended execution times.
Essential Tools — Your Analysis Toolkit
Here’s a summary of the tools we’ve mentioned, organized by purpose.
| Category | Tool | Platform | Cost |
|---|---|---|---|
| Static Analysis | strings, file, xxd |
Linux/Mac | Free |
| Detect It Easy (DIE) | Windows/Linux | Free | |
pefile (Python library) |
Cross-platform | Free | |
| PEStudio | Windows | Free | |
| Dynamic Analysis | Process Monitor (Procmon) | Windows | Free |
| Process Hacker | Windows | Free | |
| Regshot | Windows | Free | |
| Wireshark | Cross-platform | Free | |
| Sandboxes | Cuckoo Sandbox | Self-hosted | Free |
| Any.Run | Online | Free tier | |
| Hybrid Analysis | Online | Free | |
| Reverse Engineering | Ghidra | Cross-platform | Free |
| IDA Pro / IDA Free | Cross-platform | Free / Paid | |
| x64dbg | Windows | Free | |
| Radare2 / Cutter | Cross-platform | Free | |
| Online Analysis | VirusTotal | Web | Free |
| MalwareBazaar | Web | Free | |
| URLhaus | Web | Free |
You don’t need all of these to get started. A solid beginner setup would be: a Windows VM + Process Monitor + Wireshark + Ghidra + VirusTotal. That’s enough to perform basic static and dynamic analysis on most samples.
Getting Started — Where to Practice
You can’t learn malware analysis without samples to analyze. Here are some safe, legal resources:
- MalwareBazaar (bazaar.abuse.ch) — A repository of malware samples shared by the security community. Free to download (you’ll need an account)
- theZoo (github.com/ytisf/theZoo) — A curated collection of malware samples for research purposes
- Any.Run — Watch other people’s analysis sessions to learn what to look for
- Practical Malware Analysis (book by Sikorski & Honig) — The gold standard for learning. Comes with lab exercises and sample files
- Malware Unicorn’s workshops — Free, hands-on reverse engineering workshops
Start with static analysis — it’s the safest and gives you quick wins. Once you’re comfortable, move to dynamic analysis in a VM. And when you’re ready for the deep end, pick up Ghidra and start reverse engineering.
Final Thoughts
Malware analysis is one of those fields that sits at the intersection of programming, security, and detective work. Every sample is a puzzle — the author is trying to hide what it does, and your job is to figure it out anyway.
What I find most fascinating is that it forces you to understand how computers actually work — at the OS level, at the network level, at the assembly level. You can’t analyze a rootkit without understanding how the kernel manages processes. You can’t spot a C2 beacon without understanding how DNS and HTTP work. You can’t reverse engineer an encryption routine without understanding the math behind it.
If you’re interested in cybersecurity and want to understand the offensive side from a defensive perspective — malware analysis is an incredible place to start.
Stay curious, and happy hunting!