What is Malware Analysis?
Thilan Dissanayaka Malware February 29, 2020

What is Malware Analysis?

You download what looks like a normal PDF. You open it. Nothing happens — or so you think. Behind the scenes, a process just spawned, a connection just opened to a server in Eastern Europe, and your keystrokes are now being logged.

How do you figure out what just happened? How do you take that suspicious file, rip it apart, and understand exactly what it does — without getting infected yourself?

That’s malware analysis. And it’s one of the most fascinating disciplines in cybersecurity.

What is Malware Analysis?

Malware analysis is the process of examining malicious software to understand:

  • How it works — What does the code actually do?
  • How it spreads — Does it exploit a vulnerability? Social engineering? USB drives?
  • What it targets — Files? Credentials? Network resources?
  • What damage it causes — Data theft? Encryption? Persistence?
  • How to detect and prevent it — What signatures, indicators, or behaviors can we look for?

The end goal is to extract Indicators of Compromise (IoCs) — things like file hashes, IP addresses, domain names, registry keys, or mutex names that uniquely identify the malware. These IoCs feed into antivirus signatures, firewall rules, intrusion detection systems, and threat intelligence platforms.

Think of it like forensic science, but for software. A detective examines a crime scene to understand what happened, who did it, and how to catch them. A malware analyst does the same thing — but the crime scene is a binary file, and the evidence is hidden in assembly code, network traffic, and system calls.

Types of Malware — Know Your Enemy

Before we analyze malware, let’s understand what we’re dealing with. Malware comes in many flavors, each with different objectives and behaviors.

Type What It Does Example
Virus Attaches itself to legitimate programs and spreads when those programs run CIH (Chernobyl virus)
Worm Self-replicating — spreads across networks without user interaction WannaCry, Conficker
Trojan Disguises itself as legitimate software to trick users into running it Zeus, Emotet
Ransomware Encrypts your files and demands payment for the decryption key WannaCry, LockBit, Ryuk
Spyware Silently monitors user activity — keystrokes, screenshots, browsing history Pegasus, FinFisher
Rootkit Hides deep in the OS to maintain persistent, undetectable access Sony BMG rootkit, ZeroAccess
RAT (Remote Access Trojan) Gives the attacker full remote control over the infected machine DarkComet, njRAT
Botnet Agent Turns the infected machine into a “zombie” controlled by a command server Mirai, Emotet (also a botnet)

Most modern malware doesn’t fit neatly into one category. WannaCry, for instance, was a worm (self-propagating via EternalBlue exploit) that delivered ransomware (encrypted files and demanded Bitcoin). Emotet started as a banking trojan, evolved into a botnet, and became a delivery platform for other malware. The lines are blurry.

The Three Pillars of Malware Analysis

There are three main approaches to analyzing malware, each with different depth, difficulty, and trade-offs.

1. Static Analysis — Look, Don’t Touch

Static analysis means examining the malware without executing it. You’re looking at the file itself — its structure, metadata, strings, imports, and code — but never actually running it.

This is the safest approach. Since the malware never executes, there’s no risk of infection. It’s also the fastest way to get a first impression.

What you do in static analysis:

  • File identification — What type of file is it? PE executable? ELF binary? Office document with macros?
  • Hashing — Generate SHA256/MD5 hashes to fingerprint the sample and check if it’s already known
  • String extraction — Pull readable text from the binary. You’d be surprised how often malware contains URLs, IP addresses, error messages, or even the author’s username in plain text
  • PE header analysis — For Windows executables, examine the imports (what Windows APIs does it call?), exports, sections, and timestamps
  • Packer detection — Is the binary packed or obfuscated? Packers like UPX compress the binary to hide its true contents

Example — extracting strings from a suspicious binary:

$ strings suspicious.exe | head -20
!This program cannot be run in DOS mode.
.text
.rdata
.data
kernel32.dll
CreateFileA
WriteFile
WinExec
http://evil-c2-server.com/beacon
cmd.exe /c whoami
HKEY_CURRENT_USER\Software\Microsoft\Windows\CurrentVersion\Run

Look at that. Without running a single line of code, we already know this binary:

  • Calls CreateFileA and WriteFile (file system operations)
  • Uses WinExec (executing commands)
  • Reaches out to http://evil-c2-server.com/beacon (command & control server)
  • Runs cmd.exe /c whoami (reconnaissance)
  • Writes to the Run registry key (persistence — it wants to survive reboots)

That’s a lot of intelligence from just strings. Of course, sophisticated malware won’t leave strings in plaintext — they’ll encrypt or obfuscate them. But you’d be surprised how many samples are sloppy.

Example — checking a hash on VirusTotal:

$ sha256sum suspicious.exe
a1b2c3d4e5f6... suspicious.exe

Take that hash, search it on VirusTotal, and you’ll instantly see if 70+ antivirus engines have already flagged it. If it’s a known sample, someone has probably already written a full report on it.

2. Dynamic Analysis — Let It Run (Safely)

Dynamic analysis means executing the malware in a controlled environment and observing what it does. Instead of reading the code, you watch its behavior — what files it creates, what network connections it makes, what registry keys it modifies.

This is more dangerous than static analysis because the malware is actually running. That’s why you always do this inside a sandbox — an isolated virtual machine that you can snapshot before execution and revert after.

What you observe in dynamic analysis:

  • Process activity — What processes does it spawn? Does it inject into other processes?
  • File system changes — Does it create, modify, or delete files?
  • Registry changes — Does it add persistence keys? Modify security settings?
  • Network traffic — What domains does it contact? What data does it send?
  • API calls — What system calls does the malware make at runtime?

Setting up a safe analysis environment:

The golden rule — never analyze malware on your host machine. Here’s a typical setup:

  1. Create a VM — Use VirtualBox or VMware. Windows 10 is the most common target OS
  2. Take a clean snapshot — Before you introduce any malware
  3. Disable networking (or route through an isolated network) — You don’t want the malware calling home from your network
  4. Install analysis tools inside the VM — Process Monitor, Wireshark, Regshot, etc.
  5. Copy the sample into the VM and execute it
  6. Observe everything — Capture network traffic, file changes, registry modifications
  7. Revert to snapshot when done — The VM is clean again, as if nothing happened

Example — watching malware behavior with Process Monitor:

After detonating the sample in your sandbox, Process Monitor (Procmon) shows you every file operation, registry access, and network call in real time:

Operation       Path                                        Result
CreateFile      C:\Users\Admin\AppData\Local\Temp\svchost.exe    SUCCESS
RegSetValue     HKCU\Software\Microsoft\Windows\CurrentVersion\Run\Update    SUCCESS
TCP Connect     185.141.27.98:443                           SUCCESS
CreateFile      C:\Users\Admin\Documents\*.docx             SUCCESS
WriteFile       C:\Users\Admin\Documents\README_DECRYPT.txt SUCCESS

From this we can see the malware:

  1. Drops a copy of itself as svchost.exe in the Temp folder (disguised as a system process)
  2. Adds itself to the Run key (persistence)
  3. Connects to 185.141.27.98 on port 443 (C2 communication)
  4. Accesses .docx files (likely encrypting them)
  5. Drops a ransom note (README_DECRYPT.txt)

That’s ransomware behavior — and we figured it out just by watching.

Automated sandboxes:

If you don’t want to set up everything manually, automated sandboxes do the heavy lifting for you:

  • Cuckoo Sandbox — Open-source, self-hosted. Submit a file, and it runs it in a VM, captures everything, and generates a full report
  • Any.Run — Interactive online sandbox. You can watch the malware execute in real time in your browser
  • Hybrid Analysis — Free online sandbox by CrowdStrike. Upload a sample and get a detailed behavior report
  • Joe Sandbox — Commercial-grade, incredibly detailed analysis

3. Reverse Engineering — Reading the Blueprint

Reverse engineering is the deepest level of analysis. You take the compiled binary, disassemble it back into assembly code (or decompile it into pseudo-C), and read through the logic line by line.

This is where you go from “this malware contacts a C2 server” to “this malware uses a custom XOR-based encryption algorithm with key 0x4F to encode its C2 communications, and it falls back to a DGA (Domain Generation Algorithm) if the primary C2 is unreachable.”

This is the hard part. It requires:

  • Understanding of x86/x64 assembly language
  • Knowledge of OS internals (Windows API, PE format, memory management)
  • Patience — a lot of patience
  • Familiarity with disassemblers and debuggers

Key tools for reverse engineering:

Tool Type Notes
Ghidra Disassembler/Decompiler Free, open-source (by NSA). Excellent decompiler
IDA Pro Disassembler/Decompiler Industry standard. Expensive, but the best
x64dbg Debugger Free, great for dynamic debugging on Windows
Radare2/Cutter Disassembler/Debugger Free, open-source. Steep learning curve
Binary Ninja Disassembler Clean UI, good API for scripting

Example — decrypting hardcoded strings with Ghidra:

You load the malware into Ghidra, and the decompiler shows you something like this:

void decrypt_string(char *encrypted, int len) {
    for (int i = 0; i < len; i++) {
        encrypted[i] = encrypted[i] ^ 0x4F;
    }
}

void contact_c2(void) {
    char c2_domain[] = {0x2D, 0x29, 0x2E, 0x27, 0x1E, 0x24, 0x20, 0x1E, ...};
    decrypt_string(c2_domain, sizeof(c2_domain));
    // c2_domain is now "evil-c2-server.com"
    connect_to_server(c2_domain);
}

The malware stored its C2 domain as XOR-encrypted bytes — strings wouldn’t have caught this. But by reverse engineering the decryption function, we recovered the actual domain. That’s an IoC we can now feed into our detection systems.

This is why static analysis alone isn’t always enough. Sophisticated malware obfuscates everything — strings, API calls, control flow. Reverse engineering peels back those layers.

A Complete Analysis Workflow

Let’s put it all together. Here’s a realistic workflow for analyzing a suspicious sample from start to finish.

Step 1: Sample Collection and Safety

You receive a suspicious file — maybe from a honeypot, an email attachment, or an incident response engagement.

# First, hash it for identification
$ sha256sum sample.exe
e3b0c44298fc1c149afbf4c8996fb924...

# Check the hash on VirusTotal
# If it's already known, read existing reports first

# Identify the file type
$ file sample.exe
PE32 executable (GUI) Intel 80386, for MS Windows

Step 2: Static Analysis

# Extract strings
$ strings -n 8 sample.exe > strings_output.txt

# Check for packers
$ die sample.exe    # Detect It Easy
# Result: UPX 3.96 detected

# Unpack if needed
$ upx -d sample.exe -o sample_unpacked.exe

# Analyze PE headers with pefile (Python)
$ python3 -c "
import pefile
pe = pefile.PE('sample_unpacked.exe')
for entry in pe.DIRECTORY_ENTRY_IMPORT:
    print(entry.dll.decode())
    for func in entry.imports:
        print(f'  {func.name.decode() if func.name else hex(func.ordinal)}')
"

Suspicious imports to watch for:

API Call Why It’s Suspicious
VirtualAlloc / VirtualProtect Allocating executable memory (shellcode injection)
CreateRemoteThread Injecting code into another process
WriteProcessMemory Writing into another process’s memory
URLDownloadToFile Downloading files from the internet
RegSetValueEx Modifying the registry (persistence)
CryptEncrypt Encrypting data (ransomware indicator)
IsDebuggerPresent Anti-analysis — checking if it’s being debugged

Step 3: Dynamic Analysis

Fire up your sandbox VM, take a snapshot, and detonate the sample.

Run these tools simultaneously:

  • Process Monitor — Capture all file, registry, and network operations
  • Wireshark — Capture all network traffic
  • Regshot — Take a registry snapshot before and after execution to see what changed
  • Process Hacker — Monitor running processes and their memory

Execute the sample and let it run for a few minutes. Some malware has sleep timers or checks for user activity before activating.

Step 4: Analyze the Results

After execution, you review:

=== Network Connections ===
DNS query: update-service.malwaredomain.com → 185.141.27.98
HTTPS connection to 185.141.27.98:443
POST /gate.php with encrypted body (likely exfiltrated data)

=== File System Changes ===
Created: C:\Users\Admin\AppData\Local\Temp\updater.exe (copy of itself)
Created: C:\Users\Admin\AppData\Roaming\config.dat (encrypted config)
Modified: Multiple .docx and .xlsx files in Documents folder

=== Registry Changes ===
Added: HKCU\Software\Microsoft\Windows\CurrentVersion\Run\WindowsUpdate
  Value: C:\Users\Admin\AppData\Local\Temp\updater.exe

=== Process Activity ===
sample.exe → spawned cmd.exe → executed "whoami" and "ipconfig"
sample.exe → injected into explorer.exe via CreateRemoteThread

Step 5: Extract IoCs

From all of this analysis, we compile our Indicators of Compromise:

Indicators of Compromise:
  File Hashes:
    - SHA256: e3b0c44298fc1c149afbf4c8996fb924...
    - MD5: d41d8cd98f00b204e9800998ecf8427e

  Network:
    - Domain: update-service.malwaredomain.com
    - IP: 185.141.27.98
    - URL: https://185.141.27.98/gate.php
    - User-Agent: "Mozilla/5.0 (compatible; MSIE 10.0)"

  Host:
    - File: C:\Users\*\AppData\Local\Temp\updater.exe
    - Registry: HKCU\Software\Microsoft\Windows\CurrentVersion\Run\WindowsUpdate
    - Mutex: Global\XYZ_MUTEX_001

  Behavior:
    - Enumerates documents (.docx, .xlsx, .pdf)
    - Process injection via CreateRemoteThread into explorer.exe
    - Exfiltrates data via HTTPS POST to /gate.php

These IoCs are now actionable. They can be shared with your security team, fed into your SIEM, used to write firewall rules, or published in a threat intelligence report.

Anti-Analysis Techniques — Malware Fights Back

Here’s where it gets interesting. Malware authors know their creations will be analyzed, so they build in defenses. Some common anti-analysis techniques:

VM Detection

The malware checks if it’s running inside a virtual machine. If yes, it assumes it’s being analyzed and either does nothing or self-destructs.

// Check for VMware registry key
RegOpenKeyEx(HKLM, "SOFTWARE\\VMware, Inc.\\VMware Tools", ...);

// Check for VirtualBox artifacts
if (GetModuleHandle("VBoxHook.dll") != NULL) {
    ExitProcess(0);  // We're being analyzed, bail out
}

// Check CPU instruction timing
// VMs have slightly different timing for CPUID instructions

Counter: Use bare-metal analysis machines for sophisticated samples, or patch VM artifacts to make the environment look more like a real machine.

Anti-Debugging

The malware detects if a debugger is attached and changes its behavior.

if (IsDebuggerPresent()) {
    // Run benign code instead of malicious payload
    MessageBox(NULL, "Hello World!", "Legit App", MB_OK);
    return;
}

Counter: Patch IsDebuggerPresent to always return false, or use kernel-level debuggers that are harder to detect.

Packing and Obfuscation

The malware’s real code is compressed or encrypted. At runtime, it unpacks itself into memory. Static analysis sees only the packer stub, not the actual malicious code.

Counter: Let the malware unpack itself in a debugger, then dump the unpacked code from memory.

Time Bombs

The malware waits — sometimes minutes, sometimes days — before activating. If a sandbox only runs for 60 seconds, it’ll see nothing malicious.

Counter: Fast-forward the system clock in the sandbox, or use sandboxes with extended execution times.

Essential Tools — Your Analysis Toolkit

Here’s a summary of the tools we’ve mentioned, organized by purpose.

Category Tool Platform Cost
Static Analysis strings, file, xxd Linux/Mac Free
  Detect It Easy (DIE) Windows/Linux Free
  pefile (Python library) Cross-platform Free
  PEStudio Windows Free
Dynamic Analysis Process Monitor (Procmon) Windows Free
  Process Hacker Windows Free
  Regshot Windows Free
  Wireshark Cross-platform Free
Sandboxes Cuckoo Sandbox Self-hosted Free
  Any.Run Online Free tier
  Hybrid Analysis Online Free
Reverse Engineering Ghidra Cross-platform Free
  IDA Pro / IDA Free Cross-platform Free / Paid
  x64dbg Windows Free
  Radare2 / Cutter Cross-platform Free
Online Analysis VirusTotal Web Free
  MalwareBazaar Web Free
  URLhaus Web Free

You don’t need all of these to get started. A solid beginner setup would be: a Windows VM + Process Monitor + Wireshark + Ghidra + VirusTotal. That’s enough to perform basic static and dynamic analysis on most samples.

Getting Started — Where to Practice

You can’t learn malware analysis without samples to analyze. Here are some safe, legal resources:

  • MalwareBazaar (bazaar.abuse.ch) — A repository of malware samples shared by the security community. Free to download (you’ll need an account)
  • theZoo (github.com/ytisf/theZoo) — A curated collection of malware samples for research purposes
  • Any.Run — Watch other people’s analysis sessions to learn what to look for
  • Practical Malware Analysis (book by Sikorski & Honig) — The gold standard for learning. Comes with lab exercises and sample files
  • Malware Unicorn’s workshops — Free, hands-on reverse engineering workshops

Start with static analysis — it’s the safest and gives you quick wins. Once you’re comfortable, move to dynamic analysis in a VM. And when you’re ready for the deep end, pick up Ghidra and start reverse engineering.

Final Thoughts

Malware analysis is one of those fields that sits at the intersection of programming, security, and detective work. Every sample is a puzzle — the author is trying to hide what it does, and your job is to figure it out anyway.

What I find most fascinating is that it forces you to understand how computers actually work — at the OS level, at the network level, at the assembly level. You can’t analyze a rootkit without understanding how the kernel manages processes. You can’t spot a C2 beacon without understanding how DNS and HTTP work. You can’t reverse engineer an encryption routine without understanding the math behind it.

If you’re interested in cybersecurity and want to understand the offensive side from a defensive perspective — malware analysis is an incredible place to start.

Stay curious, and happy hunting!

ALSO READ
Blockchain 0x000 – Understanding the Fundamentals
May 21, 2020 Web3 Development

Imagine a world where strangers can exchange money, share data, or execute agreements without ever needing to trust a central authority. No banks, no intermediaries, no single point of failure yet...

Identity and Access Management (IAM)
May 11, 2020 Identity & Access Management

Who are you — and what are you allowed to do? That's the fundamental question every secure system must answer. And it's exactly what Identity and Access Management (IAM) is built to solve.

How I built a web based CPU Simulator
May 07, 2020 Pet Projects

As someone passionate about computer engineering, reverse engineering, and system internals, I've always been fascinated by what happens "under the hood" of a computer. This curiosity led me to...

Writing a Shell Code for Linux
Apr 21, 2020 Exploit Development

Shellcode is a small piece of machine code used as the payload in exploit development. In this post, we write Linux shellcode from scratch — starting with a simple exit, building up to spawning a shell, and explaining every decision along the way.

Exploiting a Stack Buffer Overflow on Windows
Apr 12, 2020 Exploit Development

In a previous tutorial we discusses how we can exploit a buffer overflow vulnerability on a Linux machine. I wen through all theories in depth and explained each step. Now today we are going to jump...

Access Control Models
Apr 08, 2020 Identity & Access Management

Access control is one of the most fundamental concepts in security. Every time you set file permissions, assign user roles, or restrict access to a resource, you're implementing some form of access control. But not all access control is created equal...

Exploiting a  Stack Buffer Overflow  on Linux
Apr 01, 2020 Exploit Development

Have you ever wondered how attackers gain control over remote servers? How do they just run some exploit and compromise a computer? If we dive into the actual context, there is no magic happening....

Basic concepts of Cryptography
Mar 01, 2020 Cryptography

Ever notice that little padlock icon in your browser's address bar? That's cryptography working silently in the background, protecting everything you do online. Whether you're sending an email,...

Common Web Application Attacks
Feb 05, 2020 Application Security

Web applications are one of the most targeted surfaces by attackers. This is primarily because they are accessible over the internet, making them exposed and potentially vulnerable. Since these...

Remote Code Execution (RCE)
Jan 02, 2020 Application Security

Remote Code Execution (RCE) is the holy grail of application security vulnerabilities. It allows an attacker to execute arbitrary code on a remote server — and the consequences are as bad as it sounds. In this post, we'll go deep into RCE across multiple languages, including PHP, Java, Python, and Node.js.