Introduction to Malware Analysis

Disclaimer
• This stuff requires the analyst to dive
extremely deep into technical details
• This quick talk will attempt to give you a 1000
foot view of malware analysis
• I put a careful distinction between Malware
Analysis and Reverse Engineering

Malware Analysis Overview
• Static Analysis: involves analyzing the code
without actually running the code
– File identification, header information, strings, etc.
– Disassembler – IDA Pro
• Dynamic Analysis: involves executing the code in
a controlled manner and monitoring system
changes
– Sysinternals, memory forencis, etc.
– Debuggers – Immunity Debugger OllyDbg

Coding Terms
• Malware authors with code in High Level Programming
Language: C/C++

Static Analysis: File Identification
• Linux “file” utility
• Python-magic module

Static Analysis: MD5 Hash
• Linux “md5sum” utility: md5sum <fileName>
• Python hashlib module:

Static Analysis: Strings
• Can be a quick way to gain intelligence from
the file:
– Domains, Ips, URLs, Function names, hardcoded
information

Static Analysis: Packers
• Packers are used to obfuscate the code which leads to:
Changes the file signature (MD5 Hash)
– Obfuscates the file strings, and code
– Compress file size (sometimes)
• Packed code can be identified by:
– Examining the PE sections, and Imports: If a PE file only
has LoadLibrary/GetProcAddress normally packed
– Strings: UPX0, UPX1, aspack, adata, NSP0, NSP1, WinRAR
SFX, PEC2, PECompact2, Themida, Orean.sys, NTkrnl,
Secure Suite
• Tools like (PEiD, LordPE, and Python peutils module)

Static Analysis: Packers
• Unpacked vs. Packed Strings:

Data Encoding
• Malware uses encoding for a number of reasons,
some are to disguise internal workings, hide C2
information, and data exfil
– Some simple encoding algorithms are:
– Character Substitution
– XOR – uses a static key to XOR with the original value
– Base64 – Can use default or custom character set
– Default Base64 character set: A-Z, a-z, 0-9, +, /
• We will examine two common data encoding
techniques used in Malware XOR and Base64

Data Encoding: XOR
• Strings are often required to be stored in a program in order
to pass it as a parameter to a function
• XOR once = encoded
• XOR again with same key = plaintext

Data Encoding: Base64
• Storing base64 strings as HTML comments is how the APT group
“Comment Crew” got their name. This technique is still leveraged today in
malware
• Base64 is a common encoding scheme because it is very easy to decode

Static Analysis: PE File Format
• PE data structure contains all the information required for the
Windows OS loader to manage executable code. .text – instructions
the CPU executes
– .rdata – Imports and Exports
– .data – Global data
– .rsrc – Resources (icons, images, strings, etc.)
• Useful information in PE header: Imports and Exports – Gives an
idea to malware functionality
– Compilation Time, Language Settings, and strings
– Section Names – Packed code can have non-standard section names
• Tools to analyze PE header: pescanner.py, CFF Explorer, python
pefile, Resource Hacker, Dependency Walker, LordPE, etc.

Windows API Calls:
• When performing advanced static or dynamic analysis it’s
important to have a good understanding of Windows API calls
• By looking at the imported functions within the PE header you
can see which Windows API functions the PE file wants to
utilize
• By recognizing API calls you can quickly get an idea of
malware’s functionality by analyzing strings output, and
during advanced static analysis using a disassembler
• An excellent resource for Windows API calls is MSDN. Google
search “API_Function MSDN”

Windows API: MSDN Example
• The Parameters modify how the function will be used on the
system.
• The return type is what the function will return after it is
called in a program

Windows API: Disassembly
• Parameters are pushed to the stack in Last In First Out(LIFO)
order, which is why they are in reverse order in the
disassembly

Wake Up 
• Okay, that was likely starting to bore some
people – SORRY
• Let’s move to Dynamic analysis which is more
flashy

Getting Infected
• Double clicking the executable doesn’t always work
– Sometimes you need to register the malware as a service or load it as
a DLL (regsvr32.exe and rundll32.exe )
• Install the malware as a service
– Interact with the system like a normal user The
malware may be waiting for a certain application to open
to inject code into it (Ex: Internet Explorer)
– It could require a CLI argument : One sample required
<filename> /install in order to actually run the malware
– Static analysis is normally required to determine CLI
switches

SysInternals Tool Suite
• If I could pick just one tool, id pick the 50+ in
the Sysinternals tool suite 
• Tools put out by Mark Russinovich – now
works for Microsoft
• Process Explorer, Process Monitor, Autoruns,
etc.

Process Monitor
• Very verbose tool that generates a lot of events
• Filtering is required to make sense of the data

Process Monitor Cont.
• Press Ctrl+L to bring up the filtering dialog box
– Quick filters are: Operation is WriteFile
– Category is Write

Malware Persistence - Autoruns
• Really is the key to identify malware – how does it gain
persistence?
• Autoruns can help enumerate persistence mechanisms:

Monitoring Network Activity
• Some interesting network indicators of malware are:
– SYNs out to an IP or domain
– UDP traffic to IP or domain
– HTTP GET/POST requests
– DNS Queries
– Connection attempt times are important. Every 1 min, 30mins, etc.

Automation? Sandboxes
• So far the basic dynamic analysis we have talked about
can be automated
• Sandboxes are a good tool in any malware analyst
toolbox – they have Pro’s and Con’s:
– Pros: Speeds up analysis, fast, saves time
– Cons: Misses details, can be fooled
• Sandboxes can be open source or commercial:
– Really good free option is Cuckoo sandbox:
• Install Tutorial: http://www.primalsecurity.net/im-cuckoo-for-malware-
with-a-spice-of-reverse-engineering/

Summary
• Malware analysis requires both static and
dynamic analysis techniques to accurately
enumerate indicators of compromise
• As with any automated tool an analyst will
need to be able to validate findings manually

Introduction to Malware Analysis

More Related Content

Introduction to Malware Analysis