The document discusses analyzing malicious PDF files. It describes decompressing PDFs using PDFTK to extract JavaScript. JavaScript is analyzed using a JavaScript emulator like SpiderMonkey to deobfuscate code. Any shellcode is reformed from Unicode and analyzed using Sctest for its behavior. The document provides examples of analyzing sample PDFs, extracting JavaScript, decompressing streams, and inspecting shellcode payloads. Analysis steps and tools used are explained to help understand how malicious PDF files work and discover embedded exploits.
2. Page § 2
Vulnerability, exploit code, shellcode
§ Vulnerability is a weakness which allows an attacker to reduce a
system's Information Assurance
§ Vulnerability is also known as the attack surface, an attacker must
have at least one applicable tool or technique that can connect to a
system weakness.
§ An exploit is a piece of software, a chunk of data, or sequence of
commands in order to cause unintended or unanticipated behavior to gain
control according to vulnerabilities.
§ a shellcode is a small piece of code used as the payload in the exploit
code. "shellcode” starts a command shell from which the attacker can
control the compromised machine.
3. Page § 3
MS09002 Internet Explorer Remote Code Execution Vulnerability
§ Published Day: 2009-02-10
§ CVE(CAN) ID: CVE-2009-0075
§ Classification: Critical
(Windows XP Service Pack 2 and Windows XP Service Pack 3 + IE7)
§ Patch: Cumulative Security Update for Internet Explorer (kb961260)
§ Description:
已成為最受歡迎的網頁掛馬所使用的弱點。IE7與IE8的CFunctionPoint函
數沒有正確的檢查DOM所夾帶的參數內容,進而可以使用特殊的字串迫壞
記憶體保護機制,導致攻擊者能夠用當時使用者登入的權限執行”任何程
式”,MS09002 Exploit code 使用JavaScripts Heap Spray,導致使用者無
法正常瀏覽網頁,執行惡意程式碼。
12. Page § 12
PDF Introduction
§ PDF : Portable Document Format (可攜式文件)
§ PDF 特性:
• 跨平台 (Cross Platform)
• 保留原始文件格式 ( Page Layout)
• 開放標準 (Open Standard)
§ 2007年12月,PDF format成為標準
§ PDF Reader : Adobe Reader, Foxit Reader, Sumatra PDF
§ ASCII – based document
13. Page § 13
PDF Format
PDF Start (Version)
PDF Object
Cross Reference Table
Trailer
End of File
obj
endobj
/JS
• Object (obj … endobj) : stream data element
• JavaScript object starts with /JS
Understand more PDF format:
http://www.4xpdf.com/2008/10/download-
iso-32000-1-document-for-free-from-
adobecom/
14. Page § 14
PDF Format -- PDF Objects
Object Name: Description:
/OpenAction
/AA (Additional Action)
Run automatically the scripts or actions
/Names
/AcroForm
/Action
Run and launch scripts or actions
/JavaScript
/JS
Run Javascripts
/GoTo* Change view to a specified destination within PDF or another PDF
/Launch Launch a program or open a document
/URI Access URL
/SubmitForm
/GoToR
Send data to URL
/RichMedia Embed Flash in PDF
/Filter Compression or Encoding format for binary stream
/ObjStm Hide objects inside an Object stream
15. Page § 15
Malicious PDF attacking: How it works!!!
Hackers crafted
malicious PDF
Distribute via
email attachments
Embed into
malicious web page
other means:
upload,usb,p2p
Open the file with
vulnerable pdf reader
or browser PDF plugin
Payloads
Execute malicious payloads
16. Page § 16
Malicious PDF attacking: why PDF is targeted
§ PDF is widely used in the IT
§ Popular PDF readers and creators have bugs
(Acrobat Reader 8.1.1 has several vulnerabilities)
§ Easy to combine with spam mails.
§ Easy to combine with client-side attack
(PDF plungin on web browsers)
17. Page § 17
Malicious PDF Analysis Methodology
Observe
PDF structure
Extract
code segments
Disassemble
malicious code
Find
next steps
Decrypt, locate and extract suspicious
embedded code segments
Observe PDF structure and content such
as shellcode, VBA macros, or JavaScript.
Disassemble, deobfuscate and examine
JavaScript, ActionScript or VB macro code
Understand the next infected step
Observe
PDF structure
19. Page § 19
Malicious PDF Analysis -- Tools
§ Wepawet: Analysis web-based malware
– http://wepawet.iseclab.org/
§ Jsunpack: A Generic JavaScript Unpacker
– http://jsunpack.jeek.org/dec/go
§ PDFTK : PDF stream decoder
§ SpiderMokey: JavaScript Emulator
§ Sctest of Libemu: shellcode analysis
§ PDF StructAzer: displays structure and raw contents of the PDF file
§ PDFiD: identify strings with scripts and actions in PDF files
§ PDF-parser: identify key elements of PDF file without rendering it
20. Page § 20
Malicious PDF Analysis -- Tools (Cont.)
§ Origami Walker: examine the structure of PDF files.
§ Origami pdfscan: identify strings with scripts and actions
§ Origami extractjs: extract JavaScript from PDF files
§ Jsunpack-n’s pdf.py: extract JavaScript from PDF files.
§ Malzilla: help deobfuscate JavaScript
21. Page § 21
Buildup PDF Analysis Lab
§ Step 1: choose text editor, vim.tiny or gedit
§ Step 2: apt-get install python
§ Step 3: apt-get install pdftk
§ Step 4: Install libemu
– wget
http://ncu.dl.sourceforge.net/project/nepenthes/libemu
%20development/0.2.0/libemu-0.2.0.tar.gz
– tar zxvf libemu*.tar.gz
– cd libemu-0.2.0
– ./configure –prefix=/opt/libemu/
– make && make install
22. Page § 22
Buildup PDF Analysis Lab (Cont.)
§ Step 5:
– wget http://www.didierstevens.com/files/software/js-1.7.0-mod.tar.gz
– tar zxvf js-1.7.0-mod.tar.gz
– cd js/src
– export CFLAGS=“-DJS_C_STRINGS_ARE_UTF8”
– make –f Makefile.ref
– JS_DIST=/opt/js make –f Makefile.ref export
23. Page § 23
Before we start …..
1. 打開文件觀察,找出 /JS or /Javascripts
2. 確認是否有壓縮,利用PDFTK解壓縮
3. 觀察是否為 obfuscated javascript,利用JS (SpiderMonkey)
4. 分析Shellcode,利用 Sctest (Libemu)
24. Page § 24
Analysis Sample 1 : Observe /JS
§ $vim.tiny sample1.pdf To find /JS or /Javascripts
§ Look Object 7 (including heap spray code)
§ util.printf ( ) trigger malicious exploit code
§ This is Adobe Reader 'util.printf()' JavaScript Function Stack
Buffer Overflow Vulnerability
( http://www.securityfocus.com/bid/30035)
26. Page § 26
Analysis Sample1 : Shellcode Inspection
§ This is a malicious PDF file
§ Copy the var payload to the end and save into shellcode.txt
§ Remove chars using vim.tiny
– 移除+符號 : %s/[”+.]//g
– 移除段行符號: %j
§ Reform uniform using perl code to format
– cat shellcode.txt | perl –pe ‘s/%u(..)(..)/chr(hex($2)).chr(hex($1))/ge’ |
hexdump –C
– cat shellcode.txt | perl –pe ‘s/%u(..)(..)/chr(hex($2)).chr(hex($1))/ge’ >
shellcode.bin
27. Page § 27
Analysis Sample1: Shellcode Analysis
§ Feed the shellcode.bin to sctest
– /opt/libemu/bin/sctest -Ss 100000 < shellcode.bin
A reverse shell to ip 202.190.85.36 to port 7777
28. Page § 28
Analysis Sample2: Observe /JS
§ $vim.tiny sample2.pdf To find /JS or /Javascripts
§ Look Object 1 and search ‘Z0pEA5PLzPyyw()’ function
§ Found /Filter : This PDF is decompressed.
FilterDecode + ASCIIHexDeCode
Stream Length
Stream Objects
• FlateDecode : use standard zlib compression
• ASCIIHexDeCode :hex chars conversion
29. Page § 29
Analysis Sample2: Decompression
§ Decompression: use PDFTK
– pdftk sample2.pdf output dc-sample2.pdf uncompres
§ Observe /JS again in decompressed PDF
30. Page § 30
Analysis Sample3: Obfuscated JavaScript Analysis
§ PDF readers have their own javascript engine and bugs
§ Malicious PDF could execute javascript to trigger exploit
codes to attack bugs
§ Use JS emulator to analyze javascript
31. Page § 31
Analysis Sample3: Observe and decompress
§ Decompress
– pdftk sample3.pdf output dc-sample3.pdf uncompress
§ Look for /JS and find Object 13
§ Object 13 contains JavaScript code
§ Copy the JavaScript to sample3.js
§ Remove (^M) that generated by PDFTK
%s/^M//g using vi
32. Page § 32
Analysis Sample3: JavaScript Analysis
§ JavaScript Emulator
– /opt/js/bin/js sample3.js
§ Analysis on the eval.001.log file
35. Page § 35
Analyze sample 4 : PDF syntax ObfuscaMon
§ PDF use syntax for document handling
– This.Info.Title
– getPageNum
Analysis Steps:
– Observe and decompress
– Extract Javascript
– JS analysis
– Shellcode execute and analysis
How many exploits ?
List the downloading URL and malware