This document discusses malware collection and analysis conducted at the DSNSLab at NCTU. It introduces the lab director, Professor Xie Zhiping, and outlines the lab's research areas including malware analysis, virtual machines, digital forensics, and network security. It then provides an overview of the Secmap platform for automated malware analysis and collection. Methods of malware collection discussed include disk forensics, web crawling, shared repositories, email, and honeypots.
3. Outline
• Rapid
Increasing
of
Malware
• Secmap
– Automa+c
Malware
Analysis
Cycle
– High
Performance
and
Fault
Tolerance
– Modula+on
• Malware
Collec+on
– Disk
Forensics
– Email
ASachment
– Web
Crawler
– Malware
Sharing
Repository
– Honey
Pot
• Malware
ASributes
u Note:
Some
part
of
this
slide
is
removed
due
to
research
is
under
processing.
4. Rapid
Increasing
of
Malware
• Malware
increasing
McAfee
Labs
Threat
Report
in
Fourth
Quarter
2013
5. Malware
Life
Cycle
• Malware
Life
Cycle
and
Response
Window
hSp://www.fireeye.com/blog/corporate/2014/05/ghost-‐hun+ng-‐with-‐
an+-‐virus.html
11. Malware
Collec+on
• Malware
samples
can
help
to
construct
detec+on
model,
design
signature
• Therefore,
we
use
following
way
to
collect
samples
– HoneyPot
– Web
Crawler
– Shared
Repository
– Email
– Disk
Forensics
– User
Upload
12. Disk
Forensics
• When
host
are
infected,
disk
forensics
is
needed
to
discover
malware
– Delete
– Hidden
• Dele+ng
file
is
one
of
important
behavior
of
malware
– About
half
of
malware
delete
some
files
when
execu+on
– Malware
oden
delete
log
files
,
binary
created
or
remove
itself
to
prevent
from
forensic
• It
is
useful
if
we
can
recover
files
deleted
by
malware
14. Recover
Mechanism
• In
sodware
approach
– Basic
method
need
file
system’s
meta-‐data
to
recover
files
– File
carving
is
proposed
to
recover
files
without
file
system’s
meta-‐data
15. File
System
Data
Structure
Filename
Start
cluster
Recover.jpg
Cluster
50
Hello.txt
Cluster
53
Cluster
number
Next
cluster
50
51
51
52
52
EOF
53
57
Recover.jpg
content
Recover.jpg
content
Recover.jpg
content
Hello.txt
content
Unknown
Cluster
50
Cluster
51
Cluster
52
Cluster
53
Cluster
54
Directory
Entry
File
Alloca+on
Table
Disk
Data
Area
15/14
20. Web
Crawler
• To
collect
malware
across
the
web,
we
use
crawler
to
automa+c
download
files
from
internet
– Nutch
+
Hadoop
– Collect
about
10000
files
1
/day
• Rarely
malicious
– Not
run
javascript
– No
vulnerability
– Password
21. Malware
Sharing
Repository
• There
are
many
website
provide
free
malware
sharing
– ASack
Response
• Malc0de
• Malware
Black
List
• Malware
Domain
List
– Malware
Sharing
• VXHeaven
• Malware
Dump
• VirusSign
• …….
22. Malware
Profile
File
Metadata
File
Name
“setup.exe”
Origin
File
Name
MD5(SHA1)
Hash
ccffcb94e4058ed22a94881ba2
d26f35
File
Size
65024
File
Type
PE32
executable
for
MS
Windows
(GUI)
Intel
80386
32-‐
bit
IsMalicious
True
Some
of
our
source
may
upload
benign
file
File
Source
Collec+on
Date
2013-‐11-‐21
Collec+on
Source
Email
Email/Disk/Crawler/Honeypot
Collec+on
Loca+on
bletchley@dsns.cs.nctu.edu.t
w
Email
address,
disk
id,
URL,
ip
of
honeypot
23. Executable
Related
ASribute
Behavior
Network
Trace
Log
All
Communica+on
Flow
Instruc+on
Trace
Log
All
Instruc+on
Executed
Func+on
Trace
Log
All
API
func+on
code
Modified
Files
All
Modified
Files
Shellcode
Shellcode
iden+fied
in
Files
(document
only)
Modified
Registry
All
Registry
Modified
SSDT
Hook
If
SSDT
changed
by
this
sample
MBR
Modified
If
this
sample
modified
MBR
Screenshots
25. Conclusion
• Secmap
is
an
infrastructure
to
automa+c
collect,
analysis
and
store
the
malware
sample
• Different
Way
to
collect
wide
range
of
samples
– Honeypot
– Disk
– Email
– Web