Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Mastering The Super Timeline Log2timeline Style

Download as pdf or txt
Download as pdf or txt
You are on page 1of 43

Tradi&onal

&meline analysis can be extremely useful yet it some&mes misses


important events that are stored inside les on the suspect system (log les, OS
ar&facts). By solely depending on tradi&onal lesystem &meline the inves&gator
misses context that is necessary to get a complete and accurate descrip&on of the
events that took place. To achieve this goal of enlightenment we need to dig deeper
and incorporate informa&on found inside ar&facts or log les into our &meline
analysis and create some sort of super &meline. These ar&facts or log les could
reside on the suspect system itself or in another device, such as a rewall or a proxy.
This talk will focus on the tool log2&meline, which is a framework built to parse
dierent log les and ar&facts to produce a super &meline in an easy automa&c
fashion, designed to assist inves&gators in their &meline analysis.

Quick introduc&on of who I am.

First of all I would like to talk about &meline analysis in general. Why would you be bothered
to do such a thing?
Well ... rst of all, knowing what happened at a par&cular &meframe and more importantly
what happened around that &me can both greatly enhance your analysis and even shorten
your invesita&on &me, perhaps considerably.
Besides the obvious cases where the inves&ga&on revolves around a specic &meframe,
perhaps to analyse what was done on the system at a given &me, &meline analysis has many
uses.
One such is temporal proximity... lets say in a malware analyse where youve located a piece
of malware on a computer and you need to gure out the whole story. Temporal proximity,
or loca&ng events that took place on the system around the &me the malware was
downloaded or executed can provide the inves&gator with tons of new leads, and possible
provide the story of what happened on the system (and quite possibly on how the malware
got there in the rst place). It also points the inves&gator to les that he might not have
discovered.
One thing that has been pointed out few &mes, by Harlan Carvey and others, is the fact that if
you are sending a junior inves&gator out on the eld to perform aquisi&on, he/she can
quickly extract a &meline from the system and ship it to the senior inves&gator while doing
the aquisi&on so that the &me the actual image le arrives you will have a very good picture
of what happened and which les you need to analyse (few kb or mb of text le ships a whole
lot faster than the typical run of the mill image le).

Tradi&onally &meline analysis meant pulling the &mestamps from the lesystem itself.
Depending on the lesystem you got &mestamps that represented the last &me a le
was accessed, modied, deleted or created.
Pulling all these available &mestamps and puTng them together in a &meline could
provide the inves&gator with valuable informa&on about what happened on the
system... but it doesnt really tell the whole story...

what does this tell you?


ok, we propably know that the user Reed Richards most likely logged into this machine for
the rst &me on July the 9th, 2008, since his registry le was created then. OK Im making
some assump&ons, but that is generally the case here... and then we can see that the last
&me the le was modied is on the 18th of June, 2009... which most likley corresponds to the
last &me the user logged o the machine... but what happened in between? OK, I know the
lesystem &meline does contain a whole bunch of other entries that can tell at least part of
the story, but this is the only indica&on you will get from the user registry le..
Then weve got the last &me the Event log les were modied... what does that tell you...
perhaps the last &me the machine was shut down, sinc the system is usually always wri&ng
some events in the le, and it does when the machine is shut down (at least normally), could
be the last &me the power bu]on was held down on the machine or the power plug was
removed from the machine?
If we are examining lesystem &meline we can gain a lot of informa&on about the system,
what happened approximetly, and when.. but we really need to open these les and examine
them further before really knowing what happened on the system.
but what if we extend the &meline, think about it for a second... a lot of these les that we
are examining contain &mestamps embedded in them, event logs contain &mestamps of
when each entry is wri]en, weve got last write &me of registry keys, and some of the data
contains &mestamps as well, and the setupapi got &mestamps, and so many other les that
we are rou&nely examining during our exams... what if we would just parse the &mestamp
informa&on from these les and include them into our &meline?

So to recap for a bit... although tradi&onal &meline analysis provides us with valuable
informa&on about the system and it can be used to nd les in temporal proximity it
has its problems... we know that it doesnt really tell the whole story, but there are
other problems as well.
Filesystem &mestamps can be easily changed using readily available tools such as
touch or &mestomp from the Metaspolit an&-forensics project. They are highly
sensi&ve to changes, since only the last access &me is recorded and least but certainly
not last they are not always updated. Some opera&ng systems do not update all of
the available &mestamps, either by default like Vista and Windows 7 does not update
the last access &me of les, or by changing seTngs in fstab or in registry. This is
usually done for performance reasons (takes &me to update &mestamps).
So we cannot always trust the &mestamps from the lesystem, but there must be
other solu&ons, other &mestamps that are perhaps more resilient...

So why not extend the &meline so it includes &mestamps extracted from other
sources?
Include informa&on from Event Log les, syslog, registry, metadata from various
documents... what ever source of &mestamp that is available.
Some &mestamp sources are very dicult to alter, others more easy. But combining
&mestamps from mul&ple sources makes changing them all considerably more
dicult and can provide a method to verify that &mestamps havent been modied.
Perhaps to visually represent the &meline to make it easier to understand it and
analyse. This certainly can help in some situa&ons, perhaps not in others.
And nally we could of course create some sort of magic tool that can just do the
analysis for use... perhaps using the forensicator pro?

There are basically two approaches to extending the &meline... either manually
adding &mestamps to the &meline or to use some sort of tool to do it.
the manual approach is not very ecient and requires the inves&gator to both know
the loca&on and format of the le itself... so the manual approach is quite &me
consuming
Then of course you have the possibility to use tools.... most of the tools out there
have been created to extract &mestamp from one single le format.... there have
been few a]empts at making a more general tool. but most of them are designed to
extract &mestamp from a single source, requiring the inves&gator to know both the
loca&on of the les as well as the need to know and use several tools to build the
&meline.
Harlan Carvey has created several tools that can be used to extract &mestamps from
les. Then youve got few other tools, such as the Ex-Tip by Mike Cloppert, CFTL by
Jens Olsson, System Combo Timeline by Don Weber and nally Aier&me by the
Netherlands Forensics Ins&tute.

There are of course some problems with adding the &mestamps, both using tools and
manually.... not all les are stored in an ASCII format, some are in a binary format that
is oien not easily understandable.
Other problems are that &mestamps are not always stored in the same format, they
might be stored as seconds since January 1st, 1970 or in 100 ns steps since January
1st, 1601.
Then youve got some &mestamps that are stored using a xed &mezone, like UTC,
while others use the local &mezone of the computer.
So there are many variances that the inves&gator needs to know before he adds the
&mestamps to his &meline.

so.... what can be done?


Enter log2&meline... which is a tool wri]en to address this problem.
log2&meline is essen&ally a framework wri]en to extract and display &mestamp data
from various sources. It then outputs the &meline in various formats as well,
depending on the need of the inves&gator.
The tool is wri]en using a Mac OS X (my worksta&on) and tested on both Mac OS X
and on Linux, more specically Ubuntu. The tool has been successfully used on most
*NIX variants and there have even been some that have used it in Windows, although
not all func&ons correctly there. There need to be slight changes to the scripts so
that they can fully work in the Windows environment.
So to sum things up, log2&meline is basically a framework wri]en to extend the
&meline into a super &meline, and the best part... it is capable of doing so
automa&cally...

10

The tool is basically built around four main modules. A front-end that is the
interface to work with the framework, an input module that parses a given
le, an output module that prints the output and nally shared libraries that
contain code that is shared between the modules.
The front-end takes care of parsing parameters passed to the tool, reading
les and directories and calling other modules. It also makes some
modica&ons to the &mestamp object, that contains the actual &mestamp
and other informa&on about it.
The input module starts by verifying that it can really parse the given le, and
then proceeds with parsing each available &mestamp within it, producing a
&mestamp object, that is then passed on to the front-end for further
processing.
The output modules take the &mestamp object, and produce an output from
it that is then printed, to a le, standard out or to a database.
The shared libraries mostly consist to reduce code repe&&on, that is if a code
is used by more than one module it is be]er to store it in a shared source that
can be used by every module.

11

Currently the tool provides three front-ends.


log2&meline the original front-end and the one that is perhaps the main one. This
is a CLI front-end of the tool, capable of parsing a single ar&fact.
glog2&meline an extremly simple GUI wri]en in Perl GTK, more a proof of concept
than anything else.
&mescanner a CLI front-end that recursively goes through a directory parsing every
le it is capable of. This is essen&ally the automa&c por&on of the framework.

12

> log2&meline f list


-------------------------------------------------------------------------
Name
Version
Descrip&on
-------------------------------------------------------------------------
chrome
0.2
Parse the content of a Chrome history le
evt

0.2
Parse the content of a Windows 2k/XP/2k3 Event Log
evtx

0.3
Parse the content of a Windows Event Log File (EVTX)
exif

0.4
Extract metadata informa&on from les using ExifTool
_bookmark
0.2
Parse the content of a Firefox bookmark le
refox2 0.2
Parse the content of a Firefox 2 browser history
refox3 0.8
Parse the content of a Firefox 3 history le
iehistory 0.5
Parse the content of an index.dat le containg IE history
iis

0.4
Parse the content of a IIS W3C log le
isatxt

0.3
Parse the content of a ISA text export log le
mac&me
0.4
Parse the content of a body le in the mac&me format
mcafee
0.2
Parse the content of a log le
opera
0.1
Parse the content of an Opera's global history le
oxml

0.4
Parse the content of an OpenXML document (Oce 2007 documents)
pcap

0.4
Parse the content of a PCAP le
pdf

0.2
Parse some of the available PDF document metadata
prefetch 0.7
Parse the content of the Prefetch directory
recycler 0.5
Parse the content of the recycle bin directory
restore
0.8
Parse the content of the restore point directory
setupapi 0.4
Parse the content of the SetupAPI log le in Windows XP
sol

0.4
Parse the content of a .sol (LSO) or a Flash cookie le
squid

0.4
Parse the content of a Squid access log (h]p_emulate o)
tln

0.4
Parse the content of a body le in the TLN format
userassist 0.7
Parses the UserAssist Ac&ve Desktop key (part of NTUSER.DAT le)
win_link 0.6
Parse the content of a Windows shortcut le (or a link le)
xprewall 0.3
Parse the content of a XP Firewall log

13

> log2&meline -o list


-------------------------------------------------------------------------
Name
Version
Descrip&on
-------------------------------------------------------------------------
beedocs
0.1
Output &meline using tab-delimited le to import into BeeDocs
cef

0.2
Output &meline using the ArcSight Commen Event Format (CEF)
cil

0.6
Output &meline in a XML format that can be read by CFTL
csv

0.5
Output &meline using CSV (Comma Separated Value) le
mac&me
0.5
Output &meline using mac&me format
mac&me_l
0.6
Output &meline using legacy version of the mac&me format (version
1.x and 2.x)
simile
0.4
Output &meline in a XML format that can be read by a SIMILE widget
sqlite
0.6
Output &meline into a SQLite database
tln

0.5
Output &meline using H. Carvey's TLN format
tlnx

0.1
Output &meline using H. Carvey's TLN format in XML

14

The rst version of the tool was published last July, which consisted of a very simple
framework built around the mac&me output. That is the &mestamp object, which we will
discuss futher in a short while was was built around the mac&me output.
This has caused some problems for the tool, most notably when using other outputs that
contained dierent elds than are dened in the mac&me output. Informa&on was either
repeated or quite simply unnecessary informa&on was added to the output.
This has been changed in version 0.50, which will be published aier this talk (hopefully, if I
manage to complete it before the summit).
The largest change in the tool is mostly in the internal structure; changes to the &mestamp
object as well as how the output modules are built up. So the output of every output module
is slightly to considerably changed in the new version.
Another problem with the tool so far was the speed. It took from an hour to few hours to
scan through an image le. If each opera&on of the tool was examined further it was no&ced
that the tool spent the vast majority of its execu&on &me in the verica&on phase. So the
focus was to minimize the &me spent in the verica&on phase, leading to considerable speed
improvements in the tool.
The rst op&miza&on process reduced a test data set from 57 minutes down to 23 (~60%
reduc&on of &me). This makes the tool even more useful, at least in my opinion

15

So, just a brief overview of the structure of log2&meline.


Ive briey men&oned the structure before, but lets explore it in a bit more detail.
The front-end takes care of ini&alising all the work, loading the appropriate input and output
modules as well as parsing parameters. It then accepts the &mestamp object from the input
module, modies it according to the parameters, and passes it on to the output module.
The input module does most of the work, it has a verica&on phase that is built to properly
iden&fy the le in ques&on. That is the verica&on phase should not return a posi&ve match
unless it is determined that the input module can properly parse and extract &mestamps
from that source.
Aier verifying the structure of the le the front-end calls a func&on called load_line in the
input module to check if there are more lines or &mestamps in the le that havent been
parsed. If there are, the front-end ini&alizes the parsing func&on of the input module that
creates the &mestamp object and returns a reference to it. The front-end then adds some
informa&on to the &mestamp object along with possible modica&on to it before passing it
on to the output module for parsing.
So the ow of exectu&on is changed to the outptu module for each &mestamp or a line in the
le that is being parsed. That way the tool does not store all lines in memory, it reads a single
line or a &mestamp, outputs it, and then moves on to the next one.

16

The &mestamp object is perhaps the core of the framework.


It mainly consists of a hash value in Perl that contains all the necessary informa&on to
describe a &mestamp.
It contains values that are used by the output modules to properly built an output.
The basic building block is the &me value, which is a hash in itself. Each key in the has
contains a &mestamp or a value, a type, which is a descrip&on of the &mestamp (eg. last
modied, created, &me wri]en,etc...) and a legacy value, which is a reference to the MACB
value.
The &mestamp object has then two dis&nct descrip&on eld, both desc and short, which is
essen&ally a shorter version of the desc eld. This eld contains the text that is the main part
of the printed output, the actual parsed informa&on from the log le.
The source eld is a reference to the TLN ouptut format, it is a short dis&nct eld describing
the source of the &mestamp (eg. FILE, REG,EVT,WEBHIST,...) and a source type which is a
more descrip&ve eld of the source (eg. Event Log, System Registry,...).
The notes eld is op&onal and contains addi&onal informa&on that can be added to the event
if the output module supports a notes eld.
The extra eld is a hash value itself, and contains all addi&onal informa&on that can describe
the &mestamp, mostly op&onal elds although some are always lled out, like the lename
eld and inode.

17

Now Im going to demonstrate how log2&meline actually parses an example le to


extract the &mestamp from it.
Lets examine the OpenXML standard or the new Microsoi Oce standard for Word
and other Oce documents.
An OpenXML le is nothing more than a ZIP archive that stores several XML les
among others.
The basic le structure for a Word document is the following three folders:
+ _rels a folder that contains the .rels le, describing the structure of the document
or all the rela&onships.
+ docProps The metadata of the document, or the document proper&es.
+ word This is the actual Word document, or the content of the document

18

log2&meline starts by veryng the le is actually a ZIP le by examining the header


value.
ZIP les have a magic value of 0x04034b50, so every OpenXML le should have that
too. But this only veries that we are dealing with a ZIP le and although all
OpenXML les are ZIP les, not all ZIP les are OpenXML.
So we need to do further valida&on of the le before proceeding. log2&meline then
reads the header of the ZIP le, and examines the le name variable inside the
header.
If this is an OpenXML le, the le name header equals to [Content_Types].xml,.
Only aier veryng that the le name is correct log2&meline will proceed with the
parsing.

19

The framework is built into several steps. Aier verica&on we have the prepara&on
phase, which purpose varies greatly between modules. The purpose of the
prepara&on phase is either to simply open the le for parsing or to actually go
through the en&re le and nd every available &mestamp in it and put them in an
array or a hash for further processing.
In the case of an OpenXML le, not every line in the metadata is actually a date eld,
so the prepara&on phase takes care of parsing through the en&re metadata, and
storing it inside a hash value.
The input module starts by extrac&ng the rela&onship le located in _rels/.rels and
parses it to nd all available document proper&es les.

20

Aier nding each available document property le, the input module parses through
each one of it and inserts each metadata informa&on inside a hash value.
When the module has completed its parsing the next phase begins, where a func&on
called load_line is called to get each &mestamp line that is availble from the
document.
All &mestamps are stored using ISO-8601 format inside the document proper&es les,
so the load_line func&on simply examines each value of the metadata and
determines if it contains a valid date. If it nds one, it proceeds with parsing the
&mestamp and crea&ng a &mestamp object.
The &mestamp object contains some informa&on generally found inside a metadata
document le, such as the authors name, etc...

21

the nal step involves actually prin&ng the informa&on found out. Aier parsing each
available &mestamp entry a &mestamp object is created and sent to the output
module, which forms an appropriate output and prints it out, before the control is
sent back to the input module for further processing.
This is an exmple output line.

22

So how to install the tool....


There are basically three possibili&es on how to do so..
First of all, just use a distro that comes with the tool pre-installed... the only one that I know of is
the SIFT worksta&on, that is the new version, 2.0.
The second method is to compile from the source code. This used to be the only method of
installing the tool and has its ups and downs really. The tool is wri]en in Perl and depends upon
several Perl libraries, some of which do not have a dedicated package in most repositories. So the
use of the CPAN shell in Perl is oien needed to install some dependencies.
And the third and really the preferred method of installa&on (if you are not using SIFT) is to use
repositories to install the tool. I recently created an apt-get repository for Ubuntu machines to
easily install the tool and all its dependencies. For those Perl libraries that do not have a packet
in the default repository, Ive compiled them and included in there, so there shouldnt be any
need to install addi&onal libraries, just issue
apt-get install log2&meline-perl
aier installing the repository (see instruc&ons on the web site)
CERT.org has also created a Fedora repository so it should be easy to issue yum install
log2&meline.
The reason why this is the preferred method is the fact that when you do an upgrade in your
system, the tool gets updated to. This makes sure you are running the latest version at all &mes.

23

Ive briey discussed the tool &mescanner, but not really gone into any details about
it. So to begin with I would like to men&on the dierences between &mescanner and
log2&meline.
log2&meline, the main front-end, is designed to parse and extract &mestamps from a
single le only. So you need to both know the format of the le as well as the
loca&on of it to be able to use the tool.
&mescanner on the other hand is a recursive scanner that uses all of the other
modules of the framework. &mescanner is designed to parse through a mount point
(or any other directory) and extrac&ng &mestamps from all available les within it.
&mescanner works in such a way that you can either let it use all of the input modules
at once, so each le gets tested agains every input module available, or you can
select which modules you would like to be used.

24

So how does &mescanner work?


Basically it begins with going through which input modules should be used in the tool,
it then loads up all the selected modules into a hash.
It then goes recursively through a directory that is passed to the tool as a parameter
and tries to verify each le/directory within it against all the loaded input modules.
If a le is successfully veried the tool will proceed with parsing it and moving on to
the next one. That is if a le is veried, it will not be passed on to other modules that
havent been tested (this is new, the tool used to test each le against every input
module, irrelevant to the fact that it had already been parsed).

25

The default behaviour of &mescanner is the same as it was before, which is to test each le
and directory agains every available input module.
But through parameters it can be changed to use only selected modules to test against. The
parameter f is used to either select which modules should be used, which shouldnt be or
what list of modules is chosen.
To make it easier to select which modules to choose a list le was designed that can be edited
or created at will. The list les that come with the tool are: web, winvista and winxp (in this
version).
Each list le is simply a text le that contains the names of the modules that are chosen. To
get a list of all available modules and list les, the op&on of f list can be issued.
Other op&ons are
&mescanner z local d . f chrome,refox2
That is to use a f list, where list is the names of the modules to use separated with a comma.
Another op&on to use is to prepend the list with a -, indica&ng that you want every module
used, except the ones on the list
&mescanner z local d . f=exif,refox2
This will use all available input modules, except the refox2 and exif module

26

Then to go through the process of crea&ng a super &meline.


To begin with the image le has to be mounted. Lets assume that we are dealing
with a NTFS image le in a raw mode (a la dd). So to mount it we issue the command:
sudo mount -t n*s-3g -o ro,loop,show_sys_les,noexec,noodev /cases/vista/
vista_n*s.dd /mnt/windows_vista_mount
Aier moun&ng the image le we need to run &mescanner against it. Lets run the
tool so that it uses all available input modules.
>mescanner -z EST5EDT -d /mnt/windows_vista_mount -w /cases/vista/bodyle
log /cases/vista/>mescanner.log
Then to add the lesystem &mestamps to the &meline we use the s tool from the
Sleuthkit.
s r m C: /images/windowsforensics/vista_n*s.dd >> /cases/vista/bodyle
Now we can add informa&on into the &meline from any other tool that we would like
to, such as regdump.pl or any other tool capable of outpuTng a &meline in the
mac&me body format (or any other format really that can be then parsed again using
log2&meline).

27

I wanted to quickly go over some of the dierent output modules to give a quick
overview of what the tool is capable of doing.
So far the tool has three dierent output to perform visualiza&on for the &meline. It
has six dierent modules for either plain ASCII (line by line) or XML output.
It has then one module for outpuTng the &meline directly to a database, to a SQLite
database to be more precise. We will not be going into the structure of the SQLite
database here, since there are currently no tools that can read and work with the
output, at least not currently...

28

So lets go over the visualiza&on part a bit more. As I said before there are currently
three available modules for visual output.
They are: SIMILE widgets, CyberForensics TimeLab and BeeDocs. I will show
examples of each one.

29

To begin with we have SIMILE widgets that are essen&ally a web widget for visualizing
temporal data.
The output module either creates a XML le or a JSON le that can be read with a
SIMILE widget.
The problem with this method is the fact that although log2&meline does output the
&meline into a XML le or a JSON le, the HTML le s&ll has to be created and
properly tuned before the &meline can be actually visualized.
An example HTML le has been included with the tool to make that process easier,
but it s&ll requires quite a manual approach to set it up correctly.

30

The CyberForensics TimeLab is another project designed with the same goal as
log2&meline. The tool diers from log2&meline in the way that it is commercial, not
yet available and the current beta version does not support as many le formats as
log2&meline does.
So to extend the CFTL tool, log2&meline can output in a XML le, which is the default
storage of temporal data that CFTL uses. The XML le can be directly opened by CFTL
and read just as any other data that the tool produces.

31

32

BeeDocs is another visualiza&on tool, wri]en to work on a Mac OS X. The tool is


designed to visually represent &melines, both in 2D and 3D.
The tool can import &melines using a tab delimited le, such as the les created by
the beedocs output module of log2&meline.

33

34

There are certainly some benets of visually represen&ng the &meline, but there are
aws as well.
The main benets might be that visual representa&on is oien easier to understand,
given that it is done properly, and it is easier to explain to non-technical people using
visual &melines. Therefore visual &melines can be great to include in reports.
The problems however are that tools like that are oien extremly slow when dealing
with the magnitude of events that are generally presented in &melines like the ones
we deal with. And it can be dicult to nd events of interest since visual
representa&on is oien more suitable to nd spikes or high volumes of entries... but
the problem is the fact that in most cases there are only handful of entries that are
the real interest. That is to say we are usually looking for few straws in a very large
haystack, instead of searching for spikes in the &meline (the rule of Least Frequency
of Occurence, or LFO as described by Peter Silberman).
So visually represen&ng the &meline can be dicult to achieve to properly assist the
inves&gator into analysing the &meline, but it can be a great value for adding it into
reports aier the analysis. So to include limited events into a visualiza&on tool can be
of great value... but not to say that if the visualiza&on is done properly, it can possibly
greatly enhance &meline analysis...

35

So perhaps the most common method of analysing &melines is simply to use the
good old spreadsheet applica&on.
There are two op&ons of expor&ng the &meline into a spreadsheet. Either to use the
mac&me output format and then use the tool mac&me (part of the Sleuhtkit) to
convert the mac&me outptut to a CSV le, which can be easily opened by any
spreadsheet applica&on.
The other possibility is to use the CSV output module directly from log2&meline and
open it using any spreadsheet applica&on.
The only problem with using the CSV output module is that is not currently sorted
according to the dates (since we print each line as it is parsed, making sor&ng more
dicult).
The benets of using a spreadsheet applica&on is the fact that simple lters can be
easily created as well as being easy to hide certain columns or rows that are not of
interest to the inves&ga&on.

36

There are of course other methods to analyze the &meline. Perhaps to use the
Mandiant Highlighter, or just the simple method of combining vim, less and grep to
anlayse it, using perhaps a CSV le.
Or really to use what ever method that suits you.

37

Remember theses &mestamps from the beginning of the presenta&on?


Well... if we would now add the informa&on that weve discussed about so far... who
would this &meline look like?

38

Well... it consists of several new lines, it went from four lines to 2568.
So this picture will only show you a small por&on of it, but from this output you can
s&ll see that the super &meline

39

40

41

42

43

You might also like