Splunk Quick Reference Guide PDF
Splunk Quick Reference Guide PDF
CONCEPTS
Index-time and Search-time
During index-time processing, data is read from a source on a host and is
classified into a source type. Timestamps are extracted, and the data is parsed
into individual events. Line-breaking rules are applied to segment the events for
display in search results. Each event is written to an index on disk, where it is
later retrieved with a search request.
When a search starts, indexed events are retrieved from disk. Fields are extracted
from the event's raw text. These events can then be transformed using the
Splunk Enterprise search processing language to build reports and visualizations
that can be added to dashboards.
Indexes
When data is added, Splunk Enterprise parses it into individual events, extracts
the timestamp, applies line-breaking rules, and stores the events in an index.
You can create new indexes for different inputs. By default, data is stored in the
"main" index. Events are retrieved from one or more indexes during a search.
Events
An event is a set of values associated with a timestamp. It is a single entry of
data and can have one or multiple lines. An event can be a text document, a
configuration file, an entire stack trace, and so on. This is an example of an event
in a web activity log:
Host
A host is the name of the physical or virtual device where an event originates. The
host field provides an easy way to find all data originating from a specific device.
Fields
Fields are searchable name and value pairings that distinguish one event from
another because not all events have the same fields and field values. Using fields,
you can write tailored searches to retrieve the specific events that you want and
use the search commands. As Splunk Enterprise processes events at index-time
and search-time, it extracts fields based on configuration file definitions and
user-defined patterns.
Tags
Tags are aliases to particular field values. You can assign one or more tags
to any field name/value combination, including event types, hosts, sources,
and source types. Use tags to group related field values together or track
abstract field values such as IP addresses or ID numbers by giving them more
descriptive names.
Data model
A data model is a hierarchically-structured search-time mapping of semantic
knowledge about one or more datasets. It encodes the domain knowledge
necessary to build a variety of specialized searches of those datasets. These
specialized searches are in turn used by Splunk Enterprise to generate reports for
Pivot users. Data model objects represent different datasets within the larger set
of data indexed by Splunk Enterprise.
Pivot
Pivot refers to the table, chart, or data visualization you create using the Pivot
Editor. The Pivot Editor enables users to map attributes defined by data model
objects to a table or chart data visualization without having to write the searches
to generate them. Pivots can be saved as reports and used to power dashboards.
Search
Search is the primary way users navigate data in Splunk Enterprise. You can write
a search to retrieve events from an index, use statistical commands to calculate
metrics and generate reports, search for specific conditions within a rolling time
window, identify patterns in your data, predict future trends, and so on. Searches
can be saved as reports and used to power dashboards.
Reports
Reports are saved searches and pivots. You can run reports on an adhoc basis,
schedule them to run on a regular interval, set a scheduled report to generate
alerts when the results of their runs meet particular conditions. Reports can be
added to dashboards as dashboard panels.
Dashboards
Dashboards are made up of panels that contain modules such as search boxes,
fields, charts, tables, forms, and so on. Dashboard panels are usually hooked up
to saved searches or pivots. They can display the results of completed searches
as well as data from backgrounded real-time searches.
Indexer
An indexer is the Splunk Enterprise instance that indexes data. The indexer
transforms the raw data into events and stores the events into an index. The
indexer also searches the indexed data in response to search requests.
Subsearches
A subsearch is an argument to a command. A subsearch runs its own search and
returns those results to the parent command as the argument value. A subsearch
is contained in square brackets. For example, the following search uses a sub
search to find all syslog events from the user that had the last login error:
Time Modifiers
Instead of using the custom time ranges in Splunk Web, you can specify a time
range to retrieve events inline with your search by using the latest and earliest
search modifiers. The relative times are specified with a string of characters that
indicate the amount of time (integer and unit) and, optionally, a "snap to" time
unit. The syntax for time modifiers is:
[+|-]<integer><unit>@<snap_time_unit>
The following search, "error earliest=-1d@d latest=-h@h" retrieves events
containing "error" that occurred yesterday at midnight to the last hour, on the
hour.
Time units are specified as seconds (s), minute (m), hour (h), day (d), week
(w), month (mon), quarter (q), and year (y). The time integer defaults to 1. For
example, "m" is the same as "1m".
Snapping rounds the time amount down to the latest time not after the specified
time. For example, if it is 11:59:00 and you "snap to" hours (@h), the time will be
11:00:00 not 12:00:00. You can also "snap to" specific days of the week using @
w0 for Sunday, @w1 for Monday, and so on.
DESCRIPTION
chart/
timechart
dedup
eval
fields
head/tail
lookup
rename
replace
rex
search
sort
stats
top/rare
transaction
Optimizing Searches
The key to fast searching is to limit the data that needs to be pulled off disk to an
absolute minimum, and then to filter that data as early as possible in the search
so that processing is done on the minimum data necessary.
Partition data into separate indexes, if youll rarely perform searches across
multiple types of data. For example, put web data in one index, and firewall
data in another.
community.splunk.com
EVAL FUNCTIONS
FUNCTION
abs(X)
case(X,"Y",)
ceil(X)
cidrmatch("X",Y)
coalesce(X,)
exact(X)
exp(X)
floor(X)
The eval command calculates an expression and puts the resulting value into a field (e.g. "...| eval force = mass * acceleration"). The following
table lists the functions eval understands, in addition to basic arithmetic operators (+ - * / %), string concatenation (e.g., '...| eval name = last . ",
" . last'), boolean operations (AND OR NOT XOR < > <= >= != = == LIKE).
DESCRIPTION
EXAMPLES
abs(number)
Ceiling of a number X.
ceil(1.9)
cidrmatch("123.132.32.0/25",ip)
coalesce(null(), "Returned val",
null())
exact(3.14*num)
Returns eX.
exp(3)
floor(1.9)
isbool(field)
isint(field)
isnotnull(field)
isnull(field)
isnum(field)
isstr(field)
len(field)
like(field, "foo%")
ln(bytes)
Returns the log of the first argument X using the second argument Y
as the base. Y defaults to 10.
log(number,2)
lower(username)
match(field, "^\d{1,3}\.\d$")
max(delay, mydelay)
md5(field)
min(delay, mydelay)
mvcount(multifield)
mvfilter(match(email, "net$"))
Returns a subset of the multivalued field X from start position (zerobased) Y to Z (optional).
mvindex( multifield, 2)
mvjoin(foo, ";")
now()
null()
nullif(fieldA, fieldB)
pi()
Returns X .
pow(2,10)
random()
relative_time(now(),"-1d@d")
round(X,Y)
round(3.5)
rtrim(X,Y)
if(X,Y,Z)
isbool(X)
isint(X)
isnotnull(X)
isnull(X)
isnum(X)
isstr()
len(X)
like(X,"Y")
ln(X)
log(X,Y)
lower(X)
ltrim(X,Y)
match(X,Y)
max(X,)
md5(X)
min(X,)
mvcount(X)
mvfilter(X)
mvindex(X,Y,Z)
mvjoin(X,Y)
now()
null()
nullif(X,Y)
pi()
pow(X,Y)
random()
relative_time
(X,Y)
replace(X,Y,Z)
searchmatch(X)
split(X,"Y")
sqrt(X)
strftime(X,Y)
DESCRIPTION
EXAMPLES
split(foo, ";")
sqrt(9)
strftime(_time, "%H:%M")
strptime(X,Y)
strptime(timeStr, "%H:%M")
substr(X,Y,Z)
substr("string", 1, 3)
+substr("string", -3)
time()
tonumber(X,Y)
tonumber("0A4",16)
tostring(X,Y)
upper(X)
upper(username)
urldecode(X)
urldecode("http%3A%2F%2Fwww.splunk.
com%2Fdownload%3Fr%3Dheader")
time()
trim(X,Y)
typeof(X)
validate(X,Y,)
avg(X)
count(X)
dc(X)
first(X)
last(X)
list(X)
max(X)
median(X)
min(X)
mode(X)
perc<X>(Y)
range(X)
stdev(X)
stdevp(X)
sum(X)
sumsq(X)
values(X)
var(X)
Common statistical functions used with the chart, stats, and timechart commands. Field names can
be wildcarded, so avg(*delay) might calculate the average of the delay and xdelay fields.
DESCRIPTION
Returns the average of the values of field X.
Returns the number of occurrences of the field X. To indicate a specific field value to match, format X as eval(field="value").
Returns the count of distinct values of the field X.
Returns the first seen value of the field X. In general, the first seen value of the field is the chronologically most recent instance of field.
Returns the last seen value of the field X.
Returns the list of all values of the field X as a multi-value entry. The order of the values reflects the order of input events.
Returns the maximum value of the field X. If the values of X are non-numeric, the max is found from lexicographic ordering.
Returns the middle-most value of the field X.
Returns the minimum value of the field X. If the values of X are non-numeric, the min is found from lexicographic ordering.
Returns the most frequent value of the field X.
Returns the X-th percentile value of the field Y. For example, perc5(total) returns the 5th percentile value of a field "total".
Returns the difference between the max and min values of the field X.
Returns the sample standard deviation of the field X.
Returns the population standard deviation of the field X.
Returns the sum of the values of the field X.
Returns the sum of the squares of the values of the field X.
Returns the list of all distinct values of the field X as a multi-value entry. The order of the values is lexicographical.
Returns the sample variance of the field X.
SEARCH EXAMPLES
Filter Results
Add Fields
| dedup host
| regex _raw="(?<!\
d)10.\d{1,3}\.\
d{1,3}\.\d{1,3}(?!\d)"
Group Results
Cluster results together, sort by their
"cluster_count" values, and then return
the 20 largest clusters (in data size).
| cluster t=0.9
showcount=true | sort
limit=20 -cluster_count
| transaction host
cookie maxspan=30s
maxpause=5s
| transaction clientip
startswith="signon"
endswith="purchase"
| eval
velocity=distance/time
| delta count as
countdiff
Filter Fields
Keep the "host" and "ip" fields, and
display them in the order: "host", "ip".
| fields + host, ip
| fields - host, ip
Modify Fields
Rename the "_ip" field as "IPAddress".
| rename _ip as
IPAddress
| replace *localhost
with mylocalhost in host
Order Results
Return the first 20 results.
| head 20
| reverse
Multi-Valued Fields
| nomv recipients
| tail 20
| makemv delim=","
recipients | top
recipients
| mvexpand recipients
| fields EventCode,
Category, RecordNumber
| mvcombine delim=","
RecordNumber
| eval to_count =
mvcount(recipients)
| eval recipient_first =
mvindex(recipient,0)
| eval netorg_recipients
= mvfilter(match(recipient,
"\.net$") OR
match(recipient, "\.org$"))
| eval newval =
mvappend(foo, "bar", baz)
| eval orgindex =
mvfind(recipient, "\.org$")
Reporting
Return events with uncommon values.
| anomalousvalue
action=filter pthresh=0.02
| chart max(delay) by
size bins=10
| outlier
| stats dc(host)
| stats avg(*lay) by
date_hour
| timechart span=1m
avg(CPU) by host
Lookup Tables
| timechart count by
host
| lookup usertogroup
user output group
| outputlookup users.csv
| rare url
| inputlookup users.csv
NOTE
\s
EXAMPLE
\d\s\d
white space
\S
\d\S\d
\d
\d\d\d-\d\d-\d\d\d\d
digit
\D
\D\D\D
not digit
\w
\W
[...]
[^...]
no included character
zero or more
one or more
zero or one
(?: ... )
^
named extraction
logical or atomic grouping
start of line
end of line
{...}
\
\w\w\w
\W\W\W
[a-z0-9#]
[^xyz]
\w*
\d+
\d\d\d-?\d\d-?\d\d\d\d
\w|\d
or
(?P<var> ...)
Regular Expressions are useful in multiple areas: search commands regex and
rex; eval functions match() and replace(); and in field extraction.
number of repetitions
escape
(?P<ssn>\d\d\d-\d\d-\d\d\d\d)
(?:[a-zA-Z]|\d)
^\d+
\d+$
\d{3,5}
\[
EXPLANATION
digit space digit
digit non-whitespace digit
SSN
three non-digits
three word chars
three non-word chars
any char that is a thru z, 0 thru 9, or #
any char but x, y, or z
zero or more words chars
integer
SSN with dashes being optional
word or digit character
pull out a SSN and assign to 'ssn' field
alphabetic character OR a digit
line begins with at least one digit
line ends with at least one digit
between 3-5 digits
escape the [ char
%H
Time
Days
Months
Years
Examples
%I
%M
%S
%N
%p
%Z
AM or PM
%z
%s
%d
%j
%w
%a
%A
%b
%B
%m
%y
%Y
%Y-%m-%d
%y-%m-%d
%b %d, %Y
%B %d, %Y
q|%d %b '%y = %Y-%m-%d|
Splunk Inc.
250 Brannan Street
San Francisco, CA 94107
Year (2008)
1998-12-31
98-12-31
www.splunk.com