Splunk Questions and Answers Final Document
Splunk Questions and Answers Final Document
Splunk Questions and Answers Final Document
Ph.No:9618592385
Splunk Interview
Questions and
Answers Complete
Material
Ananya Technologies
Gudimalkapur X Road, Opp.MNR School Building
Near mehdipatnam8523841435 / 9618592385
----------------------------------------------------
Splunk Questions and Answers
Q.Tell me about yourself?
Myself Prasad(Tell me u r name) ,I have total 6.4 years of experience in IT industry, 4.4
years of experience in Splunk Admin and Development, remaining 2 years of experience
in middleware technologies, like web logic server, websphere server. Presently I am
working in CDK Global(Tell me u r company), It’s located in Hyderabad.
I have good experience Splunk components like Forwarder, indexer,Search Head and
Deployment Server,Cluster Master, License Master,Deployer.
I have good experience in data age concept like hot,warm,cold,frozen and thawaed
buckets.
I have good experience in onboarding the data different data source. Like agent based,
rsyslog feed,HTTP Event Collector, DB connect app, REST API and SFTP
server IP,
server name,
index name,
source type
cd /opt/splunk/etc/deployment-apps/
means bracket ([) open monitor colon(:) slash slash then source file path slash then
source file name then bracket close.
after that i will mention the index=test, first i need to test in test index then everything
is fine then i will move to test index to production index.
_TCP_ROUTING and
disabled = 1,
after that we need to add inputs app then check the restart splunkD check box.
after that we need to reload the server class through this command.
time prefix
time format
if in case word W, if in case space S this details we need to mention in the line breaker
this props code i will push to heavy forwarder. after that we need to reload the heavy
forwarder server class.
then i will check the events again. every thing is fine. then i will move to test index to
prod index. then again reload the inputs server class.
KV store: 8191
Login to Deployment server - Check for the deployment client, i.e. Universal Forwarder
and check for the phone home interval - if the Phone home interval is longer than usual,
ex, 24 hours ago, 3 days ago that means the machine is no longer reporting to Splunk
Look through Splunkd.log for diagonostic and error metrices. We can also go to
Monitoring console app and check for the resource utilization of different server
components like, CPU, MEMORY Utilization etc.
We can also install splunk-on-splunk app from splunbase.com and monitor health of
different splunk instances
There are lot of techniques - Base searches for dashboards, Filter as early as possible,
avoid using wildcards, Inclusion is always better than exclusion. Example, search
specifically for status = 50* rather than searching for |search NOT status=50*.
Use data models which can be used within lot of other saved searches like dashboards
and reports.
After that data moved to warm bucket. data rolled from the hot bucket. Warm bucket
also searchable.After that data moved warm bucket to cold bucket. Data rolled from
warm bucket. Cold bucket also searchable. later data moved to cold to frozen bucket.
Data rolled from cold to frozen bucket. it’s not searchable. The indexer deletes frozen
data by default, but you can also archive it . later data moved frozen to thawed bucket.
✔ output.conf
✔ transforms.conf
✔ props.conf
✔ indexes.conf
✔ web.conf
✔ limits.conf
✔ authentication.conf
✔ autherization.conf
✔ collections.conf
props.conf
[wineventLog:system]
TRANSFORMS=null_queue_filter
transforms.conf
[null_queue_filter]
REGEX=(?!)^Eventcode=(592|593)
DEST_KEY=queue
FORMAT=nullQueue
then click on New Token, Here, we can mention Name, Source Name Override,
Description, then check the enable indexer acknowledgement. once token is generated,
we can configure the token backend also.
cd opt/splunk/etc/deployment-apps/splunk_httpinputs/local
[http://bs_pcf_lab_usw2]
disabled = 0
a.stanza name
b.index name
c.token
5.we need to provide the source with URL and Token name and index name, kafka time
will set up the token from there end.
_time – The time when the data was ingested into Splunk.
Host – Host is the indexer where the data is stored in the form of Indexes/table.
Sourcetype – The format of file ingested into Splunk. Example, .csv, json, xml, .txt etc.
Then I will increase initCrcLen =1024 to 2048, then i will reload the server class, then I
will check data again, if data is coming then fine still data is not coming I will increase
initCrcLen=2048 to 4096..I will check like this. Finally I will use CrcSalt=<source>
Then need to check in monitor stanza, source file path is correct or not,then
Or
3.Also, you may need to check for the skipped searches. Maybe during skipped searches
time, you were running into your maxconcurrent limit, which is why this search was
skipped multiple times and that is why you did not receive the alert.
INFO SavedSplunker -
savedsearch_id="nobody;SystemManage;SVaccount-authfail-emailsend", user="abcd",
app="", savedsearch_name="", priority=, status=skipped, reason="maxconcurrent limit
reached", scheduled_time=1498555860, window_time=0
In case if you see the above info message in logs, you should increase the limit for the
maximum number of concurrent searches in limits.conf
Use data models which can be used within lot of other saved searches like dashboards
and reports.
Normally when a indexer cluster member having searchable copies goes down, the _raw
copies of data gets converted to searchable files (tsidx). Master node in this case takes
care of bucket fixing, that is tries to keep the match with the Replication factor you've
set up.
Streamstats gives running calculation for any field name specified plus it also keeps the
orignal value of a field name as same. Example, price field has values as, 20, 30, 40, 50.
after you do, |streamstats sum(price) as Sum_Price by Product_Name, you will see that
in first row output will be 20, in second it will be 50 (20+30), in third line it will be 90
(50+40) and so on. Later if you do |table price Product_Name, you can also see actual
values for price field.
Lookup is a knowledge object in Splunk. Within our SPL code if we need to reference to
an external file we can do that using lookup. Lookup files can be added to splunk by
going to settings > lookups > add lookup files.
Lookups are useful also from the perspective of performing several types of joins like,
inner, outer etc.
Q.fillnull command?
Replaces null values with a specified value.
Example:
For the current search results, fill all empty fields with NULL.
| fillnull value=NULL
OR
NOT
to identify which Indexer is down we can again run a simple command - index=_internal
source="*splunkd.log*""*Connection failure*" - By running this command you will get to
know the indexerIP which is having connection failure.
- Apps are a full fledged version of splunk enterprise. They contains options for creating
dashboards, reports, alerts, lookups, eventypes, tags and all other kind of knowledge
objects. Add-ons on other hand perform a limited set of functionalities like for Example,
Windows Addon can only get the data from windows based systems, Unix based Add-ons
can get data from specfic unix based servers and so on.
We have installed apps like, DBCONNECT and Add ons like, Windows app for
infrastructure and Unix apps for getting data from Unix based systems
Knowledge bundle is basically a kind of app bundle which is for sending regular updates
to all serach head members in a cluster. The captain of the search head cluster
distributes knowledge bundle to every search head member whenever any change in 1
or more search head takes place.
Transforms.conf
[discardingdata]
REGEX=(i?) error
DEST_KEY=queue
Props.conf
[Sourcetype]
TRANSFORMS-abc = discardingdata
Actual data = 10% raw data + (10 to 110%) of raw data = ¼ of actual data
Repfactor= auto
Repfactor=0 (exclude)
Recently i have created the three alert. That is Windows CPU Usage greater than 95 %
we need to trigger the alert.
Windows Free Memory less than 10% we need to trigger the alert.
Windows Disk Free Space less than 10% Check then we need to trigger the alert.
Second phase means : forwarder moved to indexer. Indexer is main heart of the
Splunk. Sorting the data, analysing the data.
Third phase means : end user run the query in search head but data will come in
indexer only.
⮚ Input lookup is used to enrich the data and output lookup is used to build their
information.
⮚ For Example : | inputlookup IDC_FORD_Trend.CSV
⮚ stats – This command produces summary statistics of all existing fields in your
search results and store them as values in new fields.
⮚ eventstats – It is same as stats command except that aggregation results are
added in order to every event and only if the aggregation is applicable to that
event. It computes the requested statistics similar to stats but aggregates them to
the original raw data.
⮚ When you use the timechart command, the x-axis represents time. The y-axis can
be any other field value, count of values, or statistical calculation of a field value
⮚ Example :-
Data models : you have a large amount of unstructured data which is critical to your
business, and you want an easy way out to access that information without using
complex search queries. This can be done using data models as it presents your data
in a sorted and hierarchical manner.The key benefits of data models are:
⮚ Step 3: Specify a ‘Title’ to your data model. You can use any character in the title,
except an asterisk. The data model ‘ID’ field will get filled automatically as it is a
unique identifier. It can only contain letters, numbers, and underscores. Spaces
between characters are not allowed.
⮚ Step 4: Choose the ‘App’ you are working on currently. By default, it will be
‘home’.
⮚ Step 5: Add a ‘Description’ to your data model.
⮚ Step 6: Click ‘Create’ and open the new data model in the Data Model Editor.
⮚ Index-time processing is the processing of data that happens before the event
is actually indexed. Examples of this are data fields which get extracted as and
when the data comes into the index like source, host and timestamp.
⮚ Following are the processes that occur during index time:
● 1.Default field Source type customization
● 2.Index-time field extraction
● 3.Event timestamping
● 4.Event line breaking
● 5.Event segmentation
1. Search – Search command in splunk is used to search for data which is stored in
Key-Value pairs.
2. Stats –We use stats command to gather statistics about any field or set of fields.
The output will always be shown in a tabular format.
3. Rename –Rename command is use to give a field or set of fields another name.
4. Table –A table command is used to show the fields in a tabular format.
Commands –
1. Table – Table command is use for displaying multiple field name (s).
Example, index="ajeet" sourcetype="csv" |table host, sourcetype, source.
2. Dedup – Dedup command is used for removing duplicate field values.
Example, index="ajeet" sourcetype="csv"
| dedup host, sourcetype, source
| table host, sourcetype, source
Operators –
18 –May - 2019
29-MAY-2019
Chart – Chart command is used to visualise the data in 2-D. Using the chart
7.
command we can group by using only 2 fields.
Example, index=raja1
index=raja1
Aggregate Functions –
Commands –
Addtotals col=t – This command will give you both the row and the column total.
Example, index=raja1
price=*
| table price
| addtotals col=t
Addcoltotals –Addcoltotals will give you the column total of a particular filed.
Example, index=raja1
price=*
| table price
| addcoltotals
Addcoltotals row=t – Will give both the row and column total.
Example, index=raja1
price=*
| table price
| addcoltotals row=t
## Addtotals ##
The addtotals command computes the arithmetic sum of all numeric fields for each
search result
## Addcoltotals ##
The addcoltotals command appends a new result to the end of the search result set.
The result contains the sum of each numeric field or you can specify which fields to
summarize.
Appendcols
- Appends the fields of the subsearch results with the input search results.
Appendpipe
latest: Specify the latest time for the time range of your search.
Example:
Q.transaction?
transaction command finds transactions based on events that meet various constraints.
example:-
Top Command:-Return the top values like Return the 10 most common values for a field
Q.fillnull command?
Replaces null values with a specified value.
Example:
For the current search results, fill all empty fields with NULL.
| fillnull value=NULL
Q.Head command?
Example:
|head limit=10
Q.geostats command?
generate statistics to display geographic data and summarize the data on maps.
Example:
Q.iplocation command?
Extracts location information from IP addresses by using 3rd-party databases.
Example:
Q.transpose command?
Returns the specified number of rows (search results) as columns.
Example:
Q.Join?
Combine the two quries. we can use join.i have queri like A,i have another query like B.i
want to combine both A nd B.we can use JOIN command.
$splunk_home/var/log/splunk/searches.log
An app is a way of localising your data and preventing people from other Application
teams to make use of them ....
Apps in Splunk can be created directly using the Splunk GUI ....
Add ons performs a specific set of functionalities - Like Windows and Linux Add ons for
getting data from Windows and Linux servers respectively ....
Add ons are normally imported from Splunkbase.com, a repository for splunk facilitated
apps and add ons ....
Compared to apps, add ons only exhibits limited functionality and they could be grouped
for one to many use. Example, A Linux add on installed on Universal forwarder can only
receive data from Linux servers and not Windows servers
These fields which show No results when you try to group them using Tstats command.
=========================
1.Explain about your roles and responsibilities in your
organization?
- Taking care of scheduled maintenance activities
- Creating user based knowledge objects like, Dashboards, reports, alerts, static and
dynamic lookups, eventtypes, doing field extractions.
Pointer: Explain types of data we can ingest in splunk . Common ones they expect us to
answer is flat files(logfile, textfile etc) and syslog onboarding. You can also talk about
Database and csv onboarding.
Universal forwarder doesn't have any GUI. So everything that we need to configure is by
logging to the UF through admin credentials. Once you login to it using this
path opt/splunkforwarder/bin we need to need to run following command to add
indexerIP/hostname where it will be forwarding data to. The command is this, ./splunk
add forward-server indexerIP:9997
Once this is done, we need to add sourcefile names that needs to be added. Example has
already been explained in #4, /splunk add monitor
/var/log/introspection/resource_usage.log -index Ashu
Outputs.conf - All the indexers name added using this path, /splunk add forward-server
indexerIP:9997 will be visible under outputs.conf
Pointers: inputs.conf , outputs.conf
When you open transforms.conf, these are the CONFIG parameters which are
configurable -
DEST_KEY
REGEX
FORMAT
7. Have you worked on data transformation? For
example can achieve below scenarios?
a. How will you mask the sensitive data before its indexed?
Using the above class and using the replacement parameter - replacement is a string to
replace the regular expression match
c. How can you filter out unwanted data from a data source and
drop it before it gets indexed so that I will save on licensing cost?
DEST_KEY
REGEX
FORMAT
Yes, using the unencrypted SYSLOG service and a universal forwarder. Alternatively, we
can also use daemon processes like, Collectd and Statsd to transmit data using UDP.
Sourcetype is used as a data classifier whereas source contains the exact path from
where the data needs to be onboarded
Login to any particular indexer - Go to settings > under System > Licensing > Add
License File (Mainly an XML based licensing file is added)
Licensing cost is calculated on entire data size that is ingested. Compressing has nothing
to with License usage, compressing is done to save on Disk space.
when we set up server classes and assigns them apps or set of apps we need to restart
splunkd, a system level process - Once this is done any new app updates will be
automatically sent to all servers.
_audit - All search related information - Scheduled searches as well as adhoc searches
_introspection - All system wide data, including memory and CPU data
CIM is common information model used by splunk. CIM acts as a common standard used
by data coming from different sources.
Data normalization is already explained above
- Dispatch directory is for running all scheduled saved searches and adhoc searches.
● System Requirements -
● Updates to dashboards, reports, new saved searches created are always subject
to Captain - The captain takes care of all of this
Before we Configure search head clustering, we need to configure a deployer because
Deployer IP is required to create a search head cluster
DEPLOYER -
[shclustering]
Pass4symmkey = password
shcluster_label = cluster1
Restart the Splunk since change has been done in .conf files
While setting up Search head clustering we first have to create a Deployer as above -
When it comes to setting up a SH clustering, the first thing we need to do is login to that
particular Search head and run the command by going to bin as follows -
Scheduled saved searches which are under different user names who are no more part of
the splunk ecosystem or have left the company are called as orphaned searches. It
happens because there is no role associated within splunk for that particluar user.
With recent upgrade of Splunk to 8.0.1 the problem with orphaned searches has almost
resolved. But still if you see the orphaned searches warning appearing under Messages
in your search head you can follow this guideline on how to resolve.
https://docs.splunk.com/Documentation/Splunk/8.0.2/Knowledge/Resolveorphanedsearc
hes
TSIDX files are Time series index files. When raw data comes to index it gets converted
into Tsidx files. Tsidx files are actually searchable files from Splunk search head
- Hot Bucket - Contains newly incoming data. One the Hot bucket size reaches a
particular threshold, it gets converted to a cold bucket. This bucket is also searchable
from the search head
- Warm bucket - Data rolled from hot bucket comes to this bucket. This bucket is not
actively written into but is still searchable. Once the indexer reaches maximum number
of cold buckets maintained by it, it gets rolled to warm buckets
- Cold - Contains data rolled from warm buckets. The data is still searcahble from the
search head. It's mainly for the backup purpose in case one or more hot or warm
buckets are unsearchable. After cold buckets reaches a threshold, they gets converted to
Frozen buckets
Each and every user authenticated to Splunk has limited search quota - Normal users
has around 25 MB whereas power users has around 50-125 MB. Once this threshold is
exceeded for a particular time, users searches will start getting queued.
Phonehome interval is the time interval for which a particular deployment client will keep
polling your Deployment server. Ex, 2 seconds ago, 10 seconds etc.
Server class are group of servers coming from the same flavour or same geographic
location. Ex, to combine all windows based servers we will create a windows based
server class. Similarly, to combine all Unix based servers we will create a unix based
server class.
Token is a placeholder for a set of values for a particular variable. Example, Name =
$Token1$. Now here Name field can have multiple values like, Naveen, Ashu, Ajeet etc.
The value that a particular token will hold completely depends upon the selection. Tokens
are always enclosed between $$, like the example above.
Check if the Forwarder host name/Ip Address is not under the blacklist panel in
Deployment server.
Dashboard is a kind of view which contains different panels and panel shows up different
metrices.
Indexes.conf
This is the done with this command and already explained above -
./splunk init shcluster-config -auth admin:password -mgmt_uri
IPaddressofSHinHTTPSFollowedByMGMTPort -replication_port 9000 -replication_factor 3
-conf_deploy_fetch_url DeployerIpaddress:8089 -secret passwordofdeployer
-shcluster_label clusterName
By default Splunk sets the default size to 10 GB for all buckets on a 64-bit OS.
Third layer of bucket refers to cold bucket which is still searchable.
we need to go to settings > Access controls > Roles > YourUserRole > Indexes and
check if the user has read access to index.
Universal forwarders are basically agents which are installed on the client, i.e., servers
from where we are getting the data. They don't have any pre-processing capability.
Heavy forwarders in turn have pre-processing, routing and filtering capabilities.
SV Reddy Answers
1.Explain about your roles and responsibilities in your
organization -
- Taking care of scheduled maintenance activities
- Creating user based knowledge objects like, Dashboards, reports, alerts, static and
dynamic lookups, eventtypes, doing field extractions.
- Troubleshooting issues related to production environment like Dashboard not
showing up the data - In this case we basically check from the raw logs if the format
of the data has changed or not.
- Been part of mass password update activities for DATABASE related inputs because
if the DATABASE password change happens we need to change the connection
password created in our BBConnect application
Once this is done, we need to add sourcefile names that needs to be added. Example has
already been explained in #4, /splunk add monitor
/var/log/introspection/resource_usage.log -index Ashu
Outputs.conf - All the indexers name added using this path, /splunk add forward-server
indexerIP:9997 will be visible under outputs.conf
Pointers: inputs.conf , outputs.conf
When you open transforms.conf, these are the CONFIG parameters which are
configurable -
DEST_KEY
REGEX
FORMAT
Using the above class and using the replacement parameter - replacement is a string to
replace the regular expression match
c. How can you filter out unwanted data from a data source and drop it before it gets
indexed so that I will save on licensing cost?
DEST_KEY
REGEX
FORMAT
Login to any particular indexer - Go to settings > under System > Licensing > Add
License File (Mainly an XML based licensing file is added)
We have installed apps like, DBCONNECT and Add ons like, Windows app for
infrastructure and Unix apps for getting data from Unix based systems
Pointer: Talk about Splunk App for DB connect and Splunk app/addon for linux/unix .
These are 2 common apps/addons that you should know.
Would be a good deal if you can talk about Splunk app for AWS (cloud integration)
15. Questions on regex will be asked – A common one would be – could you tell me the
regex for IP address.?
Regex in splunk are done with the help of rex, regex and erex command. No one will
ever ask you to tell regex about IPaddresses
when we set up server classes and assigns them apps or set of apps we need to restart
splunkd, a system level process - Once this is done any new app updates will be
automatically sent to all servers.
_audit - All search related information - Scheduled searches as well as adhoc searches
_introspection - All system wide data, including memory and CPU data
● System Requirements -
● Updates to dashboards, reports, new saved searches created are always subject
to Captain - The captain takes care of all of this
DEPLOYER -
Distributes apps and other configurations to SH cluster members
Can be colocated with deployment server if no. of deployment clients < 50
Can be colocated with Master node
Can be colocated with Monitoring console
Can service only one SH cluster
The cluster uses security keys to communicate/authenticate with SH members
[shclustering]
Pass4symmkey = password
shcluster_label = cluster1
Restart the Splunk since change has been done in .conf files
While setting up Search head clustering we first have to create a Deployer as above -
When it comes to setting up a SH clustering, the first thing we need to do is login to that
particular Search head and run the command by going to bin as follows -
Scheduled saved searches with invalid owners are considered "orphaned". They cannot
be run because Splunk cannot determine the roles to use for the search context.
- User - Can only read from splunk artifacts. Example, Reports, dashboards, alerts and
so on. Don't have edit permissions.
- Power user - Can create dashboard, alerts, reports and have Edit permissions
- Admin- Have access to all production servers, can do server restarts, take care of
maintenance activities and so on. Power user and normal user role are subsets of Admin
role
stats
chart
timechart
rare
top etc
TSIDX files are Time series index files. When raw data comes to index it gets converted
into Tsidx files. Tsidx files are actually searchable files from Splunk search head
-> Hot Bucket - Contains newly incoming data. One the Hot bucket size reaches a
particular threshold, it gets converted to a cold bucket. This bucket is also searchable
from the search head
-> Warm bucket - Data rolled from hot bucket comes to this bucket. This bucket is not
actively written into but is still searchable. Once the indexer reaches maximum number
of cold buckets maintained by it, it gets rolled to warm buckets
-> Cold - Contains data rolled from warm buckets. The data is still searcahble from the
search head. It's mainly for the backup purpose in case one or more hot or warm
buckets are unsearchable. After cold buckets reaches a threshold, they gets converted to
Frozen buckets
-> Frozen - Once the data is in Frozen buckets, it is either archived or deleted. In this
stage we are not able searchable anything.
To switch to a static captain, reconfigure each cluster member to use a static captain:
1. On the member that you want to designate as captain, run this CLI command:
splunk edit shcluster-config -mode captain -captain_uri <URI>:<management_port>
-election false
2. On each non-captain member, run this CLI command:
splunk edit shcluster-config -mode member -captain_uri <URI>:<management_port>
-election false
Note the following:
● Single-site cluster with loss of majority, where you converted the remaining
members to use static captain. Once the cluster regains a majority, you should
convert the members back to dynamic.
● Two-site cluster, where the majority site went down and you converted the members
on the minority site to use static captain. Once the majority site returns, you should
convert all members to dynamic.
In the scenario of a single-site cluster with loss of majority, you should revert to dynamic
mode once the cluster regains its majority:
Dynamic captain is one which keeps on changing with passage of time. to set a Dynamic
captain we login to servers.conf and change the parameter - preferred_captain = true.
Each and every user authenticated to Splunk has limited search quota - Normal users
has around 25 MB whereas power users has around 50-125 MB. Once this threshold is
exceeded for a particular time, users searches will start getting queued.
InputQueue
Parsing Queue
Merging Queue
Typing Queue
Indexing Queue
Null Queue
Server class are group of servers coming from the same flavour or same geographic
location. Ex, to combine all windows based servers we will create a windows based
server class. Similarly, to combine all Unix based servers we will create a unix based
server class.
Token is a placeholder for a set of values for a particular variable. Example, Name =
$Token1$. Now here Name field can have multiple values like, Naveen, Ashu, Ajeet etc.
Dashboard is a kind of view which contains one or more rows, each row contains one or
more panels each panel shows up different metrices.
52. How will you make a indexer not searchable for user
(Question wrong)
I don't know who to do it but I will ask someone
OR
Via configuration files --🡪 indexes.conf
By default Splunk sets the default size to 10 GB for all buckets on a 64-bit OS.
Third layer of bucket refers to cold bucket which is still searchable.
Universal forwarders agents are basically installed on the client, i.e., where we are
getting the data. Its consume very less CPU and memory and its don’t have any web.
Heavy forwarders having full enterprise version of splunk software, it will do parsing
(masking, Index routing, sourcetype routing and ignore the garbage date), i.e,
pre-processing, routing and filtering capabilities and it have web.
to identify which Indexer is down we can again run a simple command - index=_internal
source="*splunkd.log*""*Connection failure*" - By running this command you will get to
know the indexerIP which is having connection failure.
===================================================================================
=
- Creating user based knowledge objects like, Dashboards, reports, alerts, static and
dynamic lookups, eventtypes, doing field extractions.
- Been part of mass password update activities for DATABASE related inputs because if
the DATABASE password change happens we need to change the connection password
created in our BBConnect application
- We have a multisite cluster both at Rochelle and Hudson. Each of these clusters
contains 40 indexers each. Each of the cluster has 1 cluster master, 1 deployment server,
more than 10000 forwarders installed on clients, 1 deployment server configured to
receive data from forwarders - Deployment server consists of 3 kinds of apps, 7 search
heads in a cluster, 1 deployer
1.
Log into Splunk Web as the admin user.
4.
5.
Click the Server settings link in the System section of the screen.
6.
7.
Click General settings.
8.
9.
Change the value for either Management port or Web port, and click Save.
10.
Alternatively, you can also go /bin folder and run the following command -
Yes. We need to login to the machine basically the client. If the Universal forwarder is
already not installed there, we need to install one. Later on we need to go to ./bin and
run the following command -
Yes, I did worked on Data Normalization. Data normalization as the name states is the
process of removing redundant/duplicate data, plus, it also comprises of logically
grouping data together.
In Splunk we makes use of tags during search time to normalize data. There is one thing
we need to take care while normalizing data. Data normalization should only be done at
search time and not index time. It is a technique which is adopted for faster data
retrieving and lesser search execution time. So, it is better we do it once the data is
stored in indexers.
Pointer: Explain types of data we can ingest in splunk . Common ones they expect us to
answer is flat files(logfile, textfile etc) and syslog onboarding. You can also talk about
Database and csv onboarding.
Universal forwarder doesn't have any GUI. So everything that we need to configure is by
logging to the UF through admin credentials. Once you login to it using this
path opt/splunkforwarder/bin we need to need to run following command to add
indexerIP/hostname where it will be forwarding data to. The command is this, ./splunk
add forward-server indexerIP:9997
Once this is done, we need to add sourcefile names that needs to be added. Example has
already been explained in #4, /splunk add monitor
/var/log/introspection/resource_usage.log -index Ashu
Outputs.conf - All the indexers name added using this path, /splunk add forward-server
indexerIP:9997 will be visible under outputs.conf
- Heavy forwarders are used for data pre-processing meaning it is used for selective data
forwarding and removing unwanted values as well.
When you open transforms.conf, these are the CONFIG parameters which are
configurable -
DEST_KEY
REGEX
FORMAT
SEDCMD-<class> = s/<regex>/<replacement>/flags
●
regex is a Perl language regular expression
●
●
replacement is a string to replace the regular expression match.
●
●
flags can be either the letter g to replace all matches or a number to replace a
specified match.
●
b. Can you change/replace hostname with new host ?
Using the above class and using the replacement parameter - replacement is a string to
replace the regular expression match
c. How can you filter out unwanted data from a data source and drop it before it gets
indexed so that I will save on licensing cost?
DEST_KEY
REGEX
FORMAT
Yes, using the unencrypted SYSLOG service and a universal forwarder. Alternatively, we
can also use daemon processes like, Collectd and Statsd to transmit data using UDP.
Sourcetype is used as a data classifier whereas source contains the exact path from
where the data needs to be onboarded
License master is a splunk instance which is used for monitoring splunk data volume on
a daily basis. This is how we configure a license master -
Login to any particular indexer - Go to settings > under System > Licensing > Add
License File (Mainly an XML based licensing file is added)
Licensing cost is calculated on entire data size that is ingested. Compressing has nothing
to with License usage, compressing is done to save on Disk space.
- Apps are a full fledged version of splunk enterprise. They contains options for creating
dashboards, reports, alerts, lookups, eventypes, tags and all other kind of knowledge
objects. Add-ons on other hand perform a limited set of functionalities like for Example,
Windows Addon can only get the data from windows based systems, Unix based Add-ons
can get data from specfic unix based servers and so on.
We have installed apps like, DBCONNECT and Add ons like, Windows app for
infrastructure and Unix apps for getting data from Unix based systems
Pointer: Talk about Splunk App for DB connect and Splunk app/addon for linux/unix .
These are 2 common apps/addons that you should know.
Would be a good deal if you can talk about Splunk app for AWS (cloud integration)
Props.conf is a configuration file used for selective indexing - mainly used for data
pre-processing. We need to mention the sourcename in the props.conf. Transforms.conf
is for specifying what all set of events/parameters/fields needs to be excluded. Example,
DEST_KEY
REGEX
FORMAT
Regex in splunk are done with the help of rex, regex and erex command. No one will
ever ask you to tell regex about IPaddresses
Deployment server is a splunk instance which is used for polling from different
deployment clients like, indexer, Universal forwarder, Heavy forwarder etc.
Server classes is used for grouping different servers based on the classes - like if I have
to group all the UNIX based servers i can create a class called - UNIX_BASED_SERVERS
and group all servers under this class. Similarly, for Windows based servers I can create
a WINDOWS_BASED_SERVERS class and group all servers under this class.
Apps are basically a set of stanzas which are deployed to different members of a server
class.
when we set up server classes and assigns them apps or set of apps we need to restart
splunkd, a system level process - Once this is done any new app updates will be
automatically sent to all servers.
Look through Splunkd.log for diagonostic and error metrices. We can also go to
Monitoring console app and check for the resource utilization of different server
components like, CPU, MEMORY Utilisation etc.
_audit - All search related information - Scheduled searches as well as adhoc searches
_introspection - All system wide data, including memory and CPU data
Epoch time is UNIX based time in splunk. Epoch time is converted to Standard time
using this function - |eval Time = strftime(EpochTimeField, "%y-%m-%d %H:%M:%S")
CIM is common information model used by splunk. CIM acts as a common standard used
by data coming from different sources.
- Dispatch directory is for running all scheduled saved searches and adhoc searches.
Btool command shoudn't be used on most of the cases. It is most unstable command
and is very rarely updated. It's mainly for mainframe health check statuses.
However, if we still need to run and debug things we can use this command -
● System Requirements -
● Updates to dashboards, reports, new saved searches created are always subject
to Captain - The captain takes care of all of this
Before we Configure search head clustering, we need to configure a deployer because
Deployer IP is required to create a search head cluster
DEPLOYER -
Distributes apps and other configurations to SH cluster members
Can be colocated with deployment server if no. of deployment clients < 50
Can be colocated with Master node
Can be colocated with Monitoring console
Can service only one SH cluster
The cluster uses security keys to communicate/authenticate with SH members
[shclustering]
Pass4symmkey = password
shcluster_label = cluster1
Restart the Splunk since change has been done in .conf files
While setting up Search head clustering we first have to create a Deployer as above -
When it comes to setting up a SH clustering, the first thing we need to do is login to that
particular Search head and run the command by going to bin as follows -
With recent upgrade of Splunk to 8.0.1 the problem with orphaned searches has almost
resolved. But still if you see the orphaned searches warning appearing under Messages
in your search head you can follow this guideline on how to resolve.
https://docs.splunk.com/Documentation/Splunk/8.0.2/Knowledge/Resolveorphanedsearc
hes
- User - Can only read from splunk artifacts. Example, Reports, dashboards, alerts and
so on. Don't have edit permissions.
- Power user - Can create dashboard, alerts, reports and have Edit permissions
- Admin- Have access to all production servers, can do server restarts, take care of
maintenance activities and so on. Power user and normal user role are subsets of Admin
role
Lookup is a knowledge object in Splunk. Within our SPL code if we need to reference to
an external file we can do that using lookup. Lookup files can be added to splunk by
going to settings > lookups > add lookup files.
Lookups are useful also from the perspective of performing several types of joins like,
inner, outer etc.
Transforming commands are used for transforming event data into a different format,
this may include converting it to Chart, table, etc.
chart
timechart
rare
top etc
Tstats command works on only index time fields. Like the stats commands it shows up
the data in the tabulat format. It is very fast compared to stats command but using
tstats you can only group-by with index fields not the search time fields which are
created using Eval command.
TSIDX files are Time series index files. When raw data comes to index it gets converted
into Tsidx files. Tsidx files are actually searchable files from Splunk search head
- Hot Bucket - Contains newly incoming data. One the Hot bucket size reaches a
particular threshold, it gets converted to a cold bucket. This bucket is also searchable
from the search head
- Warm bucket - Data rolled from hot bucket comes to this bucket. This bucket is not
actively written into but is still searchable. Once the indexer reaches maximum number
of cold buckets maintained by it, it gets rolled to warm buckets
- Cold - Contains data rolled from warm buckets. The data is still searcahble from the
search head. It's mainly for the backup purpose in case one or more hot or warm
buckets are unsearchable. After cold buckets reaches a threshold, they gets converted to
Frozen buckets
Cluster master or master node is for maintaining a particular cluster. This is how it is
configured -
●
Replication Factor.The replication factor determines how many copies of data the
cluster maintains. The default is 3. For more information on the replication factor,
see Replication factor. Be sure to choose the right replication factor now. It is
inadvisable to increase the replication factor later, after the cluster contains
significant amounts of data.
●
●
Search Factor. The search factor determines how many immediately searchable
copies of data the cluster maintains. The default is 2. For more information on the
search factor, see Search factor. Be sure to choose the right search factor now. It is
inadvisable to increase the search factor later, once the cluster has significant
amounts of data.
●
●
Security Key. This is the key that authenticates communication between the master
and the peers and search heads. The key must be the same across all cluster nodes.
The value that you set here must be the same that you subsequently set on the peers
and search heads as well.
●
●
Cluster Label. You can label the cluster here. The label is useful for identifying the
cluster in the monitoring console. See Set cluster labels in Monitoring Splunk
Enterprise.
●
6. Click Enable master node.
Replication factor tells how many indexers have the search copies available
Workflow actions are for automating low level implementation details and getting things
automated.
We can create workflow actions by going to Settings > Fields > Workflow actions
Each and every user authenticated to Splunk has limited search quota - Normal users
has around 25 MB whereas power users has around 50-125 MB. Once this threshold is
exceeded for a particular time, users searches will start getting queued.
Server class are group of servers coming from the same flavour or same geographic
location. Ex, to combine all windows based servers we will create a windows based
server class. Similarly, to combine all Unix based servers we will create a unix based
server class.
Token is a placeholder for a set of values for a particular variable. Example, Name =
$Token1$. Now here Name field can have multiple values like, Naveen, Ashu, Ajeet etc.
The value that a particular token will hold completely depends upon the selection. Tokens
are always enclosed between $$, like the example above.
Dashboard is a kind of view which contains different panels and panel shows up different
metrices.
Data models are a hierarchal representation of data. It shows the data in a more
structured and organised format. Pivot tables are subsets of a data model, it's an
interface where users can create reports, alerts without much involvement to SPL
language.
48. Default indexes created during Indexer installation?
Default indexes are - main, default, summary, _internal, _introspection, _audit
52. How will you make a indexer not searchable for user
I don't know who to do it but I will ask someone
This is the done with this command and already explained above -
./splunk init shcluster-config -auth admin:password -mgmt_uri
IPaddressofSHinHTTPSFollowedByMGMTPort -replication_port 9000 -replication_factor 3
-conf_deploy_fetch_url DeployerIpaddress:8089 -secret passwordofdeployer
-shcluster_label clusterName
By default Splunk sets the default size to 10 GB for all buckets on a 64-bit OS.
Third layer of bucket refers to cold bucket which is still searchable.
Universal forwarders are basically agents which are installed on the client, i.e., servers
from where we are getting the data. They don't have any pre-processing capability.
Heavy forwarders in turn have pre-processing, routing and filtering capabilities.
to identify which Indexer is down we can again run a simple command - index=_internal
source="*splunkd.log*""*Connection failure*" - By running this command you will get to
know the indexerIP which is having connection failure.
Ashutosh Answers in 2021
How to add new indexer to cluster? -------- Go to Settings > Indexer Clustering > Add
peer node - Give master URI. Since it is a new cluster member, you need to run this
command so that all data is synced with this cluster as well. The command is - splunk
apply cluster-bundle.
Normally when a indexer cluster member having searchable copies goes down, the _raw
copies of data gets converted to searchable files (tsidx). Master node in this case takes
care of bucket fixing, that is tries to keep the match with the Replication factor you've
set up.
what is search affinity splunk? - In case of Mutisite cluster, search affinity refers to
setting up search heads in a way that they must only query for results from their local
site, that is the site that is nearest to them. Example, if you have a multisite cluster in 2
different sites, namely, Rochelle and Hudson. Now if a user searches for any data from
Rochelle, all the search requests must go to Indexer clusters which are in Rochelle zone
and similar for hudson site as well. Setting up search affinity helps in reducing latency
within networks.
What is maintainance mode ? - Also called as halt mode because it prevents any bucket
replication within indexer cluster. Example, in case you are upgrading your splunk from
What does maintainance mode do ? - Maintenance mode will halt all buckets fixups,
meaning, if there is any corrupt bucket it will not be fixed to normal. Also, maintenance
mode will not check for conditions like, Replication factor is not met or Search factor is
not met. It also prevents timely rolling of Hot buckets to warm buckets.
50% of searchheads are down ? what will happen? How to resolve? ------ Run, splunk
show shcluster-status to see if the captain is also down. In this case you need to setup a
static captain as follows - ./splunk edit shcluster-config -mode captain
-captain_uri https://SHURL:8089 -election false. In case you have 4 SH members and 2
went down that means your default replication factor which is set to 3 will not be met. In
this case you can reinstantiate a SH cluster with following command as follows by setting
the RF to 2. Here is the command, ./splunk init shcluster-config -auth
username:password -mgmt_uri https://shheadURI:8089 -replication_port 9000
-replication_factor 2 -conf_deploy_fetch_url http://DeployerURL:8089 -secret
deployerencyptedpassword -shcluster_label labelName.
what are the Challenges you are faced? ------- HERE YOU CAN GIVE ANY EXAMPLE.
How to upgrade the version from scratch? ----------- Enable Maintenance Mode. take a
backup of all splunk artifacts to some repository or to some backup server. Install the
newer package using wget utility on linux machines and getting the windows installer on
windows. Keep using Monitoring console and Health card report manager to check the
status of your Splunk instances.
What is Base search and child search or post process search? - Base search or post
process searches are used for optimising Splunk searches run time. There will be one
search that will be executed once and same search can be used in multiple panels of
a dashboard. To create a base search, do the following, <search id="basesearchID">
and then use <search base="basesearchID"> in all the panels that will be using the base
search.
What is the average time taken to ingest the 500gb data ? - Depends on how the
ingestion is happening.
If deployment server went down ? how to resolve? What is the impact? - The main
purpose why we used DS is to distribute apps and updates to a group of
non-clustered Splunk instances. In case DS went down all the Deployment clients
polling to DS will not get the latest set of apps and updates.
If cluster master went down ? how to resolve? What is the impact? - Cluster master is
responsible for managing the entire Indexer cluster members. In case, CM goes down
replication between different iNDEXER MEMBERS will not happen. A user search will
randomly land in one of the indexer member cluster and Co-ordination will break. As
part of remedy, restart splunkd on Cluster master and look for it's internal logs on other
cluster members. To resync equal data between all members, run - splunk apply
cluster-config and also splunk resync cluster-config commands individually in all Cluster
members so that all the members have same set of data.
How many servers do we need to ingest tha 300gb data? How can you segregate the
data? - You can make use of https://splunk-sizing.appspot.com/ website to make
selection on amount of bandwidth you may require.
At a time 10 persons searching for same query but only 6 members getting the query
remaining not why? - Depends on no. of VCPUs that your infrastructure supports. Let's
say if you are having 3 Search head members with 2 VCPUs each that means only 2*3 =
6 Concurrent searches can run at a time. You need to increase your throughput by
adding more CPUs for concurrent processing.
Hot to warm rolling conditions ? - Based on retention policy of Hot bucket and maximum
size of each bucket. You can use |dbinspect indexname command to get the bucket info
about any index. Alternatively, you can also vi indexes.conf to get info about hot and
warm buckets.
What is the background process to ingest the data into the splunk ? -Install UF on any
machine using the wget utiliy supported by splunk/downloads, once this is done loop
through, /opt/splunkforwarder directory, if you can successfully loop through this
directory that means UF package is successfully installed. Run, splunk enable boot-start
so that UF is always available at run time.
What is the role of captain ?how can we define captain ? - Captain takes care of
replication and managing searches efficiently between different search head members.
Captain can be defined as follows - ./splunk bootstrap shcluster-captain -servers_list
“https://shmemberURI:8089, otherSHmembersURI”
How to onboard the data through Splunk Add-on? - Settings > Data and Inputs >
Continuously monitor > filename > Ingest using add-on (From dropdown select the
add-on name).
How to onboard the syslog data from the scratch? - Use an agent like http event
collecter or REST APIs to first send all the syslog data to syslog-NG server and within the
syslogNG servers you can install UF package from where you can ingest the data to
Splunk indexers.
How to ingest teh data from routers to Splunk? - SAME APPRAOCH AS ABOVE
Events time is showing feture timestamp? what is the reason? How to fix this issue? -
Look for the timestamp column been used from the log files to ingest data. The column
might be having future timestamps pertaining to future migrations dates or DR activities
for example. In your UF for to indexes.conf and set a parameter, time=Date.
====
How to optimize the Splunk Query in real time ? - There are lot of techniques - Base
searches for dashboards, Filter as early as possible, avoid using wildcards, Inclusion is
always better than exclusion. Example, search specifically for status = 50* rather than
searching for |search NOT status=50*.
Use data models which can be used within lot of other saved searches like dashboards
and reports.
Alert didn’t trigger ? reason ?How to troubleshoot? - Run the following command - |rest
/services/search/jobs "Alert Name". This will tell you when the alert has last ran. You can
also run the following command if you have admin permissions - index=_audit "Alert
Name" - This will tell you what time the alert took to run and when it was last executed.
Run, index=_internal to get the diagnostics metrices for the same alert name.
You can also run, |rest /servicesNS/-/-/saved/searches |search cron_schedule = "0 *"
(Give the wildcard cron schedule for the alert and check if there are lot of concurret
saved searches running at same interval of time. Try changing the schedules of other
alerts and reports by 1-2 mins ahead or behind).
what is the difference between top and head? - Top gives you list of most common field
values alongwith a percentage of how frequently it is present compared to other
field values. Head command just gives initial few results based on the query specified.
Example, there is a field called price which has values as, 20, 30, 40, 50, 60, 70, 80, 90,
20, 30, 40, 20. When you run, | top price, this command will give you price value as 20
in first row becuse 20 is appearing maximum no. of 3 times in all price field values, it will
also show percentage of how many times value 20 is appearing. Similarly, if you run -
|head 5 price it will giv eyou this as the output - 20, 30, 40, 50, 60.
What is the REST API? - REST APIs are path to specific locations or directories for
accessing different types of knowlegde objects. Example by using, |rest
/servicesNS/-/-/saved/searches, you can get list of all reports and alerts. Similary by
running, |rest /servicesNS/-/-/data/ui/views, you can get list of all dashboards and so
on.
Have you migrated any knowledge objects from one environment to another
environment? - Yes, you can do them with the help of REST APIs as explained above.
cluster, one peer goes down then another peer comes to picture and
server the data to end user.
2. What is Replication Factor and Search Factor
RF :: No of copies of raw data, its equal or less then no of peers in
cluster. RF depends on probability of network down tolerance.
SF:: No. of copies of index file, its equal or less then no of RF.
3. What will happen if peer goes down?
every 60Sec, master send the heart beat frequency to all peers then
all peers should be reply to that heart beat frequency.
4. What will happen if master goes down?
peer try to call to master when its get the data form UF, if master is
not respond it will wait 60 sec then again try to contact to master like
this 3 time it will do, after that peer will go previous history of master
server suggested until 24 hours. after 24hours that pervious history
also delete then peers act as standard lone.
5. Difference between valid and complete cluster?
Valid is nothing but non- searchable copy
Complete is nothing but searchable copy
Searchable is nothing but which have both replication factor and
search factor
11. Diff between standalone search head and clustered search head
SH:: its wont replicate the splunk knowledge objects
SHC:: in SHC replicate the splunk knowledge objects
##
Via Configuration ::
we have to give the Indexers in distserach.conf file
/opt/splunk/etc/syaytem/local
Vi distsearch.conf
[distributedSearch]
Servers = 1.1.1.1:8089,2.2.2.2:8089
;wq!
Then we have to copy the trusted.pem file of search head and paste
to indexer.
/opt/splunk/etc/distserverkeys
Note :: after installation and before start the splunk services only
we have to change this.
Via configuration::
Go to web.conf
Add this stanza under [settings]
startwebserver=0
If 0 means disabling the web
If 1 means enabling the web
16. What is dispatch directory and are we able to take control over
it?
Dispatch directory is nothing but whatever search in search head bar
its going to store that records
/opt/splunk/var/run/splunk/dispatch
/opt/slunk/bin
./splunk cmd splunkd clean-dispatch /tmp -24h@h
Using limits.conf we can control these settings
Adhoc search - 10 minutes
## Global behaviour via limits.conf
limits.conf
[search]
ttl = 600
Ananya Technologies splunk.ananyatechnologies@gmail.comPage 93
SPLUNK INTERVIEW Q & A
Ph.No:9618592385
# default - 10 mins
[subsearch]
ttl = 300
# default - 5 mins
19. What are the basic troubleshooting steps if you not receive your
data in indexer end.
First we need to check the communication between UF and Indexer
then
Then need to check in monitor stanza, given correct path and
available index or not.
Check the splunkd logs for know exact isuue.
Splunk will not create automatically any index. If index is not present
it will throw you an error ( if u not mention any index name in
monitor stanza it will go to default index MAIN.
Splunk will create sourcetype automatically. It will strip last part from
source
##
● Splunk is a platform which allows people to get visibility into machine data,
that is generated from hardware devices, networks, servers, IoT devices and
other sources
● Splunk is used for analyzing machine data because it can give insights into
application management, IT operations, security, compliance, fraud detection,
threat visibility etc
You can find more details about the working of Splunk here: Splunk Architecture:
Tutorial On Forwarder, Indexer And Search Head.
Read Blog
● Act like an antivirus policy server for setting up Exceptions and Groups, so
that you can map and create different set of data collection policies each for
either a windows based server or a linux based server or a solaris based
server
● Can be used to control different applications running in different operating
systems from a central location
● Can be used to deploy the configurations and set policies for different
applications from a central location.
Q4. Why use only Splunk? Why can’t I go for something that is
open source?
This kind of question is asked to understand the scope of your knowledge. You can
answer that question by saying that Splunk has a lot of competition in the market
for analyzing machine logs, doing business intelligence, for performing IT operations
and providing security. But, there is no one single tool other than Splunk that can do
all of these operations and that is where Splunk comes out of the box and makes a
difference. With Splunk you can easily scale up your infrastructure and get
professional support from a company backing the platform. Some of its competitors
are Sumo Logic in the cloud space of log management and ELK in the open source
category. You can refer to the below table to understand how Splunk fares against
other popular tools feature-wise. The detailed differences between these tools are
covered in this blog: Splunk vs ELK vs Sumo Logic.
Q6. What are the unique benefits of getting data into a Splunk
instance via Forwarders?
You can say that the benefits of getting data into Splunk via forwarders are
bandwidth throttling, TCP connection and an encrypted SSL connection for
transferring data from a forwarder to an indexer. The data forwarded to the indexer
is also load balanced by default and even if one indexer is down due to network
outage or maintenance purpose, that data can always be routed to another indexer
instance in a very short time. Also, the forwarder caches the events locally before
forwarding it, thus creating a temporary backup of that data.
● You can create a web hook, so that you can write to hipchat or github. Here,
you can write an email to a group of machines with all your subject, priorities,
and body of the message
You can find more details about this topic in this blog: Splunk alerts.
● You can do a double click, which will perform a drill down into a particular list
containing user names and their IP addresses and you can perform further
search into that list
● You can do a double click to retrieve a user name from a report and then pass
that as a parameter to the next report
● You can use the workflow actions to retrieve some data and also send some
data to other fields. A use case of that is, you can pass latitude and longitude
details to google maps and then you can find where an IP address or location
exists.
The screenshot below shows the window where you can set the workflow actions.
● Create Sales Reports: If you have a sales report, then you can easily create
the total number of successful purchases, below that you can create a child
object containing the list of failed purchases and other views
● Set Access Levels: If you want a structured view of users and their various
access levels, you can use a data model
● Enable Authentication: If you want structure in the authentication, you can
create a model around VPN, root access, admin access, non-root admin
access, authentication on various different applications to create a structure
around it in a way that normalizes the way you look at data.
So when you look at a data model called authentication, it will not matter to
Splunk what the source is, and from a user perspective it becomes extremely
simple because as and when new data sources are added or when old one’s
are deprecated, you do not have to rewrite all your searches and that is the
biggest benefit of using data models and pivots.
On the other hand with pivots, you have the flexibility to create the front views of
your results and then pick and choose the most appropriate filter for a better view of
results. Both these options are useful for managers from a non-technical or
● At times ‘eval’ and ‘stats’ are used interchangeably however, there is a subtle
difference between the two. While ‘stats‘ command is used for computing
statistics on a set of events, ‘eval’ command allows you to create a new field
altogether and then use that field in subsequent parts for searching the data.
● Another frequently asked question is the difference between ‘stats’, ‘charts’
and ‘timecharts’ commands. The difference between them is mentioned in the
table below.
In Stats command, you can use In Chart, it takes only 2 fields, each
In Timechart, it tak
multiple fields to build a table. field on X and Y axis respectively.
the X-axis is fixed a
● The obvious and the easiest way would be by using files and directories as
input
Q20. What are the defaults fields for every event in Splunk?
There are about 5 fields that are default and they are barcoded with every event into
Splunk.
They are host, source, source type, index and timestamp.
To determine the priority among copies of a configuration file, Splunk software first
determines the directory scheme. The directory schemes are either a) Global or b)
App/user.
When the context is global (that is, where there’s no app/user context), directory
priority descends in this order:
When the context is app/user, directory priority descends from user to app to
system:
Q23. What is the difference between Search time and Index time
field extractions?
As the name suggests, Search time field extraction refers to the fields extracted
while performing searches whereas, fields extracted when the data comes to the
indexer are referred to as Index time field extraction. You can set up the indexer
time field extraction either at the forwarder level or at the indexer level.
Another difference is that Search time field extraction’s extracted fields are not part
of the metadata, so they do not consume disk space. Whereas index time field
extraction’s extracted fields are a part of metadata and hence consume disk space.
● The first time when data gets indexed, it goes into a hot bucket. Hot buckets
are both searchable and are actively being written to. An index can have
several hot buckets open at a time
● When certain conditions occur (for example, the hot bucket reaches a certain
size or splunkd gets restarted), the hot bucket becomes a warm bucket
(“rolls to warm”), and a new hot bucket is created in its place. Warm buckets
are searchable, but are not actively written to. There can be many warm
buckets
● Once further conditions are met (for example, the index reaches some
maximum number of warm buckets), the indexer begins to roll the warm
buckets to cold based on their age. It always selects the oldest warm bucket
to roll to cold. Buckets continue to roll to cold as they age in this manner
The bucket aging policy, which determines when a bucket moves from one stage to
the next, can be modified by editing the attributes in indexes.conf.
Get Started with Splunk
● Assume that your data retention policy is only for 6 months but, your data
has aged out and is older than a few months. If you still want to do your own
calculation or dig out some statistical value, then during that time, summary
index is useful
● For example, you can store the summary and statistics of the percentage
growth of sale that took place in each of the last 6 months and you can pull
the average revenue from that. That average value is stored inside summary
index.
That is the use of Summary indexing and in an interview, you are expected to
answer both these aspects of benefit and limitation.
( Lockdown Period )
==================================
6. After pushing the server class but how forwarder know there in server clss there in
server, forwarder will chk by using which config forwarder will chk ds will I need for this
application.
9. What difference between nas storage path and actual storage path
4.what is DB connect ?
9.if you need to migrate to data said how do you do it data join?
10.what is lookups?
4. How would universal forwarder knows that all the events have been reached to
indexer and which attribute we use in configuration
5. Regular expressions and they gave scenarios to extract the fields and we should do it
on board
7. Bucket concepts
9. For suppose there are 4 indexers and we don't know which one is down.Need to find
out which is down and bring the sever up. (Scenario based)
3.IS is single site or multi site and can u explain multisite briefly
11.what is RF and SF
14.where it is replicated
18.what is the role of license in clustering and license is applicable for clustering and
where we have to configure
19.In index clustering if one of the indexer is down how do know that and where we
have to check
23.what is macros
36.how ur onboarding
38.how to push the files to search head and what is the command
9.what is bloomfilter
10.how do you change the password and create the password in CLI
=======================================================
8. retension period?
12. what is the difference between power user and admin & user?
38. what is the difference betweenfast mode,smart mode and verbose mode?
SHYAM
1.How many servers do we need to ingest tha 300gb data? How can you
segregate the data?
WIPRO
GOURAV ( WIPRO )
RAVI
SONY
NVISH
1.Components of splunk
5.File presidency
7.Rex Command
10.Timezone
8K MILES
1.Daily Activities
6.Masking
7.Architecture
9.Troubleshooting
11.UF installation
12.What is the configuration files used for connect from forwarder to deployment server
13.Dashboards explanation
16.Aws Addons
17.DB-connect
4. Are you worked on Regular expression parsing the data, Lets for example, you want to
firewall logs it is not in correct format i need to extract the ip address and domain
name?
6. What are the common performance issues in splunk admin side and development
side?
7. I have 1TB data for ingesting. what are the recomendations for the splunk
architecture(How many indexes, search heads)?
13. After troubleshooting Universal forwarder agent restart is done, but still issue is not
solved, what is the next step we must take?
15. Is there any physical difference b/w hot warm cold and frozen and tawad buckets?
16. Why we require small hot bucket, medium warm bucket and large cold frozen
thawad buckets.Do you any brief idea behind that?
19. Can we search the data in frozen bucket-NO Do you know the reason?
20. Once the data is archived, but for investigation purpose we need that data after 30
days. How will that data be restored?
21. A dashboard want to be viewed by a paricular user. How will that happen?
27. License master server is not reachable. what will be impact in splunk instance?
Check for attempts to gain access to a system by using multiple accounts with
Metmox(06-10-2020)Akshay
Panel-Vijay
-------------
6. Explain me alert you created in your firm(You explained gim fire Exception)
Metmox(Akshay)Panel-Bhargavi
---------------
7. If your client will not get SOP then are you capable of doing Onboarding
Metmox(06-10-2020)AkshayPanel-vijay
-------------
6. Explain me alert you created in your firm(You explained gim fire Exception)
Metmox(Akshay)Panel-Bhargavi
---------------
7. If your client will not get SOP then are you capable of doing Onboarding
1. Can you explain current Business followed by your company in regarding your duties?
5. What is splunk?
9. If i gave you a file, a file contain a password and ingest the data into a splunk. What is the
process of that? I am asking from scratch?
11. What is the difference between splunk app and splunk addon?
Sony(30/10/2020)
Panel:NALLAMATHU RAVI KUMAR
1. Can you brief me about the recent project hat you have done and roles and responsibility?
2. Can you brief me about project architecture they are using in splunk?
4. Components of splunk?
10. Can you explain me how the baground process i.e splunk is ingesting data into the splunk?
11. How raw data is backing and where that process are happening?
Value Labs-12-11-2020
Panel- Vamsi Krishna Konjati
----------------------------
1. Can you explain current Business followed by your company in regarding your duties?
5. What is splunk?
9. If i gave you a file, a file contain a password and ingest the data into a splunk. What is the
process of that? I am asking from scratch?
11. What is the difference between splunk app and splunk addon?
6. If your server is running. Have you gone through vunerability part?Did you fix any vunerability in
your current project?
11. just help me with simple steps to set up our environment 3 indexer, 2 search head 50 uf and 50
windows servers--> to bring this data into splunk index…
4. Are you worked on Regular expression parsing the data, Lets for example, you want to firewall
logs it is not in correct format i need to extract the ip address and domain name?
6. What are the common performance issues in splunk admin side and development side?
7. I have 1TB data for ingesting. what are the recomendations for the splunk architecture(How
many indexes, search heads)?
13. After troubleshooting Universal forwarder agent restart is done, but still issue is not solved,
what is the next step we must take?
15. Is there any physical difference b/w hot warm cold and frozen and tawad buckets?
16. Why we require small hot bucket, medium warm bucket and large cold frozen thawad
buckets.Do you any brief idea behind that?
19. Can we search the data in frozen bucket-NO Do you know the reason?
20. Once the data is archived, but for investigation purpose we need that data after 30 days. How
will that data be restored?
21. A dashboard want to be viewed by a paricular user. How will that happen?
27. License master server is not reachable. what will be impact in splunk instance?