Lesson 4 Collecting and Querying Security
Lesson 4 Collecting and Querying Security
Lesson 4
Collecting and Querying Security
Monitoring Data
Lesson Introduction
Security monitoring depends to a great extent on the use of data captured in network
traces, log files, and host-based scanners. Collecting this data into a single repository
for analysis—a security information and event management (SIEM) system—will be a
core part of your role as a cybersecurity analyst.
Lesson Objectives
In this lesson you will:
• Configure log review and SIEM tools.
LICENSED FOR USE ONLY BY: AFENDEY JINIR · 14899581 · JAN 18 2022
138 | The Official CompTIA CySA+ Student Guide (Exam CS0-002)
Topic 4A
&RQȴJXUH/RJ5HYLHZDQG6Ζ(07RROV
Log review is a critical part of security assurance. Only referring to the logs following a
major incident is missing the opportunity to identify threats and vulnerabilities early
and to respond proactively. There are many types of logs and log formats however,
so you must be able to configure systems that can aggregate and correlate data from
these different log sources and produce actionable intelligence.
• Develop use cases to define exactly what you do and do not consider a threat.
• Have a plan about what should be done in the event that you are alerted to a threat.
• Schedule regular threat hunting so you don't miss any important events that have
escaped alerts.
• Provide auditors and forensics analysts with a trail of evidence to support their duties.
The following represents some of the major commercial and open-source products
available in the SIEM marketplace.
LICENSED FOR USE ONLY BY: AFENDEY JINIR · 14899581 · JAN 18 2022
The Official CompTIA CySA+ Student Guide (Exam CS0-002) | 139
Splunk
Splunk (splunk.com) is one of the market-leading big data information gathering and
analysis tools. Splunk can import machine-generated data via a connector or visibility
add-on. Connectors exist for most NOS and application platforms. The data is indexed
as it is retrieved and written to a data store. The historical or real-time data captured
by Splunk can then be analyzed using searches, written in Splunk's Search Processing
Language (SPL). The results of searches can be presented using visualization tools in
custom dashboards and reports, or configured as triggers for alerts and notifications.
Splunk can be installed as local enterprise software or used as a cloud solution. There
is also a Splunk Light product for smaller networks and a dedicated Enterprise Security
module. The security module includes pre-configured dashboards, security intelligence
searches, and incident response workflows.
ELK/Elastic Stack
The ELK Stack (elastic.co), now the Elastic Stack with the addition of Beats, is a collection
of tools providing SIEM functionality:
• Elasticsearch—The query and analytics tool.
The ELK Stack can be implemented locally or it can be invoked as a cloud service.
ArcSight
ArcSight (microfocus.com/en-us/products/siem-security-information-event-
management/overview) is a vendor of SIEM log management and analytics software,
now owned by HP, via the affiliated company Micro Focus. As well as cybersecurity
intelligence and response, one of the crucial functions of enterprise SIEMs like ArcSight
is the ability to provide compliance reporting for legislation and regulations such as
HIPAA, SOX, and PCI DSS.
QRadar
QRadar (ibm.com/security/security-intelligence/qradar) is IBM's SIEM log management,
analytics, and compliance reporting platform.
Graylog
Graylog (graylog.org) is an open-source SIEM with an enterprise version focused on
compliance and supporting IT operations and DevOps.
LICENSED FOR USE ONLY BY: AFENDEY JINIR · 14899581 · JAN 18 2022
140 | The Official CompTIA CySA+ Student Guide (Exam CS0-002)
LICENSED FOR USE ONLY BY: AFENDEY JINIR · 14899581 · JAN 18 2022
The Official CompTIA CySA+ Student Guide (Exam CS0-002) | 141
• What happened, with specific detail to distinguish the nature of the event from
other events.
• Where it happened—on which host, file system, network port, and so forth.
• Where the event originated (for example, a session initiated from an outside IP
address over a VPN connection).
To learn more, watch the video “Configuring SIEM Agents” on the CompTIA Learning Center.
• Sensor—As well as log data, the SIEM might collect packet captures and traffic flow
data from sniffers. Often, the SIEM software can be configured in sensor mode and
deployed to different points on the network. The sensor instances then forward
network traffic information back to the main management instance.
LICENSED FOR USE ONLY BY: AFENDEY JINIR · 14899581 · JAN 18 2022
142 | The Official CompTIA CySA+ Student Guide (Exam CS0-002)
In this network, the SIEM aggregates network traffic data from SPAN (port mirroring) and TAP
sensors placed at strategic locations. Data from client workstations is collected from IDS/EDR
agents running on hosts. Security data is collected from application and access logs using either
agents or the Syslog protocol. (Images © 123rf.com)
Date/Time Synchronization
Another processing challenge is the timestamps used in each log. Hosts might use
incorrect internal clock settings, or settings that are correct for a different time zone, or
record the timestamp in a non-standard way (tools.ietf.org/html/rfc3339). These issues
can make it difficult to correlate events and reconstruct time sequences. Try to ensure
that all logging sources be synchronized to the same time source, using Network Time
Protocol (NTP), for instance. The system also needs to deal with varying time zones
and daylight savings time changes consistently. If the SIEM cannot correct for these
variations, one option is to ensure that all logging sources record timestamps in the
LICENSED FOR USE ONLY BY: AFENDEY JINIR · 14899581 · JAN 18 2022
The Official CompTIA CySA+ Student Guide (Exam CS0-002) | 143
UTC time zone. For example, an ISO 8601/RFC 3339 date/timestamp uses the following
format:
2020-01-01T00:00.01Z
This is one second past midnight on New Year's Day 2020 at the Greenwich Meridian.
The Z indicates that there is no time zone offset, so the date/time value represents
Coordinated Universal Time (UTC). At the same time New Year's Day is being celebrated
in Greenwich, the local time in New York (UTC-5) would be recorded as:
2019-12-31T19:00.01-05:00
Coordinated Universal Time (UTC) is a time standard, not a time zone, but it always
corresponds to the current time in the Greenwich Meantime (GMT) time zone. A date stamp
in GMT should be recorded as 2020-01-01T00:00.01+00:00. RFC 3339
allows the use of -00:00 to indicate that the time zone is unknown.
Secure Logging
Logging requires sufficient IT resources because it can be both disk- and network-
intensive. Large organizations can generate gigabytes or even terabytes of log data
every hour. Analyzing such large volumes of data requires substantial CPU and system
memory resources. It is also important to configure a secure channel so that an
attacker cannot tamper with the logs being sent to the SIEM. The data store itself must
have the CIA triad properties of confidentiality, integrity, and availability.
Event Log
One source of security information is the event log from each network server or client.
Systems such as Microsoft Windows, Apple macOS, and Linux keep a variety of logs to
record events as users and software interact with the system. The format of the logs
varies depending on the system. Information contained within the logs also varies by
system, and in many cases, the type of information that is captured can be configured.
When events are generated, they are placed into log categories. These categories
describe the general nature of the events or what areas of the OS they affect. The five
main categories of Windows event logs are:
• Application—Events generated by applications and services, such as when a service
cannot start.
• System—Events generated by the operating system and its services, such as storage
volume health checks.
• Forwarded Events—Events that are sent to the local host from other computers.
• Warning—Events that are not necessarily a problem but may be in the future.
• Error—Events that are significant problems and may result in reduced functionality.
LICENSED FOR USE ONLY BY: AFENDEY JINIR · 14899581 · JAN 18 2022
144 | The Official CompTIA CySA+ Student Guide (Exam CS0-002)
Beyond general category and severity, each log entry includes fields for the subject of
the entry, details of the error (if there is one), the event's ID, the source of the event,
and a description of what a warning or error might mean.
Prior to Windows Vista and Windows 7, one limitation of Windows logs was that they
only logged local events; that is, each computer handled logging its own events. This
meant that third-party tools were needed to gain an overall view of messaging for
the entire network. The development of event subscriptions in the latest versions of
Windows and Windows Server allows logging to be configured to forward all events to
a single host, enabling a holistic view of network events. The updated log format (.evtx)
uses XML formatting, making export to third-party applications more straightforward.
Using the Elastic Stack running in Security Onion to view a summary of logs collected from winlogbeat
agents running on Windows servers. (Screenshot Security Onion securityonion.net)
syslog
For non-Windows hosts, events are usually managed by syslog (tools.ietf.org/html/
rfc3164). This was designed to follow a client-server model and so allows for centralized
collection of events from multiple sources. It also provides an open format for event
logging messages, and as such has become a de facto standard for logging of events
from distributed systems. For example, syslog messages can be generated by Cisco
routers and switches, as well as servers and workstations, and collected in a central
database for viewing and analysis. Syslog is a TCP/IP protocol and can run on most
operating systems. It usually uses UDP port 514.
LICENSED FOR USE ONLY BY: AFENDEY JINIR · 14899581 · JAN 18 2022
The Official CompTIA CySA+ Student Guide (Exam CS0-002) | 145
Configuring the pfSense UTM to send log events to a remote syslog server at 10.1.0.248 over the default
UDP port 514. (Screenshot Netgate pfSense pfsense.org.)
A syslog message comprises a PRI code, a header containing a timestamp and host
name, and a message part. The PRI code is calculated from the facility and a severity
level:
• Facility identifies the affected system by using a numeric value from 0 to 23. On
most systems, the values can be interpreted by a short keyword such as "kern"
(operating system kernel), "mail" (mail system), or "auth" (authentication or security).
The facility is multiplied by 8.
• Severity values are a number from 0 (most critical) to 7 (not critical). The severity
value is added to the facility value to derive the PRI.
The PRI code is used by the logging daemon to determine where to write the event or
print an alert. For example, a PRI code of <19> represents the mail facility (19/8=2.xxx
[ignore the remainder]) plus an error-level severity (19-[2*8]=3), so the event would
be written to the mail log and possibly also printed to the administrator's terminal. An
event can be written to multiple logs.
In a basic syslog implementation, the PRI code is not usually written to the log. On modern
implementations, it is possible to configure the template used by the logging daemon to add
the string representations to the header. Similarly, more information may be added to the
header than just the timestamp and host name.
The message part contains a tag showing the source process plus content. The format
of the content is application dependent. It might use space- or comma-delimited fields
or name/value pairs, such as JSON data.
The original syslog protocol has some drawbacks. Using UDP delivery protocols does
not ensure delivery, so messages could be lost in a congested network. Also, it does
not supply basic security controls to ensure confidentiality, integrity, and availability of
log data. Messages are not encrypted in transit or in storage, and any host can send
LICENSED FOR USE ONLY BY: AFENDEY JINIR · 14899581 · JAN 18 2022
146 | The Official CompTIA CySA+ Student Guide (Exam CS0-002)
data to the syslog server, so an attacker could cause a DoS to flood the server with
misleading data. A man-in-the-middle attack could destroy the integrity of message
data. In response to these shortcomings, newer syslog implementations introduce
security features, many of which are captured in the standard proposal tools.ietf.org/
html/rfc3195, which includes:
• The ability to use TCP (port 1468) for acknowledged delivery, instead of
unacknowledged delivery over UDP (port 514).
• The ability to use Transport Layer Security (TLS) to encrypt message content in
transit.
Syslog implementations may also provide additional features beyond those specified
in RFC 3195, such as message filtering, automated log analysis capabilities, event
response scripting (so you can send alerts through email or text messages, for
example), and alternate message formats.
Note that syslog can refer to the protocol used to transfer log data, the server (daemon)
used to implement logging, or to the format of log entries. Most systems implement an
updated version of the daemon (syslog-ng or rsyslog).
Beyond OS event logs, various log formats have been developed for the specific purpose of
exchanging event data between security tools, such as from an IDS or firewall to a SIEM. You
can find an overview of these formats at secef.net/tutorials.
LICENSED FOR USE ONLY BY: AFENDEY JINIR · 14899581 · JAN 18 2022
The Official CompTIA CySA+ Student Guide (Exam CS0-002) | 147
5HYLHZ$FWLYLW\
/RJDQG6Ζ(07RROV
Answer the following questions to test your understanding of the content covered in
this topic.
1. :KDWRSWLRQVDUHWKHUHIRULQJHVWLQJGDWDIURPDXQLȴHGWKUHDW
management (UTM) appliance deployed on the network edge to a SIEM?
2. Which two factors do you need to account for when correlating an event
timeline using a SIEM?
3. True or false? Syslog uses a standard format for all message content.
4. :KLFKGHIDXOWSRUWGR\RXQHHGWRDOORZRQDQ\LQWHUQDOȴUHZDOOVWRDOORZD
host to send messages by syslog to a SIEM management server?
LICENSED FOR USE ONLY BY: AFENDEY JINIR · 14899581 · JAN 18 2022
148 | The Official CompTIA CySA+ Student Guide (Exam CS0-002)
/DE$FWLYLW\
&RQȴJXULQJ6Ζ(0$JHQWVDQG&ROOHFWRUV
Scenario
A security information and event management (SIEM) system assists security
monitoring and incident response by aggregating and correlating log and network
traffic data within a single management and reporting interface. In this lab, you will use
different methods of configuring data sources for shipping logs to the SIEM. We will use
the Security Onion (securityonion.net) appliance, which implements the Elastic Stack
(elastic.co) for SIEM functionality.
Lab Setup
If you are completing this lab using the CompTIA Labs hosted environment, access the
lab using the link provided. Note that you should follow the instructions presented in
the CompTIA Labs interface, NOT the steps below. If you are completing this lab using
a classroom computer, use the VMs installed to Hyper-V on your HOST computer, and
follow the steps below to complete the lab.
Start the VMs used in this lab in the following order, adjusting the memory allocation
first if necessary, and waiting at the ellipsis for the previous VMs to finish booting
before starting the next group. You do not need to connect to a VM until prompted to
do so in the activity steps.
1. UTM1 (512—1024 MB)
2. DC1 (1024—2048 MB)
3. SIEM1 (4096—6144 MB)
...
4. MS1 (1024—2048 MB)
...
5. PC1 (1024—2048 MB)
6. PC2 (512—1024 MB)
LICENSED FOR USE ONLY BY: AFENDEY JINIR · 14899581 · JAN 18 2022
The Official CompTIA CySA+ Student Guide (Exam CS0-002) | 149
If you can allocate more than the minimum amounts of RAM, prioritize SIEM1.
&RQȴJXUHD6HQVRUΖQWHUIDFH
The SIEM1 VM running Security Onion has two interfaces. eth0 is configured with the
IP address 10.1.0.246 and is used as a management interface. eth1 has no IP address
and is used only to sniff traffic from the local network. All the VMs are connected to the
vLOCAL switch implemented in Hyper-V. To enable eth1 to sniff traffic, it is configured
as a port mirroring destination interface. Run a script to configure the source interfaces
and test that the sensor can sniff traffic.
1. On the HOST, open a PowerShell prompt as administrator and run the following
script:
C:\COMPTIA-LABS\LABFILES\EnablePortMirroring.ps1
This script configures the Windows and UTM1 VMs as source interfaces for port
mirroring. Any traffic they process will be copied to the port that SIEM1's eth1
sensor interface is connected to.
Lab topology—The Hyper-V settings allow SIEM1 to sniff traffic passing over the vLOCAL switch. The
sniffing/sensor interface is separate from the management interface and has no IP address. It operates
as a passive sensor. (Images © 123rf.com)
3. Open a connection window for the SIEM1 VM and log on as siem with the
password Pa$$w0rd.
LICENSED FOR USE ONLY BY: AFENDEY JINIR · 14899581 · JAN 18 2022
150 | The Official CompTIA CySA+ Student Guide (Exam CS0-002)
4. Right-click the desktop and select Open Terminal. Run the following command to
test port mirroring, entering Pa$$w0rd when prompted to confirm the use of
sudo:
sudo tcpdump -ni eth1 ip
The -n switch suppresses name resolution and the ip filter omits IPv6 traffic.
Make sure you can see unicast traffic between other hosts (10.1.0.1 to 10.10.2, for
instance).
If you don't see unicast traffic, use the Settings dialog for each VM to verify that the
adapter is set as Source under Network Adapter > Advanced Features > Mirroring
mode.
This traffic is being monitored by the Bro (now called Zeek) passive network
sniffer (zeek.org). Bro's rules reduce this traffic stream to "interesting" events.
These events are written to the SIEM logging engine, powered by the Elastic Stack
(Logstash, ElasticSearch, and Kibana).
The output should show that each service is OK. If there is a warning message
that Logstash is still initializing, you might not see immediate results as you
complete the activities.
7. From the desktop, right-click the Kibana icon and select Open. Log on with the
username siem and password Pa$$w0rd.
8. Under Bro Hunting, select Connections. Scroll down the page to verify that hosts
from the 10.1.0.0/24 network are present.
Bro/Zeek performs passive analysis on traffic received by the sensor and collates statistics and
generates alerts for packets or conversations that match a rule pattern. The Kibana dashboard
presents the data generated by Zeek as visualizations in one or more dashboards. (Screenshot Kibana
in the Elastic Stack elastic.co)
LICENSED FOR USE ONLY BY: AFENDEY JINIR · 14899581 · JAN 18 2022
The Official CompTIA CySA+ Student Guide (Exam CS0-002) | 151
11. Switch back to the SIEM1 VM. In Kibana, under Alert Data, view the Bro Notices
and NIDS categories for scanning activity alerts. If there are no results, click the
Update button. You may need to be patient or check again after completing other
tasks in this lab.
The NIDS alert is generated by the Snort IDS engine and ruleset.
Viewing a summary of alerts produced by the NIDSs sensor and ruleset (Snort) in Kibana. The
classification of the event as "Web Application Attack" is drawn from the classtype attribute in the Snort
rule. (Screenshot Kibana in the Elastic Stack elastic.co)
As you edit the file, be aware that yaml files are white space sensitive. Settings are
grouped by indentation. You must not use tab to indent, however. This file uses
two spaces per indentation level, which is the widely accepted custom.
LICENSED FOR USE ONLY BY: AFENDEY JINIR · 14899581 · JAN 18 2022
152 | The Official CompTIA CySA+ Student Guide (Exam CS0-002)
hosts: "10.1.0.246:5044"
In the Elastic Stack, ElasticSearch provides the storage and query functionality.
Logstash is an engine for collecting different types of data from various sources
via a pipeline. The pipeline takes inputs—such as syslog or a Beats agent—and
filters the data to normalize it.
6. Open PowerShell as administrator and run the following commands to test the
configuration file:
7. Run the following two commands to install the agent as a service, and start the
service:
.\install-service-winlogbeat
start-service winlogbeat
8. Switch to the SIEM1 VM. At the terminal, run the following command:
sudo so-allow-view
The output shows that the firewall has already been configured to allow traffic
over the Beats port 5044. Note that Logstash and other components run in
Docker containers.
9. In the Kibana app, check the Bro Notices and NIDS dashboards if you have not
previously seen any alerts. Also check the Beats dashboard under Host Hunting. It
may take time for events from the DC1 VM to start appearing, however. Use the
Update or Refresh button to check for new alerts after you have finished other
tasks in the lab.
&RQȴJXUH$SSOLFDWLRQ/RJJLQJ
The default Beats configuration for a Windows Server just captures the application,
system, and security logs. This will produce a lot of data, much of which will not really
be relevant to incident detection or threat hunting. You will often want to configure
application logs to send data to the SIEM. As an example, on MS1, configure IIS to send
access logs to Event Viewer and install the Beats agent to forward it to the SIEM.
1. Open a connection window for the MS1 VM and log on as 515support\
Administrator with the password Pa$$w0rd.
2. In Server Manager, select Tools > Internet Information Services (IIS) Manager.
3. In IIS Manager, select the MS1 server and double-click the Logging applet in the
middle pane.
Note the options for log format, but leave set to W3C.
LICENSED FOR USE ONLY BY: AFENDEY JINIR · 14899581 · JAN 18 2022
The Official CompTIA CySA+ Student Guide (Exam CS0-002) | 153
6. Open PowerShell as administrator and run the following command to check the
name of the event log capturing IIS access events (ignore any line break):
- name: Microsoft-IIS-Logging/Logs
9. Save and close the file, selecting Yes when prompted to switch to Administrator
mode. Switch back to the PowerShell prompt. Run the following commands to
test the configuration file:
10. Run the following two commands to install the agent as a service, and start the
service:
.\install-service-winlogbeat
start-service winlogbeat
11. Use the PC1 and PC2 VMs to generate some network activity, such as copying
files from ??'&?ODEƴOHV share, browsing the http://updates.
corp.515support.com website, and using Zenmap to scan 10.1.0.2.
3. Check the I accept box and click Install. Accept the UAC prompt. When setup
completes, check the 5XQ$JHQWFRQȴJXUDWLRQLQWHUIDFH box and click Finish.
Confirm the UAC prompt.
LICENSED FOR USE ONLY BY: AFENDEY JINIR · 14899581 · JAN 18 2022
154 | The Official CompTIA CySA+ Student Guide (Exam CS0-002)
This associates the agent with the manager running on SIEM1. It is possible to
authenticate this connection, but we have skipped that for this lab.
5. Switch back to the Wazuh Agent Manager dialog. Click the Refresh button. A key
should be loaded into the Authentication key box.
Do not be concerned if the agent dialog still shows the status as "Stopped."
Perform a Query
To extract and aggregate records from the SIEM's database, you need to be able to
construct string search patterns as the basis for more complex queries.
1. Switch to the SIEM1 VM. In the Kibana app, check the dashboards for new alert
sources, including the OSSEC dashboard under Host Hunting. Use the Update or
Refresh button to check for new alerts.
2. Click the Management tab, select Index Patterns, and then click the Create
index pattern button.
3. In the Index pattern box, type logstash-ossec-* and then click Next step.
4. From the Time Filter field name list box, select I don't want to use the Time
Filter. Click the Create index pattern button.
5. Click the Discover tab. From the list box currently set to *:logstash-*, select
logstash-ossec-*.
6. In the Search box, type the following filter string and then click the Update button.
LICENSED FOR USE ONLY BY: AFENDEY JINIR · 14899581 · JAN 18 2022
The Official CompTIA CySA+ Student Guide (Exam CS0-002) | 155
Querying source log files in Kibana. (Screenshot Kibana in the Elastic Stack elastic.co)
You can also build fields using the Available fields list. This will show you the top values for a
particular category.
&RQȴJXUHDV\VORJ6RXUFH
Some hosts are not compatible with agents, or you may have a configuration or
security reason for not installing an agent. In this scenario, you can use syslog to
transfer event data from the host to the SIEM. To illustrate this, configure remote syslog
on the UTM1 VM, which is running the pfSense security appliance (pfsense.org).
1. Switch to the PC1 VM and open http://10.1.0.254 in the browser.
2. Log on to the web admin app using the username admin and password
Pa$$w0rd. Maxmize the window.
3. Select Status > System Logs. Click the Settings tab.
4. Scroll down to the Remote Logging Options section. Check the Enable Remote
Logging box.
6. From Remote Syslog Contents, check only System Events and Firewall Events.
Click Save.
7. Switch to the SIEM1 VM. In the Kibana app, click the Management tab, select
Index Patterns, and then click the Create index pattern button.
8. In the Index pattern box, type ORJVWDVKV\VORJ and then click Next step.
9. From the Time Filter field name list box, select I don't want to use the Time
Filter. Click the Create index pattern button.
10. Click the Discover tab. From the list box currently set to logstash-ossec*, select
logstash-syslog-*.
LICENSED FOR USE ONLY BY: AFENDEY JINIR · 14899581 · JAN 18 2022
156 | The Official CompTIA CySA+ Student Guide (Exam CS0-002)
11. In the Search box, type the following filter string and then click the Update button.
V\VORJVRXUFHLS
12. You have explored some options for ingesting log and network traffic sources into
a SIEM. What will be the next step in configuring this SIEM deployment?
• For each VM that is running, right-click and select Revert to set the configuration
back to the saved checkpoint.
LICENSED FOR USE ONLY BY: AFENDEY JINIR · 14899581 · JAN 18 2022
The Official CompTIA CySA+ Student Guide (Exam CS0-002) | 157
Topic 4B
$QDO\]HDQG4XHU\/RJVDQG6Ζ(0'DWD
Once you have a system for collecting and normalizing security information, the next
phase in the intelligence cycle is analysis and production. As a CySA+ professional, you
must be able to use query and scripting tools to facilitate analysis of large and complex
datasets.
SIEM Dashboards
A SIEM will help with most of the regular duties involved in staffing a SOC or CSIRT,
such as:
• Perform triage on alerts, escalating true positives to incident response and
dismissing false positives.
• Review security data sources to check that log collection and information feeds are
functioning as expected.
• Review CTI to identify priorities or potential impacts from events occurring at other
companies and all over the Internet.
You may interpret security incidents differently depending on your judgement of an overall
threat level. You should be alert to internal projects that increase risk—product development
that may entice competitors to try to spy on you or new and recent hires, for instance.
Externally measured threats will also change your overall threat level. For example, a
zero-day vulnerability such as the OpenSSL Heartbleed exploit raises the threat level for all
organizations.
• Identify opportunities for threat hunting, based on CTI and overall alert and incident
status.
LICENSED FOR USE ONLY BY: AFENDEY JINIR · 14899581 · JAN 18 2022
158 | The Official CompTIA CySA+ Student Guide (Exam CS0-002)
Different visualizations in an Elastic Stack dashboard running on Security Onion. The line graph shows
changes in the volume of alerts over time. The pie graph shows the balance of severe-to-informational
alerts. The table shows the events reported most often. (Screenshot Security Onion securityonion.net)
Selecting the right metrics for the dashboard is a critical task. As space is limited, only
information that is directly actionable should be included. Each widget selected should
be designed to support an analyst workflow. Common security key performance
indicators (KPI) include:
• The number of vulnerabilities, by service type, that have been discovered and
remediated.
• The number of failed log-ons or unauthorized access attempts.
• The number of systems currently out of compliance with security requirements.
• The number of security incidents reported within the last month.
• The average response time for a security incident.
• The average time required to resolve a help-desk call.
• The current number of outstanding or unresolved technical issues in a project or
system.
• The number of employees who have completed security training.
• Percentage of test coverage on applications being developed in-house.
You may also configure multiple dashboards for different audiences. For example, the
metrics discussed above are relevant to the security team. A separate dashboard could
be configured for reporting to management.
To learn more, watch the video “Using SIEM Dashboards” on the CompTIA Learning Center.
LICENSED FOR USE ONLY BY: AFENDEY JINIR · 14899581 · JAN 18 2022
The Official CompTIA CySA+ Student Guide (Exam CS0-002) | 159
Behavioral Analysis
Behavior-based detection (or statistical- or profile-based detection) means that the
engine is trained to recognize baseline traffic or expected events associated with a user
account or network device. Anything that deviates from this baseline (outside a defined
level of tolerance) generates an alert. The engine does not keep a record of everything
that has happened and then try to match new traffic to a precise record of what has
gone before. It uses heuristics to generate a statistical model of what the baseline looks
like. It may develop several profiles to model behavior at various times of the day. This
means that the system generates false positive and false negatives until it has had time
to improve its statistical model of what is normal.
Anomaly Analysis
Anomaly analysis is the process of defining an expected outcome or pattern to
events, and then identifying any events that do not follow these patterns. This is useful
in tools and environments that enable you to set rules. If network traffic or host-based
events do not conform to the rules, then the system will see this as an anomalous
event. For example, the engine may check packet headers or the exchange of packets
in a session against RFC standards and generate an alert if they deviate from strict
RFC compliance. Anomaly analysis is useful because you don't need to rely on known
malicious signatures to identify something unwanted in your organization, as this can
lead to false negatives.
LICENSED FOR USE ONLY BY: AFENDEY JINIR · 14899581 · JAN 18 2022
160 | The Official CompTIA CySA+ Student Guide (Exam CS0-002)
Behavioral analysis differs from anomaly analysis in that the latter prescribes the baseline
for expected patterns, and the former records expected patterns in relation to the entity
being monitored.
Trend Analysis
Trend analysis is the process of detecting patterns within a dataset over time and
using those patterns to make predictions about future events. Applied to security
intelligence, trend analysis can help you to judge that specific events over time are
related and possibly indicate that an attack is imminent. It can also help you avoid
unforeseen negative effects that result from an attack if you can't stop the attack
altogether. Aside from predicting future events, trend analysis also enables you to
review past events through a new lens. For example, when an incident happens, you'll
usually attribute it to one cause. However, after time has passed and you gather more
intelligence, you may gain a new perspective and realize that the nature of the cause is
different than you had originally thought.
A trend is difficult to spot by examining each event in a log file. Instead, you need
software to visualize the incidence of types of event and show how the number or
frequency of those events changes over time. Trend analysis can apply to frequency,
volume, or statistical deviation:
• Frequency-based trend analysis establishes a baseline for a metric, such as number
of NXERROR DNS log events per hour of the day. If the frequency exceeds (or in
some cases undershoots) the threshold for the baseline, then an alert is raised.
• Statistical deviation analysis can show when a data point should be treated as
suspicious. Statistical analysis uses the concept of mean (the sum of all values
divided by the number of samples) and standard deviation. Standard deviation is a
measure of how close values in the set are to the mean. If most values are close to
the mean, standard deviation is low. Statistical techniques such as regression and
clustering can be used to determine whether a certain data point is not aligned with
the relationships that most data points share. For example, a cluster graph might
show activity by standard users and privileged users, invoking analysis of behavioral
metrics of what processes each type runs, which systems they access, and so on.
A data point that appears outside the two clusters for standard and administrative
users might indicate some suspicious activity by that account.
Trend analysis depends on choice of metrics to baseline and measure. You should aim
to evaluate the effectiveness of each metric that you track, given the limited resource
that is hours of analyst time. Some areas for trend analysis include:
• Number of alerts and incidents and detection/response times—These types of
metrics show how well security operations are performing. You could potentially
also measure hours lost or impact in cost terms, though these things are hard to
measure and quantify.
• Network and host metrics—You can measure any number of network metrics
(volume of internal and external traffic, numbers of log-ons/log-on failures,
number of active ports, number of authorized or unauthorized devices, instances
LICENSED FOR USE ONLY BY: AFENDEY JINIR · 14899581 · JAN 18 2022
The Official CompTIA CySA+ Student Guide (Exam CS0-002) | 161
Trend analysis can provide some defense against sparse attack techniques. The
problem with many monitoring systems is the profusion of false alarms. Each alert
requires so many work hours of human analyst time to investigate. Where the system
is generating high numbers of alerts, a sizable proportion will go uninvestigated. A
sparse attack succeeds either because the sensitivity of the security software has been
turned down to try to reduce false positives or because the actions are buried within
the noise generated by the number of alerts. An attacker can also launch "blinding" or
diversionary attacks to disguise his or her actual target or intention.
In another sense, trend analysis can also refer to narrative-based threat awareness
and intelligence. For example, historically botnets used Internet Relay Chat (IRC) as a
command-and-control mechanism. Security researchers analyzed these techniques
and specified heuristic rulesets that were good at spotting IRC-based C&C mechanisms.
Consequently, the attackers stopped using IRC and started using SSL tunnels to bury
their communications amid the general HTTPS chatter of a regular network. It is vital to
keep up to date with the latest threat intelligence so that your security controls can be
configured and deployed appropriately.
LICENSED FOR USE ONLY BY: AFENDEY JINIR · 14899581 · JAN 18 2022
162 | The Official CompTIA CySA+ Student Guide (Exam CS0-002)
One of the problems of this type of rule is that it must store persistent state data,
which takes up memory. The SIEM will only be able to store individual items of state
data for a limited period. If there are many correlation rules that use stateful data,
there will be a significant load on the host's processing resources.
Correlation rules depend on normalized data. For example, an IP address only has
value as data in context. It could be a source or destination IP address, it could
be statically or dynamically assigned, or it could be affected by a network address
translation (NAT) service. All these factors can affect whether a correlation between
indicators in one log, such as a firewall, can be made between those in another, such as
a web server's application log. Similarly, local time values can be affected by differences
in time zones or poor clock synchronization.
SIEM Queries
Where a correlation rule matches data as it is first ingested in the SIEM, a query
extracts records from among all the data stored for review or to show as a visualization.
The basic format of a query is:
Select (Some Fields) Where (Some Set of Conditions)
6RUWHG%\6RPH)LHOGV
Microsoft's blog introducing Azure log query language (azure.microsoft.com/en-us/blog/
azure-log-analytics-meet-our-new-query-language-2) provides a useful overview of query
syntax. Resources such as the Splunk documentation for Search Processing Language (docs.
splunk.com/Documentation/Splunk/8.0.0/Search/GetstartedwithSearch) will also help you to
understand the features and capabilities of SIEM search, query, and visualization tools.
To learn more, watch the video “Reviewing Query Log” on the CompTIA Learning Center.
LICENSED FOR USE ONLY BY: AFENDEY JINIR · 14899581 · JAN 18 2022
The Official CompTIA CySA+ Student Guide (Exam CS0-002) | 163
A complete description of regex syntax is beyond the scope of this course, but you can use
an online reference such as regexr.com or rexegg.com to learn it.
Option Description
-i By default, literal search strings in grep
are case-sensitive. This option ignores
case sensitivity.
LICENSED FOR USE ONLY BY: AFENDEY JINIR · 14899581 · JAN 18 2022
164 | The Official CompTIA CySA+ Student Guide (Exam CS0-002)
Option Description
-L Like the behavior of the -v option, in
that it returns the names of files without
matching lines.
In Windows, you can use the ƴQG command for basic string matching. The ƴQGVWU
command supports regex syntax.
Piping
The output of a command can be used as the input for another command—a process
called piping. Using the pipe character (|) causes the following command to take the
output of a previous command as its input. For example, to return only lines in syslog.
txt that deal with the NetworkManager process, while also cutting each line so that only
the date, time, source, and process display, you would enter:
JUHS1HWZRUN0DQDJHUYDUORJV\VORJ_FXWG
-f1-5 | sort -t " " -k3
In this example, the grep command feeds into the cut command, and then into the
sort command, producing a more focused output.
The head and tail Commands
The head and tail commands output the first and last 10 lines respectively of a
file you provide. You can also adjust this default value to output more or fewer lines.
The tail tool is useful for reviewing the most recent entries in a log file.
To learn more, watch the video “Analyzing, Filtering, and Searching Event Log” on the
CompTIA Learning Center.
LICENSED FOR USE ONLY BY: AFENDEY JINIR · 14899581 · JAN 18 2022
The Official CompTIA CySA+ Student Guide (Exam CS0-002) | 165
Scripting Tools
While issuing a search command sequence manually is useful for one-off analysis, in
many circumstances you might want to run searches on multiple files and according
to a schedule. To do this, you need to use the commands within the context of a script.
We will look at shell scripting languages for Linux (Bash) and Windows (PowerShell), but
be aware that languages such as Python and Ruby are also widely used for automation.
Bash
Bash is a scripting language and command shell for Unix-like systems. It is the default
shell for Linux and macOS. Tools like grep, cut, and sort are built into the Bash
shell. Beyond individual command entry, Bash can run complex scripts. Like standard
programming languages, Bash supports elements such as variables, loops, conditional
statements, functions, and more. The following is an example of a simple Bash script
that uses the grep and cut commands:
#!/bin/bash
echo "Pulling NetMan entries..."
JUHS1HWZRUN0DQDJHUYDUORJV\VORJ_FXWG
-f1-5 > netman-log.txt
HFKR1HW0DQORJƴOHFUHDWHG
The first line of the script indicates what type of interpreter the system should run, as
there are many different scripting languages. The echo lines simply print messages to
the console. The grep line pipes in cut to trim the syslog as before, and outputs (>)
the results to a file called netman-log.txt.
Newer versions of Windows 10 include a Linux subsystem that supports the Bash shell.
awk
The feature awk is a scripting engine geared toward modifying and extracting data
from files or data streams, which can be useful in preparing data for analysis. Programs
and scripts run in awk are written in the AWK programming language. The awk
keyword is followed by the pattern, the action to be performed, and the file name. The
action to be performed is given within curly braces. The pattern and the action to be
performed should be specified within single quotes. If the pattern is not specified, the
action is performed on all input data; however, if the action is not specified, the entire
line is printed.
LICENSED FOR USE ONLY BY: AFENDEY JINIR · 14899581 · JAN 18 2022
166 | The Official CompTIA CySA+ Student Guide (Exam CS0-002)
This will select all security event log entries whose events are type 5 (audit failure). It
will then output the source, the time the event was generated, and a brief message
about the event. This can be useful for finding specific events based on their details,
without being at the target computer and combing through Event Viewer.
Windows PowerShell
Windows administrators often use PowerShell to manage both local and remote hosts.
PowerShell offers much greater functionality than the traditional Windows command
prompt. PowerShell functions mainly through the use of cmdlets, which are specialized
.NET commands. These cmdlets typically take the syntax of Verb-Noun, such as Set-
Date, to change a system's date and time. Like other command shells, the cmdlet will
take whatever valid argument the user provides. PowerShell is also able to execute
scripts written to its language. Like Bash, the PowerShell scripting language supports a
wide variety of control structures.
The following is an example of a PowerShell script:
Write-Host "Retrieving logon failures..."
*HW(YHQW/RJ1HZHVW/RJ1DPH6HFXULW\,QVWDQFH,G
4625 | select
timewritten, message | Out-File C:\log-fail.txt
Write-Host "Log created!"
The Write-Host cmdlets function similar to echo by printing the given text to the
PowerShell window. The Get-EventLog cmdlet line searches the security event log for
the latest five entries that match an instance ID of 4625—the log-on failure code. The
time the event was logged and a brief descriptive message are then output to the log-
fail.txt file.
LICENSED FOR USE ONLY BY: AFENDEY JINIR · 14899581 · JAN 18 2022
The Official CompTIA CySA+ Student Guide (Exam CS0-002) | 167
5HYLHZ$FWLYLW\
4XHU\/RJDQG6Ζ(0'DWD$QDO\VLV
Answer the following questions to test your understanding of the content covered in
this topic.
1. :KDWW\SHRIYLVXDOL]DWLRQLVPRVWVXLWDEOHIRULGHQWLI\LQJWUDɝFVSLNHV"
2. You need to analyze the destination IP address and port number from some
ȴUHZDOOGDWD7KHGDWDLQWKHLSWDEOHVȴOHLVLQWKHIROORZLQJIRUPDW
DATE,FACILITY,CHAIN,IN,SRC,DST,LEN,TOS,PREC,TTL,ID,
PROTO,SPT,DPT
Jan 11 05:33:59,lx1 kernel: iptables,INPUT,eth0,
10.1.0.102,10.1.0.1,52,0x00,0x00,128,2242,T
CP,2564,21
Write the command to select only the necessary data, and sort it by
destination port number.
3. :RUNLQJZLWKWKHVDPHGDWDȴOHZULWHWKHFRPPDQGWRVKRZRQO\WKHOLQHV
where the destination IP address is 10.1.0.10 and the destination port is 21.
LICENSED FOR USE ONLY BY: AFENDEY JINIR · 14899581 · JAN 18 2022
168 | The Official CompTIA CySA+ Student Guide (Exam CS0-002)
/DE$FWLYLW\
Analyzing, Filtering, and Searching
(YHQW/RJDQGV\VORJ2XWSXW
Scenario
When you have set up appropriate data sources for a security information and event
management (SIEM) system, the next challenge is to extract actionable intelligence
from it. A SIEM will be used to respond to incidents in real time and also to perform
threat hunting for incidents that might not have been detected. Both these use
cases will depend on effective queries, filters, and visualizations so that analysts are
presented with useful information and not overloaded by false positive alerts. To
demonstrate some of the use of these tools, we will continue to use the Security Onion
(securityonion.net) appliance.
Lab Setup
If you are completing this lab using the CompTIA Labs hosted environment, access the
lab using the link provided. Note that you should follow the instructions presented in
the CompTIA Labs interface, NOT the steps below. If you are completing this lab using
a classroom computer, use the VMs installed to Hyper-V on your HOST computer, and
follow the steps below to complete the lab.
Start the SIEM1 VM only to use in this lab, adjusting the memory allocation first, if
necessary.
Analyze a Dashboard
Where the Sguil tool is used to manage and categorize alerts, escalating or dismissing
them as appropriate, Squert (squertproject.org) provides an overview of current status.
Analysis at this operational or "big picture" level is just as important as at the tactical
level.
1. Open a connection window for the SIEM1 VM and log on as siem with the
password Pa$$w0rd.
2. From the desktop, right-click the Squert icon and select Open. In the browser,
if prompted, sign in to the app with the username siem and the password
Pa$$w0rd.
3. Click the INTERVAL link to open the date picker. If necessary, select 2020 and
then Mar and Mon16.
You are now looking at the alerts raised by the IDS (Snort) as a result of sample
packet captures that were replayed through the sensor. Note that there are
LICENSED FOR USE ONLY BY: AFENDEY JINIR · 14899581 · JAN 18 2022
The Official CompTIA CySA+ Student Guide (Exam CS0-002) | 169
some other alerts from the OSSEC agent installed on Security Onion. If you were
to configure other sensors and agents, that data would also be collected and
summarized here.
The SC and DC columns show the number of source and destination IPs involved
for each event signature, while the activity column shows the number of events
of that signature per hour. This kind of dashboard is designed to provide overall
threat status reporting. The default view shows only queued events that have not
yet been analyzed and categorized by an incident handler.
4. At the top of the page, select the Summary tab. This tab shows you information
about which signatures, IP addresses, and ports are most active. It also displays
location information for each public IP and summarizes connections by source
and destination IPs and countries.
5. Select the Events tab again. Click the Queue box (with the value 24) for the ET
TROJAN Possible Windows executable sent event (2009897).
6. Click the first Queue box for the expanded event. The indicators comprising the
event are shown.
7. Click the first Event ID (3.47). The packet capture underlying the detected event is
shown in a new tab.
Note the GET request for a cryptically named php file with some parameters
whose functionality is opaque. As an old threat, we could look up the functionality
of this malware in a threat database. If it were an unknown threat, we could
isolate the host and use a sandbox to try to determine the code's functions.
LICENSED FOR USE ONLY BY: AFENDEY JINIR · 14899581 · JAN 18 2022
170 | The Official CompTIA CySA+ Student Guide (Exam CS0-002)
9. On the Squert tab, click the source IP address starting 195. A number of
information sources is shown. Select Kibana. In Kibana, click the date picker and
select Last 5 years.
The current view has applied the IP address as a search term in the bar at the top.
You can see how many alerts the address is associated with and how many times
it appears as a source and destination address.
cd /var/log && ls
This is the primary log folder for Linux. Some of the log files are processed by
syslog, while others are written directly by the application. Most application-
written logs are stored in subdirectories. For example, the web server Apache
writes to /var/log/apache2 or /var/log/httpd.
2. At the terminal, enter man grep and note the options available with the grep
command. The grep command is an extremely useful tool for searching any file,
not just logs.
3. Scroll through the grep manual until you return to the prompt. Alternatively, enter
q to return to the prompt.
4. Run the following command, entering Pa$$w0rd when prompted:
VXGRJUHSURRWV\VORJ
This shows all instances of the string root in the syslog file. You can search for any
text string in any file this way. As with most things in Linux, these searches are
case-sensitive by default.
VXGRJUHSURRWV\VORJ
This command searches for the word "root" in all files that start with syslog,
including syslog.1. The log rotation system usually backs up the last log as .1,
while older logs are gzipped.
VXGRJUHSLHUURUV\VORJ
The -i flag makes the search case-insensitive.
LICENSED FOR USE ONLY BY: AFENDEY JINIR · 14899581 · JAN 18 2022
The Official CompTIA CySA+ Student Guide (Exam CS0-002) | 171
7. How would you use grep to look for a negative match for a pattern rather than a
positive match?
VXGRFXWFV\VORJ
This command displays the first 32 characters of each log item in the file.
Adjusting for the length of the host name, this sort of command can show the
date and time, user, process name, and process ID of each log item.
VXGRFXWFV\VORJ
This command displays from character 31 to the end of the line.
One complication with this sort of approach is that logs can be in different formats.
4. Run the following commands and compare the output to the auth log:
cd ~/Downloads
head conn-sample.log
This log file is generated by the Bro IDS. Bro usually logs in JSON format, but this
has been changed to tab-delimited in this sample log file. head displays the first
10 lines from the file. In the case of this Bro log, field definitions are included.
When you have a log file using standard delimiters, use cut to extract fields
from the source file. The -f flag enables you to search by fields. The -d flag
enables you to specify what separates (delimits) each field. The tab is the default,
however, so you do not need to use -d with this file.
5. Use the output to work out the column numbers for source IP (id_orig_h),
destination port (id_resp_p), and orig_bytes (payload data sent by the originator of
the connection). Then, run the command.
6. Run clear to remove the previous output. Pipe the command to sort so that it
is shown in descending order of byte count.
7. Run clear to remove the previous output. Sort by port number in ascending
order and then by byte count in descending order.
LICENSED FOR USE ONLY BY: AFENDEY JINIR · 14899581 · JAN 18 2022
172 | The Official CompTIA CySA+ Student Guide (Exam CS0-002)
8. Run clear to remove the previous output. See if you can construct a regular
expression to filter the output to IPv4 addresses only. You will need to use grep
-E or egrep.
• For each VM that is running, right-click and select Revert to set the configuration
back to the saved checkpoint.
LICENSED FOR USE ONLY BY: AFENDEY JINIR · 14899581 · JAN 18 2022
The Official CompTIA CySA+ Student Guide (Exam CS0-002) | 173
Lesson 4
Summary
You should be able to explain the factors to consider when deploying a SIEM plus the
options for collecting network and log data from diverse sources. You should also be
able to use dashboards, string search, and queries to extract relevant information from
data sources.
• Ensure that logs are stored within secure architecture with appropriate access
permissions and tamper protection.
• Identify the analysis methods used to query and filter data to produce alerts,
including conditional, heuristic, behavioral, and anomaly-based. Evaluate methods
against the numbers of false positives and false negatives.
• Make command-line tools such as grep, cut, and sort available for manual analysis.
Consider the use of scripts to automate detection functions not supported by a
SIEM.
LICENSED FOR USE ONLY BY: AFENDEY JINIR · 14899581 · JAN 18 2022