The methodology presented in this paper is based on the ability to identify and understand the flow of log streams. Once the feeds are incorporated and the best possible coverage has been achieved, detected category will be ready for rule definition. Correlation rules can also correlate events via their taxonomy allowing the creation of device-independent correlation rules
1. Context Driven Scalable SIEM Solution
Dr. Ertuğrul AKBAŞ
eakbas@gmail.com
Cyber-attacks have grown exponentially more frequent and sophisticated, demanding near real-
time, highly available, and automated responses to threats. The global cost of cybercrime has
already grown to $100 billion annually [1], not counting the intangible damage to enterprise
and government security. In addition to the data loss, security breaches can cause
immeasurable—and sometimes irrevocable damage to brand.
Analyzing machine data from firewalls and perimeter devices in real time is vital to thwarting
and predicting threats. Every router, switch, firewall, intrusion prevention system (IPS), web
proxy, or other security element has a story to tell about the confidentiality, integrity, and
availability of the IT environment. Relevant data from across these systems is critical to
investigations as well as for continuous monitoring for situational awareness. However, the real
return on investment for security solutions lies in making them work together to provide a
comprehensive view of the enterprise security posture. This combined and chronological view
of all relevant data allows the security team to prioritize events and responses, and to effectively
engage with IT operations and other areas of the business.
The Methodology
SIEM solutions are usually used for real-time threat monitoring, incident forensics,
demonstrating regulatory compliance, and streamlining IT operations. In most organizations,
these functions are designed with the intent of leveraging them to protect sensitive data. In such
scenarios, SIEM can be effectively integrated with:
• Application Security Solutions
• DDoS Protection Solutions
• Firewalls
• Secure Mail & Web Gateways
• DLP Systems
• IPS
• End Point Security Solutions
• Database Security Systems
• OSs
The methodology presented in this paper is based on the ability to identify and understand the
flow of log streams. Understanding and decoding log flow is the first step. Output of this step
is categorized event streams like;
Malicious->DNS->Attack
Compromised->Virus->Attachment->Not Cleaned
Informational->VPN->Tunnel->Failed
2. Labeling, categorization and identification can be used interchangeably. This log identification
can be used for scenario based correlation, but might also be used for any number of other
controls.
This technology give s us the power of defining human readable correlation rules like:
“Visit a website and suddenly make lots of connections”
After a log or log stream labeled they are not just logs from now on, they represents a process
in your network. This labels represents each SIEM integrated device or application like:
Application Security Solutions, DDoS Protection Solutions, Firewalls, Secure Mail & Web
Gateways, DLP Systems, IPS, End Point Security Solutions, Database Security Systems state
Some previous works also point out log content analysis and make some classifications like [2]:
Authentication and Authorization Reports
Systems and Data Change Reports
Network Activity Reports
Resource Access Reports
Malware Activity Reports
Failure and Critical Error Reports
We have nearly 300 categories with sub-category.
Once the feeds are incorporated and the best possible coverage has been achieved, detected
category will be ready for rule definition. Correlation rules can also correlate events via their
taxonomy allowing the creation of device-independent correlation rules.[3]
The Taxonomy Algorithm
No matter the source of the event, or the format it originated in, there are types of system and
network events common across many system types. A security analyst wanting to see all user
logins within a certain time period, should not have to know what the specific attributes for
each event type for each system type is, to retrieve that information. SureLog maintains a
taxonomy of event types that normalized fields can be matched to and retrieved via. Correlation
directives can also correlate events via their taxonomy allowing the creation of device-
independent correlation rules.
A taxonomy aids in pattern recognition and also improves the scope and stability of correlation
rules. Our comprehensive log taxonomy is then applied in order to enable the cross-device,
cross-infrastructure correlation. This log taxonomy takes into account more than 400,000
distinct signatures to make sure that no matter the device, the message can be categorized.
Signatures are a way to match information in the log streams. Once the data are categorized,
the advanced correlation and alerting intelligence can be applied for prioritization of the logs.
The taxonomy is constructed of high-level, first-tier groups such as Access, Application,
Authentication, DoS, Exploit, Informational, Malware, Policy, Recon, Suspicious Activity,
System, etc. Each first-tier group is then broken down further into sub-groups and even further
as necessary, each lower tier representing more specific event classification. By referring to the
3. highest level of the Normalized Taxonomy, all lower-tier event classifications in that branch
are included in the selection. This allows the operator to select a more general event group, such
as Authentication, and all sub-group branches (Login, Logout, Password, etc.) and their
children (Admin Login, Database Login, Domain Login, etc.) of the Authentication parent will
also be included in the selection.
Sample Execution :
The identification algorithm of correlation for a load balancing switch for web will analyze logs
from this log point and in order to identify abnormal health status condition, intelligent key
search will look for ERROR<vrrp>transmit-cannot-receive within log streams.
The correlation engine has thousands of signatures for most of the : Application Security
Solutions, DDoS Protection Solutions, Firewalls, Secure Mail & Web Gateways, DLP Systems,
IPS, End Point Security Solutions, Database Security Systems state, Oss.
Attack Classification
Classifying attacks against log anonymization is an early step towards a comprehensive study
of the security of anonymization policies. If network owners can select classes of attacks that
they wish to prevent, they can then ensure that their anonymization policies meet their security
constraints, while allowing as much non-private information as possible to be revealed—thus
increasing a data set’s utility.
As described previously, we wish to provide network owners with a taxonomy of attacks, the
classes of which they can select to prevent, rather than having to focus on individual attacks.
We also wish to formally express relationships between attacks, allowing for expression of
attack groupings in a logic about anonymization. This taxonomy must be complete (every
known attack can be placed in at least one class) and mutually exclusive (no attack can be a
member of more than one class). The classes must be fine-grained enough for network owners
to select specific classes without seriously impacting the utility of a log. Finally, the classes
must be tied together in a more concrete way than a description in natural language
References
1. “The Economic Impact of Cybercrime and Cyber Espionage”, July 2013, Center for Strategic
and International Studies
2. “Top 6 SANS Essential Categories of Log Reports 2013”, v 3.01
3. http://www.anetusa.net/surelog