Hunting for Attack Chains in Event Streams
Arctic Wolf Networks processes over 9 billion events a day across its customer base. These represent HTTP and DNS transactions from customer networks, log lines from customer infrastructure devices like firewalls and switches, Active Directory event logs, logs from cloud services such as Office 365, and more. It is critical for our security engineers to quickly pick out the small subset of these events that represent security threats facing our customers. Further, effective threat detection requires the ability to detect a sequence of related events. One example is detecting a certain threshold of login failures within a time period followed by a successful login, which might indicate a successful attempt to brute force a user account. Another is the download of an executable payload followed by an HTTP POST request to a suspicious site. In both these cases, the sequence of events taken together presents a stronger indicator of compromise than each of the individual events. Arctic Wolf Networks implemented this functionality by integrating Flink with EsperTech’s Esper Complex Event Processing streaming analytics engine. We greatly benefited from Flink’s ease of deployment and horizontal scalability, while its configurable thresholds for event lateness were crucial in enabling us to handle heterogeneous customer data sources that are not all in sync. Meanwhile, Esper offered a mature, highly expressive, and performant Complex Event Processing framework, a good fit for the flexibility required to express the logic desired by our security engineers. Together, Flink and Esper enhance our security engineers’ visibility into threats faced by our customers and reduce the time investment needed to identify these threats, allowing for more comprehensive and responsive customer service.
Strategies for Landing an Oracle DBA Job as a Fresher
Flink Forward San Francisco 2019: Hunting for Attack Chains in Event Streams - Ray Ruvinskiy & Jonathan Walsh
1. 2019 Arctic Wolf Networks, Inc. All rights reserved.1 AWN-Public
Ray Ruvinskiy and Jonathan Walsh
April 2, 2019
Hunting for Attack Chains
in Event Streams
2. 2019 Arctic Wolf Networks, Inc. All rights reserved.2 AWN-Public
Ray Ruvinskiy
• Lead of the Detection Automation Team. Our
team’s job is to help our Security Analysts
avoid the noise and zero in on the actionable.
Jonathan Walsh
• Member of the Threat Operations and
Analysis team. We utilize the tools provided
by the Detection Automation Team to
improve our threat detection.
About the Presenters
3. 2019 Arctic Wolf Networks, Inc. All rights reserved.3 AWN-Public
Security Operations Centres (SOCs)
deal with security issues on an
organizational level.
About Arctic Wolf Networks
What is a SOC?
4. 2019 Arctic Wolf Networks, Inc. All rights reserved.4 AWN-Public
AWN CyberSOC manages the security
detection for our customer, gathering
data from deployed network sensors
and logs from other security tools.
These logs are processed by our
system and the output is provided to
human experts to make decisions on.
About Arctic Wolf Networks
SOC-as-a-Service
5. 2019 Arctic Wolf Networks, Inc. All rights reserved.5 AWN-Public
Balance between being informative
and being noisy
Focus on what the client needs to deal
with immediately
Alert on actionable data
Rising above the noise
6. 2019 Arctic Wolf Networks, Inc. All rights reserved.6 AWN-Public
“
Crying wolf may have been the boy’s
undoing, but the true irony was that the
wolves were always lurking nearby.
Wes Fesler
American football player
7. 2019 Arctic Wolf Networks, Inc. All rights reserved.7 AWN-Public
Lifecycle of an attack
Attributes shared by network breaches
Source: https://www.lockheedmartin.com/en-us/capabilities/cyber/cyber-kill-chain.html
8. 2019 Arctic Wolf Networks, Inc. All rights reserved.8 AWN-Public
Applying Attack Chain to WannaCry Outbreaks
• Reconnaissance – social networking
• Weaponization – build ransomware binary
• Deliver/exploit – spearphishing
• Command and control – encrypted backchannel
• Actions on Objectives – encryption and lateral movement
Indicators at each of these steps can be noisy and prone to false positives. Correlation of events
focuses us on what is relevant.
9. 2019 Arctic Wolf Networks, Inc. All rights reserved.9 AWN-Public
Complex Event Processing
Over 13 billion events a day
10. 2019 Arctic Wolf Networks, Inc. All rights reserved.10 AWN-Public
Built for streaming
Out-of-order events
Scalability
Why Flink?
11. 2019 Arctic Wolf Networks, Inc. All rights reserved.11 AWN-Public
Our experience dates to 1.2.1
Difficulties with expressivity and scalability
• At the time, no counts
FlinkCEP
12. 2019 Arctic Wolf Networks, Inc. All rights reserved.12 AWN-Public
Mature and expressive streaming CEP framework
EPL – SQL-like Event Processing Language
Esper
13. 2019 Arctic Wolf Networks, Inc. All rights reserved.13 AWN-Public
select … from PsExecEvent#time(5 minutes)
#unique(target_ip)
#length_batch(10);
EPL
PsExec services created on at least 10 devices in 5 minutes
Typed event, with
properties
Stream composition
Property in
PsExecEvent
14. 2019 Arctic Wolf Networks, Inc. All rights reserved.14 AWN-Public
select … from pattern [
AntiVirusEvent -> IDSExploitEvent where timer:within(15 minutes)
];
EPL
Endpoint virus detection followed by network IDS exploit detection
”followed by” operator
15. 2019 Arctic Wolf Networks, Inc. All rights reserved.15 AWN-Public
create window LoginFailureWindow#time(5 minutes) as LoginEvent;
on LoginEvent(not login_success) as login_failure
merge LoginFailureWindow as failures
where login_failure.id = failures.id
when not matched then insert select *;
on LoginEvent(login_success) as login_success
select and delete … from LoginFailureWindow as failures
having count(*) >= 10;
EPL
A minimum number of login failures with no successes interspersed, then a success
16. 2019 Arctic Wolf Networks, Inc. All rights reserved.16 AWN-Public
A single, continuously-running Flink job
Complex Event Processing with Flink and Esper
Putting it all together
17. 2019 Arctic Wolf Networks, Inc. All rights reserved.17 AWN-Public
Events are ingested
Events are parsed and subjected to
sanity filtering
Timestamp extraction, watermark
assignment
SplitStream used to create a
stream per rule
Complex Event Processing with Flink and Esper
Putting it all together
Source
TimestampExtractor
FlatMap
Split
18. 2019 Arctic Wolf Networks, Inc. All rights reserved.18 AWN-Public
Each of the per-rule streams
aggressively filters events of interest
and converts to Tuples
• Reduce per-event memory footprint as much
as possible
keyBy rule-specific partition key to ensure
unrelated events are processed in isolation
• e.g., for a login failure rule we key by
customer ID and username
Complex Event Processing with Flink and Esper
Putting it all together
FlatMap
KeyBy
19. 2019 Arctic Wolf Networks, Inc. All rights reserved.19 AWN-Public
Esper operator
• Per-rule (potentially multi-statement) EPL that
defines the logic
• Priority queue to order events by timestamp
⎼ Same as FlinkCEP
• Send tuples to Esper engine
• Register subscriber for results
Complex Event Processing with Flink and Esper
Putting it all together
EsperOperator Esper Engine
events
results
20. 2019 Arctic Wolf Networks, Inc. All rights reserved.20 AWN-Public
Esper operator output formatted as
AWN events with a special type
These AWN events then undergo
another filter pass where some are
whitelisted depending on specific
rules defined by Security Analysts
Events that survive this pass then
brought to Security Analyst attention
for investigation
Complex Event Processing with Flink and Esper
Putting it all together
EsperOperator
Escalation
Service
21. 2019 Arctic Wolf Networks, Inc. All rights reserved.21 AWN-Public
AWS EMR (Elastic MapReduce)
• Managed Hadoop environment
• Easiest way to get started two years ago (Flink 1.2)
Flink 1.4.2
m5.2xlarge core nodes
• 8 virtual cores, 32 GB RAM
• 8 task slots per node
Flink/Esper at AWN
22. 2019 Arctic Wolf Networks, Inc. All rights reserved.22 AWN-Public
Over 30 rules
Over 13 billion events a day passing through Flink
• But only a small subset is evaluated by the rules
6 m5.2xlarge instances
Flink/Esper at AWN
23. 2019 Arctic Wolf Networks, Inc. All rights reserved.23 AWN-Public
24. 2019 Arctic Wolf Networks, Inc. All rights reserved.24 AWN-Public
Different data sources have different delays
• Events from cloud data sources (e.g., Office365) can in particular trickle in slowly
Time zone adjustments
• Normalize to UTC but a lot of data sources to keep track of
Flow of Time Difficulties
25. 2019 Arctic Wolf Networks, Inc. All rights reserved.25 AWN-Public
Together, Flink’s simple deployment patterns and scalability and Esper’s
powerful CEP engine enable us to pick out attack chains in streams of billions of
events.
In summary
Reconnaissance – determine entry point for social engineering attacks via LinkedIn, public emails
Weaponization – build ransomware binary
Deliver/exploit – spearphish a user, and trick user into triggering the binary
Command and control – establish links back to attacker servers for status
Lateral movement – once the first system is infected, scan for other accessible systems over tcp/445 (SMB) and use PowerShell to execute the EternalBlue vulnerability
Under 2 billion events a day when we started this work
Started looking at Flink in 1.2 days
Under 2 billion events a day when we started this work