The 3 Generations of Security Operations Centres
Follow the Bank of England’s journey with Splunk and discover how the UK’s central bank is transitioning its security operations centre towards a more automated future
6. Problems with an alert driven SOC
Expensive humans triaging
alerts you have little control
over
Alert fatigue
How do you detect attacks
your vendors have no
knowledge of?
Known unknowns
Reliant on your security
vendor tools to detect
attacks
Vendor reliance
9. Attacker groups, tactics,
techniques or procedures
known by the Bank
Unknown attacks
Attacker groups, tactics,
techniques or procedures
known by the Bank
Unknown attacks
High volume untargeted attacks Low volume targeted bespoke attacks
High sophistication
Low sophistication
10. Attacker groups, tactics,
techniques or procedures
known by the Bank
Unknown attacks
Attacker groups, tactics,
techniques or procedures
known by the Bank
Unknown attacks
High sophistication
Low sophistication
High volume untargeted attacks Low volume targeted bespoke attacks
Log
analysis
Large scale data
mining – attack
discovery
Threat
intelligence
matching
Preventative
controls
11. Security tools detect
your attacks
1.0 Alert driven
Data and tools to discover
unknown attacks
2.0 Discovery driven
12. SOC 2.0 - Attack discovery
Enabling your analysts to discover unknown attacks through data mining
Think beyond traditional
security logs
Data
As attackers change
techniques, so will you
Continual improvement
Consider how your analysts
will learn about the latest
attacker tactics, techniques
and procedures
Attacker knowledge
13. 2.0 Operating model
Threat
intelligence
Data
analytics
Incident
response
Understanding our attackers’
tactics, techniques and
procedures
25% of analysts
Continually
developing new
Splunk searches
looking for latest
attacker
behaviour
50% of analysts
Responding to
suspicious
behaviours
25% of analysts
75%
75% of SOC analysts
focused on improving
detection
14. Security tools detect
your attacks
1.0 Alert driven
Data and tools to discover
unknown attacks
2.0 Discovery driven
Rise of the robots
3.0 Automation driven
15. SOC 3.0
Our automation aims
Can we free up expensive
human analysts?
Automated triage and IR
Can we bring together all
SOC process and encourage
silo reduction?
Reducing silos
Can we bring infrastructure
as code principles to the
SOC?
Infrastructure as code
16. Why do you want to
detect this attack?
Threat
intel
How are you going to
detect the attack?
Splunk
search
How do you prove
your detection works?
Test
criteria
How will you triage
results?
Triage
actions
How will you respond
to the attack?
Response
actions
Threat
intel
Splunk
search
Test
criteria
Triage
actions
Response
actions
19. 4.0 Prediction
• Many years worth of incident and attack data
• Can we identify (predict) the precursors of an
attack?
• Intervene before the attack can occur using
automation?
The future
Identifying precursors to
an attack and proactively
intervening
20. 2 Only you can know your adversaries, your
environment and your business. Your
security vendors cannot. Invest in your
people.
Don’t be driven by your technology
1 Your operating model is just as important
as your technology.
Operating model first
3 If you're not constantly developing new
ways of detecting attacks, your
monitoring is getting worse every day.
Continual improvement
4 Every organisation’s threat profile is
different. Build a Security Operations
Centre proportionate to your threats.
Build a proportionate SOC
Key takeaways
Editor's Notes
Good morning everyone,
Thank you for attending our presentation, we hope you find it useful.
Before we start, just so we can get a feel for the room, please can you put your hand up if you work in or have an interest in cyber security?
JP: I’m Jonathan Pagett and I head up the cyber defence centre at the Bank of England.
Carly: Data Analytics Lead in the Cyber Defence Centre, responsible for our detection use cases. I spend most of my day in Splunk writing search code in order to achieve this.
JP:
I imagine most people will know who the Bank of England is, but just in case there are some international visitors in the audience, we are the UK’s central Bank.
Of the normal services you would expect a central bank to provide, from a cyber security perspective our most important assets are our payment systems, in particular the UK’s real time gross settlement system which acts as the backbone of the UK financial system.
This is in addition to a large amount of sensitive information such as UK financial policy and information that other banks provide us as part of our regulatory requirements.
JP:
Everybody likes numbers, so I wanted to share some to describe our organisation.
We are an organisation of just over 4000 people and our cyber security division is made up of around 70 people. This includes 12 within the cyber defence centre.
In terms of the number of payments we process, its around a third of the UK GDP everyday.
JP:
Moving to the cyber defence centre, our mission is to detect and respond to cyber attacks against the Bank of England. It’s quite a simple mission statement, but those of you who work in security operations will know that this if far from easy to achieve.
We wanted to share with you some of the challenges and our journey over the last 5 years in how we have evolved to met this.
Carly:
We started our journey back in 2014. BoE had spent money on a wide variety of commercial security tools. The strategy was driven mainly by vendors.
At this time, the first iteration of the BoE SOC came about, We like to refer to this as SOC 1.0. We were a purely reactive SOC, responding to alert from the commercial tools.
Our role, as analysts, was to verify that the tools had alerts correctly and ensure measures taken by the tools were as expected, For example - Anti-virus correctly identified a malicious web redirect and has blocked the traffic.
This verification would be documented, but we typically wouldn’t feed anything back in to our processed. This is in part due to the fact we were using commercial tools, typical black boxes where analysts are privy to the alert logic and cannot modify or amend.
Furthermore, indicators of compromise we received from forums or vendors were manually checked by analysts.
Carly:
So, hopefully you can see where this is going… we started to identify some problems with this operating model.
Putting expensive analysts to work, just to verify something like an AV detection was correct – did not seem like a good use of analyst time.
And again analysts weren’t able to improve detections due to the use black box solution where we weren’t able to view or modify alert logic.
This gives us two problems:
Alert fatigue – as analysts we’d often respond to the same alerts on a daily basis, but without the opportunity to improve detection this led to alert fatigue.
We were dependent on vendors having adequate rulesets to detect attacks against us. Which, when considering our quite unique threat profile, as a central bank, it seemed unlikely we could be completely reliant on a broad and general set of signatures from vendors.
Granted we did detect some things but… the big question that was regulalrly asked to senior tech staff was “What keeps you up at night?” and the responses weren’t “Does everything have AV on it?” it was ….
How could we detect the more targeted attacks - the unknown threat?
JP:
Faced with seeing the need to evolve, we took a moment to pause, and redesign our detection strategy.
Ignoring all the technology we had, what would our detection strategy be if we designed from scratch?
Do we have a coherent detection strategy covering all the different types of threat actors that might target the Bank?
Speaking with a lot of security operation centres, when I ask what their detection strategy is, they always say we have got technology X or Y but not an overarching strategy so we decided to write one.
JP:
At its most simple level, we consider attackers to belong in two categories – attackers we know about and those we don’t.
When I say attackers we know about, that might be we have in house knowledge of how they conduct their attacks, or it might be knowledge of attacks that is baked into our security tools we have bought.
A simple example being our network intrusion prevention systems get updates for our vendor, leveraging the knowledge of their security teams.
JP:
We then break these two categories out, along two axis,
Firstly how sophisticated the attackers operation is – and I’m talk more than just how sophisticated the malware is, if malware is used. I’m talking about the attackers operation as a whole.
Secondly, how targeted is the attacker against. Are they an attack after a wide range of organisations and are failing indiscriminate? Or are they after just one organisation? This roughly translates into the volume of attacks we can expect to see across different organisations.
We do this so we can start mapping different attacks on this grid. For example, the Bank of Bangladesh attack, where they tried to steal 1 Billion dollars, and ultimately managed to steal 81 million, would be in the top right corner. It was a sophisticated operation and extremely targeted at just one organisation.
Where as on the other hand something like a widespread ransomware attack might be delivered to a range of organisations.
JP:
The reason we do this is to overlay our detection strategy on top.
JP
JP
Carly: At the core of this is our operating model. This brings together the vital operational functions of SOC2.0.
1. The process of data driven detection begins with Threat Intelligence. This could be from intel vendors, open source, home-grown incident data, pentest reporting or discussions had between different teams in the bank. The purpose of intel gathering is to pull out the tactic, techniques and procedures (aka TTPs). We wanted to understand the typical trends of our attackers, who and how they normally target.
2. This information feeds in to the Data Analytics section of the team, my part of the team, analysts take TTPs and generalise them to behaviours. Analysts will then code these up in splunk searches.
For example....
- Looking at entropy of DNS queries to detect data exfiltration
OR - Baselining typical netflow from groups of devices to try to identify lateral movement
OR - Clustering powershell commands to identify outliers that could indicate a malware infection
These scheduled searches then generate events, which, using a variety of methods including risk scoring these searches will trigger alerts that will be triaged by the incident responders on duty.
3. As with all SOCs incident response is at the heart of our purpose, the key improvement we made during the SOC2.0 transition was to ensure incident data is fed back in to the cycle via the threat intelligence function. By doing this we can learn from where attackers slipped through our security undetected and model specific threats to BoE and be able to capture their general behaviours next time, even if they do switch up infrastructure or specific fingerprints of their attack.
JP:
Following on from a few years of successfully running our SOC 2.0 model, our third generation is centred around automation.
JP:
Our automation aims centre around 3 outcomes:
Like everyone else we are keen to automate as much of our triage and incident response actions as possible, freeing up our expensive analysts to work on what people do best, the more creative tasks within the CDC.
We are also keen on taking infrastructure as code principles and applying them to a SOC, which I will come back to in a minute.
And lastly we want to help reduce some of the silos that have developed within the SOC, encouraging our analysts to thing of the full end to end process rather than the piece of the puzzle they are responsible for.
JP:
Within the SOC we have 5 main components ranging from splunk searches through to response actions.
These are all configured in different tools and generally considered in isolation, but this hinders our automation aims.
JP:
We have introduced the concept of a defence template, a single piece of code, similar in concept to an AWS cloud formation template, that contains all the components that describe the end to end processes of the SOC.
As all components are defined in a single document, it requires our analysts to think of every component together rather than in isolation.
It also allows us to then automatically deploy the template against our SOC infrastructure.
JP:
Moving on from the automated deployment of defence templates, we also look at which templates are suitable to automate their triage and response actions.
As part of defining our templates, we appreciate that not all triage and response actions can be fully automated.
To this end we have introduced 4 types of defence templates each with varying levels of automation.
Most new defence templates start life as a type 1 – where we still get humans to perform the triage and response actions. As we gain more confidence in the template, we then move it through the levels to a type 4, where the template is fully automated.
Carly:
Moving in to the more forward thinking space, as we've gone through the different evolutions of the SOC, we're gathered up a vast amount of contextual information. For example, incident data (consisting of true positive, real attack data and a whole lot of false positives). We've also grown a collection of TTPs, risk scores for assets and we're working towards applying this on high value users.
So how can we use all this extra data with our raw telemetry to our benefit?, Trying to prevent and intervene in attacks before they can affect us.
Similar to the way police are able to use crime data and statistics to identify crime hotspots and predict when an area is at higher risk of a crime occurring, we want to apply similar methodology with cyber security. This tied in with automation of SOC3.0 could allow us to identify triggers of early warnings/flashpoints and predefined proactive measures to shut down a risky situation before anything could happen. - Identify a user that has exhibited a specific pattern of risky behaviours that we've established from a previous attack and isolating, enhanced monitoring etc before anything bad can happen
- Identifying a newly registered domain that could pose a risk to the Bank via phishing or a waterhole attack, and blocking this before anything bad can occur.
Think – Minority report for Cyber Attacks.