One of Canada’s largest telecommunications company is using Elastic to drive improved security analysis in their SOC. With a need to ingest all security logs, build threat detection models, and normalize many new types of logs, the Bell security team turned to Elastic. Learn how they’ve streamlined alerts, deepened log analysis, and addressed challenges unique to being an ISP.
9. 9
Building One Piece at a Time
Logging
1 2 3 4
Data engineering Log storage and long
term retention
Visualization and
alerting
End to end solution
10. 10
Where Our Data Comes From
• Bare metal servers
• Virtual machines
• Containers
1 2 3 4
11. 11
Requirements For Our Log Shippers
• Simple way to ship logs
• Something that can buffer logs in case of outage
• Something that’s lightweight, but gives us the possibility to perform light
filtering at the source
• Something uniform throughout our fleet
• Automated deployment capability
1 2 3 4
12. 12
Filebeats and Winlogbeats
• Generic beats configuration per service logged
• Simple installation and configuration
• Minimal impact on systems
• No loss of data in case of network outage
1 2 3 4
14. 14
Being an ISP
• Large quantity and variety of network devices
• Unique ISP applications
• Logs also come from security devices
• Network devices can be very chatty
1 2 3 4
Different data sources to consider that other businesses don’t
15. 15
What If Beats Can’t Handle Special Cases?
• Most of the devices send logs
only via syslog
• Losing data is not an option
• Need to receive data from
geographically diverse locations
1 2 3 4
16. 16
Rsyslog
• Adding Rsyslog servers close to data sources
• Acts as buffer
• Basic parsing and serialization in JSON of logs with Rsyslog
• Send logs to our security data center in TCP and minimize the risk of data loss
1 2 3 4
18. 18
Building One Piece at a Time
Logging
1 2 3 4
Data engineering Log storage and long
term retention
Visualization and
alerting
End to end solution
19. 19
Incoming Logs
• All logs are serialized in JSON
• The ability to sustain large spikes of traffic without over provisioning
• Buffer data allowing for higher availability
• Data accessible to multiple consumers
1 2 3 4
Our past experiences and requirements
20. 20
Kafka as Our Message Queue
• Kafka allows us to handle spikes of logs
• Provide data buffering for potential downstream issue
• Provide controls to share data securely across other teams using open
formats
• Kafka supports JSON out of the box
• Rsyslog and Beats can write to Kafka
1 2 3 4
Our past experiences and requirements
22. 22
Parsing and Normalizing
• Use resources efficiently by taking advantage of auto-scaling
• Every unique technology requires it’s own set of configuration for
parsing and normalization
• Needs integration of CI/CD for ease of test and deployment
1 2 3 4
Our past experiences and requirements
23. 23
Logstash on Openshift
• We decided to run all our logstash instances on openshift
• Containers consumes less resources than multiple virtual machines
• We get auto scaling through openshift
• We can scale quickly by adding more nodes if needed to our openshift
cluster.
1 2 3 4
Logstash containers
25. 25
Logstash on Openshift
• Centralize configurations in Gitlab
• Gitlab allows us to create CI pipelines quickly
• Run Logstash configurations through rspec for testing
• Review and deploy to production on merge requests
• Openshift provides the ability to build CD pipelines
1 2 3 4
Logstash CI/CD
27. 27
Building One Piece at a Time
Logging
1 2 3 4
Data engineering Log storage and long
term retention
Visualization and
alerting
End to end solution
28. 28
Log Storage
• Most the searching is going to be done the same day
• Documents need to be easily searchable for the previous 90 days
• Horizontal scalability
• Highly available and redundant data
1 2 3 4
Our past experiences and requirements
29. 29
Log Storage
• No real surprise, we store our logs in elasticsearch
• Implementing the Hot-Warm architecture provides the best solution to
meet our requirements
• Our process allows for automated deployment of new nodes
• Elasticsearch provides the required HA and redundancy
1 2 3 4
Elasticsearch
31. 31
Long-Term Data Retention
• For forensic and legal issues, data needs to be stored for a minimum of
12 months
• Needs to be stored outside of the elasticsearch cluster
• Fast retrieval of data in the existing elastic cluster
• Minimize cost for long-term storage solution
1 2 3 4
Our past experiences and requirements
32. 32
Long-Term Data Retention
• Openstack Swift allows us to store our index snapshots in object
storage
• Reusability of S3 snapshot plugin from elasticsearch
• Acceptable retrieval times
• Use of curator to automate snapshots
1 2 3 4
S3 object storage
34. 34
Securing Data
• Control over who has access to the data
• Ease of RBAC management
• Add layer of encryption over data transportation
• Use of existing and tested solutions
1 2 3 4
Our past experiences and requirements
36. 36
Building One Piece at a Time
End to end solution
Logging
1 3 42
Data engineering Log storage and long
term retention
Visualization and
alerting
37. 37
Handling and Visualization Our Data
• Easy front-end to query logs
• Reusable query
• Ability to meaningfully visualize data
• Front-end that’s used by a wide range of security specialists
‒ Analysts
‒ Threat hunters
‒ Data scientists
Our past experiences and requirements
1 2 3 4
39. 39
Alerting on Security Events
• Need to filter on meaningful security events
• Ease of building and deploying detection rules
• Automate deployment
• Easily track life cycle of rules
Our past experiences and requirements
1 2 3 4
40. 40
Alerting on Security Events
• Simple way of writing queries
• Use of YAML text files solves maintainability issues with version control
tools
• Auto deployment through CI/CD tools tied to version control
Elastalert
1 2 3 4
42. 42
Smart Detection
• Data must be easily accessible
• Develop custom machine learning models
• Automated deployment of machine learning models
• Flexibility in using different algorithms
Our past experiences and requirements
1 2 3 4
43. 43
1 2 3 4
Smart Detection
In-house machine learning
• Models developed with open source, ML
centric libraries
• Deployment pipeline from data scientists
to production
45. 45
Security Event Correlation
• Ability to correlate security events
• Ability to write complex rules
• Simple front end to help our analysts
• Central point for alerting
Our past experiences and requirements
1 2 3 4
46. 46
Security Event Correlation
• Provides one of the best correlation engines for security events
• Allows for aggregation, correlation, trending, and more
• ESM provides a GUI and it’s a well known product throughout Bell
security teams
• Can receive and send data to multiple sources
Arcsight
1 2 3 4
48. 48
Today’s Situation With Elastic
• Elastic allows for horizontal scaling to support constant increase of log
volume
• Elastic allows for simple integration with open security protocols
• Elastic’s X-Pack solution provides a built-in secure data environment
• New architecture using elastic allows us to build more detection
mechanism using different techniques
Where we at
STOP
1 2 3 4