This document discusses illuminating dark data from critical infrastructure systems. It defines critical infrastructure as facilities like dams, power plants, factories, and transportation. These systems often rely on industrial control systems that were historically isolated, hindering access to operational data. The document proposes using tools like Telegraf and OPC UA to securely collect data from these systems and export it to databases like InfluxDB for analysis. This enables using the data for key performance indicators, benchmarking, and informing engineering decisions through dashboards and notebooks. The goal is to modernize access to critical infrastructure data while maintaining security.
David Henthorn [Rose-Hulman Institute of Technology] | Illuminating the Dark Data of Critical Infrastructure | InfluxDays EMEA 2021
1. | Illuminating the Dark Data of Critical Infrastructure
Deeply technical,
single-track free virtual
InfluxData event
2.
3. What is Critical Infrastructure?
• Dams
• Power plants
• Factories
• Manufacturing
• Transportation
• Water facilities
• Chemical plants
• Nuclear facilities
• Food and Agriculture
• Healthcare
United States: Cybersecurity and Infrastructure Security Agency
(CISA) under Department of Homeland Security
9. Dark Data
• Immense volumes of
crucially valuable data
was literally locked away
with the control system
• The security methodology
ultimately hindered access
to insight
10. How do we get our data out of our
protected networks efficiently yet
safely and securely?
Once we have the data, how do we
make informed decisions with it?
So…
11.
12. At Rose-Hulman
• We use Telegraf to
collect data from our
control systems
• We then send this
data out to InfluxDB
12
15. Levels 0/1
• Devices here speak any number of industrial protocols
– Modbus/TCP
– EtherNet/IP
– CAN bus
– Profibus/Profinet
– …
15
Telegraf can
handle some of
these!
16. But should we?
• While Telegraf capable of speaking many of these protocols
• Level 0/1 devices should be left to their dedicated tasks as
much as possible
• These devices are not updated frequently
• Also, many of these protocols have NO security built into them
• Telegraf agent buried deep in the network levels
• Would need to manage connections to many devices
16
18. What is OPC UA?
• Many modern control systems are implementing OPC UA
connectivity
– OLE for Process Control. MS Windows initially
• DA, HA, AE, etc.
– Open Platform Communications
• https://opcfoundation.org
– UA = unified architecture
• Aggregates all the disparate sources (and protocols) into one
• Built with modern internet connectivity and security in mind
18
21. ANOTHER OPTION: Factry.io’s Node Implementation
• node-opcua-logger
• Standalone program written in Node.js
• Contains industry standard techniques for handling data:
– Periodic scans or subscriptions
– “Data compression” methods
• https://github.com/coussej/node-opcua-logger
22.
23. Henthorn Lab Telegraf OPC UA plugin
• In production for ~10 months now
• Uses the GOPCUA library for communication
• Industry standard data compression techniques
• Heartbeat techniques
• Available on our Github (github.com/henthornlab)
28. Two major use cases of data from critical infrastructure
• What are the key
performance indicators
right now?
• What were the key
performance indicators
over some time range?
29. What are the values right now?
• Clients could query the OPC UA servers directly
– Security and network traffic concerns here
– Mission critical connections only
• Clients can query the Historian for the latest values
– Depending on location, there will be some latency to this
– For dashboards and some webapps, not a big deal
– But what about some other use cases?
29
31. Telegraf exposing Prometheus-style Metrics
• Telegraf’s Prometheus output plugin allows metrics to be
exposed via a http/https endpoint
– http://my-telegraf-host.domain/metrics
• Lightweight and low latency for on-premises clients due to
position in DMZ
31
34. Harden access to /metrics
• Can easily pass that endpoint to a local or neighboring
Apache or NGINX web server
• These can serve to handle authentication, https, logs, load
balancing, etc.
• Leaves Telegraf to focus on its task
34
37. What do we do with this data?
• Benchmark previous performance so we can:
– Identify outliers in a currently running process
– Forecast future behavior
– Predict when maintenance is needed
– Make informed decisions on whether to upgrade or scale up
• Identify correlations and engineering trends
• Aggregate data from multiple and varied sources
– e.g. Anomalous electrical behavior vs. weather
37
38. Currently teaching a course on Process Analytics
• Course learning objectives center on the collection and
analysis of process data to make informed engineering
decisions
• Students typically have exposure to:
– MS Excel
– MATLAB
– R
– Python
39. Skillset Growth
• Students start with familiar tools
– Data into spreadsheets
– CSV files
• Timeseries data and databases
• Move to key performance indicators (KPIs) and dashboards
– InfluxDB and Grafana
• Bulk of time with interactive Python data notebooks
39
41. Onboarding Exercise for Dashboards and Time Series
Data
• Loaded five-year historical data into InfluxDB for popularity
of the top 100 games on Steam
• Students mined the data to find KPIs and then prepare a
dashboard with those KPIs
• Dataset helps them understand concepts like seasonality
• Quickly learn to identify outliers
44. Jupyter Notebooks
• Interactive notebooks that allow engineers to mock-up a data
science experiment in no time
• Many are cloud-based and run through the browser, so no
additional software needed
• Rich support for text through Markdown language.
• Includes support for mathematical equations through MathJax
(subset of LaTeX)
• Now supports a multitude of kernels besides Python
• Easily shared and version controlled
46. • pandas dataframe filled
with historized data
• Visualization techniques
• Dimensionality reduction
• k-means clustering
• Principal Component
Analysis
• Time series forecasting
• Regression techniques
46
47. Notebooks: Focus is on communications
• Clear are reproducible connections to data
• Processing techniques with lots of comments
• Crisp, informative visuals
47
48. Conclusions:
• We are working to create secure channels to bring data out of
critical infrastructure
• Once out, we want reproducible data and methods
• Data stack: Equipment OPC UA Telegraf InfluxDB
• Methods: Grafana, Jupyter Notebooks
48