1. Manuel Offenberg of Seagate discussed securing data at the edge using RISC-V and Keystone enclaves to protect data during creation and movement.
2. OpenTitan can provide another layer of trust by securing the root of trust.
3. Endpoint security is crucial for ensuring overall data integrity and trustworthiness when significant data is being generated at billions of sensors and IoT devices.
3. Information Classification: General
Edge Data Challenges
1
Significant growth in
data driven autonomous
decision-making 2
Billions of sensors, IoT
devices, and end-points to
generate data for machine
learning training and inference
3
Many of these endpoints
have weak or no security,
increasing risk of unauthorized
data manipulation
4. Information Classification: General
Trusted Endpoint
• Identity attestation
• Firmware and
run-time attestation
• Secure isolation of
critical functionality
• Origin attestation with
data fingerprinting
Assurance
• Immutable and
verifiable object
storage framework
• Fingerprints for
content and metadata
integrity and origin
Notarization
• Ledger to record
content manifest
identifiers
• Immutable relative
ordering of events
Mobilization
• Devices and
applications have
cryptographic
identities
• Only provisioned
member devices
within the domain
• Verifiable provenance
of data objects based
on crypto identity
Chain of Custody for Data
5. Information Classification: General
Concept Use Case
Cloud
Mobile Device
Edge Data Storage
Data
Manifest
Data offload/storage, manifest validation
Data manifest transfers
RISC-V with secure
enclave(s) as Root of Trust
Endpoint
Provisioning
Notary
Storage
MFA Device
Device identity provisioning
6. Information Classification: General
Building Blocks
• DJI Matrice 100
• HiFive Unleashed
• Keystone Enclave
• Yubico Yubikey
Endpoint Services
• Lightweight object storage
• Verified data transfers
• Device provisioning
• Data movement
• Secure data logging
Trusted Endpoint
7. Information Classification: General
Keystone: Open-Source Enclave
Framework for RISC-V
• Trusted run-time for applications
• Isolation of sensitive data & functionality
Enclaves and Root of Trust
D. Lee et al., “Keystone: An Open Framework for Architecting Trusted Execution Environments”
https://doi.org/10.1145/3342195.3387532
8. Information Classification: General
Enclaves and Root of Trust
Keystone: Open-Source Enclave
Framework for RISC-V
• Trusted run-time for applications
• Isolation of sensitive data & functionality
• Uses Cases:
• Device/endpoint attestation
• Secure endpoint services,
e.g., data fingerprinting, key management
9. Information Classification: General
Root of Trust
• Platform integrity
• Self and system, e.g., Keystone SM
• Secrets storage and crypto operations
• Cryptographic identity
• E.g., Trusted Computing Group’s DICE
(Device Identifier Composition Engine)
Enclaves and Root of Trust
Keystone: Open-Source Enclave
Framework for RISC-V
• Trusted run-time for applications
• Isolation of sensitive data & functionality
• Uses Cases:
• Device/endpoint attestation
• Secure endpoint services,
e.g., data fingerprinting, key management
10. Information Classification: General
OpenTitan is the first open source
project building a transparent,
high-quality reference design for
silicon root of trust (RoT) chips.
Firmware
Instruction Set
Architecture
SoC Architecture
Digital IP
(RTL)
Foundry IP
Protocols
Physical Design Kit
Chip Fabrication
Chip Packaging
PCB Interface
PCB Design
(Sch & Layout)
APIS
RTL
Verification
Analog IP
Firmware
Instruction Set
Architecture
SoC Architecture
Digital IP
(RTL)
Foundry IP
Protocols
Physical Design Kit
Chip Fabrication
Chip Packaging
PCB Interface
PCB Design
(Sch & Layout)
APIS
RTL
Verification
Analog IP
Traditional RoT OpenTitan
Software
Silicon
Integration
Proprietary Open
11. Information Classification: General
Root of Trust Prototype
Seagate evaluation platform for endpoint storage
• Trenz TE0841 - Xilinx Kintex UltraScale XCKU035
• USB 3.x host interface
Ported OpenTitan to TE0841
• Added peripheral proprietary IP blocks
• Added placeholder IP as needed
Firmware/software
• Secure boot and secure updates
• Device identity and attestation
• Advanced features, e.g., HSM
What’s next
• Maturation of OpenTitan
• Attestation protocol enhancements
• Integrated IP for custom SoCs
Hi everyone, I'm Manuel and I work at Seagate in the Research organization, and focus on Data Security.
I like discuss how a RISC-V based architecture helps to improve data trust at the Edge.
This is not a very technical talk, as I will try to explain how RISC-V enables better data protection,
specifically for data coming from endpoints and processed at the Edge.
A quick explainer about The Edge: it is a location, not a thing.
It is the outer boundary of the network - sometimes hundreds or even thousands of miles from the nearest enterprise or cloud data center.
The edge can be found in a wide range of locations, including:
• at Floors of manufacturing plants
• on Roofs of buildings
• near Cell phone towers
• in Barns on farms
• etc.
Hopefully, that helps to set the context for the remainder of this talk.
I think pretty much everyone can agree that Machine Learning and Artificial Intelligence will influence the design of gen compute infrastructures.
And, that trend will drive a significant increase of Edge-based autonomous decision-making systems - increasingly impacting our daily lives.
The combination of Edge and emerging threats requires us to rethink data security.
Not only are the threat models for Edge deployments different when compared to traditional data centers,
ML and AI also introduce a whole new set of vulnerabilities.
For example: we have seen a piece of tape on a stop sign throwing off self-driving cars.
Or - in another situation - the misclassification of objects by ML systems when inserting hidden information into an image.
These new threats are the result of limitations in the current ML technologies,
and are generally classified into two groups (taken from a 2019 Harvard University report).
At the high-level, the attacks are categorized as
Data Input attacks, and
Data poisoning attacks
Data Input attacks exploit model weaknesses. Malicious actors change the inputs, fooling the ML system into making mistakes, typically undetected.
E.g., in a 2014 paper by Goodfellow et al., they showed that adding a small amount of noise to images and invisible to the human eye -
makes a ML system misclassify that image.
On the other hand, data poisoning attacks corrupt the training process.
The model learns malicious behavior or even a backdoor, causing the system to malfunction, benefiting malicious actors.
A 2020 paper from Texas A&M shows how effective such an attack can be when training a trojan for deep neural networks.
Countering these news threats requires an increasing focus on Data integrity & trustworthiness.
For that, we need a new generation of security infrastructures that always protect data.
RISC-V gives us the opportunity to rethink how to address these challenges.
Data integrity and trust are key in a world that uses autonomous systems.
All data flowing into these systems should come from sensors or endpoints that are known and trusted.
And when data are moved around, all systems touching those data must be secured as well.
If not, malicious actors may alter the data feeding these autonomous systems, causing them to malfunction.
What is needed is a Chain of custody architecture for data. It contains 4 elements.
1) Trusted Endpoints must have
the ability to respond to secure identity challenges
the ability to respond to challenges attesting to the endpoint's firmware and run-time correctness; and
the ability to enroll into one or more trust domains.
The 2nd component is Data Assurance
- key data are managed as objects that are cryptographically protected for integrity and -- optionally for confidentiality
- then we extend the integrity protection to include provenance of data, i.e., use the endpoint's crypto ID to sign data manifests for proof of origin.
This requires that the crypto ID is recognized as part of a trust domain - by provisioning it for that domain.
3rd component is Data Notarization.
It is (optionally) used to establish data creation in an immutable way -- by recording the unique identifiers of specific data objects -- or data object sets.
This is typically done for non-repudiation reasons and may use ledger technologies.
4th component is Data Mobilization. It is key in any Endpoint and Edge use case, whether data stays at the Edge or is moved from endpoint to Edge to cloud.
In all cases, it is an imperative that all devices and applications that touch or transform data are known and trusted.
The main ingredient for this architecture is secure compute, to support sensitive services,
such as identity and platform integrity attestation, data integrity and origin validation, and provision.
How does such a Framework look like? To answer that question, we created a proof of concept, using a RISC-V platform.
We took a DJI drone and added - what we call - a secure data creation device, plus supporting applications.
This setup captures data acquired during drone flights.
The collected data are stored in a lightweight object store on the drone and cryptographically secured by a drone-mounted trusted element.
All applications within trusted element execute on a RISC-V development board, secured by enclaves.
That allowed us to separate critical security functionality from the general-purpose Linux OS that is running on the board.
The board is then mounted to the drone and powered by the drone’s battery.
After drone flights, data are offloaded via the network interface of the board.
Besides the drone, the project also included other components.
A workstation as a stand-in for an Edge storage solution. It runs a data offload service.
That offload service implements multiple checks
1) it verifies a manifest and its signature, 2) verifies the data against the manifest, and 3) records or checks data ownership with the notary service.
Once data are authenticated, verified, and notarized, they are moved to the cloud.
In our PoC, that backend deployment consists of multiple containerized services running in Azure, including:
- a blockchain service using Quorum -- with a smart contract to record the digest value of the data manifest
- an authority and provisioning service, used in combination with a secure hardware key.
- a NOSQL database to store notarized manifests, object descriptors and associated metadata
- and ... a blob store for cloud data storage
We made sure a trust relationship exist with any system element that touches the data.
E.g., the drone or edge server requires provisioning of its devices and services.
That's done using a pre-provisioned Multi-Factor Authentication device to authorize,
and then create and sign the certificates for all devices and services within the trust network.
The majority of the project was developed in Go, using gRPC for communications; Solidity for a Quorum smart contract, and C/C++ for enclave applications.
The Trusted Endpoint uses a HiFive Unleashed development board.
The board has multi-core RISC-V chip, in this case, a SiFive Freedom SoC, and contains
- 4x RV64 Cores with Virtual Memory Support
- 1x Management Core,
- 2MB L2 Cache,
- 8GB of DRAM,
- plus some peripherals.
The board runs a Linux OS, onto which the we deployed a set of Keystone secure enclaves.
We also tested a DICE cryptographic identity to generate fingerprints for all drone data.
For the purposes of this proof-of-concept, the enclave’s cryptographic identity is used as a proxy for the drone’s identity.
As mentioned earlier, in order to trust the drone as an authorized vehicle for data offload -- its identity must be known and provisioned within a trusted domain.
For that we used a Yubico Multi-Factor Authentication key to generate X.509 certificates, including one for the drone.
We did the same for the other components in the PoC eco-system.
As mentioned, we used enclaves based on Keystone, an open source enclave framework developed at UC Berkeley’s ADEPT lab.
Simply put, the enclave model can be compared to a secure container in which applications run in isolation -- while assuming that the rest of the system is untrusted.
Part of the secure enclave functionality is the ability to authenticate and attest to the integrity of enclaves.
The attestation proofs that the enclave has not been tampered with.
This then makes a statement about the integrity of the enclave itself plus the integrity of the code running within the enclave.
Keystone uses the RISC-V PMPs. This allows executable code to run in Machine Mode underneath e.g., Linux and then create protected physical memory regions.
In the case of Keystone, it runs a very small Trusted Compute Base called the security monitor.
The security monitor is key to the memory isolation model for the enclaves.
Each enclave has its own isolated physical memory, in which you can e.g. run a real-time OS in Supervisor Mode.
Linux support is provided by the project. This allowed us to deploy Keystone enclaves onto the HiFive board in a straightforward manner.
And ... the open-source nature of Keystone allowed us to tinker and make it work for our environment. E.g., we integrated TCG’s DICE functionality.
A key take-way for us is that virtualization really increased efficiency. We used Qemu (Q-em-you), it allowed us to seamlessly migration from emulator to hardware and back.
This is extremely useful for debugging, and allowed us to move between RISC-V, X86 Linux, and ARM platforms with a single set of source code.
-------
All major CPU vendors have their own enclave solutions e.g., ARM TrustZone, Intel SGX, and AMD SEV, but all are pretty much closed source. Amazon has recently announced general availability of Nitro Enclaves, AWS’ cloud enclave solution running on their EC2 hosts.
There are many use cases for secure process isolation, but our main focus - for now - is the protection of Cryptographic functionality.
E.g., in our case, cryptographic signing functions needed for endpoint identification are executed within the enclave, thus never exposing the private key.
For this PoC, we developed and deployed secure enclaves for identity attestation and for digital signing of data manifests.
As notes earlier in the talk, the Keystone enclave framework has its own build-in attestation capabilities, based on measurements taken during the secure boot process.
At each CPU reset, the root-of-trust does several things:
- it measures the security monitor image,
- it generates an attestation secret key,
- it and signs any measurement and public keys with that secret key.
But ... Out of the box -- Keystone currently simulates a secure boot process using a modified bootloader.
In the real world we need a secure boot process backed by a hardware root of trust; and there are multiple ways of solving that problem.
Example: during the 2018 RISC-V Summit speakers from Microsemi showed a solution using a FPGA system controller.
In our PoC, we changed Keystone, to make it work with the Trusted Computing Group’s DICE functionality.
Going forward though, we see OpenTitan as another and very viable solution for hardware backed - secure boot of Keystone.
Soo ... Quick backgrounder on OpenTitan.
It is an open-source project with the objective to design an open, transparent and high-quality silicon root of trust.
Seagate - among others - is a member of the OpenTitan consortium, but the project is managed by lowRISC, a not-for-profit out of Cambridge, UK.
LowRISC also manages the Ibex core project, which defines a configurable 32bit in-order RISC-V core with a 2 stage pipeline.
This Ibex core forms the basis for OpenTitan and is complemented with a number of IP blocks.
Those include memories, security blocks (such as AES, SHA), I/O and other peripherals, a TileLink interconnect, and device interface functions written in C.
The project currently supports both a simulator and FPGA targets and is available on GitHub.
Recommend you check out opentitan dot org for more information about the project.
Seagate is investigating how open source hardware designs can be used to implement the security needs of endpoints and any storage platforms in the future.
In order to test our thesis that - Edge deployments benefit from an open RoT and an open Enclave model - we developed a research platform.
We converted an existing Seagate product into an endpoint storage eval platform by redesigning the board and adding a FPGA device.
Besides those key changes, we also:
Added peripheral IP blocks where needed
in this case - Xilinx IP blocks for SPI, QSPI, I2C
Added a placeholder IP block not yet available in OpenTitan
Added proprietary IP blocks to enable advanced functionality
e.g., SPI interposer, a reset controller
Also added an extender board for debug and bootstrap, and
Had to modify the power supply to support the addition of boards and FPGA.
We will use the FPGA as a target for our internal OpenTitan builds to test both OpenTitan IP, plus investigate integration scenarios.
The main one is to validate Tock, the open source OS written in Rust and modified by the community to support OpenTitan.
In the future, we like to look into DICE support and OCP's project Cerberus integration -
using Manticore, an open source implementation of the Cerberus protocol.
For the latter, we will focus on missing functionality required to improve Edge storage security.
To summarize,
The growth in Edge compute and the rapid adoption of ML and AI technologies will drive the need for better data security.
For those uses cases, it is key that we trust all data created by endpoints and have a way to always validate the integrity and provenance of that data.
RISC-V and its ecosystem provides a unique baseline, to build more secure systems that will protect all data throughout its life cycle.
These systems will range from small endpoints to large HPC clusters, but in all cases, improved security is needed for data trustworthiness.
Finally, OpenTitan is a RISC-V domain specific security solution that will significantly improve hardware security and thus the overall integrity of systems going forward.
[[ take sec ]]
If you like, reach out to me at manuel dot offenberg at seagate dot com.
Thank you for your time and have a good day.