SlideShare uma empresa Scribd logo
1 de 58
Baixar para ler offline
Secure Cloud Storage and
    Computing Using
Reconfigurable Hardware

Victor Costan (龍望), Hsin-Jung Yang (楊昕蓉),
      Srini Devadas, Nickolai Zeldovich
Why Security Matters
Cloud Computing: Dreams and Reality

• The Cloud: Ideal Picture   • The Cloud: Reality
Cloud Storage: Attack Vectors




 Hypervisor        State        Hardware
   Bugs         Manipulation     Attacks
Replay Attacks are Harmful
Spot the Differences
Spot the Differences
Spot the Differences
Spot the Differences: Job
Spot the Differences: Job
Spot the Differences: Name, Relationship Status
Why It Matters

• We rely on fresh data to make decisions
  – Google searches
  – Facebook profiles
  – Twitter, Linked-In


• Outdated data has big impact on users
  – Wrong profile information: confusion, embarrassment
  – Old search results: bad business decisions, embarrassment
  – Old document versions: costly business decisions, regulatory issues
System Design
Design:
Cloud Storage API
• Block Device
  – Fixed block size (1Mb)
  – Write(block number, block)
  – Read(block number)  block


• Easy to reason about the security

• File systems operate on top of this abstraction


                   B1            B2            B3          B4

                            Disk divided into 1MB blocks
Design:
System Architecture

                                                                Client
 FPGA / ASIC                 Secure NVRAM
  (Trusted)                       Chip



                                             System Bus
                                             (Untrusted)
                                                                   Internet
                                                                 (Untrusted)



    CPU           Disk            RAM
 (Untrusted)                   (Untrusted)       Network Card
               (Untrusted)                        (Untrusted)
Design:
Trusted Storage on Untrusted Disks
160-bit hash in trusted memory authenticates 1TB disk


                                    Root Hash
                                                                            Root hash matches
                                    h7=h(h5||h6)                            iff all blocks match
  20
levels
                h5=h(h1||h2)                                                       Nodes hash
                                                        h6=h(h3||h4)              their children


         h1=h(B1)        h2=h(B2)            h3=h(B3)            h4=h(B4)         Leaves hash
                                                                                   their blocks


           B1                  B2                  B3                  B4

                        Disk divided into 1MB blocks
Design:
Hash Tree Caching
 Node   Hash                      Verified   Left        Right
 number                                      child       child
 1         fabe3c05d8ba995af93e   Y          Y           N

 2         e6fc9bc13d624ace2394   Y          Y           Y
                                                                      The FPGA
                                                                     caches hash
 4         53a81fc2dcc53e4da819   Y          N           N
                                                                      tree nodes
 5         b2ce548dfa2f91d83ec6   Y          N           N


                                                                      1
     The untrusted OS is free to choose
      the caching policy, for maximum                        2                3
                performance
                                                     4           5        6        7
Design:
Hash Tree Cache
• Server stores entire hash tree in RAM
• FPGA has a cache that stores a subset of nodes
• Server tells FPGA what nodes to store




                          Cache management commands



              1                             Node Hash    Verified
                                               1 fabe…      Y
      2               3                        2 e6fc…      Y
                                               4 53a8…      Y
  4       5       6         7
                                               5 b2ce…      Y
Design:
Hash Tree Cache - Load

• Server tells the FPGA to load a node into a cache entry
• The cache entry is unverified right after a load

                1                                       1

       2                                        2

  4                                         4       5

 Node Hash          Verified          Node Hash         Verified
      1 fabe…          Y                  1 fabe…           Y
      2 e6fc…          Y                  2 e6fc…           Y
      4 53a8…          N                  4 53a8…           N
                                          5 b2ce…           N
Design:
Hash Tree Cache - Verify

• Server tells the FPGA to use a node to verify its children
• FPGA checks that parent’s hash matches children hashes

                1                                       1

       2                                        2

  4        5                                4       5

 Node Hash          Verified          Node Hash         Verified
      1 fabe…          Y                 1 fabe…            Y
      2 e6fc…          Y                 2 e6fc…            Y
      4 53a8…          N                 4 53a8…            Y
      5 b2ce…          N                 5 b2ce…            Y
Design:
Hash Tree Cache - Efficiency

• Checking leaf 33 requires 10 node loads for a cold cache on
  this toy example (38 loads on the real FPGA tree)
• Remember the root is always loaded in the cache

                                         1

                                 2           3

                             4       5

                    8            9

          16            17

     32        33
Design:
Hash Tree Cache - Efficiency

• Checking leaf 38 only 4 node loads, because 9 is already in
  the cache and verified
• Server can predict client requests and manage cache for
  high performance
                                                    1

                                      2                 3

                             4                  5

                    8                  9

          16            17       18        19

     32        33                     38        39
Results
Results:
System Architecture

                                                                Client
 FPGA / ASIC                 Secure NVRAM
  (Trusted)                       Chip



                                             System Bus
                                             (Untrusted)
                                                                   Internet
                                                                 (Untrusted)



    CPU           Disk            RAM
 (Untrusted)                   (Untrusted)       Network Card
               (Untrusted)                        (Untrusted)
Results: Server Prototype
Results: Server Prototype
Results: Normal Operation
Results: FPGA Board, Normal Operation
Results: Attack Does Not Impact User
Results: FPGA Board, Under Attack
Results: Performance Block Diagram

         Read / Write 1MB Data Block to Disk

                Limit: Disk I/O Speed



                Hash 1MB Data Block

Limit: Hash Engine Speed      Limit: FPGA Data Bus



           Load & Verify Hash Tree Nodes

Limit: Hash Engine Speed       Limit: Dependencies



           Update Hash Tree (Writes Only)

Limit: Hash Engine Speed       Limit: Dependencies



                HMAC (Sign) Result

              Limit: Hash Engine Speed
Results: Performance Block Diagram

        Read / Write 1MB Data Block to Disk

               Limit: Disk I/O Speed



               Hash 1MB Data Block

Limit: Hash Engine Speed     Limit: FPGA Data Bus



           Load & Verify Hash Tree Nodes

Limit: Hash Engine Speed      Limit: Dependencies



           Update Hash Tree (Writes Only)

Limit: Hash Engine Speed      Limit: Dependencies



                HMAC (Sign) Result

             Limit: Hash Engine Speed
Results: Prototype Performance (est.)

        Read / Write 1MB Data Block to Disk         Disk I/O         Throughput
               Limit: Disk I/O Speed
                                                    7,200 RPM HDD         70 MB/s
                                                    10,000 RPM HDD       100 MB/s
               Hash 1MB Data Block
                                                    15,000 RPM HDD       130 MB/s
Limit: Hash Engine Speed     Limit: FPGA Data Bus
                                                    SSD                  250 MB/s
           Load & Verify Hash Tree Nodes

Limit: Hash Engine Speed      Limit: Dependencies

                                                    1 MB = 1 block
           Update Hash Tree (Writes Only)

Limit: Hash Engine Speed      Limit: Dependencies



                HMAC (Sign) Result

             Limit: Hash Engine Speed
Results: Performance Block Diagram

         Read / Write 1MB Data Block to Disk

                  Limit: Disk I/O Speed



                Hash 1MB Data Block

Limit: Hash Engine Speed         Limit: FPGA Data Bus



            Load & Verify Hash Tree Nodes

 Limit: Hash Engine Speed         Limit: Dependencies



            Update Hash Tree (Writes Only)

 Limit: Hash Engine Speed         Limit: Dependencies



                  HMAC (Sign) Result

                Limit: Hash Engine Speed
Results: Prototype Performance (est.)

         Read / Write 1MB Data Block to Disk            Operation         Throughput
                  Limit: Disk I/O Speed                 Block Hash              800 MB/s
                                                        Pipelined              3,200 MB/s
                Hash 1MB Data Block                     Block Hash
Limit: Hash Engine Speed         Limit: FPGA Data Bus



            Load & Verify Hash Tree Nodes
                                                        1 MB = 1 block
 Limit: Hash Engine Speed         Limit: Dependencies


                                                        Transport          Throughput
            Update Hash Tree (Writes Only)
                                                        PCI Express x16      4,096 MB/s
 Limit: Hash Engine Speed         Limit: Dependencies
                                                        SATA II                384 MB/s
                  HMAC (Sign) Result                    PCI Express x1         250 MB/s
                Limit: Hash Engine Speed                Ethernet               125 MB/s
Results: Performance Block Diagram

         Read / Write 1MB Data Block to Disk

                  Limit: Disk I/O Speed



                 Hash 1MB Data Block

 Limit: Hash Engine Speed        Limit: FPGA Data Bus



           Load & Verify Hash Tree Nodes

Limit: Hash Engine Speed         Limit: Dependencies



            Update Hash Tree (Writes Only)

 Limit: Hash Engine Speed         Limit: Dependencies



                  HMAC (Sign) Result

                Limit: Hash Engine Speed
Results: Prototype Performance (est.)

         Read / Write 1MB Data Block to Disk            Operation         Throughput
                  Limit: Disk I/O Speed                 Tree Node Hash           1.25 M/s
                                                        Pipelined                 5.0 M/s
                 Hash 1MB Data Block                    Tree Node Hash
 Limit: Hash Engine Speed        Limit: FPGA Data Bus   Tree Operations           62.5 k/s
                                                        Optimized Tree            2.5 M/s
           Load & Verify Hash Tree Nodes                Operations
Limit: Hash Engine Speed         Limit: Dependencies
                                                        1 MB = 1 block
            Update Hash Tree (Writes Only)              Transport          Throughput
 Limit: Hash Engine Speed         Limit: Dependencies   PCI Express x16      4,096 MB/s
                                                        SATA II                384 MB/s
                  HMAC (Sign) Result                    PCI Express x1         250 MB/s
                Limit: Hash Engine Speed
                                                        Ethernet               125 MB/s
Results: Performance Block Diagram

         Read / Write 1MB Data Block to Disk

                  Limit: Disk I/O Speed



                 Hash 1MB Data Block

 Limit: Hash Engine Speed        Limit: FPGA Data Bus



            Load & Verify Hash Tree Nodes

 Limit: Hash Engine Speed         Limit: Dependencies



           Update Hash Tree (Writes Only)

Limit: Hash Engine Speed         Limit: Dependencies



                  HMAC (Sign) Result

                Limit: Hash Engine Speed
Results: Prototype Performance (est.)

         Read / Write 1MB Data Block to Disk            Operation         Throughput
                  Limit: Disk I/O Speed                 Tree Node Hash           1.25 M/s
                                                        Pipelined                 5.0 M/s
                 Hash 1MB Data Block                    Tree Node Hash
 Limit: Hash Engine Speed        Limit: FPGA Data Bus   Tree Operations           62.5 k/s


            Load & Verify Hash Tree Nodes

 Limit: Hash Engine Speed         Limit: Dependencies
                                                        1 MB = 1 block
           Update Hash Tree (Writes Only)               Transport          Throughput
Limit: Hash Engine Speed         Limit: Dependencies    PCI Express x16      4,096 MB/s
                                                        SATA II                384 MB/s
                  HMAC (Sign) Result                    PCI Express x1         250 MB/s
                Limit: Hash Engine Speed
                                                        Ethernet               125 MB/s
Results: Performance Block Diagram

         Read / Write 1MB Data Block to Disk

                Limit: Disk I/O Speed



                Hash 1MB Data Block

Limit: Hash Engine Speed      Limit: FPGA Data Bus



           Load & Verify Hash Tree Nodes

Limit: Hash Engine Speed       Limit: Dependencies



           Update Hash Tree (Writes Only)

Limit: Hash Engine Speed       Limit: Dependencies



                HMAC (Sign) Result

             Limit: Hash Engine Speed
Results: Prototype Performance (est.)

         Read / Write 1MB Data Block to Disk         Operation         Throughput
                Limit: Disk I/O Speed                Node HMAC                1.25 M/s

                Hash 1MB Data Block

Limit: Hash Engine Speed      Limit: FPGA Data Bus



           Load & Verify Hash Tree Nodes

Limit: Hash Engine Speed       Limit: Dependencies
                                                     1 MB = 1 block
           Update Hash Tree (Writes Only)            Transport          Throughput
Limit: Hash Engine Speed       Limit: Dependencies   PCI Express x16      4,096 MB/s
                                                     SATA II                384 MB/s
                HMAC (Sign) Result                   PCI Express x1         250 MB/s
             Limit: Hash Engine Speed
                                                     Ethernet               125 MB/s
Results: Performance Block Diagram

                                                     • Steps are performed in
         Read / Write 1MB Data Block to Disk

                Limit: Disk I/O Speed
                                                       parallel (pipelined),
                                                       because they are in
                Hash 1MB Data Block
                                                       different system
Limit: Hash Engine Speed      Limit: FPGA Data Bus     components
                                                     • However, the slowest
           Load & Verify Hash Tree Nodes
                                                       step is the bottleneck
Limit: Hash Engine Speed       Limit: Dependencies
                                                       for the entire system
           Update Hash Tree (Writes Only)            • Each step can be made
Limit: Hash Engine Speed       Limit: Dependencies     faster by adding more
                                                       hardware (e.g. more
                HMAC (Sign) Result                     disks), assuming cache
              Limit: Hash Engine Speed
                                                       policies can scale up
Results: Ping-Pong Workload
         10                            • Typical collaboration
          9                              scenario
          8
          7
                                       • Real-Life
          6
                                         – Google Docs
 Block




          5
                                         – Facebook Messages
          4                              – Dropbox
          3
          2
                                       • Straight-up LRU shines
          1
                                         here
          0
              0   5    10    15   20
                      Time
Results: Photo Gallery Workload
         10                            • Modeled after data on
          9                              photo applications
          8
          7
                                       • Real-Life
          6
                                         – Facebook’s #1 Feature
 Block




          5
                                         – Google Picasa
          4                              – Flixter
          3
          2
                                       • Special policy inspired
          1
                                         by Facebook Haystack
          0                              classifies photos, loads
              0   5    10    15   20
                      Time               cache predictively
Results: Map-Reduce Workload
         30                   • Index-generating Map-
                                Reduce
         25


         20                   • Real-Life
                                – Google Pagerank
 Block




         15
                                – Facebook friend graph
                                  (EdgeRank)
         10


          5                   • Special policy that
                                takes advantage of
          0                     Map-Reduce access
              0     5    10     pattern
                  Time
Results: Cache Hit Rates
                            • Applications: 2 users
 1                            collaborating on a file (ping-
                              pong), photo gallery
0.9                           browsing, Map-Reduce job
0.8
                            • Cache policies: Speculative
                              Last-Recently Used,
0.7              Spec LRU
                              Facebook Haystack’s policy
                 Haystack
                              optimized for caching,
0.6              MR-Aware
                              policy optimized for Map-
                              Reduce access patterns
0.5                         • Conclusion: no policy
                              works well on all
                              applications, so app server
                              must drive policy
Results: Protocol Overhead

  • Client – Server Bandwidth overhead: 0.002%
      – Operation: 1 HMAC (20 bytes) per 1MB = 0.002%
      – Handshake: extra secret exchange piggybacks on SSL: 5%


  • Latency overhead (1 client): 4%
      – Without security: 8.2ms / request
      – With security: 8.5ms / request
      – Latency overhead = the latency of a very fast Internet hop


  • No throughput overhead (N-clients)
      – With or without security: 100MB/s
      – Need 40 HDDs to saturate PCI-E x16, 52 HDDs to saturate FPGA



MIT COMPUTER SCIENCE AND ARTIFICIAL INTELLIGENCE LABORATORY
Results: Protocol Overhead

• Protocol is simple
  enough to implement
  on browser side
  – Chrome
  – Firefox
  – Internet Explorer 10


• Easy integration in
  existing Web
  applications

• End-to-end security
Thank You!



  Questions?
Other Applications


  • FPGA can be used to load user-specified circuits and
    perform arbitrary computation with security guarantees

  • Applications: encrypted image search, financial calculations

  • Potential applications in highly regulated industries, e.g.
    medical record keeping and processing, secure financial
    services




MIT COMPUTER SCIENCE AND ARTIFICIAL INTELLIGENCE LABORATORY
Secure Computation:
   Overview
                                Untrusted
                               computation:                      VM image 
                                                                  CPU cores
                                VM image                Cloud
          Task
                                 Trusted               Machine   Circuit spec
                               computation:                        FPGA
                               Circuit spec                         LUTs




     • Most code is untrusted, executes in a VM

     • Trusted code is broken up into kernels which become
       circuits deployed onto an FPGA

     • If efficiency is not an issue, deploy a processor on the
       FPGA, execute software securely
MIT COMPUTER SCIENCE AND ARTIFICIAL INTELLIGENCE LABORATORY             6/9/2011
Secure Computation: Challenge

• Multi-tenancy is the key              VM Hypervisor
  to the cloud’s cost
                             Client 1      Client 2     Client 3
  effectiveness                VM            VM           VM


                                          PCI Express
• FPGA can host different
  applications running in           FPGA controller
  parallel
                                                  Client 2
                                                 Application
• Challenge: isolation        Client 1
  between applications,      Application
  just like a hypervisor                          Client 3
                                                 Application
Other Applications


• FPGA can be used to load user-specified circuits and
  perform arbitrary computation with security guarantees

• Applications: encrypted image search, financial calculations

• Potential applications in highly regulated industries, e.g.
  medical record keeping and processing, secure financial
  services
Design:
FPGA Boot Sequence
                         random nonce

                         PKcard + Manufacturer Certificate

Check certificate against e-fuses
Check Pkcard against certificate
                                    PUFsyndrome + SignPKcard(PUFsyndrome)

Compute SKfpga from PUFsyndrome

                         Root Hash + SignPKcard(nonce || Root Hash)

Verify signature
                        EncSKfpga(SKcard) + MACSKfpga(nonce || SKcard)

Verify MAC
Design:
Client Trust Model
• Each FPGA – NVRAM pair has a Endorsement Key (EK)
• Manufacturer certifies the public EK
• Client uses the public EK to encrypt a HMAC key, which
  becomes its shared secret with the trusted hardware


                                                 Manufacturer
                  verify           Endorsement         sign
   Client
                                   Certificate
       generate
  HMAC key                                         PubEK PrivEK
              encrypt with PubEK
                                                                decrypt with
    Encrypted HMAC key                                            PrivEK
                                                          HMAC key
Design:
Hash Tree Security

1. Impossible to come up with a block B1’ such that B1 ≠ B1’
   but h(B1) = h(B1’)

2. Impossible to come up with a node hash h1’ such that h1’
   such that h1 ≠ h1’ but h(h1||h2) = h(h1’||h2)

Therefore, the root hash authenticates the entire contents of
the tree.
Design:
FPGA Boot Sequence Security

• Server OS transfers messages between FPGA and Trusted
  Memory  untrusted channel

• FPGA authenticates Trusted Memory using Manufacturer
  Certificate, whose public key is burned into FPGA’s e-fuses

• Trusted Memory authenticates FPGA using its Physically
  Unclonable Function (PUF)

• At manufacturing time, FPGA is paired with memory chip

• FPGA can be paired with new memory chip if necessary
Design:
Hash Tree Cache Security

• Server OS responsible for loading and verifying tree nodes

• Parent node hash verifies children nodes

• Reading a block requires the block’s leaf to be verified

• Writing a block requires the path from the block’s leaf to the
  root to be loaded and verified

• A node can be loaded in at most one cache line, to prevent
  replay attacks using stale node hashes

Mais conteúdo relacionado

Semelhante a Trusted Cloud Storage Tech Talk

Mongodb - Scaling write performance
Mongodb - Scaling write performanceMongodb - Scaling write performance
Mongodb - Scaling write performanceDaum DNA
 
Cache is King ( Or How To Stop Worrying And Start Caching in Java) at Chicago...
Cache is King ( Or How To Stop Worrying And Start Caching in Java) at Chicago...Cache is King ( Or How To Stop Worrying And Start Caching in Java) at Chicago...
Cache is King ( Or How To Stop Worrying And Start Caching in Java) at Chicago...srisatish ambati
 
Cache on Delivery
Cache on DeliveryCache on Delivery
Cache on DeliverySensePost
 
Facebook's HBase Backups - StampedeCon 2012
Facebook's HBase Backups - StampedeCon 2012Facebook's HBase Backups - StampedeCon 2012
Facebook's HBase Backups - StampedeCon 2012StampedeCon
 
Accelerating NoSQL
Accelerating NoSQLAccelerating NoSQL
Accelerating NoSQLsunnygleason
 
Crypto Strikes Back! (Google 2009)
Crypto Strikes Back! (Google 2009)Crypto Strikes Back! (Google 2009)
Crypto Strikes Back! (Google 2009)Nate Lawson
 
Disk IO Benchmarking in shared multi-tenant environments
Disk IO Benchmarking in shared multi-tenant environmentsDisk IO Benchmarking in shared multi-tenant environments
Disk IO Benchmarking in shared multi-tenant environmentsRodrigo Campos
 
Os Wardenupdated
Os WardenupdatedOs Wardenupdated
Os Wardenupdatedoscon2007
 
MongoDB: Scaling write performance | Devon 2012
MongoDB: Scaling write performance | Devon 2012MongoDB: Scaling write performance | Devon 2012
MongoDB: Scaling write performance | Devon 2012Daum DNA
 
Windows server 8 hyper v & storage (hans vredevoort)
Windows server 8 hyper v & storage (hans vredevoort)Windows server 8 hyper v & storage (hans vredevoort)
Windows server 8 hyper v & storage (hans vredevoort)hypervnu
 
00 opencapi acceleration framework yonglu_ver2
00 opencapi acceleration framework yonglu_ver200 opencapi acceleration framework yonglu_ver2
00 opencapi acceleration framework yonglu_ver2Yutaka Kawai
 
Rooting your internals - Exploiting Internal Network Vulns via the Browser Us...
Rooting your internals - Exploiting Internal Network Vulns via the Browser Us...Rooting your internals - Exploiting Internal Network Vulns via the Browser Us...
Rooting your internals - Exploiting Internal Network Vulns via the Browser Us...Michele Orru
 
Near-realtime analytics with Kafka and HBase
Near-realtime analytics with Kafka and HBaseNear-realtime analytics with Kafka and HBase
Near-realtime analytics with Kafka and HBasedave_revell
 

Semelhante a Trusted Cloud Storage Tech Talk (17)

Radius
RadiusRadius
Radius
 
Mongodb - Scaling write performance
Mongodb - Scaling write performanceMongodb - Scaling write performance
Mongodb - Scaling write performance
 
Cache is King ( Or How To Stop Worrying And Start Caching in Java) at Chicago...
Cache is King ( Or How To Stop Worrying And Start Caching in Java) at Chicago...Cache is King ( Or How To Stop Worrying And Start Caching in Java) at Chicago...
Cache is King ( Or How To Stop Worrying And Start Caching in Java) at Chicago...
 
Cache on Delivery
Cache on DeliveryCache on Delivery
Cache on Delivery
 
GPU programming
GPU programmingGPU programming
GPU programming
 
Facebook's HBase Backups - StampedeCon 2012
Facebook's HBase Backups - StampedeCon 2012Facebook's HBase Backups - StampedeCon 2012
Facebook's HBase Backups - StampedeCon 2012
 
Raid
RaidRaid
Raid
 
Accelerating NoSQL
Accelerating NoSQLAccelerating NoSQL
Accelerating NoSQL
 
Cachememory
CachememoryCachememory
Cachememory
 
Crypto Strikes Back! (Google 2009)
Crypto Strikes Back! (Google 2009)Crypto Strikes Back! (Google 2009)
Crypto Strikes Back! (Google 2009)
 
Disk IO Benchmarking in shared multi-tenant environments
Disk IO Benchmarking in shared multi-tenant environmentsDisk IO Benchmarking in shared multi-tenant environments
Disk IO Benchmarking in shared multi-tenant environments
 
Os Wardenupdated
Os WardenupdatedOs Wardenupdated
Os Wardenupdated
 
MongoDB: Scaling write performance | Devon 2012
MongoDB: Scaling write performance | Devon 2012MongoDB: Scaling write performance | Devon 2012
MongoDB: Scaling write performance | Devon 2012
 
Windows server 8 hyper v & storage (hans vredevoort)
Windows server 8 hyper v & storage (hans vredevoort)Windows server 8 hyper v & storage (hans vredevoort)
Windows server 8 hyper v & storage (hans vredevoort)
 
00 opencapi acceleration framework yonglu_ver2
00 opencapi acceleration framework yonglu_ver200 opencapi acceleration framework yonglu_ver2
00 opencapi acceleration framework yonglu_ver2
 
Rooting your internals - Exploiting Internal Network Vulns via the Browser Us...
Rooting your internals - Exploiting Internal Network Vulns via the Browser Us...Rooting your internals - Exploiting Internal Network Vulns via the Browser Us...
Rooting your internals - Exploiting Internal Network Vulns via the Browser Us...
 
Near-realtime analytics with Kafka and HBase
Near-realtime analytics with Kafka and HBaseNear-realtime analytics with Kafka and HBase
Near-realtime analytics with Kafka and HBase
 

Último

Introduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptxIntroduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptxMatsuo Lab
 
Cybersecurity Workshop #1.pptx
Cybersecurity Workshop #1.pptxCybersecurity Workshop #1.pptx
Cybersecurity Workshop #1.pptxGDSC PJATK
 
NIST Cybersecurity Framework (CSF) 2.0 Workshop
NIST Cybersecurity Framework (CSF) 2.0 WorkshopNIST Cybersecurity Framework (CSF) 2.0 Workshop
NIST Cybersecurity Framework (CSF) 2.0 WorkshopBachir Benyammi
 
Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)Commit University
 
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdf
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdfIaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdf
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdfDaniel Santiago Silva Capera
 
Things you didn't know you can use in your Salesforce
Things you didn't know you can use in your SalesforceThings you didn't know you can use in your Salesforce
Things you didn't know you can use in your SalesforceMartin Humpolec
 
Comparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and IstioComparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and IstioChristian Posta
 
UiPath Community: AI for UiPath Automation Developers
UiPath Community: AI for UiPath Automation DevelopersUiPath Community: AI for UiPath Automation Developers
UiPath Community: AI for UiPath Automation DevelopersUiPathCommunity
 
20200723_insight_release_plan_v6.pdf20200723_insight_release_plan_v6.pdf
20200723_insight_release_plan_v6.pdf20200723_insight_release_plan_v6.pdf20200723_insight_release_plan_v6.pdf20200723_insight_release_plan_v6.pdf
20200723_insight_release_plan_v6.pdf20200723_insight_release_plan_v6.pdfJamie (Taka) Wang
 
PicPay - GenAI Finance Assistant - ChatGPT for Customer Service
PicPay - GenAI Finance Assistant - ChatGPT for Customer ServicePicPay - GenAI Finance Assistant - ChatGPT for Customer Service
PicPay - GenAI Finance Assistant - ChatGPT for Customer ServiceRenan Moreira de Oliveira
 
Introduction to Quantum Computing
Introduction to Quantum ComputingIntroduction to Quantum Computing
Introduction to Quantum ComputingGDSC PJATK
 
UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8DianaGray10
 
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...Aggregage
 
Secure your environment with UiPath and CyberArk technologies - Session 1
Secure your environment with UiPath and CyberArk technologies - Session 1Secure your environment with UiPath and CyberArk technologies - Session 1
Secure your environment with UiPath and CyberArk technologies - Session 1DianaGray10
 
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdfUiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdfDianaGray10
 
RAG Patterns and Vector Search in Generative AI
RAG Patterns and Vector Search in Generative AIRAG Patterns and Vector Search in Generative AI
RAG Patterns and Vector Search in Generative AIUdaiappa Ramachandran
 
Digital magic. A small project for controlling smart light bulbs.
Digital magic. A small project for controlling smart light bulbs.Digital magic. A small project for controlling smart light bulbs.
Digital magic. A small project for controlling smart light bulbs.francesco barbera
 
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPA
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPAAnypoint Code Builder , Google Pub sub connector and MuleSoft RPA
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPAshyamraj55
 
AI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just MinutesAI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just MinutesMd Hossain Ali
 
9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding Team9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding TeamAdam Moalla
 

Último (20)

Introduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptxIntroduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptx
 
Cybersecurity Workshop #1.pptx
Cybersecurity Workshop #1.pptxCybersecurity Workshop #1.pptx
Cybersecurity Workshop #1.pptx
 
NIST Cybersecurity Framework (CSF) 2.0 Workshop
NIST Cybersecurity Framework (CSF) 2.0 WorkshopNIST Cybersecurity Framework (CSF) 2.0 Workshop
NIST Cybersecurity Framework (CSF) 2.0 Workshop
 
Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)
 
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdf
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdfIaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdf
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdf
 
Things you didn't know you can use in your Salesforce
Things you didn't know you can use in your SalesforceThings you didn't know you can use in your Salesforce
Things you didn't know you can use in your Salesforce
 
Comparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and IstioComparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and Istio
 
UiPath Community: AI for UiPath Automation Developers
UiPath Community: AI for UiPath Automation DevelopersUiPath Community: AI for UiPath Automation Developers
UiPath Community: AI for UiPath Automation Developers
 
20200723_insight_release_plan_v6.pdf20200723_insight_release_plan_v6.pdf
20200723_insight_release_plan_v6.pdf20200723_insight_release_plan_v6.pdf20200723_insight_release_plan_v6.pdf20200723_insight_release_plan_v6.pdf
20200723_insight_release_plan_v6.pdf20200723_insight_release_plan_v6.pdf
 
PicPay - GenAI Finance Assistant - ChatGPT for Customer Service
PicPay - GenAI Finance Assistant - ChatGPT for Customer ServicePicPay - GenAI Finance Assistant - ChatGPT for Customer Service
PicPay - GenAI Finance Assistant - ChatGPT for Customer Service
 
Introduction to Quantum Computing
Introduction to Quantum ComputingIntroduction to Quantum Computing
Introduction to Quantum Computing
 
UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8
 
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
 
Secure your environment with UiPath and CyberArk technologies - Session 1
Secure your environment with UiPath and CyberArk technologies - Session 1Secure your environment with UiPath and CyberArk technologies - Session 1
Secure your environment with UiPath and CyberArk technologies - Session 1
 
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdfUiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
 
RAG Patterns and Vector Search in Generative AI
RAG Patterns and Vector Search in Generative AIRAG Patterns and Vector Search in Generative AI
RAG Patterns and Vector Search in Generative AI
 
Digital magic. A small project for controlling smart light bulbs.
Digital magic. A small project for controlling smart light bulbs.Digital magic. A small project for controlling smart light bulbs.
Digital magic. A small project for controlling smart light bulbs.
 
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPA
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPAAnypoint Code Builder , Google Pub sub connector and MuleSoft RPA
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPA
 
AI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just MinutesAI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just Minutes
 
9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding Team9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding Team
 

Trusted Cloud Storage Tech Talk

  • 1. Secure Cloud Storage and Computing Using Reconfigurable Hardware Victor Costan (龍望), Hsin-Jung Yang (楊昕蓉), Srini Devadas, Nickolai Zeldovich
  • 3. Cloud Computing: Dreams and Reality • The Cloud: Ideal Picture • The Cloud: Reality
  • 4. Cloud Storage: Attack Vectors Hypervisor State Hardware Bugs Manipulation Attacks
  • 11. Spot the Differences: Name, Relationship Status
  • 12. Why It Matters • We rely on fresh data to make decisions – Google searches – Facebook profiles – Twitter, Linked-In • Outdated data has big impact on users – Wrong profile information: confusion, embarrassment – Old search results: bad business decisions, embarrassment – Old document versions: costly business decisions, regulatory issues
  • 14. Design: Cloud Storage API • Block Device – Fixed block size (1Mb) – Write(block number, block) – Read(block number)  block • Easy to reason about the security • File systems operate on top of this abstraction B1 B2 B3 B4 Disk divided into 1MB blocks
  • 15. Design: System Architecture Client FPGA / ASIC Secure NVRAM (Trusted) Chip System Bus (Untrusted) Internet (Untrusted) CPU Disk RAM (Untrusted) (Untrusted) Network Card (Untrusted) (Untrusted)
  • 16. Design: Trusted Storage on Untrusted Disks 160-bit hash in trusted memory authenticates 1TB disk Root Hash Root hash matches h7=h(h5||h6) iff all blocks match 20 levels h5=h(h1||h2) Nodes hash h6=h(h3||h4) their children h1=h(B1) h2=h(B2) h3=h(B3) h4=h(B4) Leaves hash their blocks B1 B2 B3 B4 Disk divided into 1MB blocks
  • 17. Design: Hash Tree Caching Node Hash Verified Left Right number child child 1 fabe3c05d8ba995af93e Y Y N 2 e6fc9bc13d624ace2394 Y Y Y The FPGA caches hash 4 53a81fc2dcc53e4da819 Y N N tree nodes 5 b2ce548dfa2f91d83ec6 Y N N 1 The untrusted OS is free to choose the caching policy, for maximum 2 3 performance 4 5 6 7
  • 18. Design: Hash Tree Cache • Server stores entire hash tree in RAM • FPGA has a cache that stores a subset of nodes • Server tells FPGA what nodes to store Cache management commands 1 Node Hash Verified 1 fabe… Y 2 3 2 e6fc… Y 4 53a8… Y 4 5 6 7 5 b2ce… Y
  • 19. Design: Hash Tree Cache - Load • Server tells the FPGA to load a node into a cache entry • The cache entry is unverified right after a load 1 1 2 2 4 4 5 Node Hash Verified Node Hash Verified 1 fabe… Y 1 fabe… Y 2 e6fc… Y 2 e6fc… Y 4 53a8… N 4 53a8… N 5 b2ce… N
  • 20. Design: Hash Tree Cache - Verify • Server tells the FPGA to use a node to verify its children • FPGA checks that parent’s hash matches children hashes 1 1 2 2 4 5 4 5 Node Hash Verified Node Hash Verified 1 fabe… Y 1 fabe… Y 2 e6fc… Y 2 e6fc… Y 4 53a8… N 4 53a8… Y 5 b2ce… N 5 b2ce… Y
  • 21. Design: Hash Tree Cache - Efficiency • Checking leaf 33 requires 10 node loads for a cold cache on this toy example (38 loads on the real FPGA tree) • Remember the root is always loaded in the cache 1 2 3 4 5 8 9 16 17 32 33
  • 22. Design: Hash Tree Cache - Efficiency • Checking leaf 38 only 4 node loads, because 9 is already in the cache and verified • Server can predict client requests and manage cache for high performance 1 2 3 4 5 8 9 16 17 18 19 32 33 38 39
  • 24. Results: System Architecture Client FPGA / ASIC Secure NVRAM (Trusted) Chip System Bus (Untrusted) Internet (Untrusted) CPU Disk RAM (Untrusted) (Untrusted) Network Card (Untrusted) (Untrusted)
  • 28. Results: FPGA Board, Normal Operation
  • 29. Results: Attack Does Not Impact User
  • 30. Results: FPGA Board, Under Attack
  • 31. Results: Performance Block Diagram Read / Write 1MB Data Block to Disk Limit: Disk I/O Speed Hash 1MB Data Block Limit: Hash Engine Speed Limit: FPGA Data Bus Load & Verify Hash Tree Nodes Limit: Hash Engine Speed Limit: Dependencies Update Hash Tree (Writes Only) Limit: Hash Engine Speed Limit: Dependencies HMAC (Sign) Result Limit: Hash Engine Speed
  • 32. Results: Performance Block Diagram Read / Write 1MB Data Block to Disk Limit: Disk I/O Speed Hash 1MB Data Block Limit: Hash Engine Speed Limit: FPGA Data Bus Load & Verify Hash Tree Nodes Limit: Hash Engine Speed Limit: Dependencies Update Hash Tree (Writes Only) Limit: Hash Engine Speed Limit: Dependencies HMAC (Sign) Result Limit: Hash Engine Speed
  • 33. Results: Prototype Performance (est.) Read / Write 1MB Data Block to Disk Disk I/O Throughput Limit: Disk I/O Speed 7,200 RPM HDD 70 MB/s 10,000 RPM HDD 100 MB/s Hash 1MB Data Block 15,000 RPM HDD 130 MB/s Limit: Hash Engine Speed Limit: FPGA Data Bus SSD 250 MB/s Load & Verify Hash Tree Nodes Limit: Hash Engine Speed Limit: Dependencies 1 MB = 1 block Update Hash Tree (Writes Only) Limit: Hash Engine Speed Limit: Dependencies HMAC (Sign) Result Limit: Hash Engine Speed
  • 34. Results: Performance Block Diagram Read / Write 1MB Data Block to Disk Limit: Disk I/O Speed Hash 1MB Data Block Limit: Hash Engine Speed Limit: FPGA Data Bus Load & Verify Hash Tree Nodes Limit: Hash Engine Speed Limit: Dependencies Update Hash Tree (Writes Only) Limit: Hash Engine Speed Limit: Dependencies HMAC (Sign) Result Limit: Hash Engine Speed
  • 35. Results: Prototype Performance (est.) Read / Write 1MB Data Block to Disk Operation Throughput Limit: Disk I/O Speed Block Hash 800 MB/s Pipelined 3,200 MB/s Hash 1MB Data Block Block Hash Limit: Hash Engine Speed Limit: FPGA Data Bus Load & Verify Hash Tree Nodes 1 MB = 1 block Limit: Hash Engine Speed Limit: Dependencies Transport Throughput Update Hash Tree (Writes Only) PCI Express x16 4,096 MB/s Limit: Hash Engine Speed Limit: Dependencies SATA II 384 MB/s HMAC (Sign) Result PCI Express x1 250 MB/s Limit: Hash Engine Speed Ethernet 125 MB/s
  • 36. Results: Performance Block Diagram Read / Write 1MB Data Block to Disk Limit: Disk I/O Speed Hash 1MB Data Block Limit: Hash Engine Speed Limit: FPGA Data Bus Load & Verify Hash Tree Nodes Limit: Hash Engine Speed Limit: Dependencies Update Hash Tree (Writes Only) Limit: Hash Engine Speed Limit: Dependencies HMAC (Sign) Result Limit: Hash Engine Speed
  • 37. Results: Prototype Performance (est.) Read / Write 1MB Data Block to Disk Operation Throughput Limit: Disk I/O Speed Tree Node Hash 1.25 M/s Pipelined 5.0 M/s Hash 1MB Data Block Tree Node Hash Limit: Hash Engine Speed Limit: FPGA Data Bus Tree Operations 62.5 k/s Optimized Tree 2.5 M/s Load & Verify Hash Tree Nodes Operations Limit: Hash Engine Speed Limit: Dependencies 1 MB = 1 block Update Hash Tree (Writes Only) Transport Throughput Limit: Hash Engine Speed Limit: Dependencies PCI Express x16 4,096 MB/s SATA II 384 MB/s HMAC (Sign) Result PCI Express x1 250 MB/s Limit: Hash Engine Speed Ethernet 125 MB/s
  • 38. Results: Performance Block Diagram Read / Write 1MB Data Block to Disk Limit: Disk I/O Speed Hash 1MB Data Block Limit: Hash Engine Speed Limit: FPGA Data Bus Load & Verify Hash Tree Nodes Limit: Hash Engine Speed Limit: Dependencies Update Hash Tree (Writes Only) Limit: Hash Engine Speed Limit: Dependencies HMAC (Sign) Result Limit: Hash Engine Speed
  • 39. Results: Prototype Performance (est.) Read / Write 1MB Data Block to Disk Operation Throughput Limit: Disk I/O Speed Tree Node Hash 1.25 M/s Pipelined 5.0 M/s Hash 1MB Data Block Tree Node Hash Limit: Hash Engine Speed Limit: FPGA Data Bus Tree Operations 62.5 k/s Load & Verify Hash Tree Nodes Limit: Hash Engine Speed Limit: Dependencies 1 MB = 1 block Update Hash Tree (Writes Only) Transport Throughput Limit: Hash Engine Speed Limit: Dependencies PCI Express x16 4,096 MB/s SATA II 384 MB/s HMAC (Sign) Result PCI Express x1 250 MB/s Limit: Hash Engine Speed Ethernet 125 MB/s
  • 40. Results: Performance Block Diagram Read / Write 1MB Data Block to Disk Limit: Disk I/O Speed Hash 1MB Data Block Limit: Hash Engine Speed Limit: FPGA Data Bus Load & Verify Hash Tree Nodes Limit: Hash Engine Speed Limit: Dependencies Update Hash Tree (Writes Only) Limit: Hash Engine Speed Limit: Dependencies HMAC (Sign) Result Limit: Hash Engine Speed
  • 41. Results: Prototype Performance (est.) Read / Write 1MB Data Block to Disk Operation Throughput Limit: Disk I/O Speed Node HMAC 1.25 M/s Hash 1MB Data Block Limit: Hash Engine Speed Limit: FPGA Data Bus Load & Verify Hash Tree Nodes Limit: Hash Engine Speed Limit: Dependencies 1 MB = 1 block Update Hash Tree (Writes Only) Transport Throughput Limit: Hash Engine Speed Limit: Dependencies PCI Express x16 4,096 MB/s SATA II 384 MB/s HMAC (Sign) Result PCI Express x1 250 MB/s Limit: Hash Engine Speed Ethernet 125 MB/s
  • 42. Results: Performance Block Diagram • Steps are performed in Read / Write 1MB Data Block to Disk Limit: Disk I/O Speed parallel (pipelined), because they are in Hash 1MB Data Block different system Limit: Hash Engine Speed Limit: FPGA Data Bus components • However, the slowest Load & Verify Hash Tree Nodes step is the bottleneck Limit: Hash Engine Speed Limit: Dependencies for the entire system Update Hash Tree (Writes Only) • Each step can be made Limit: Hash Engine Speed Limit: Dependencies faster by adding more hardware (e.g. more HMAC (Sign) Result disks), assuming cache Limit: Hash Engine Speed policies can scale up
  • 43. Results: Ping-Pong Workload 10 • Typical collaboration 9 scenario 8 7 • Real-Life 6 – Google Docs Block 5 – Facebook Messages 4 – Dropbox 3 2 • Straight-up LRU shines 1 here 0 0 5 10 15 20 Time
  • 44. Results: Photo Gallery Workload 10 • Modeled after data on 9 photo applications 8 7 • Real-Life 6 – Facebook’s #1 Feature Block 5 – Google Picasa 4 – Flixter 3 2 • Special policy inspired 1 by Facebook Haystack 0 classifies photos, loads 0 5 10 15 20 Time cache predictively
  • 45. Results: Map-Reduce Workload 30 • Index-generating Map- Reduce 25 20 • Real-Life – Google Pagerank Block 15 – Facebook friend graph (EdgeRank) 10 5 • Special policy that takes advantage of 0 Map-Reduce access 0 5 10 pattern Time
  • 46. Results: Cache Hit Rates • Applications: 2 users 1 collaborating on a file (ping- pong), photo gallery 0.9 browsing, Map-Reduce job 0.8 • Cache policies: Speculative Last-Recently Used, 0.7 Spec LRU Facebook Haystack’s policy Haystack optimized for caching, 0.6 MR-Aware policy optimized for Map- Reduce access patterns 0.5 • Conclusion: no policy works well on all applications, so app server must drive policy
  • 47. Results: Protocol Overhead • Client – Server Bandwidth overhead: 0.002% – Operation: 1 HMAC (20 bytes) per 1MB = 0.002% – Handshake: extra secret exchange piggybacks on SSL: 5% • Latency overhead (1 client): 4% – Without security: 8.2ms / request – With security: 8.5ms / request – Latency overhead = the latency of a very fast Internet hop • No throughput overhead (N-clients) – With or without security: 100MB/s – Need 40 HDDs to saturate PCI-E x16, 52 HDDs to saturate FPGA MIT COMPUTER SCIENCE AND ARTIFICIAL INTELLIGENCE LABORATORY
  • 48. Results: Protocol Overhead • Protocol is simple enough to implement on browser side – Chrome – Firefox – Internet Explorer 10 • Easy integration in existing Web applications • End-to-end security
  • 49. Thank You! Questions?
  • 50. Other Applications • FPGA can be used to load user-specified circuits and perform arbitrary computation with security guarantees • Applications: encrypted image search, financial calculations • Potential applications in highly regulated industries, e.g. medical record keeping and processing, secure financial services MIT COMPUTER SCIENCE AND ARTIFICIAL INTELLIGENCE LABORATORY
  • 51. Secure Computation: Overview Untrusted computation: VM image  CPU cores VM image Cloud Task Trusted Machine Circuit spec computation:  FPGA Circuit spec LUTs • Most code is untrusted, executes in a VM • Trusted code is broken up into kernels which become circuits deployed onto an FPGA • If efficiency is not an issue, deploy a processor on the FPGA, execute software securely MIT COMPUTER SCIENCE AND ARTIFICIAL INTELLIGENCE LABORATORY 6/9/2011
  • 52. Secure Computation: Challenge • Multi-tenancy is the key VM Hypervisor to the cloud’s cost Client 1 Client 2 Client 3 effectiveness VM VM VM PCI Express • FPGA can host different applications running in FPGA controller parallel Client 2 Application • Challenge: isolation Client 1 between applications, Application just like a hypervisor Client 3 Application
  • 53. Other Applications • FPGA can be used to load user-specified circuits and perform arbitrary computation with security guarantees • Applications: encrypted image search, financial calculations • Potential applications in highly regulated industries, e.g. medical record keeping and processing, secure financial services
  • 54. Design: FPGA Boot Sequence random nonce PKcard + Manufacturer Certificate Check certificate against e-fuses Check Pkcard against certificate PUFsyndrome + SignPKcard(PUFsyndrome) Compute SKfpga from PUFsyndrome Root Hash + SignPKcard(nonce || Root Hash) Verify signature EncSKfpga(SKcard) + MACSKfpga(nonce || SKcard) Verify MAC
  • 55. Design: Client Trust Model • Each FPGA – NVRAM pair has a Endorsement Key (EK) • Manufacturer certifies the public EK • Client uses the public EK to encrypt a HMAC key, which becomes its shared secret with the trusted hardware Manufacturer verify Endorsement sign Client Certificate generate HMAC key PubEK PrivEK encrypt with PubEK decrypt with Encrypted HMAC key PrivEK HMAC key
  • 56. Design: Hash Tree Security 1. Impossible to come up with a block B1’ such that B1 ≠ B1’ but h(B1) = h(B1’) 2. Impossible to come up with a node hash h1’ such that h1’ such that h1 ≠ h1’ but h(h1||h2) = h(h1’||h2) Therefore, the root hash authenticates the entire contents of the tree.
  • 57. Design: FPGA Boot Sequence Security • Server OS transfers messages between FPGA and Trusted Memory  untrusted channel • FPGA authenticates Trusted Memory using Manufacturer Certificate, whose public key is burned into FPGA’s e-fuses • Trusted Memory authenticates FPGA using its Physically Unclonable Function (PUF) • At manufacturing time, FPGA is paired with memory chip • FPGA can be paired with new memory chip if necessary
  • 58. Design: Hash Tree Cache Security • Server OS responsible for loading and verifying tree nodes • Parent node hash verifies children nodes • Reading a block requires the block’s leaf to be verified • Writing a block requires the path from the block’s leaf to the root to be loaded and verified • A node can be loaded in at most one cache line, to prevent replay attacks using stale node hashes