SlideShare uma empresa Scribd logo
1 de 39
Baixar para ler offline
+
Building Secure Applications
With HBase / Accumulo
Sujee Maniyam
sujee@elephantscale.com
Nosql now! 2014 Conference
Aug 2014, San Jose, CA
+
About This Talk…
n Some practical tips & design patterns on
building secure applications using HBase
and Accumulo
n A quick demo (fingers crossed!)
n Audience : technical
+
Who Invited This Guy?
n  HI, I am Sujee Maniyam
n  Founder / Principal @ Elephant Scale
Consulting & Training in Big Data, NoSQL
n  Co-Author of open source Hadoop book:
http://hadoopilluminated.com
n  Founder / Organizer of ‘Big Data Guru’ meetup
http://www.meetup.com/BigDataGurus/
n  Open source : http://github.com/sujee
n  http://sujee.net |
http://www.linkedin.com/in/sujeemaniyam
+
NoSQL eco-system (too many!)
+
HBase : Quick Intro
n  Modeled after Google Big Table
n  Distributed, Nosql store built on Hadoop / HDFS
n  Apache project
n  http://hbase.apache.org/
HDFS
HBase
+
Accumulo : Quick Intro
n  Developed by the National Security Agency (NSA) !
n  Google Big Table implementation
n  Nosql store on top of HDFS
n  Security is a first grade concept
HDFS
Accumulo
+
HBase & Accumulo
n  Both are Big Table implementation
n  Based on HDFS
n  Written in Java
n  Apache open source projects
HDFS
HBase Accumulo
+
Approach to Security in Hadoop
Until Recently…
+
But Security Picture Has Improved
Rapidly…
n  Lot of work going on in the eco system
n  Hadoop vendors (Cloudera / HortonWorks ..) have been
very actively working on security features
n  ‘the core’ features are in
n  Ease of use improving as well
+
Next : Building Secure Applications
+
What Does It Mean to be ‘Secure’?
n  1) Control who can get in?
n  2) Verify the person’s identity
n  3) safeguard communications with user
n  4) What is allowed for this user
n  5) And finally…
n  Protect data at rest
+
1) Who can get in
n  Control which machines can connect to NoSQL cluster
n  Don’t expose the cluster to public
n  Too many open ports
n  Too vulnerable
n  Solutions:
n  Run cluster behind firewall
n  Restrict which machines
can connect to cluster
n  Linux / Network level security
n  Outside the actual NoSQL
+
Trusted Environment
+
2) User Authentication
n  Wolf: Knock… Knock…
n  Pig :Who is there?
n  Wolf : It is me… little pig
n  How can we verify the user?
n  Username / password (gmail)
n  Or use a third person (referee)
n  Kerberos
Source : http://1.bp.blogspot.com/
+
Kerberos : Quick Primer
n  Kerberos is a authentication protocol for networked
machines
n  Validates client to server and vice-versa
n  Strong crypto algorithms (AES, 3DES…)
+
Kerberos Protocol for Getting a
Beer in a Carnival / Fair J_
+
Kerberos Protocol Explained :
Getting Beer @ Fair / Party
n  Prove your age (identity) to wrist-band issuer
n  Ticket Granting Ticket
n  Get a wristband à qualifies you to get beer
n  Service Ticket
n  Go to bartender and ask for beer using your wrist-band
n  Service Request
n  Get Beer ! J
n  For technically correct explanation see :
http://www.roguelynn.com/words/explain-like-im-5-
kerberos/
+
Kerberos Integration
HBase Accumulo
Kerberos Integration yes Yes
(simple authentication
built-in also)
+
3) Secure Client Communication
n  Guard client / server communication (‘on the wire’)
n  Done by using SASL (certificates)
n  Prevents snooping by third parties
Hbase Accumulo
Secure client
communications
Yes Yes
+
4) What Is Allowed For This User?
n  In unsecured environment users can read / write to any table
n  à not very secure!
n  Control which data users can see..
+
Quick Primer on HBase Storage
n  Tables have many rows
n  Row has multiple columns (or qualifiers)
n  They are grouped into column families
n  Each cell also has a timestamp
(not shown here)
info secure
Customer_id name email phone Last 4
social
Full ssn
Family1
Cell
Family2
+
HBase Allows Access Control At
Family Level
info secure
Customer_id name email phone Last 4
social
Full ssn
First level CSR can
Only access this family
Only supervisors can
access this family
+
Need More Fine Grained Access
n  We like to provide ‘cell level’ access controls
n  Greater flexibility in application development
n  More fine grained access controls
n  Meet Accumulo’s Data Model
+
Accumulo Data Model
Family : info
Columns à name email Last 4 ssn Ssn Gmail
password
Visibility
tokens à
Level 1 Level 1 Level 1 Level 2
OR
Top
clearance
Top
clearance
•  Every thing in HBase data model
•  Plus each row has a ‘Visibility Token’
+
Users Are Assigned ‘Visibility
Tokens’
User id Visibility levels
User 1 Level 1
User 2 Level 1 + Level 2
Edward Snowden Level 1 + Level 2 + Top
Clearance
+
Accumulo only returns cells visible
to user
family
Columns à name email Last 4 SSN Full SSN Gmail
password
person1 Joe joe@gma
il.com
6789 123-45-67
89
JoeSuper
Man!
Visibility
tokens à
Level 1 Level 1 Level 1 Level 2
OR
Top
clearance
Top
clearance
+
What Users Can See…
User Visibility Privilage Visible Cells
User 1 Level 1 Name
Email
Last 4 ssn
User 2 Level 1 +
Level 2
Name
Email
Last 4 SSN
Full SSN
Edward Snowden Level 1 +
Level 2 +
Top Clearance
Name
Email
Last 4 SSN
Full SSN
Gmail Password
+
Good News For HBase
n  With release 0.98 Hbase also allows cell based access
controls
n  Called ‘tags’
n  Need to upgrade to Hfile V3 (version 3) format
+
Visibility / Access Controls
n  Both HBase and Accumulo allow access control for the data
Hbase Accumulo
Cell Level Visibility Yes
(Starting
with v 0.98)
Yes
+
5) Final Step : Encrypt Data At Rest
n  Eventually data ends up in disk
n  We need to protect the ‘raw data’ on disk
n  To prevent
n  Users going to disk directly
n  Theft of hardware
+
Solution : Encrypt Data
Transparently
n  Encryption is done via keys
n  Uses Java Cryptography Extension (JCE)
n  Data is encrypted before writing to HDFS
n  Does not rely on HDFS or Linux level encryption
n  Per family encryption is supported
Hbase Accumulo
Encryption At Rest Yes Yes
+
HBase & Accumulo :Transparent
Encryption
+
Encryption : Key Management
n  The keys have to managed carefully…
n  Don’t loose them !
n  Don’t compromise them !!
n  Possible storage mechanisms
n  Database
n  Remote file server
n  Key management server
n  Local file system
+
Summary
HBase Accumulo
Runs in a trusted environment Yes
(outside
configuration)
Yes
(outside
configuration)
User Authentication Kerberos Kerberos +
Built-in
Secure client communications
(via SSL)
Yes Yes
Visibility at cell level Yes (starting from
v0.98)
Yes
Encrypt data at rest Yes Yes
+
Useful Resources
n  Accumulo
n  http://www.slideshare.net/DonaldMiner/accumulo-
oct2013bofpresentation
n  HBase
n  http://hbase.apache.org/book/hbase.encryption.server.html
+
DEMO
+
Demo Explained
Name email ssn Gmail_pas
sword
Person1 Joe Smith joe@gmail.
com
123-45-6789 ‘JoeDaMan!’
Visibility
Level
Level 1 Level 1 Level 2 Top
Demonstrate cell level visibility feature of accumulo
Here is how the data looks like:
+
Demo : Accumulo Users + Visibility
Accumulo
user
Table1
access
Access
level
Visible Columns
root yes all all
user1 yes Level 1 Name, email
user2 yes Level 1 +
Level 2
Name, email
+
SSN
esnowden yes Level 1 +
Level 2 +
Top
Name, email
+
SSN
+
Gmail password J
user3 no N/A N/A
+
Thanks & Questions!
sujee@ElephantScale.com
http://ElephantScale.com
Expert consulting & training in Big Data
(Hadoop, NoSQL, Spark)
Free, online Hadoop book
‘Hadoop illuminated’

Mais conteúdo relacionado

Semelhante a Building secure NoSQL applications nosqlnow_conf_2014

Open Source Security Tools for Big Data
Open Source Security Tools for Big DataOpen Source Security Tools for Big Data
Open Source Security Tools for Big DataRommel Garcia
 
Open Source Security Tools for Big Data
Open Source Security Tools for Big DataOpen Source Security Tools for Big Data
Open Source Security Tools for Big DataGreat Wide Open
 
Introduction to distributed security concepts and public key infrastructure m...
Introduction to distributed security concepts and public key infrastructure m...Introduction to distributed security concepts and public key infrastructure m...
Introduction to distributed security concepts and public key infrastructure m...Information Security Awareness Group
 
PCI Compliane With Hadoop
PCI Compliane With HadoopPCI Compliane With Hadoop
PCI Compliane With HadoopRommel Garcia
 
Hands-on getdns Tutorial
Hands-on getdns TutorialHands-on getdns Tutorial
Hands-on getdns TutorialShumon Huque
 
Building Secure Open & Distributed Social Networks
Building Secure Open & Distributed Social NetworksBuilding Secure Open & Distributed Social Networks
Building Secure Open & Distributed Social NetworksHenry Story
 
DevOpsDays - DevOps: Security 干我何事?
DevOpsDays - DevOps: Security 干我何事?DevOpsDays - DevOps: Security 干我何事?
DevOpsDays - DevOps: Security 干我何事?smalltown
 
2014 sept 4_hadoop_security
2014 sept 4_hadoop_security2014 sept 4_hadoop_security
2014 sept 4_hadoop_securityAdam Muise
 
Phreebird Suite 1.0: Introducing the Domain Key Infrastructure
Phreebird Suite 1.0:  Introducing the Domain Key InfrastructurePhreebird Suite 1.0:  Introducing the Domain Key Infrastructure
Phreebird Suite 1.0: Introducing the Domain Key InfrastructureDan Kaminsky
 
HTTPS Explained Through Fairy Tales
HTTPS Explained Through Fairy TalesHTTPS Explained Through Fairy Tales
HTTPS Explained Through Fairy TalesOVHcloud
 
Derbycon - Passing the Torch
Derbycon - Passing the TorchDerbycon - Passing the Torch
Derbycon - Passing the TorchWill Schroeder
 
getdns PyCon presentation
getdns PyCon presentationgetdns PyCon presentation
getdns PyCon presentationMelinda Shore
 
Securing Network Access with Open Source solutions
Securing Network Access with Open Source solutionsSecuring Network Access with Open Source solutions
Securing Network Access with Open Source solutionsNick Owen
 
Practical Cryptography and Security Concepts for Developers
Practical Cryptography and Security Concepts for DevelopersPractical Cryptography and Security Concepts for Developers
Practical Cryptography and Security Concepts for DevelopersGökhan Şengün
 
Demystfying secure certs
Demystfying secure certsDemystfying secure certs
Demystfying secure certsGary Williams
 
Finding Security a Home in a DevOps World
Finding Security a Home in a DevOps WorldFinding Security a Home in a DevOps World
Finding Security a Home in a DevOps WorldShannon Lietz
 
[CLASS 2014] Palestra Técnica - Jonathan Knudsen
[CLASS 2014] Palestra Técnica - Jonathan Knudsen[CLASS 2014] Palestra Técnica - Jonathan Knudsen
[CLASS 2014] Palestra Técnica - Jonathan KnudsenTI Safe
 
VisualWorks Security Reloaded - STIC 2012
VisualWorks Security Reloaded - STIC 2012VisualWorks Security Reloaded - STIC 2012
VisualWorks Security Reloaded - STIC 2012Martin Kobetic
 
Random musings on SSL/TLS configuration
Random musings on SSL/TLS configurationRandom musings on SSL/TLS configuration
Random musings on SSL/TLS configurationextremeunix
 
computer-security-and-cryptography-a-simple-presentation
computer-security-and-cryptography-a-simple-presentationcomputer-security-and-cryptography-a-simple-presentation
computer-security-and-cryptography-a-simple-presentationAlex Punnen
 

Semelhante a Building secure NoSQL applications nosqlnow_conf_2014 (20)

Open Source Security Tools for Big Data
Open Source Security Tools for Big DataOpen Source Security Tools for Big Data
Open Source Security Tools for Big Data
 
Open Source Security Tools for Big Data
Open Source Security Tools for Big DataOpen Source Security Tools for Big Data
Open Source Security Tools for Big Data
 
Introduction to distributed security concepts and public key infrastructure m...
Introduction to distributed security concepts and public key infrastructure m...Introduction to distributed security concepts and public key infrastructure m...
Introduction to distributed security concepts and public key infrastructure m...
 
PCI Compliane With Hadoop
PCI Compliane With HadoopPCI Compliane With Hadoop
PCI Compliane With Hadoop
 
Hands-on getdns Tutorial
Hands-on getdns TutorialHands-on getdns Tutorial
Hands-on getdns Tutorial
 
Building Secure Open & Distributed Social Networks
Building Secure Open & Distributed Social NetworksBuilding Secure Open & Distributed Social Networks
Building Secure Open & Distributed Social Networks
 
DevOpsDays - DevOps: Security 干我何事?
DevOpsDays - DevOps: Security 干我何事?DevOpsDays - DevOps: Security 干我何事?
DevOpsDays - DevOps: Security 干我何事?
 
2014 sept 4_hadoop_security
2014 sept 4_hadoop_security2014 sept 4_hadoop_security
2014 sept 4_hadoop_security
 
Phreebird Suite 1.0: Introducing the Domain Key Infrastructure
Phreebird Suite 1.0:  Introducing the Domain Key InfrastructurePhreebird Suite 1.0:  Introducing the Domain Key Infrastructure
Phreebird Suite 1.0: Introducing the Domain Key Infrastructure
 
HTTPS Explained Through Fairy Tales
HTTPS Explained Through Fairy TalesHTTPS Explained Through Fairy Tales
HTTPS Explained Through Fairy Tales
 
Derbycon - Passing the Torch
Derbycon - Passing the TorchDerbycon - Passing the Torch
Derbycon - Passing the Torch
 
getdns PyCon presentation
getdns PyCon presentationgetdns PyCon presentation
getdns PyCon presentation
 
Securing Network Access with Open Source solutions
Securing Network Access with Open Source solutionsSecuring Network Access with Open Source solutions
Securing Network Access with Open Source solutions
 
Practical Cryptography and Security Concepts for Developers
Practical Cryptography and Security Concepts for DevelopersPractical Cryptography and Security Concepts for Developers
Practical Cryptography and Security Concepts for Developers
 
Demystfying secure certs
Demystfying secure certsDemystfying secure certs
Demystfying secure certs
 
Finding Security a Home in a DevOps World
Finding Security a Home in a DevOps WorldFinding Security a Home in a DevOps World
Finding Security a Home in a DevOps World
 
[CLASS 2014] Palestra Técnica - Jonathan Knudsen
[CLASS 2014] Palestra Técnica - Jonathan Knudsen[CLASS 2014] Palestra Técnica - Jonathan Knudsen
[CLASS 2014] Palestra Técnica - Jonathan Knudsen
 
VisualWorks Security Reloaded - STIC 2012
VisualWorks Security Reloaded - STIC 2012VisualWorks Security Reloaded - STIC 2012
VisualWorks Security Reloaded - STIC 2012
 
Random musings on SSL/TLS configuration
Random musings on SSL/TLS configurationRandom musings on SSL/TLS configuration
Random musings on SSL/TLS configuration
 
computer-security-and-cryptography-a-simple-presentation
computer-security-and-cryptography-a-simple-presentationcomputer-security-and-cryptography-a-simple-presentation
computer-security-and-cryptography-a-simple-presentation
 

Último

Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Victor Rentea
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesrafiqahmad00786416
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...apidays
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxRustici Software
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native ApplicationsWSO2
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamUiPathCommunity
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdfSandro Moreira
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024The Digital Insurer
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...Zilliz
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Victor Rentea
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxRemote DBA Services
 

Último (20)

Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptx
 

Building secure NoSQL applications nosqlnow_conf_2014

  • 1. + Building Secure Applications With HBase / Accumulo Sujee Maniyam sujee@elephantscale.com Nosql now! 2014 Conference Aug 2014, San Jose, CA
  • 2. + About This Talk… n Some practical tips & design patterns on building secure applications using HBase and Accumulo n A quick demo (fingers crossed!) n Audience : technical
  • 3. + Who Invited This Guy? n  HI, I am Sujee Maniyam n  Founder / Principal @ Elephant Scale Consulting & Training in Big Data, NoSQL n  Co-Author of open source Hadoop book: http://hadoopilluminated.com n  Founder / Organizer of ‘Big Data Guru’ meetup http://www.meetup.com/BigDataGurus/ n  Open source : http://github.com/sujee n  http://sujee.net | http://www.linkedin.com/in/sujeemaniyam
  • 5. + HBase : Quick Intro n  Modeled after Google Big Table n  Distributed, Nosql store built on Hadoop / HDFS n  Apache project n  http://hbase.apache.org/ HDFS HBase
  • 6. + Accumulo : Quick Intro n  Developed by the National Security Agency (NSA) ! n  Google Big Table implementation n  Nosql store on top of HDFS n  Security is a first grade concept HDFS Accumulo
  • 7. + HBase & Accumulo n  Both are Big Table implementation n  Based on HDFS n  Written in Java n  Apache open source projects HDFS HBase Accumulo
  • 8. + Approach to Security in Hadoop Until Recently…
  • 9. + But Security Picture Has Improved Rapidly… n  Lot of work going on in the eco system n  Hadoop vendors (Cloudera / HortonWorks ..) have been very actively working on security features n  ‘the core’ features are in n  Ease of use improving as well
  • 10. + Next : Building Secure Applications
  • 11. + What Does It Mean to be ‘Secure’? n  1) Control who can get in? n  2) Verify the person’s identity n  3) safeguard communications with user n  4) What is allowed for this user n  5) And finally… n  Protect data at rest
  • 12. + 1) Who can get in n  Control which machines can connect to NoSQL cluster n  Don’t expose the cluster to public n  Too many open ports n  Too vulnerable n  Solutions: n  Run cluster behind firewall n  Restrict which machines can connect to cluster n  Linux / Network level security n  Outside the actual NoSQL
  • 14. + 2) User Authentication n  Wolf: Knock… Knock… n  Pig :Who is there? n  Wolf : It is me… little pig n  How can we verify the user? n  Username / password (gmail) n  Or use a third person (referee) n  Kerberos Source : http://1.bp.blogspot.com/
  • 15. + Kerberos : Quick Primer n  Kerberos is a authentication protocol for networked machines n  Validates client to server and vice-versa n  Strong crypto algorithms (AES, 3DES…)
  • 16. + Kerberos Protocol for Getting a Beer in a Carnival / Fair J_
  • 17. + Kerberos Protocol Explained : Getting Beer @ Fair / Party n  Prove your age (identity) to wrist-band issuer n  Ticket Granting Ticket n  Get a wristband à qualifies you to get beer n  Service Ticket n  Go to bartender and ask for beer using your wrist-band n  Service Request n  Get Beer ! J n  For technically correct explanation see : http://www.roguelynn.com/words/explain-like-im-5- kerberos/
  • 18. + Kerberos Integration HBase Accumulo Kerberos Integration yes Yes (simple authentication built-in also)
  • 19. + 3) Secure Client Communication n  Guard client / server communication (‘on the wire’) n  Done by using SASL (certificates) n  Prevents snooping by third parties Hbase Accumulo Secure client communications Yes Yes
  • 20. + 4) What Is Allowed For This User? n  In unsecured environment users can read / write to any table n  à not very secure! n  Control which data users can see..
  • 21. + Quick Primer on HBase Storage n  Tables have many rows n  Row has multiple columns (or qualifiers) n  They are grouped into column families n  Each cell also has a timestamp (not shown here) info secure Customer_id name email phone Last 4 social Full ssn Family1 Cell Family2
  • 22. + HBase Allows Access Control At Family Level info secure Customer_id name email phone Last 4 social Full ssn First level CSR can Only access this family Only supervisors can access this family
  • 23. + Need More Fine Grained Access n  We like to provide ‘cell level’ access controls n  Greater flexibility in application development n  More fine grained access controls n  Meet Accumulo’s Data Model
  • 24. + Accumulo Data Model Family : info Columns à name email Last 4 ssn Ssn Gmail password Visibility tokens à Level 1 Level 1 Level 1 Level 2 OR Top clearance Top clearance •  Every thing in HBase data model •  Plus each row has a ‘Visibility Token’
  • 25. + Users Are Assigned ‘Visibility Tokens’ User id Visibility levels User 1 Level 1 User 2 Level 1 + Level 2 Edward Snowden Level 1 + Level 2 + Top Clearance
  • 26. + Accumulo only returns cells visible to user family Columns à name email Last 4 SSN Full SSN Gmail password person1 Joe joe@gma il.com 6789 123-45-67 89 JoeSuper Man! Visibility tokens à Level 1 Level 1 Level 1 Level 2 OR Top clearance Top clearance
  • 27. + What Users Can See… User Visibility Privilage Visible Cells User 1 Level 1 Name Email Last 4 ssn User 2 Level 1 + Level 2 Name Email Last 4 SSN Full SSN Edward Snowden Level 1 + Level 2 + Top Clearance Name Email Last 4 SSN Full SSN Gmail Password
  • 28. + Good News For HBase n  With release 0.98 Hbase also allows cell based access controls n  Called ‘tags’ n  Need to upgrade to Hfile V3 (version 3) format
  • 29. + Visibility / Access Controls n  Both HBase and Accumulo allow access control for the data Hbase Accumulo Cell Level Visibility Yes (Starting with v 0.98) Yes
  • 30. + 5) Final Step : Encrypt Data At Rest n  Eventually data ends up in disk n  We need to protect the ‘raw data’ on disk n  To prevent n  Users going to disk directly n  Theft of hardware
  • 31. + Solution : Encrypt Data Transparently n  Encryption is done via keys n  Uses Java Cryptography Extension (JCE) n  Data is encrypted before writing to HDFS n  Does not rely on HDFS or Linux level encryption n  Per family encryption is supported Hbase Accumulo Encryption At Rest Yes Yes
  • 32. + HBase & Accumulo :Transparent Encryption
  • 33. + Encryption : Key Management n  The keys have to managed carefully… n  Don’t loose them ! n  Don’t compromise them !! n  Possible storage mechanisms n  Database n  Remote file server n  Key management server n  Local file system
  • 34. + Summary HBase Accumulo Runs in a trusted environment Yes (outside configuration) Yes (outside configuration) User Authentication Kerberos Kerberos + Built-in Secure client communications (via SSL) Yes Yes Visibility at cell level Yes (starting from v0.98) Yes Encrypt data at rest Yes Yes
  • 35. + Useful Resources n  Accumulo n  http://www.slideshare.net/DonaldMiner/accumulo- oct2013bofpresentation n  HBase n  http://hbase.apache.org/book/hbase.encryption.server.html
  • 37. + Demo Explained Name email ssn Gmail_pas sword Person1 Joe Smith joe@gmail. com 123-45-6789 ‘JoeDaMan!’ Visibility Level Level 1 Level 1 Level 2 Top Demonstrate cell level visibility feature of accumulo Here is how the data looks like:
  • 38. + Demo : Accumulo Users + Visibility Accumulo user Table1 access Access level Visible Columns root yes all all user1 yes Level 1 Name, email user2 yes Level 1 + Level 2 Name, email + SSN esnowden yes Level 1 + Level 2 + Top Name, email + SSN + Gmail password J user3 no N/A N/A
  • 39. + Thanks & Questions! sujee@ElephantScale.com http://ElephantScale.com Expert consulting & training in Big Data (Hadoop, NoSQL, Spark) Free, online Hadoop book ‘Hadoop illuminated’