SlideShare uma empresa Scribd logo
1 de 38
Baixar para ler offline
HBase Security for the Enterprise
Andrew Purtell, Trend Micro
On behalf of the Trend Hadoop Group
apurtell@apache.org
Agenda

•
    Who we are

•
    Motivation

•
    Use Cases

•
    Implementation

•
    Experience

•
    Quickstart Tutorial
Introduction
Trend Micro




                           Headquartered:
                           Tokyo, Japan     Founded:
                                             LA 1988




•
    Technology innovator and top ranked security solutions
    provider
•
    4,000+ employees worldwide
Trend Micro Smart Protection Network                       WEB
                                                        REPUTATION


                                          EMAIL                       FILE
Threats         Threat Collection         REPUTATION                  REPUTATION
               • Customers
               • Partners
               • TrendLabs Research,
                 Service & Support
                                                                                   Management
               • Samples
               • Submissions
               • Honeypots
               • Web Crawling                              Data
               • Feedback Loops                          Platform
               • Behavioral Analysis




                                                                                            SaaS
          Partners
          • ISPs                                                                            Cloud
          • Routers
          • Etc.

                                             Endpoint
                            Off Network                                        Gateway

                                                              Messaging




•
    Information integration is our advantage
Trend Hadoop Group

                                                  Cascading
               Pig
                                     Giraph
                                               Flume
              HDFS                    Oozie
              HBase                                    Sqoop
            MapReduce
            ZooKeeper                    Mahout
                                                      Avro
               Core
                                      Hive               Gora
                                               Solr
             Supported               Not Supported, Monitoring


•
    We curate and support a complete internal distribution
•
    We act within the ASF community processes on behalf of
    internal stakeholders, and are ASF evangelists
Motivation
Our Challenges

•
    As we grow our business we see the network effects of
    our customers' interactions with the Internet and each
    other




•
  This is a volume,
variety, and velocity
problem
Why HBase?

•
  For our Hadoop based
applications, if we were
forced to use MR for
every operation, it
would not be useful
•
  Fortunately, HBase
provides low latency
random access to
very large data tables
and first class Hadoop
platform integration
But...
•
    Hadoop, for us, is the centerpiece of a data management
    consolidation strategy
•
    (Prior to release 0.92) HBase did not have intrinsic
    access control facilities
•
    Why do we care? Provenance, fault isolation, data
    sensitivity, auditable controls, ...
Our Solution

•
    Use HBase where appropriate
•
    Build in the basic access control features we need
    (added in 0.92, evolving in 0.94+)
•
    Do so with a community sanctioned approach
•
    As a byproduct of this work, we have Coprocessors,
    separately interesting
Use Cases
Meta
•
     Our meta use case: Data integration, storage and service
     consolidation




                                            Today: “Data
                                            neighborhood”




    Yesterday: Data islands
Application Fault Isolation

•
    Multitenant cluster, multiple application dev teams
•
    Need to strongly authenticate users to all system
    components: HDFS, HBase, ZooKeeper
•
    Rogue users cannot subvert authentication
•
    Allow and enforce restrictive permissions on internal
    application state: files, tables/CFs, znodes
Private Table (Default case)

•
    Strongly authenticate users to all system components
•
    Assign ownership when a table is created
•
    Allow only the owner full access to table resources
•
    Deny all others
•
    (Optional) Privacy on the wire with encrypted RPC




•
    Internal application state
•
    Applications under development, proofs of concept
Sensitive Column Families in Shared Tables

•
    Strongly authenticate users to all system components
•
    Grant read or read-write permissions to some CFs
•
    Restrict access to one or more other CFs only to owner
•
    Requires ACLs at per-CF granularity
•
    Default deny to help avoid policy mistakes




•
    Domain Reputation Repository (DRR)
•
    Tracking and logging system (TLS), like Google's Dapper
Read-only Access for Ad Hoc Query

•
    Strongly authenticate users to all system components
•
    Need to supply HBase delegation tokens to MR
•
    Grant write permissions to data ingress and analytic
    pipeline processes
•
    Grant read only permissions for ad hoc uses, such as Pig
    jobs
•
    Default deny to help avoid policy mistakes



•
    Knowledge discovery via ad hoc query (Pig)
Implementation
Goals and Non-Goals

Goals

•
    Satisfy use cases
•
    Use what Secure Hadoop Core provides as much as
    possible
•
    Minimally invasive to core code

Non-Goals

•
    Row-level or per value (cell)
•
    Complex policy, full role based access control
•
    Push down of file ownership to HDFS
Coprocessors

•
    Inspired by Bigtable coprocessors, hinted at like the
    Higgs Boson in Jeff Dean's LADIS '09 keynote talk
•
    Dynamically installed code that runs at each region in the
    RegionServers, loaded on a per-table basis:
      Observers: Like database triggers, provide event-based hooks for
        interacting with normal operations
      Endpoints: Like stored procedures, custom RPC methods called
        explicitly with parameters
•
    A high-level call interface for clients: Calls addressed to
    rows or ranges of rows are mapped to data location and
    parallelized by client library
•
    Access checking is done by an Observer
•
    New security APIs implemented as Endpoints
Authentication

•
    Built on Secure Hadoop
      Client authentication via Kerberos, a trusted third party
      Secure RPC based on SASL
•
    SASL can negotiate encryption and/or message integrity
    verification on a per connection basis
•
    Make RPC extensible                and      pluggable,        add   a
    SecureRpcEngine option
•
    Support DIGEST-MD5 authentication, allowing Hadoop
    delegation token use for MapReduce
      TokenProvider, a Coprocessor that provides and verifies HBase
        delegation tokens, and manages shared secrets on the cluster
Authorization – AccessController

•
    AccessController: A Coprocessor that manages access
    control lists
•
    Simple and familiar permissions model: READ, WRITE,
    CREATE, ADMIN
•
    Permissions grantable at table, column family, and
    column qualifier granularity
•
    Supports user and group based assignment
•
    The Hadoop group mapping        service    can   model
    application roles as groups
Authorization – AccessController
Authorization – Secure ZooKeeper

•
    ZooKeeper plays a critical role in HBase cluster
    operations and in the security implementation; needs
    strong security or it becomes a weak point
•
    Kerberos-based client
    authentication
•
    Znode ACLs enforce
    SASL authenticated access
    for sensitive data
Audit

•
    Simple audit log via Log4J
•
    Still need to work out a structured format for audit log
    messages
Two Implementation “Levels”

1. Secure RPC
•
    SecureRPCEngine for integration with Secure Hadoop,
    strong user authentication, message integrity, and
    encryption on the wire
•
    Implementation is solid


2. Coprocessor-based add-ons
•
    TokenProvider: Install only if running MR jobs with HBase
    RPC security enabled
•
    AccessController: Install on a per table basis, configure
    per CF policy, otherwise no overheads
•
    Implementations bring in new runtime dependencies on
    ZooKeeper, still considered experimental
Two Implementation “Levels”

1. Secure RPC
•
    SecureRPCEngine for integration with Secure Hadoop,
    strong user authentication, message integrity, and
    encryption on the wire
•
    Implementation is solid


2. Coprocessor-based add-ons
•
    TokenProvider: Install only if running MR jobs with HBase
    RPC security enabled
•
    AccessController: Install on a per table basis, configure
    per CF policy, otherwise no overheads
•
    Implementations bring in new runtime dependencies on
    ZooKeeper, still considered experimental
Layering
                                         Thrift client
                                          Thrift client     REST client
                                                             REST client
       HBase MapReduce client
        HBase MapReduce client           HBase Thrift
                                          HBase Thrift      HBase REST
                                                             HBase REST

       TokenProvider
        TokenProvider                    HBase Java client
                                          HBase Java client

       HBase Secure RPC
        HBase Secure RPC

       AccessController (optional on a a per-table basis)
        AccessController (optional on per-table basis)

       HBase
        HBase

       Hadoop Secure RPC
        Hadoop Secure RPC

       MapReduce
        MapReduce                        HDFS
                                          HDFS

       Authentication infrastructure: Kerberos ++ LDAP
        Authentication infrastructure: Kerberos LDAP

       OS
        OS
Experience
Secure RPC Engine
•
    Authentication adds latency at connection setup: Extra
    round trips for SASL negotiation
•
    Recommendation: Increase RPC idle time for better
    connection reuse

•
    Negotiating message integrity (“auth-int”) takes ~5% off
    of max throughput
•
    Negotiating SASL encryption (“auth-conf”) takes ~10%
    off of max throughput
•
    Recommendation: Consider your need for such options
    carefully
Secure RPC Engine

•
    A Hadoop system including HBase will initiate RPC far
    more frequently than without (file reads, compactions,
    client API access, …)
•
    If the KDC is overloaded then not only client operations
    but also things like region post deployment tasks may fail,
    increasing region transition time
•
    Recommendation: HA KDC deployment, KDC capacity
    planning, trust federation over multiple KDC HA-pairs
Secure RPC Engine

•
    Activity swarms may be seen by a KDC as replay attacks
    (“Request is a replay (34)”)
•
    Recommendation: Insure unique keys for each service
    instance, e.g. hbase/host@realm where host is fqdn
•
    Recommendation: Check for clock skew over cluster hosts
•
    Recommendation: Use MIT Kerberos 1.8
•
    Recommendation: Increase RPC idle time for better
    connection reuse
•
    Recommendation: Avoid too frequent HBCK validation of
    cluster health
Hadoop Security Issues (?)
•
    Open issue: Occasional swarms of 5-10 seconds at
    intervals of about TGT lifetime of:
     date time host.dc ERROR [PostOpenDeployTasks:
        a74847b544ba37001f56a9d716385253]
        (org.apache.hadoop.security.UserGroupInformation) -
        PriviledgedActionException as:hbase/host.dc@realm (auth:KERBEROS)
        cause:javax.security.sasl.SaslException: GSS initiate failed
        [Caused by GSSException: No valid credentials provided (Mechanism
        level: Failed to find any Kerberos tgt)]

    Some Hadoop RPC improvements not yet ported
•
    Speaking of swarms, at or about delegation token
    expiration interval you may see runs of:
     date time host.dc ERROR [DataStreamer for file file block blockId]
        (org.apache.hadoop.security.UserGroupInformation) -
        PriviledgedActionException as:blockId (auth:SIMPLE)
        cause:org.apache.hadoop.ipc.RemoteException: Block token with
        block_token_identifier (expiryDate=timestamp, keyId=keyId,
        userId=hbase, blockIds=blockId, access modes=[READ|WRITE]) is
        expired.

These should probably not be logged at ERROR level
TokenProvider

•
    Increases exposure to ZooKeeper related RegionServer
    aborts: If keys cannot be rolled or accessed due to a ZK
    error, we must fail closed
•
    Recommendation: Provision sufficient ZK quorum peers
    and deploy them in separate failure domains (one at
    each top of rack, or similar)
•
    Recommendation: Redundant L2 / L2+L3, you probably
    have it already

•
    Recent versions of ZooKeeper have important bug fixes
•
    Recommendation: Use ZooKeeper 3.4.4 (when released)
    or higher
    For more detail on HBase token authentication:
    http://wiki.apache.org/hadoop/Hbase/HBaseTokenAuthentication
AccessController
•
    Use 0.92.1 or above for a bug fix with Get protection

•
    The AccessController will create a small new “system”
    table named _ acl _; the data in this table is almost as
    important as that in .META.
•
    Recommendation: Use the shell to manually flush the
    ACL table after permissions changes to insure changes
    are persisted

•
    Recommendation: The recommendations related to
    ZooKeeper for TokenProvider apply equally here
Shell Support

•
    Shell support is rudimentary, will support the basic use
    cases
•
    Note: You must supply exactly the same permission
    specification to revoke as you did to grant; there is no
    wildcarding and nothing like revoke all
Demonstration Video
Thank You!

Mais conteúdo relacionado

Mais procurados

2013 Nov 20 Toronto Hadoop User Group (THUG) - Hadoop 2.2.0
2013 Nov 20 Toronto Hadoop User Group (THUG) - Hadoop 2.2.02013 Nov 20 Toronto Hadoop User Group (THUG) - Hadoop 2.2.0
2013 Nov 20 Toronto Hadoop User Group (THUG) - Hadoop 2.2.0Adam Muise
 
Application architectures with hadoop – big data techcon 2014
Application architectures with hadoop – big data techcon 2014Application architectures with hadoop – big data techcon 2014
Application architectures with hadoop – big data techcon 2014Jonathan Seidman
 
End-to-End Security and Auditing in a Big Data as a Service Deployment
End-to-End Security and Auditing in a Big Data as a Service DeploymentEnd-to-End Security and Auditing in a Big Data as a Service Deployment
End-to-End Security and Auditing in a Big Data as a Service DeploymentDataWorks Summit/Hadoop Summit
 
Multitenancy At Bloomberg - HBase and Oozie
Multitenancy At Bloomberg - HBase and OozieMultitenancy At Bloomberg - HBase and Oozie
Multitenancy At Bloomberg - HBase and OozieDataWorks Summit
 
Managing Hadoop, HBase and Storm Clusters at Yahoo Scale
Managing Hadoop, HBase and Storm Clusters at Yahoo ScaleManaging Hadoop, HBase and Storm Clusters at Yahoo Scale
Managing Hadoop, HBase and Storm Clusters at Yahoo ScaleDataWorks Summit/Hadoop Summit
 
Operationalizing YARN based Hadoop Clusters in the Cloud
Operationalizing YARN based Hadoop Clusters in the CloudOperationalizing YARN based Hadoop Clusters in the Cloud
Operationalizing YARN based Hadoop Clusters in the CloudDataWorks Summit/Hadoop Summit
 
Realtime Analytics in Hadoop
Realtime Analytics in HadoopRealtime Analytics in Hadoop
Realtime Analytics in HadoopRommel Garcia
 
Hadoop in the Cloud - The what, why and how from the experts
Hadoop in the Cloud - The what, why and how from the expertsHadoop in the Cloud - The what, why and how from the experts
Hadoop in the Cloud - The what, why and how from the expertsDataWorks Summit/Hadoop Summit
 
Hadoop Security and Compliance - StampedeCon 2016
Hadoop Security and Compliance - StampedeCon 2016Hadoop Security and Compliance - StampedeCon 2016
Hadoop Security and Compliance - StampedeCon 2016StampedeCon
 
From Insights to Value - Building a Modern Logical Data Lake to Drive User Ad...
From Insights to Value - Building a Modern Logical Data Lake to Drive User Ad...From Insights to Value - Building a Modern Logical Data Lake to Drive User Ad...
From Insights to Value - Building a Modern Logical Data Lake to Drive User Ad...DataWorks Summit
 
Application Architectures with Hadoop - Big Data TechCon SF 2014
Application Architectures with Hadoop - Big Data TechCon SF 2014Application Architectures with Hadoop - Big Data TechCon SF 2014
Application Architectures with Hadoop - Big Data TechCon SF 2014hadooparchbook
 
Scaling HDFS to Manage Billions of Files with Distributed Storage Schemes
Scaling HDFS to Manage Billions of Files with Distributed Storage SchemesScaling HDFS to Manage Billions of Files with Distributed Storage Schemes
Scaling HDFS to Manage Billions of Files with Distributed Storage SchemesDataWorks Summit
 
HBaseCon 2013: ETL for Apache HBase
HBaseCon 2013: ETL for Apache HBaseHBaseCon 2013: ETL for Apache HBase
HBaseCon 2013: ETL for Apache HBaseCloudera, Inc.
 
Application Architectures with Hadoop
Application Architectures with HadoopApplication Architectures with Hadoop
Application Architectures with Hadoophadooparchbook
 
Troubleshooting Kerberos in Hadoop: Taming the Beast
Troubleshooting Kerberos in Hadoop: Taming the BeastTroubleshooting Kerberos in Hadoop: Taming the Beast
Troubleshooting Kerberos in Hadoop: Taming the BeastDataWorks Summit
 
SQL Engines for Hadoop - The case for Impala
SQL Engines for Hadoop - The case for ImpalaSQL Engines for Hadoop - The case for Impala
SQL Engines for Hadoop - The case for Impalamarkgrover
 
Bikas saha:the next generation of hadoop– hadoop 2 and yarn
Bikas saha:the next generation of hadoop– hadoop 2 and yarnBikas saha:the next generation of hadoop– hadoop 2 and yarn
Bikas saha:the next generation of hadoop– hadoop 2 and yarnhdhappy001
 

Mais procurados (20)

2013 Nov 20 Toronto Hadoop User Group (THUG) - Hadoop 2.2.0
2013 Nov 20 Toronto Hadoop User Group (THUG) - Hadoop 2.2.02013 Nov 20 Toronto Hadoop User Group (THUG) - Hadoop 2.2.0
2013 Nov 20 Toronto Hadoop User Group (THUG) - Hadoop 2.2.0
 
Evolving HDFS to a Generalized Storage Subsystem
Evolving HDFS to a Generalized Storage SubsystemEvolving HDFS to a Generalized Storage Subsystem
Evolving HDFS to a Generalized Storage Subsystem
 
Application architectures with hadoop – big data techcon 2014
Application architectures with hadoop – big data techcon 2014Application architectures with hadoop – big data techcon 2014
Application architectures with hadoop – big data techcon 2014
 
Curb your insecurity with HDP
Curb your insecurity with HDPCurb your insecurity with HDP
Curb your insecurity with HDP
 
End-to-End Security and Auditing in a Big Data as a Service Deployment
End-to-End Security and Auditing in a Big Data as a Service DeploymentEnd-to-End Security and Auditing in a Big Data as a Service Deployment
End-to-End Security and Auditing in a Big Data as a Service Deployment
 
Multitenancy At Bloomberg - HBase and Oozie
Multitenancy At Bloomberg - HBase and OozieMultitenancy At Bloomberg - HBase and Oozie
Multitenancy At Bloomberg - HBase and Oozie
 
Managing Hadoop, HBase and Storm Clusters at Yahoo Scale
Managing Hadoop, HBase and Storm Clusters at Yahoo ScaleManaging Hadoop, HBase and Storm Clusters at Yahoo Scale
Managing Hadoop, HBase and Storm Clusters at Yahoo Scale
 
Operationalizing YARN based Hadoop Clusters in the Cloud
Operationalizing YARN based Hadoop Clusters in the CloudOperationalizing YARN based Hadoop Clusters in the Cloud
Operationalizing YARN based Hadoop Clusters in the Cloud
 
Realtime Analytics in Hadoop
Realtime Analytics in HadoopRealtime Analytics in Hadoop
Realtime Analytics in Hadoop
 
Hadoop in the Cloud - The what, why and how from the experts
Hadoop in the Cloud - The what, why and how from the expertsHadoop in the Cloud - The what, why and how from the experts
Hadoop in the Cloud - The what, why and how from the experts
 
Hadoop Security and Compliance - StampedeCon 2016
Hadoop Security and Compliance - StampedeCon 2016Hadoop Security and Compliance - StampedeCon 2016
Hadoop Security and Compliance - StampedeCon 2016
 
From Insights to Value - Building a Modern Logical Data Lake to Drive User Ad...
From Insights to Value - Building a Modern Logical Data Lake to Drive User Ad...From Insights to Value - Building a Modern Logical Data Lake to Drive User Ad...
From Insights to Value - Building a Modern Logical Data Lake to Drive User Ad...
 
Application Architectures with Hadoop - Big Data TechCon SF 2014
Application Architectures with Hadoop - Big Data TechCon SF 2014Application Architectures with Hadoop - Big Data TechCon SF 2014
Application Architectures with Hadoop - Big Data TechCon SF 2014
 
Scaling HDFS to Manage Billions of Files with Distributed Storage Schemes
Scaling HDFS to Manage Billions of Files with Distributed Storage SchemesScaling HDFS to Manage Billions of Files with Distributed Storage Schemes
Scaling HDFS to Manage Billions of Files with Distributed Storage Schemes
 
HBaseCon 2013: ETL for Apache HBase
HBaseCon 2013: ETL for Apache HBaseHBaseCon 2013: ETL for Apache HBase
HBaseCon 2013: ETL for Apache HBase
 
Application Architectures with Hadoop
Application Architectures with HadoopApplication Architectures with Hadoop
Application Architectures with Hadoop
 
Troubleshooting Kerberos in Hadoop: Taming the Beast
Troubleshooting Kerberos in Hadoop: Taming the BeastTroubleshooting Kerberos in Hadoop: Taming the Beast
Troubleshooting Kerberos in Hadoop: Taming the Beast
 
SQL Engines for Hadoop - The case for Impala
SQL Engines for Hadoop - The case for ImpalaSQL Engines for Hadoop - The case for Impala
SQL Engines for Hadoop - The case for Impala
 
Bikas saha:the next generation of hadoop– hadoop 2 and yarn
Bikas saha:the next generation of hadoop– hadoop 2 and yarnBikas saha:the next generation of hadoop– hadoop 2 and yarn
Bikas saha:the next generation of hadoop– hadoop 2 and yarn
 
Spark Uber Development Kit
Spark Uber Development KitSpark Uber Development Kit
Spark Uber Development Kit
 

Destaque

Trend micro real time threat management press presentation
Trend micro real time threat management press presentationTrend micro real time threat management press presentation
Trend micro real time threat management press presentationAndrew Wong
 
Trend Micro: Security Challenges and Solutions for the Cloud (Saas) & Cloud S...
Trend Micro: Security Challenges and Solutions for the Cloud (Saas) & Cloud S...Trend Micro: Security Challenges and Solutions for the Cloud (Saas) & Cloud S...
Trend Micro: Security Challenges and Solutions for the Cloud (Saas) & Cloud S...Ingram Micro Cloud
 
Trend micro v2
Trend micro v2Trend micro v2
Trend micro v2JD Sherry
 
Introduction - Trend Micro Deep Security
Introduction - Trend Micro Deep SecurityIntroduction - Trend Micro Deep Security
Introduction - Trend Micro Deep SecurityAndrew Wong
 
Practical Kerberos with Apache HBase
Practical Kerberos with Apache HBasePractical Kerberos with Apache HBase
Practical Kerberos with Apache HBaseJosh Elser
 
HBaseCon 2012 | Content Addressable Storages for Fun and Profit - Berk Demir,...
HBaseCon 2012 | Content Addressable Storages for Fun and Profit - Berk Demir,...HBaseCon 2012 | Content Addressable Storages for Fun and Profit - Berk Demir,...
HBaseCon 2012 | Content Addressable Storages for Fun and Profit - Berk Demir,...Cloudera, Inc.
 
HBaseCon 2013: Apache Hadoop and Apache HBase for Real-Time Video Analytics
HBaseCon 2013: Apache Hadoop and Apache HBase for Real-Time Video Analytics HBaseCon 2013: Apache Hadoop and Apache HBase for Real-Time Video Analytics
HBaseCon 2013: Apache Hadoop and Apache HBase for Real-Time Video Analytics Cloudera, Inc.
 
HBase Read High Availability Using Timeline-Consistent Region Replicas
HBase Read High Availability Using Timeline-Consistent Region ReplicasHBase Read High Availability Using Timeline-Consistent Region Replicas
HBase Read High Availability Using Timeline-Consistent Region ReplicasHBaseCon
 
HBaseCon 2013: Being Smarter Than the Smart Meter
HBaseCon 2013: Being Smarter Than the Smart MeterHBaseCon 2013: Being Smarter Than the Smart Meter
HBaseCon 2013: Being Smarter Than the Smart MeterCloudera, Inc.
 
HBaseCon 2012 | Unique Sets on HBase and Hadoop - Elliot Clark, StumbleUpon
HBaseCon 2012 | Unique Sets on HBase and Hadoop - Elliot Clark, StumbleUponHBaseCon 2012 | Unique Sets on HBase and Hadoop - Elliot Clark, StumbleUpon
HBaseCon 2012 | Unique Sets on HBase and Hadoop - Elliot Clark, StumbleUponCloudera, Inc.
 
HBaseCon 2012 | Scaling GIS In Three Acts
HBaseCon 2012 | Scaling GIS In Three ActsHBaseCon 2012 | Scaling GIS In Three Acts
HBaseCon 2012 | Scaling GIS In Three ActsCloudera, Inc.
 
HBaseCon 2013: Apache HBase, Meet Ops. Ops, Meet Apache HBase.
HBaseCon 2013: Apache HBase, Meet Ops. Ops, Meet Apache HBase.HBaseCon 2013: Apache HBase, Meet Ops. Ops, Meet Apache HBase.
HBaseCon 2013: Apache HBase, Meet Ops. Ops, Meet Apache HBase.Cloudera, Inc.
 
HBaseCon 2013: 1500 JIRAs in 20 Minutes
HBaseCon 2013: 1500 JIRAs in 20 MinutesHBaseCon 2013: 1500 JIRAs in 20 Minutes
HBaseCon 2013: 1500 JIRAs in 20 MinutesCloudera, Inc.
 
Tales from the Cloudera Field
Tales from the Cloudera FieldTales from the Cloudera Field
Tales from the Cloudera FieldHBaseCon
 
HBaseCon 2012 | Living Data: Applying Adaptable Schemas to HBase - Aaron Kimb...
HBaseCon 2012 | Living Data: Applying Adaptable Schemas to HBase - Aaron Kimb...HBaseCon 2012 | Living Data: Applying Adaptable Schemas to HBase - Aaron Kimb...
HBaseCon 2012 | Living Data: Applying Adaptable Schemas to HBase - Aaron Kimb...Cloudera, Inc.
 
HBaseCon 2015: DeathStar - Easy, Dynamic, Multi-tenant HBase via YARN
HBaseCon 2015: DeathStar - Easy, Dynamic,  Multi-tenant HBase via YARNHBaseCon 2015: DeathStar - Easy, Dynamic,  Multi-tenant HBase via YARN
HBaseCon 2015: DeathStar - Easy, Dynamic, Multi-tenant HBase via YARNHBaseCon
 
HBaseCon 2013: Project Valta - A Resource Management Layer over Apache HBase
HBaseCon 2013: Project Valta - A Resource Management Layer over Apache HBaseHBaseCon 2013: Project Valta - A Resource Management Layer over Apache HBase
HBaseCon 2013: Project Valta - A Resource Management Layer over Apache HBaseCloudera, Inc.
 
HBaseCon 2012 | Leveraging HBase for the World’s Largest Curated Genomic Data...
HBaseCon 2012 | Leveraging HBase for the World’s Largest Curated Genomic Data...HBaseCon 2012 | Leveraging HBase for the World’s Largest Curated Genomic Data...
HBaseCon 2012 | Leveraging HBase for the World’s Largest Curated Genomic Data...Cloudera, Inc.
 
HBaseCon 2013: Evolving a First-Generation Apache HBase Deployment to Second...
HBaseCon 2013:  Evolving a First-Generation Apache HBase Deployment to Second...HBaseCon 2013:  Evolving a First-Generation Apache HBase Deployment to Second...
HBaseCon 2013: Evolving a First-Generation Apache HBase Deployment to Second...Cloudera, Inc.
 
HBaseCon 2013: Apache HBase on Flash
HBaseCon 2013: Apache HBase on FlashHBaseCon 2013: Apache HBase on Flash
HBaseCon 2013: Apache HBase on FlashCloudera, Inc.
 

Destaque (20)

Trend micro real time threat management press presentation
Trend micro real time threat management press presentationTrend micro real time threat management press presentation
Trend micro real time threat management press presentation
 
Trend Micro: Security Challenges and Solutions for the Cloud (Saas) & Cloud S...
Trend Micro: Security Challenges and Solutions for the Cloud (Saas) & Cloud S...Trend Micro: Security Challenges and Solutions for the Cloud (Saas) & Cloud S...
Trend Micro: Security Challenges and Solutions for the Cloud (Saas) & Cloud S...
 
Trend micro v2
Trend micro v2Trend micro v2
Trend micro v2
 
Introduction - Trend Micro Deep Security
Introduction - Trend Micro Deep SecurityIntroduction - Trend Micro Deep Security
Introduction - Trend Micro Deep Security
 
Practical Kerberos with Apache HBase
Practical Kerberos with Apache HBasePractical Kerberos with Apache HBase
Practical Kerberos with Apache HBase
 
HBaseCon 2012 | Content Addressable Storages for Fun and Profit - Berk Demir,...
HBaseCon 2012 | Content Addressable Storages for Fun and Profit - Berk Demir,...HBaseCon 2012 | Content Addressable Storages for Fun and Profit - Berk Demir,...
HBaseCon 2012 | Content Addressable Storages for Fun and Profit - Berk Demir,...
 
HBaseCon 2013: Apache Hadoop and Apache HBase for Real-Time Video Analytics
HBaseCon 2013: Apache Hadoop and Apache HBase for Real-Time Video Analytics HBaseCon 2013: Apache Hadoop and Apache HBase for Real-Time Video Analytics
HBaseCon 2013: Apache Hadoop and Apache HBase for Real-Time Video Analytics
 
HBase Read High Availability Using Timeline-Consistent Region Replicas
HBase Read High Availability Using Timeline-Consistent Region ReplicasHBase Read High Availability Using Timeline-Consistent Region Replicas
HBase Read High Availability Using Timeline-Consistent Region Replicas
 
HBaseCon 2013: Being Smarter Than the Smart Meter
HBaseCon 2013: Being Smarter Than the Smart MeterHBaseCon 2013: Being Smarter Than the Smart Meter
HBaseCon 2013: Being Smarter Than the Smart Meter
 
HBaseCon 2012 | Unique Sets on HBase and Hadoop - Elliot Clark, StumbleUpon
HBaseCon 2012 | Unique Sets on HBase and Hadoop - Elliot Clark, StumbleUponHBaseCon 2012 | Unique Sets on HBase and Hadoop - Elliot Clark, StumbleUpon
HBaseCon 2012 | Unique Sets on HBase and Hadoop - Elliot Clark, StumbleUpon
 
HBaseCon 2012 | Scaling GIS In Three Acts
HBaseCon 2012 | Scaling GIS In Three ActsHBaseCon 2012 | Scaling GIS In Three Acts
HBaseCon 2012 | Scaling GIS In Three Acts
 
HBaseCon 2013: Apache HBase, Meet Ops. Ops, Meet Apache HBase.
HBaseCon 2013: Apache HBase, Meet Ops. Ops, Meet Apache HBase.HBaseCon 2013: Apache HBase, Meet Ops. Ops, Meet Apache HBase.
HBaseCon 2013: Apache HBase, Meet Ops. Ops, Meet Apache HBase.
 
HBaseCon 2013: 1500 JIRAs in 20 Minutes
HBaseCon 2013: 1500 JIRAs in 20 MinutesHBaseCon 2013: 1500 JIRAs in 20 Minutes
HBaseCon 2013: 1500 JIRAs in 20 Minutes
 
Tales from the Cloudera Field
Tales from the Cloudera FieldTales from the Cloudera Field
Tales from the Cloudera Field
 
HBaseCon 2012 | Living Data: Applying Adaptable Schemas to HBase - Aaron Kimb...
HBaseCon 2012 | Living Data: Applying Adaptable Schemas to HBase - Aaron Kimb...HBaseCon 2012 | Living Data: Applying Adaptable Schemas to HBase - Aaron Kimb...
HBaseCon 2012 | Living Data: Applying Adaptable Schemas to HBase - Aaron Kimb...
 
HBaseCon 2015: DeathStar - Easy, Dynamic, Multi-tenant HBase via YARN
HBaseCon 2015: DeathStar - Easy, Dynamic,  Multi-tenant HBase via YARNHBaseCon 2015: DeathStar - Easy, Dynamic,  Multi-tenant HBase via YARN
HBaseCon 2015: DeathStar - Easy, Dynamic, Multi-tenant HBase via YARN
 
HBaseCon 2013: Project Valta - A Resource Management Layer over Apache HBase
HBaseCon 2013: Project Valta - A Resource Management Layer over Apache HBaseHBaseCon 2013: Project Valta - A Resource Management Layer over Apache HBase
HBaseCon 2013: Project Valta - A Resource Management Layer over Apache HBase
 
HBaseCon 2012 | Leveraging HBase for the World’s Largest Curated Genomic Data...
HBaseCon 2012 | Leveraging HBase for the World’s Largest Curated Genomic Data...HBaseCon 2012 | Leveraging HBase for the World’s Largest Curated Genomic Data...
HBaseCon 2012 | Leveraging HBase for the World’s Largest Curated Genomic Data...
 
HBaseCon 2013: Evolving a First-Generation Apache HBase Deployment to Second...
HBaseCon 2013:  Evolving a First-Generation Apache HBase Deployment to Second...HBaseCon 2013:  Evolving a First-Generation Apache HBase Deployment to Second...
HBaseCon 2013: Evolving a First-Generation Apache HBase Deployment to Second...
 
HBaseCon 2013: Apache HBase on Flash
HBaseCon 2013: Apache HBase on FlashHBaseCon 2013: Apache HBase on Flash
HBaseCon 2013: Apache HBase on Flash
 

Semelhante a HBaseCon 2012 | HBase Security for the Enterprise - Andrew Purtell, Trend Micro

CIS13: Big Data Platform Vendor’s Perspective: Insights from the Bleeding Edge
CIS13: Big Data Platform Vendor’s Perspective: Insights from the Bleeding EdgeCIS13: Big Data Platform Vendor’s Perspective: Insights from the Bleeding Edge
CIS13: Big Data Platform Vendor’s Perspective: Insights from the Bleeding EdgeCloudIDSummit
 
Securing the Hadoop Ecosystem
Securing the Hadoop EcosystemSecuring the Hadoop Ecosystem
Securing the Hadoop EcosystemDataWorks Summit
 
Data Pipelines in Hadoop - SAP Meetup in Tel Aviv
Data Pipelines in Hadoop - SAP Meetup in Tel Aviv Data Pipelines in Hadoop - SAP Meetup in Tel Aviv
Data Pipelines in Hadoop - SAP Meetup in Tel Aviv larsgeorge
 
Securing Hadoop in an Enterprise Context (v2)
Securing Hadoop in an Enterprise Context (v2)Securing Hadoop in an Enterprise Context (v2)
Securing Hadoop in an Enterprise Context (v2)Hellmar Becker
 
Stream processing on mobile networks
Stream processing on mobile networksStream processing on mobile networks
Stream processing on mobile networkspbelko82
 
Hadoop on Azure, Blue elephants
Hadoop on Azure,  Blue elephantsHadoop on Azure,  Blue elephants
Hadoop on Azure, Blue elephantsOvidiu Dimulescu
 
Improvements in Hadoop Security
Improvements in Hadoop SecurityImprovements in Hadoop Security
Improvements in Hadoop SecurityDataWorks Summit
 
Hadoop Security Today & Tomorrow with Apache Knox
Hadoop Security Today & Tomorrow with Apache KnoxHadoop Security Today & Tomorrow with Apache Knox
Hadoop Security Today & Tomorrow with Apache KnoxVinay Shukla
 
Hadoop Security Today and Tomorrow
Hadoop Security Today and TomorrowHadoop Security Today and Tomorrow
Hadoop Security Today and TomorrowDataWorks Summit
 
Big Data Warehousing Meetup: Securing the Hadoop Ecosystem by Cloudera
Big Data Warehousing Meetup: Securing the Hadoop Ecosystem by ClouderaBig Data Warehousing Meetup: Securing the Hadoop Ecosystem by Cloudera
Big Data Warehousing Meetup: Securing the Hadoop Ecosystem by ClouderaCaserta
 
Hadoop and Data Access Security
Hadoop and Data Access SecurityHadoop and Data Access Security
Hadoop and Data Access SecurityCloudera, Inc.
 
Hadoop security @ Philly Hadoop Meetup May 2015
Hadoop security @ Philly Hadoop Meetup May 2015Hadoop security @ Philly Hadoop Meetup May 2015
Hadoop security @ Philly Hadoop Meetup May 2015Shravan (Sean) Pabba
 
Securing Hadoop in an Enterprise Context
Securing Hadoop in an Enterprise ContextSecuring Hadoop in an Enterprise Context
Securing Hadoop in an Enterprise ContextHellmar Becker
 
Improvements in Hadoop Security
Improvements in Hadoop SecurityImprovements in Hadoop Security
Improvements in Hadoop SecurityChris Nauroth
 
Trend Micro Big Data Platform and Apache Bigtop
Trend Micro Big Data Platform and Apache BigtopTrend Micro Big Data Platform and Apache Bigtop
Trend Micro Big Data Platform and Apache BigtopEvans Ye
 
Big data - Online Training
Big data - Online TrainingBig data - Online Training
Big data - Online TrainingLearntek1
 
TriHUG October: Apache Ranger
TriHUG October: Apache RangerTriHUG October: Apache Ranger
TriHUG October: Apache Rangertrihug
 

Semelhante a HBaseCon 2012 | HBase Security for the Enterprise - Andrew Purtell, Trend Micro (20)

Bi with apache hadoop(en)
Bi with apache hadoop(en)Bi with apache hadoop(en)
Bi with apache hadoop(en)
 
CIS13: Big Data Platform Vendor’s Perspective: Insights from the Bleeding Edge
CIS13: Big Data Platform Vendor’s Perspective: Insights from the Bleeding EdgeCIS13: Big Data Platform Vendor’s Perspective: Insights from the Bleeding Edge
CIS13: Big Data Platform Vendor’s Perspective: Insights from the Bleeding Edge
 
Securing the Hadoop Ecosystem
Securing the Hadoop EcosystemSecuring the Hadoop Ecosystem
Securing the Hadoop Ecosystem
 
Data Pipelines in Hadoop - SAP Meetup in Tel Aviv
Data Pipelines in Hadoop - SAP Meetup in Tel Aviv Data Pipelines in Hadoop - SAP Meetup in Tel Aviv
Data Pipelines in Hadoop - SAP Meetup in Tel Aviv
 
Securing Hadoop in an Enterprise Context (v2)
Securing Hadoop in an Enterprise Context (v2)Securing Hadoop in an Enterprise Context (v2)
Securing Hadoop in an Enterprise Context (v2)
 
Securing Hadoop in an Enterprise Context
Securing Hadoop in an Enterprise ContextSecuring Hadoop in an Enterprise Context
Securing Hadoop in an Enterprise Context
 
Stream processing on mobile networks
Stream processing on mobile networksStream processing on mobile networks
Stream processing on mobile networks
 
Hadoop on Azure, Blue elephants
Hadoop on Azure,  Blue elephantsHadoop on Azure,  Blue elephants
Hadoop on Azure, Blue elephants
 
Improvements in Hadoop Security
Improvements in Hadoop SecurityImprovements in Hadoop Security
Improvements in Hadoop Security
 
Hadoop Security Today & Tomorrow with Apache Knox
Hadoop Security Today & Tomorrow with Apache KnoxHadoop Security Today & Tomorrow with Apache Knox
Hadoop Security Today & Tomorrow with Apache Knox
 
Hadoop Security Today and Tomorrow
Hadoop Security Today and TomorrowHadoop Security Today and Tomorrow
Hadoop Security Today and Tomorrow
 
Big Data Warehousing Meetup: Securing the Hadoop Ecosystem by Cloudera
Big Data Warehousing Meetup: Securing the Hadoop Ecosystem by ClouderaBig Data Warehousing Meetup: Securing the Hadoop Ecosystem by Cloudera
Big Data Warehousing Meetup: Securing the Hadoop Ecosystem by Cloudera
 
Hadoop and Data Access Security
Hadoop and Data Access SecurityHadoop and Data Access Security
Hadoop and Data Access Security
 
Introduction to Hadoop
Introduction to HadoopIntroduction to Hadoop
Introduction to Hadoop
 
Hadoop security @ Philly Hadoop Meetup May 2015
Hadoop security @ Philly Hadoop Meetup May 2015Hadoop security @ Philly Hadoop Meetup May 2015
Hadoop security @ Philly Hadoop Meetup May 2015
 
Securing Hadoop in an Enterprise Context
Securing Hadoop in an Enterprise ContextSecuring Hadoop in an Enterprise Context
Securing Hadoop in an Enterprise Context
 
Improvements in Hadoop Security
Improvements in Hadoop SecurityImprovements in Hadoop Security
Improvements in Hadoop Security
 
Trend Micro Big Data Platform and Apache Bigtop
Trend Micro Big Data Platform and Apache BigtopTrend Micro Big Data Platform and Apache Bigtop
Trend Micro Big Data Platform and Apache Bigtop
 
Big data - Online Training
Big data - Online TrainingBig data - Online Training
Big data - Online Training
 
TriHUG October: Apache Ranger
TriHUG October: Apache RangerTriHUG October: Apache Ranger
TriHUG October: Apache Ranger
 

Mais de Cloudera, Inc.

Partner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptxPartner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptxCloudera, Inc.
 
Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists Cloudera, Inc.
 
2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards FinalistsCloudera, Inc.
 
Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019Cloudera, Inc.
 
Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19Cloudera, Inc.
 
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19Cloudera, Inc.
 
Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19Cloudera, Inc.
 
Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19Cloudera, Inc.
 
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19Cloudera, Inc.
 
Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19Cloudera, Inc.
 
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19Cloudera, Inc.
 
Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1Cloudera, Inc.
 
Extending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the PlatformExtending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the PlatformCloudera, Inc.
 
Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18Cloudera, Inc.
 
Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360Cloudera, Inc.
 
Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18Cloudera, Inc.
 
Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18Cloudera, Inc.
 

Mais de Cloudera, Inc. (20)

Partner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptxPartner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptx
 
Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists
 
2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists
 
Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019
 
Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19
 
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
 
Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19
 
Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19
 
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
 
Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19
 
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
 
Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18
 
Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3
 
Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2
 
Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1
 
Extending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the PlatformExtending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the Platform
 
Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18
 
Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360
 
Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18
 
Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18
 

Último

Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesZilliz
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfRankYa
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 

Último (20)

Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector Databases
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 

HBaseCon 2012 | HBase Security for the Enterprise - Andrew Purtell, Trend Micro

  • 1. HBase Security for the Enterprise Andrew Purtell, Trend Micro On behalf of the Trend Hadoop Group apurtell@apache.org
  • 2. Agenda • Who we are • Motivation • Use Cases • Implementation • Experience • Quickstart Tutorial
  • 4. Trend Micro Headquartered: Tokyo, Japan Founded: LA 1988 • Technology innovator and top ranked security solutions provider • 4,000+ employees worldwide
  • 5. Trend Micro Smart Protection Network WEB REPUTATION EMAIL FILE Threats Threat Collection REPUTATION REPUTATION • Customers • Partners • TrendLabs Research, Service & Support Management • Samples • Submissions • Honeypots • Web Crawling Data • Feedback Loops Platform • Behavioral Analysis SaaS Partners • ISPs Cloud • Routers • Etc. Endpoint Off Network Gateway Messaging • Information integration is our advantage
  • 6. Trend Hadoop Group Cascading Pig Giraph Flume HDFS Oozie HBase Sqoop MapReduce ZooKeeper Mahout Avro Core Hive Gora Solr Supported Not Supported, Monitoring • We curate and support a complete internal distribution • We act within the ASF community processes on behalf of internal stakeholders, and are ASF evangelists
  • 8. Our Challenges • As we grow our business we see the network effects of our customers' interactions with the Internet and each other • This is a volume, variety, and velocity problem
  • 9. Why HBase? • For our Hadoop based applications, if we were forced to use MR for every operation, it would not be useful • Fortunately, HBase provides low latency random access to very large data tables and first class Hadoop platform integration
  • 10. But... • Hadoop, for us, is the centerpiece of a data management consolidation strategy • (Prior to release 0.92) HBase did not have intrinsic access control facilities • Why do we care? Provenance, fault isolation, data sensitivity, auditable controls, ...
  • 11. Our Solution • Use HBase where appropriate • Build in the basic access control features we need (added in 0.92, evolving in 0.94+) • Do so with a community sanctioned approach • As a byproduct of this work, we have Coprocessors, separately interesting
  • 13. Meta • Our meta use case: Data integration, storage and service consolidation Today: “Data neighborhood” Yesterday: Data islands
  • 14. Application Fault Isolation • Multitenant cluster, multiple application dev teams • Need to strongly authenticate users to all system components: HDFS, HBase, ZooKeeper • Rogue users cannot subvert authentication • Allow and enforce restrictive permissions on internal application state: files, tables/CFs, znodes
  • 15. Private Table (Default case) • Strongly authenticate users to all system components • Assign ownership when a table is created • Allow only the owner full access to table resources • Deny all others • (Optional) Privacy on the wire with encrypted RPC • Internal application state • Applications under development, proofs of concept
  • 16. Sensitive Column Families in Shared Tables • Strongly authenticate users to all system components • Grant read or read-write permissions to some CFs • Restrict access to one or more other CFs only to owner • Requires ACLs at per-CF granularity • Default deny to help avoid policy mistakes • Domain Reputation Repository (DRR) • Tracking and logging system (TLS), like Google's Dapper
  • 17. Read-only Access for Ad Hoc Query • Strongly authenticate users to all system components • Need to supply HBase delegation tokens to MR • Grant write permissions to data ingress and analytic pipeline processes • Grant read only permissions for ad hoc uses, such as Pig jobs • Default deny to help avoid policy mistakes • Knowledge discovery via ad hoc query (Pig)
  • 19. Goals and Non-Goals Goals • Satisfy use cases • Use what Secure Hadoop Core provides as much as possible • Minimally invasive to core code Non-Goals • Row-level or per value (cell) • Complex policy, full role based access control • Push down of file ownership to HDFS
  • 20. Coprocessors • Inspired by Bigtable coprocessors, hinted at like the Higgs Boson in Jeff Dean's LADIS '09 keynote talk • Dynamically installed code that runs at each region in the RegionServers, loaded on a per-table basis: Observers: Like database triggers, provide event-based hooks for interacting with normal operations Endpoints: Like stored procedures, custom RPC methods called explicitly with parameters • A high-level call interface for clients: Calls addressed to rows or ranges of rows are mapped to data location and parallelized by client library • Access checking is done by an Observer • New security APIs implemented as Endpoints
  • 21. Authentication • Built on Secure Hadoop Client authentication via Kerberos, a trusted third party Secure RPC based on SASL • SASL can negotiate encryption and/or message integrity verification on a per connection basis • Make RPC extensible and pluggable, add a SecureRpcEngine option • Support DIGEST-MD5 authentication, allowing Hadoop delegation token use for MapReduce TokenProvider, a Coprocessor that provides and verifies HBase delegation tokens, and manages shared secrets on the cluster
  • 22. Authorization – AccessController • AccessController: A Coprocessor that manages access control lists • Simple and familiar permissions model: READ, WRITE, CREATE, ADMIN • Permissions grantable at table, column family, and column qualifier granularity • Supports user and group based assignment • The Hadoop group mapping service can model application roles as groups
  • 24. Authorization – Secure ZooKeeper • ZooKeeper plays a critical role in HBase cluster operations and in the security implementation; needs strong security or it becomes a weak point • Kerberos-based client authentication • Znode ACLs enforce SASL authenticated access for sensitive data
  • 25. Audit • Simple audit log via Log4J • Still need to work out a structured format for audit log messages
  • 26. Two Implementation “Levels” 1. Secure RPC • SecureRPCEngine for integration with Secure Hadoop, strong user authentication, message integrity, and encryption on the wire • Implementation is solid 2. Coprocessor-based add-ons • TokenProvider: Install only if running MR jobs with HBase RPC security enabled • AccessController: Install on a per table basis, configure per CF policy, otherwise no overheads • Implementations bring in new runtime dependencies on ZooKeeper, still considered experimental
  • 27. Two Implementation “Levels” 1. Secure RPC • SecureRPCEngine for integration with Secure Hadoop, strong user authentication, message integrity, and encryption on the wire • Implementation is solid 2. Coprocessor-based add-ons • TokenProvider: Install only if running MR jobs with HBase RPC security enabled • AccessController: Install on a per table basis, configure per CF policy, otherwise no overheads • Implementations bring in new runtime dependencies on ZooKeeper, still considered experimental
  • 28. Layering Thrift client Thrift client REST client REST client HBase MapReduce client HBase MapReduce client HBase Thrift HBase Thrift HBase REST HBase REST TokenProvider TokenProvider HBase Java client HBase Java client HBase Secure RPC HBase Secure RPC AccessController (optional on a a per-table basis) AccessController (optional on per-table basis) HBase HBase Hadoop Secure RPC Hadoop Secure RPC MapReduce MapReduce HDFS HDFS Authentication infrastructure: Kerberos ++ LDAP Authentication infrastructure: Kerberos LDAP OS OS
  • 30. Secure RPC Engine • Authentication adds latency at connection setup: Extra round trips for SASL negotiation • Recommendation: Increase RPC idle time for better connection reuse • Negotiating message integrity (“auth-int”) takes ~5% off of max throughput • Negotiating SASL encryption (“auth-conf”) takes ~10% off of max throughput • Recommendation: Consider your need for such options carefully
  • 31. Secure RPC Engine • A Hadoop system including HBase will initiate RPC far more frequently than without (file reads, compactions, client API access, …) • If the KDC is overloaded then not only client operations but also things like region post deployment tasks may fail, increasing region transition time • Recommendation: HA KDC deployment, KDC capacity planning, trust federation over multiple KDC HA-pairs
  • 32. Secure RPC Engine • Activity swarms may be seen by a KDC as replay attacks (“Request is a replay (34)”) • Recommendation: Insure unique keys for each service instance, e.g. hbase/host@realm where host is fqdn • Recommendation: Check for clock skew over cluster hosts • Recommendation: Use MIT Kerberos 1.8 • Recommendation: Increase RPC idle time for better connection reuse • Recommendation: Avoid too frequent HBCK validation of cluster health
  • 33. Hadoop Security Issues (?) • Open issue: Occasional swarms of 5-10 seconds at intervals of about TGT lifetime of: date time host.dc ERROR [PostOpenDeployTasks: a74847b544ba37001f56a9d716385253] (org.apache.hadoop.security.UserGroupInformation) - PriviledgedActionException as:hbase/host.dc@realm (auth:KERBEROS) cause:javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)] Some Hadoop RPC improvements not yet ported • Speaking of swarms, at or about delegation token expiration interval you may see runs of: date time host.dc ERROR [DataStreamer for file file block blockId] (org.apache.hadoop.security.UserGroupInformation) - PriviledgedActionException as:blockId (auth:SIMPLE) cause:org.apache.hadoop.ipc.RemoteException: Block token with block_token_identifier (expiryDate=timestamp, keyId=keyId, userId=hbase, blockIds=blockId, access modes=[READ|WRITE]) is expired. These should probably not be logged at ERROR level
  • 34. TokenProvider • Increases exposure to ZooKeeper related RegionServer aborts: If keys cannot be rolled or accessed due to a ZK error, we must fail closed • Recommendation: Provision sufficient ZK quorum peers and deploy them in separate failure domains (one at each top of rack, or similar) • Recommendation: Redundant L2 / L2+L3, you probably have it already • Recent versions of ZooKeeper have important bug fixes • Recommendation: Use ZooKeeper 3.4.4 (when released) or higher For more detail on HBase token authentication: http://wiki.apache.org/hadoop/Hbase/HBaseTokenAuthentication
  • 35. AccessController • Use 0.92.1 or above for a bug fix with Get protection • The AccessController will create a small new “system” table named _ acl _; the data in this table is almost as important as that in .META. • Recommendation: Use the shell to manually flush the ACL table after permissions changes to insure changes are persisted • Recommendation: The recommendations related to ZooKeeper for TokenProvider apply equally here
  • 36. Shell Support • Shell support is rudimentary, will support the basic use cases • Note: You must supply exactly the same permission specification to revoke as you did to grant; there is no wildcarding and nothing like revoke all