Trend Micro developed the new security features in HBase 0.92 and has the first known deployment of secure HBase in production. We will share our motivations, use cases, experiences, and provide a 10 minute tutorial on how to set up a test secure HBase cluster and a walk through of a simple usage example. The tutorial will be carried out live on an on-demand EC2 cluster, with a video backup in case of network or EC2 unavailability.
4. Trend Micro
Headquartered:
Tokyo, Japan Founded:
LA 1988
•
Technology innovator and top ranked security solutions
provider
•
4,000+ employees worldwide
5. Trend Micro Smart Protection Network WEB
REPUTATION
EMAIL FILE
Threats Threat Collection REPUTATION REPUTATION
• Customers
• Partners
• TrendLabs Research,
Service & Support
Management
• Samples
• Submissions
• Honeypots
• Web Crawling Data
• Feedback Loops Platform
• Behavioral Analysis
SaaS
Partners
• ISPs Cloud
• Routers
• Etc.
Endpoint
Off Network Gateway
Messaging
•
Information integration is our advantage
6. Trend Hadoop Group
Cascading
Pig
Giraph
Flume
HDFS Oozie
HBase Sqoop
MapReduce
ZooKeeper Mahout
Avro
Core
Hive Gora
Solr
Supported Not Supported, Monitoring
•
We curate and support a complete internal distribution
•
We act within the ASF community processes on behalf of
internal stakeholders, and are ASF evangelists
8. Our Challenges
•
As we grow our business we see the network effects of
our customers' interactions with the Internet and each
other
•
This is a volume,
variety, and velocity
problem
9. Why HBase?
•
For our Hadoop based
applications, if we were
forced to use MR for
every operation, it
would not be useful
•
Fortunately, HBase
provides low latency
random access to
very large data tables
and first class Hadoop
platform integration
10. But...
•
Hadoop, for us, is the centerpiece of a data management
consolidation strategy
•
(Prior to release 0.92) HBase did not have intrinsic
access control facilities
•
Why do we care? Provenance, fault isolation, data
sensitivity, auditable controls, ...
11. Our Solution
•
Use HBase where appropriate
•
Build in the basic access control features we need
(added in 0.92, evolving in 0.94+)
•
Do so with a community sanctioned approach
•
As a byproduct of this work, we have Coprocessors,
separately interesting
13. Meta
•
Our meta use case: Data integration, storage and service
consolidation
Today: “Data
neighborhood”
Yesterday: Data islands
14. Application Fault Isolation
•
Multitenant cluster, multiple application dev teams
•
Need to strongly authenticate users to all system
components: HDFS, HBase, ZooKeeper
•
Rogue users cannot subvert authentication
•
Allow and enforce restrictive permissions on internal
application state: files, tables/CFs, znodes
15. Private Table (Default case)
•
Strongly authenticate users to all system components
•
Assign ownership when a table is created
•
Allow only the owner full access to table resources
•
Deny all others
•
(Optional) Privacy on the wire with encrypted RPC
•
Internal application state
•
Applications under development, proofs of concept
16. Sensitive Column Families in Shared Tables
•
Strongly authenticate users to all system components
•
Grant read or read-write permissions to some CFs
•
Restrict access to one or more other CFs only to owner
•
Requires ACLs at per-CF granularity
•
Default deny to help avoid policy mistakes
•
Domain Reputation Repository (DRR)
•
Tracking and logging system (TLS), like Google's Dapper
17. Read-only Access for Ad Hoc Query
•
Strongly authenticate users to all system components
•
Need to supply HBase delegation tokens to MR
•
Grant write permissions to data ingress and analytic
pipeline processes
•
Grant read only permissions for ad hoc uses, such as Pig
jobs
•
Default deny to help avoid policy mistakes
•
Knowledge discovery via ad hoc query (Pig)
19. Goals and Non-Goals
Goals
•
Satisfy use cases
•
Use what Secure Hadoop Core provides as much as
possible
•
Minimally invasive to core code
Non-Goals
•
Row-level or per value (cell)
•
Complex policy, full role based access control
•
Push down of file ownership to HDFS
20. Coprocessors
•
Inspired by Bigtable coprocessors, hinted at like the
Higgs Boson in Jeff Dean's LADIS '09 keynote talk
•
Dynamically installed code that runs at each region in the
RegionServers, loaded on a per-table basis:
Observers: Like database triggers, provide event-based hooks for
interacting with normal operations
Endpoints: Like stored procedures, custom RPC methods called
explicitly with parameters
•
A high-level call interface for clients: Calls addressed to
rows or ranges of rows are mapped to data location and
parallelized by client library
•
Access checking is done by an Observer
•
New security APIs implemented as Endpoints
21. Authentication
•
Built on Secure Hadoop
Client authentication via Kerberos, a trusted third party
Secure RPC based on SASL
•
SASL can negotiate encryption and/or message integrity
verification on a per connection basis
•
Make RPC extensible and pluggable, add a
SecureRpcEngine option
•
Support DIGEST-MD5 authentication, allowing Hadoop
delegation token use for MapReduce
TokenProvider, a Coprocessor that provides and verifies HBase
delegation tokens, and manages shared secrets on the cluster
22. Authorization – AccessController
•
AccessController: A Coprocessor that manages access
control lists
•
Simple and familiar permissions model: READ, WRITE,
CREATE, ADMIN
•
Permissions grantable at table, column family, and
column qualifier granularity
•
Supports user and group based assignment
•
The Hadoop group mapping service can model
application roles as groups
24. Authorization – Secure ZooKeeper
•
ZooKeeper plays a critical role in HBase cluster
operations and in the security implementation; needs
strong security or it becomes a weak point
•
Kerberos-based client
authentication
•
Znode ACLs enforce
SASL authenticated access
for sensitive data
25. Audit
•
Simple audit log via Log4J
•
Still need to work out a structured format for audit log
messages
26. Two Implementation “Levels”
1. Secure RPC
•
SecureRPCEngine for integration with Secure Hadoop,
strong user authentication, message integrity, and
encryption on the wire
•
Implementation is solid
2. Coprocessor-based add-ons
•
TokenProvider: Install only if running MR jobs with HBase
RPC security enabled
•
AccessController: Install on a per table basis, configure
per CF policy, otherwise no overheads
•
Implementations bring in new runtime dependencies on
ZooKeeper, still considered experimental
27. Two Implementation “Levels”
1. Secure RPC
•
SecureRPCEngine for integration with Secure Hadoop,
strong user authentication, message integrity, and
encryption on the wire
•
Implementation is solid
2. Coprocessor-based add-ons
•
TokenProvider: Install only if running MR jobs with HBase
RPC security enabled
•
AccessController: Install on a per table basis, configure
per CF policy, otherwise no overheads
•
Implementations bring in new runtime dependencies on
ZooKeeper, still considered experimental
30. Secure RPC Engine
•
Authentication adds latency at connection setup: Extra
round trips for SASL negotiation
•
Recommendation: Increase RPC idle time for better
connection reuse
•
Negotiating message integrity (“auth-int”) takes ~5% off
of max throughput
•
Negotiating SASL encryption (“auth-conf”) takes ~10%
off of max throughput
•
Recommendation: Consider your need for such options
carefully
31. Secure RPC Engine
•
A Hadoop system including HBase will initiate RPC far
more frequently than without (file reads, compactions,
client API access, …)
•
If the KDC is overloaded then not only client operations
but also things like region post deployment tasks may fail,
increasing region transition time
•
Recommendation: HA KDC deployment, KDC capacity
planning, trust federation over multiple KDC HA-pairs
32. Secure RPC Engine
•
Activity swarms may be seen by a KDC as replay attacks
(“Request is a replay (34)”)
•
Recommendation: Insure unique keys for each service
instance, e.g. hbase/host@realm where host is fqdn
•
Recommendation: Check for clock skew over cluster hosts
•
Recommendation: Use MIT Kerberos 1.8
•
Recommendation: Increase RPC idle time for better
connection reuse
•
Recommendation: Avoid too frequent HBCK validation of
cluster health
33. Hadoop Security Issues (?)
•
Open issue: Occasional swarms of 5-10 seconds at
intervals of about TGT lifetime of:
date time host.dc ERROR [PostOpenDeployTasks:
a74847b544ba37001f56a9d716385253]
(org.apache.hadoop.security.UserGroupInformation) -
PriviledgedActionException as:hbase/host.dc@realm (auth:KERBEROS)
cause:javax.security.sasl.SaslException: GSS initiate failed
[Caused by GSSException: No valid credentials provided (Mechanism
level: Failed to find any Kerberos tgt)]
Some Hadoop RPC improvements not yet ported
•
Speaking of swarms, at or about delegation token
expiration interval you may see runs of:
date time host.dc ERROR [DataStreamer for file file block blockId]
(org.apache.hadoop.security.UserGroupInformation) -
PriviledgedActionException as:blockId (auth:SIMPLE)
cause:org.apache.hadoop.ipc.RemoteException: Block token with
block_token_identifier (expiryDate=timestamp, keyId=keyId,
userId=hbase, blockIds=blockId, access modes=[READ|WRITE]) is
expired.
These should probably not be logged at ERROR level
34. TokenProvider
•
Increases exposure to ZooKeeper related RegionServer
aborts: If keys cannot be rolled or accessed due to a ZK
error, we must fail closed
•
Recommendation: Provision sufficient ZK quorum peers
and deploy them in separate failure domains (one at
each top of rack, or similar)
•
Recommendation: Redundant L2 / L2+L3, you probably
have it already
•
Recent versions of ZooKeeper have important bug fixes
•
Recommendation: Use ZooKeeper 3.4.4 (when released)
or higher
For more detail on HBase token authentication:
http://wiki.apache.org/hadoop/Hbase/HBaseTokenAuthentication
35. AccessController
•
Use 0.92.1 or above for a bug fix with Get protection
•
The AccessController will create a small new “system”
table named _ acl _; the data in this table is almost as
important as that in .META.
•
Recommendation: Use the shell to manually flush the
ACL table after permissions changes to insure changes
are persisted
•
Recommendation: The recommendations related to
ZooKeeper for TokenProvider apply equally here
36. Shell Support
•
Shell support is rudimentary, will support the basic use
cases
•
Note: You must supply exactly the same permission
specification to revoke as you did to grant; there is no
wildcarding and nothing like revoke all