Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Â
Hadoop security
1. Page1 Š Hortonworks Inc. 2011 â 2014. All Rights Reserved
Hadoop Security with HDP/PHD
2. Page2 Š Hortonworks Inc. 2011 â 2014. All Rights Reserved
Disclaimer
This document may contain product features and technology directions that are under
development or may be under development in the future.
Technical feasibility, market demand, user feedback, and the Apache Software Foundation
community development process can all effect timing and final delivery.
This documentâs description of these features and technology directions does not represent a
contractual commitment from Hortonworks to deliver these features in any generally available
product.
Product features and technology directions are subject to change, and must not be included in
contracts, purchase orders, or sales agreements of any kind.
3. Page3 Š Hortonworks Inc. 2011 â 2014. All Rights Reserved
Agenda
⢠Hadoop Security
⢠Kerberos
⢠Authorization and Auditing with Ranger
⢠Gateway Security with Knox
⢠Encryption
4. Page4 Š Hortonworks Inc. 2011 â 2014. All Rights Reserved
⢠Wire encryption
in Hadoop
⢠Native and
partner
encryption
⢠Centralized
audit reporting
w/ Apache
Ranger
⢠Fine grain access
control with
Apache Ranger
Security today in Hadoop with HDP/PHD
Authorization
What can I do?
Audit
What did I do?
Data Protection
Can data be encrypted
at rest and over the
wire?
⢠Kerberos
⢠API security with
Apache Knox
Authentication
Who am I/prove it?
HDPPHD
Centralized Security Administration
EnterpriseServices:Security
5. Page5 Š Hortonworks Inc. 2011 â 2014. All Rights Reserved
Security needs are changing
Administration
Centrally management &
consistent security
Authentication
Authenticate users and systems
Authorization
Provision access to data
Audit
Maintain a record of data access
Data Protection
Protect data at rest and in motion
Security needs are changing
⢠YARN unlocks the data lake
⢠Multi-tenant: Multiple applications for data
access
⢠Different kinds of data
⢠Changing and complex compliance environment
2014
65% of clusters host
multiple workloads
Fall 2013
Largely siloâd deployments
with single workload clusters
6. Page6 Š Hortonworks Inc. 2011 â 2014. All Rights Reserved
HDFS
Typical Flow â Hive Access through Beeline client
HiveServer 2
A B C
Beeline
Client
7. Page7 Š Hortonworks Inc. 2011 â 2014. All Rights Reserved
HDFS
Typical Flow â Authenticate through Kerberos
HiveServer 2
A B C
KDC
Use Hive
Service T,icket
submit query
Hive gets
Namenode
(NN) service
ticket
Hive creates
map reduce
using NN
Service Ticket
Client
⢠Requests a TGT
⢠Receives TGT
⢠Client dcrypts it with the password
hash
⢠Sends the TGT and receives a Service
Ticket
Beeline
Client
8. Page8 Š Hortonworks Inc. 2011 â 2014. All Rights Reserved
HDFS
Typical Flow â Add Authorization through Ranger(XA
Secure)
HiveServer 2
A B C
KDC
Use Hive ST,
submit query
Hive gets
Namenode
(NN) service
ticket
Hive creates
map reduce
using NN ST
Ranger
Client gets
service ticket for
Hive
Beeline
Client
9. Page9 Š Hortonworks Inc. 2011 â 2014. All Rights Reserved
HDFS
Typical Flow â Firewall, Route through Knox
Gateway
HiveServer 2
A B C
KDC
Use Hive ST,
submit query
Hive gets
Namenode
(NN) service
ticket
Hive creates
map reduce
using NN ST
Ranger
Knox gets
service ticket for
Hive
Knox runs as proxy
user using Hive ST
Original
request w/user
id/password
Client gets
query result
Beeline
Client
Apache
Knox
10. Page10 Š Hortonworks Inc. 2011 â 2014. All Rights Reserved
HDFS
Typical Flow â Add Wire and File Encryption
HiveServer 2
A B C
KDC
Use Hive ST,
submit query
Hive gets
Namenode
(NN) service
ticket
Hive creates
map reduce
using NN ST
Ranger
Knox gets
service ticket for
Hive
Knox runs as proxy
user using Hive ST
Original
request w/user
id/password
Client gets
query result
SSL
Beeline
Client
SSL SASL
SSL SSL
Apache
Knox
11. Page11 Š Hortonworks Inc. 2011 â 2014. All Rights Reserved
Security Features
PHD/HDP Security
Authentication
Kerberos Support â
Perimeter Security â For services and rest API â
Authorizations
Fine grained access control HDFS, Hbase and Hive, Storm
and Knox
Role base access control â
Column level â
Permission Support Create, Drop, Index, lock, user
Auditing
Resource access auditing Extensive Auditing
Policy auditing â
12. Page12 Š Hortonworks Inc. 2011 â 2014. All Rights Reserved
HDP/PHD Security w/ Ranger
Data Protection
Wire Encryption â
Volume Encryption TDE
File/Column Encryption HDFS TDE & Partners
Reporting
Global view of policies and audit data â
Manage
User/ Group mapping â
Global policy manager, Web UI â
Delegated administration â
Security Features
13. Page13 Š Hortonworks Inc. 2011 â 2014. All Rights Reserved
Partner Integration
Security Integrations:
â Ranger plugins: centralize authorization/audit of 3rd party s/w in Ranger UI
â Via Custom Log4J appender, can stream audit events to INFA infrastructure
â Knox: Route partner APIs through Knox after validating compatibility
â Provide SSO capability to end users
14. Page14 Š Hortonworks Inc. 2011 â 2014. All Rights Reserved
Authentication w/ Kerberos
Page 14
15. Page15 Š Hortonworks Inc. 2011 â 2014. All Rights Reserved
Kerberos in the field
Kerberos no longer âtoo complexâ. Adoption growing.
â Ambari helps automate and manage kerberos integration with cluster
Use: Active directory or a combine Kerberos/Active Directory
â Active Directory is seen most commonly in the field
â Many start with separate MIT KDC and then later grow into the AD KDC
Knox should be considered for API/Perimeter security
â Removes need for Kerberos for end users
â Enables integration with different authentication standards
â Single location to manage security for REST APIs & HTTP based services
â Tip: In DMZ
16. Page22 Š Hortonworks Inc. 2011 â 2014. All Rights Reserved
Authorization and Auditing
Apache Ranger
Page 22
17. Page23 Š Hortonworks Inc. 2011 â 2014. All Rights Reserved
Authorization and Audit
Authorization
Fine grain access control
⢠HDFS â Folder, File
⢠Hive â Database, Table, Column
⢠HBase â Table, Column Family, Column
⢠Storm, Knox and more
Audit
Extensive user access auditing in
HDFS, Hive and HBase
⢠IP Address
⢠Resource type/ resource
⢠Timestamp
⢠Access granted or denied
Control
access into
system
Flexibility
in defining
policies
18. Page24 Š Hortonworks Inc. 2011 â 2014. All Rights Reserved
Central Security Administration
Apache Ranger
⢠Delivers a âsingle pane of glassâ for
the security administrator
⢠Centralizes administration of
security policy
⢠Ensures consistent coverage across
the entire Hadoop stack
19. Page25 Š Hortonworks Inc. 2011 â 2014. All Rights Reserved
Setup Authorization Policies
25
file level
access
control,
flexible
definition
Control
permissions
20. Page26 Š Hortonworks Inc. 2011 â 2014. All Rights Reserved
Monitor through Auditing
26
22. Page28 Š Hortonworks Inc. 2011 â 2014. All Rights Reserved
Authorization and Auditing w/ Ranger
HDFS
Ranger Administration Portal
HBase
Hive Server2
Ranger Policy
Server
Ranger Audit
Server
Ranger
Plugin
HadoopComponentsEnterprise
Users
Ranger
Plugin
Ranger
Plugin
Legacy Tools
& Data
Governance
Integration APIHDFS
Knox
Storm
Ranger
Plugin
Ranger
Plugin
RDBMS
HDP 2.2 Additions Planned for 2015
TBD
EnterpriseServices:Security
Ranger
Plugin*
23. Page29 Š Hortonworks Inc. 2011 â 2014. All Rights Reserved
Installation Steps
⢠Install PHD 3.0
⢠Install Apache Ranger (https://tinyurl.com/mlgs3jy)
â Install Policy Manager
â Install User Sync
â Install Ranger Plugins
⢠Start Policy Manager
â service ranger-admin start
⢠Verify â http://<host>:6080/
- admin/admin
24. Page30 Š Hortonworks Inc. 2011 â 2014. All Rights Reserved
Ranger Plugins
⢠HDFS
⢠HIVE
⢠KNOX
⢠STORM
⢠HBASE
Steps to Enable plugins
1. Start the Policy Manager
2. Create the Plugin repository in the Policy Manager
3. Install the Plugin
⢠Edit the install.properties
⢠Execue ./enable-<plugin>.sh
4. Restart the plugin service (e.g. HDFS, Hive etc)
25. Page31 Š Hortonworks Inc. 2011 â 2014. All Rights Reserved
Ranger Console
31
⢠The Repository Manager Tab
⢠The Policy Manager Tab
⢠The User/Group Tab
⢠The Analytics Tab
⢠The Audit Tab
26. Page32 Š Hortonworks Inc. 2011 â 2014. All Rights Reserved
Repository Manager
32
⢠Add New Repository
⢠Edit Repository
⢠Delete Repository
28. Page34 Š Hortonworks Inc. 2011 â 2014. All Rights Reserved
REST API Security through Knox
Securely share Hadoop Cluster
Page 34
29. Page35 Š Hortonworks Inc. 2011 â 2014. All Rights Reserved
Share Data Lake with everyone - Securely
⢠Simplifies access: Extends Hadoopâs REST/HTTP services by encapsulating Kerberos to within the
Cluster.
⢠Enhances security: Exposes Hadoopâs REST/HTTP services without revealing network details,
providing SSL out of the box.
⢠Centralized control: Enforces REST API security centrally, routing requests to multiple Hadoop
clusters.
⢠Enterprise integration: Supports LDAP, Active Directory, SSO, SAML and other authentication
systems.
30. Page36 Š Hortonworks Inc. 2011 â 2014. All Rights Reserved
Apache Knox
Knox can be used with both unsecured Hadoop clusters, and Kerberos secured clusters. In an enterprise
solution that employs Kerberos secured clusters, the Apache Knox Gateway provides an enterprise security
solution that:
⢠Integrates well with enterprise identity management solutions
⢠Protects the details of the Hadoop cluster deployment (hosts and ports are hidden from end users)
⢠Simplifies the number of services with which a client needs to interact
31. Page37 Š Hortonworks Inc. 2011 â 2014. All Rights Reserved
Load Balancer
Extend Hadoop API reach with Knox
Hadoop Cluster
Application TierApp A App NApp B App C
Data Ingest
ETL
Admin/
Operators
Bastian Node
SSH
RPC Call
Falcon
Oozie
Scoop
Flume
Data
Operator
Business
User
Hadoop
Admin
JDBC/ODBCREST/HTTP
Knox
32. Page38 Š Hortonworks Inc. 2011 â 2014. All Rights Reserved
HDFS
Typical Flow â Add Wire and File Encryption
HiveServer 2
A B C
KDC
Use Hive ST,
submit query
Hive gets
Namenode
(NN) service
ticket
Hive creates
map reduce
using NN ST
Ranger
Knox gets
service ticket for
Hive
Knox runs as proxy
user using Hive ST
Original
request w/user
id/password
Client gets
query result
SSL
Beeline
Client
SSL SASL
SSL SSL
Apache
Knox
33. Page39 Š Hortonworks Inc. 2011 â 2014. All Rights Reserved
Why Knox?
Simplified Access
⢠Kerberos encapsulation
⢠Extends API reach
⢠Single access point
⢠Multi-cluster support
⢠Single SSL certificate
Centralized Control
⢠Central REST API auditing
⢠Service-level authorization
⢠Alternative to SSH âedge nodeâ
Enterprise Integration
⢠LDAP integration
⢠Active Directory integration
⢠SSO integration
⢠Apache Shiro extensibility
⢠Custom extensibility
Enhanced Security
⢠Protect network details
⢠SSL for non-SSL services
⢠WebApp vulnerability filter
34. Page40 Š Hortonworks Inc. 2011 â 2014. All Rights Reserved
Hadoop REST API with Knox
Service Direct URL Knox URL
WebHDFS http://namenode-host:50070/webhdfs https://knox-host:8443/webhdfs
WebHCat http://webhcat-host:50111/templeton https://knox-host:8443/templeton
Oozie http://ooziehost:11000/oozie https://knox-host:8443/oozie
HBase http://hbasehost:60080 https://knox-host:8443/hbase
Hive http://hivehost:10001/cliservice https://knox-host:8443/hive
YARN http://yarn-host:yarn-port/ws https://knox-host:8443/resourcemanager
Masters could
be on many
different hosts
One hosts,
one port
Consistent
paths
SSL config
at one host
35. Page41 Š Hortonworks Inc. 2011 â 2014. All Rights Reserved
Hadoop REST API Security: Drill-Down
Page 41
REST
Client
Enterprise
Identity
Provider
LDAP/AD
Knox Gateway
GW
GW
Firewall
Firewall
DMZ
LB
Edge
Node/Hado
op CLIs RPC
HTTP
HTTP HTTP
LDAP
Hadoop Cluster 1
Masters
Slaves
RM
NN
Web
HCat
Oozie
DN NM
HS2
Hadoop Cluster 2
Masters
Slaves
RM
NN
Web
HCat
Oozie
DN NM
HS2
HBase
HBase
36. Page42 Š Hortonworks Inc. 2011 â 2014. All Rights Reserved
Knox âfeatures in PHD
⢠Use Ambari for Install/start/stop/configuration
⢠Knox support for HDFS HA
⢠Support for YARN REST API
⢠Support for SSL to Hadoop Cluster Services (WebHDFS, HBase,
Hive & Oozie)
⢠Integration with Ranger for Knox Service Level Authorization
⢠Knox Management REST API
37. Page43 Š Hortonworks Inc. 2011 â 2014. All Rights Reserved
Installation
⢠Installed via Ambari
âThis can be done manually
âStart the embeded ldap
⢠There is good examples in the Apache doc with groovy scripts
âhttps://knox.apache.org/books/knox-0-4-0/knox-0-4-0.html
38. Page44 Š Hortonworks Inc. 2011 â 2014. All Rights Reserved
Data Protection
Wire and data at rest encryption
Page 44
39. Page45 Š Hortonworks Inc. 2011 â 2014. All Rights Reserved
Data Protection
HDP allows you to apply data protection policy at
different layers across the Hadoop stack
Layer What? How ?
Storage and
Access
Encrypt data while it is at rest
Partners, HDFS Tech Preview, Hbase
encryption, OS level encrypt,
Transmission Encrypt data as it moves Supported from HDP 2.1
40. Page49 Š Hortonworks Inc. 2011 â 2014. All Rights Reserved
HDFS Transparent Data Encryption (TDE) in 2.2
⢠Data encryption on a higher level than the OS one whilst remaining native
and transparent to Hadoop
⢠End-to-end: data can be both encrypted and decrypted by the clients
⢠Encryption/decryption using the usual HDFS functions from the client
⢠No need to requiring to change user application code
⢠No need to store data encryption keys on HDFS itself
⢠No need to unencrypted data.
⢠Data is effectively encrypted at rest, but since it is decrypted on the client
side, it means that it is also encrypted on the wire while being transmitted.
⢠HDFS file encryption/decryption is transparent to its client
⢠users can read/write files to/from encryption zone as long they have the permission to
access it
⢠Depends on installing a Key Management Server
41. Page53 Š Hortonworks Inc. 2011 â 2014. All Rights Reserved
HDFS Transparent Data Encryption (TDE) in 2.2
⢠Data encryption on a higher level than the OS one whilst remaining native and transparent to Hadoop
⢠End-to-end: data can be both encrypted and decrypted by the clients
⢠Encryption/decryption using the usual HDFS functions from the client
⢠No need to requiring to change user application code
⢠No need to store data encryption keys on HDFS itself
⢠No need to unencrypted data.
⢠Data is effectively encrypted at rest, but since it is decrypted on the client side, it means that it is also
encrypted on the wire while being transmitted.
⢠HDFS file encryption/decryption is transparent to its client
⢠users can read/write files to/from encryption zone as long they have the permission to access it
⢠Depends on installing a Key Management Server
42. Page54 Š Hortonworks Inc. 2011 â 2014. All Rights Reserved
HDFS Transparent Data Encryption (TDE) - Steps
⢠Install and run KMS on top of HDP 2.2
⢠Change HDFS params via Ambari
⢠Create encryption key
⢠hadoop key create key1 -size 256
⢠hadoop key list âmetadata
⢠Create an encryption zone using the key
⢠hdfs dfs -mkdir /zone1
⢠hdfs crypto -createZone -keyName key1 /zone1
⢠hdfs âlistZones
â http://hortonworks.com/kb/hdfs-transparent-data-encryption/