Mais conteúdo relacionado Semelhante a Built-In Security for the Cloud (20) Mais de DataWorks Summit (20) Built-In Security for the Cloud1. 1 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Built-in Security For The Cloud
DataWorks Summit Sydney
September 2017
2. 2 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Presenters
Jeff Sposetti
Senior Director of Product Management, Cloud
Hortonworks Data Cloud, Cloudbreak
3. 3 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Agenda
Introduction
Quick Demo
Security Building Blocks: Apache Ranger and Knox
Bringing It Together: Cloud and Data Lake Security
Longer Demo
Wrap Up
Q & A
4. 4 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Background: Ephemeral Workloads + Cloud Storage
Cloud is driving more ephemeral data processing use cases
Cloud requires a robust integration with cloud storage
CLOUD STORAGE
S3
ADLS
WASB
WORKLOAD CLUSTERS
Durable Ephemeral
5. 5 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Background: Hortonworks Data Cloud for AWS
Focuses on business agility, rather than
infinite configurability and cluster
management
Addresses prescriptive, ephemeral use
cases around Apache Spark + Apache Hive
Pre-tuned and configured for use with
Amazon S3
Learn more:
http://hortonworks.com/products/cloud/aws/
7. 7 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Security Building Blocks:
Apache Ranger and Knox
8. 8 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Protecting the Elephant in the Castle…..
Kerberos,
Wire Encryption
HDFS Encryption
Apache Ranger
Network Segmentation,
Firewalls
LDAP/AD
Apache Knox
9. 9 © Hortonworks Inc. 2011 – 2017. All Rights
Reserved
Apache Knox Proxying Services
★ Provide access to Hadoop via proxying of
HTTP resources
★ Ecosystem APIs and UIs + Hadoop oriented
dispatching for Kerberos + doAs
(impersonation) etc.
Authentication Services
★ REST API access, WebSSO flow for UIs
★ LDAP/AD, Header based PreAuth
★ Kerberos, SAML, OAuth
Client DSL/SDK Services
★ Scripting through DSL
★ Using Knox Shell classes directly as SDK
10. 10 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Apache Ranger
Comprehensive and Extensible Security Model
• Centralized platform to define, administer and manage
security policies across Hadoop components (HDFS, Hive,
HBase, YARN, Kafka, Solr, Storm, Knox, NiFi, Atlas)
• Extensible Architecture with ability to add custom policy
conditions, user context enrichers
Fine-Grained Authorization
• For data access control for Database, Table, Column, LDAP
Groups & Specific Users
Centralized Auditing
• Central audit location for all access requests
• Support multiple destination sources (HDFS, Solr, etc.)
• Real-time visual query interface
Advanced Security
• Dynamic Security Policies: Prohibition, Time, Location and
Tag (Atlas)
• Dynamic Column Masking & Row Filtering
OPERATIONS SECURITY
GOVERNANCE
STORAGE
STORAGE
Machine
Learning
Batch
StreamingInteractive
Search
SECURITY
11. 11 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Bringing It Together:
Cloud and Data Lake Services
12. 12 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
CLOUD
DATA LAKE
SECURITY
13. 13 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Key Components for Enterprise Security
SCHEMA POLICY AUDIT DIRECTORY
WHAT
Provides Hive schema (tables,
views, etc).
WHY
If you have 2+ workloads
accessing the same data, need
to share schema across those
workloads.
HOW
Externalize Hive Metastore
into for schema definition.
WHAT
Defines security policies
around Hive schema.
WHY
If you have 2+ users accessing
the same data, need policies
to be consistently available
and enforced.
HOW
Externalize and share Ranger
across workloads and store
policies external.
WHAT
Audit user access.
WHY
Capture data access activity.
HOW
Externalize and share Ranger
across workloads, leverage
cloud storage for audit data.
GATEWAY
WHAT
Provide single endpoint that
can be protected with SSL and
enabled for authentication to
access to cluster resources.
WHY
Avoid opening many ports,
some potentially w/o
authentication or SSL
protection.
HOW
Deploy a centralized protected
gateway automatically.
WHAT
Users and groups.
WHY
Provide authentication source
for users and authorization
source for groups.
HOW
Leverage external LDAP or
Active Directory.
14. 14 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Ephemeral Workloads: With Enterprise Security
Ephemeral Enterprise Security
Tuned and Optimized
Infrastructure
Simplified, Automated
Operations
S3 Integration
Protected Network Access
Schema Shared (Hive Metastore) Shared (Hive Metastore)
Authentication Single-user Multi-User (LDAP/AD)
Authorization - Security Policies (Ranger)
Audit - Audit (Ranger)
15. 15 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Ephemeral Workloads + Cloud Storage + Shared “Data Lake” Services
CLOUD STORAGE
S3
ADLS
WASB
WORKLOAD CLUSTERS
Durable Ephemeral
SHARED DATA LAKE SERVICES
Metastore
SCHEMA
Long Running
Define your data schema and
security policies once for your
ephemeral and always-on
workloads
Ranger
POLICY
Security access to workload
clusters via a Protected Gateway
enabled for AuthN and HTTPS.
16. 16 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Shared Schema: Hive Metastore
Register external “Amazon RDS” instances to use with Hive Metastore
Preserve Hive schema across multiple ephemeral clusters
17. 17 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Protected Network Access: Knox
18. 18 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Shared Security Policies: Ranger
Create a set of “Shared Data Lake Services”
Preserve Ranger Security Policies across multiple ephemeral clusters
19. 19 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Deployment Architecture
Access your cluster
components through the
protected gateway via SSL
on port 443 open on the
controller security group.
CONTROLLER
PROTECTED
GATEWAY
USER ACCESS
Zeppelin
HIVE LLAP / SPARK WORKLOADS
Hive
LLAP
SHARED DATA LAKE SERVICES
Ranger
POLICY
(RDS)
AUDIT
(S3)
SCHEMA
(RDS)
DIRECTORY
(LDAP/AD)
Spark
Hive
Metastore
20. 20 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Hortonworks Data Cloud + Shared Data Lake Services
1
2
3
Register an Authentication Source (i.e. LDAP/AD).
Create a “Shared Data Lake”, specify S3 Bucket & RDS.
When you create a cluster, ”attach” to the Shared Data Lake Services:
• for Multi-User AuthN (LDAP/AD)
• for AuthZ + Audit (Ranger)
• for Schema (Hive Metastore)
PREREQUISITES
• LDAP/AD
• S3 Bucket
• RDS Instance
22. 22 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
General Guidelines
Think Ephemeral. All of your data and metadata in S3 and RDS respectively, do not
create tables or files in the local HDFS.
The Hive warehouse is setup to be on S3 for data lakes, create tables in this location
instead of individual S3 buckets, it will make them easier to manage.
Use Hive “external tables” for tables that are outside this warehouse, typically if the
data is being ingested through some path outside of Hadoop
Create S3 bucket policies that exactly match usage so that you can spin up clusters with
the least privilege.
24. 24 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Takeaways
Cloud driving more ephemeral data processing use cases
Ephemeral workloads leverage cloud storage
This pattern is driving an architectural approach for “Shared Data Lake Services”
Building blocks are Apache Ranger and Apache Knox
Resource Link
Hortonworks Data Cloud https://hortonworks.com/products/cloud/aws/
Apache Ranger https://hortonworks.com/apache/ranger/
Apache Knox https://hortonworks.com/apache/knox-gateway/
25. 25 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Learn More
Enterprise ready security
and governance for
Hadoop ecosystem
Breakout Session
Thursday, September 21 @ 3:10p
https://dataworkssummit.com/sydney-
2017/sessions/treat-your-enterprise-data-lake-
indigestion-enterprise-ready-security-and-governance-
for-hadoop-ecosystem
Security, Governance and
Cybersecurity
Bird of a Feather
Thursday, September 21 @ 6:00p
https://dataworkssummit.com/sydney-2017/birds-of-a-
feather/security-governance-cybersecurity/
26. 26 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Thank You
https://hortonworks.com/products/cloud/aws/
https://hortonworks.com/apache/ranger/
https://hortonworks.com/apache/atlas/