Mais conteúdo relacionado Semelhante a Cloudera GoDataFest Security and Governance (20) Mais de GoDataDriven (20) Cloudera GoDataFest Security and Governance1. 1© Cloudera, Inc. All rights reserved.
Cloudera Security & Governance
Wim Villano, Sales Engineer Cloudera
2. 2© Cloudera, Inc. All rights reserved.
Comprehensive, Compliance-Ready Security
Authentication, Authorization, Audit, and Compliance
Access
Defining what users
and applications can
do with data
Technical Concepts:
Permissions
Authorization
Data
Protecting data in
the cluster from
unauthorized
visibility
Technical Concepts:
Encryption, Tokenization,
Data masking
Visibility
Reporting on where
data came from and
how it’s being used
Technical Concepts:
Auditing
Lineage
Cloudera Manager Apache Sentry Cloudera Navigator
Navigator Encrypt & Key
Trustee | Partners
Perimeter
Guarding access to
the cluster itself
Technical Concepts:
Authentication
Network isolation
3. 3© Cloudera, Inc. All rights reserved.
Perimeter Security – Isolation, Authentication
Preserve user choice of the right
Hadoop service (e.g. Impala,
Spark)
Conform to centrally managed
authentication policies
Implement with existing standard
systems: Active Directory (LDAP)
and KerberosCloudera Manager
Perimeter
Guarding access to
the cluster itself
Technical Concepts:
Authentication
Network isolation
4. 4© Cloudera, Inc. All rights reserved.
Active Directory and Kerberos
• Manages Users, Groups, and Services
• Provides username / password
authentication
• Group membership determines Service
access
Active Directory
• Trusted and standard third-party
• Authenticated users receive “Tickets”
• “Tickets” gain access to Services
Kerberos
User
authenticates
to AD
Authenticated
user gets
Kerberos
Ticket
Ticket grants
access to
Services e.g.
ImpalaUser
[ssmith]
Password[***** ]
5. 5© Cloudera, Inc. All rights reserved.
Access Security Requirements
Provide users access to data
needed to do their job
Centrally manage access
policies
Leverage a role-based access
control model built on AD
Access
Defining what users
and applications can
do with data
InfoSec Concept:
Authorization
Apache Sentry
6. 6© Cloudera, Inc. All rights reserved.
Authorization
• (Linux) POSIX: Directory, File
• (Linux) ACL: Management of services/resources
• Cloudera Sentry: RBAC within services
• Impala, Hive, Search, Kafka
7. 7© Cloudera, Inc. All rights reserved.
RBAC and Centralized Authorization
Manage data access by role, instead of by individual user
• Customer Support Rep has read access to US Customers
• Broker Analyst has read access to US Transactions
• Relationships between users and roles are established via groups
An RBAC policy is then uniformly enforced for all Hadoop services
• Provides unified authorization controls
• As opposed to tools for managing numerous, service specific
policies
8. 8© Cloudera, Inc. All rights reserved.
Unified Authorization with Apache Sentry
Sentry provides unified authorization via:
Fine-grained RBAC for Impala, Hive, Search and Kafka
Impala/Hive permissions synced in HDFS for all other components
(Spark, MapReduce, etc)
Goal: Unified authorization for all Hadoop services and applications
Sentry Perm.
Read Access
to ALL
Transaction
Data
Sentry Role
Fraud
Analyst Role
Group
Fraud
Analysts
Sam Smith
10. 10© Cloudera, Inc. All rights reserved.
Auditor
Read-Only
Limited Operator
Operator
Configurator
Cluster Administrator
BDR Administrator
Navigator Administrator
User Administrator
Key Administrator
Full Administrator -
Cloudera Manager Roles - Separation of Duties
12. 12© Cloudera, Inc. All rights reserved.
Data Security Requirements
Perform analytics on regulated
data
Encrypt data, conform to key
management policies, protect from
root
Integrate with existing HSM as part
of key management infrastructure
Data
Protecting data in
the cluster from
unauthorized
visibility
InfoSec Concept:
Compliance
Navigator Encrypt &
Key Trustee
13. 13© Cloudera, Inc. All rights reserved.
Compliance-Ready Encryption & Key Management
Cloudera’s Solution:
• ALL data encrypted: HDFS, HBase,
metadata, log files, ingest paths
• Enterprise Key Management via
Navigator Key Trustee
• Configuration support via Cloudera
Manager
• Audit integration to Cloudera Navigator
• Optional root-of-trust integration with
HSMs
Manager Navigator
Impala Hive
HDFS HBase
Sentry
Navigator Key Trustee
Log
Files
Metadata Store
Encrypted Data
Encryption Key
Legend
Ingest Paths
15. 15© Cloudera, Inc. All rights reserved.
Visibility Security Requirements
Understand where report data
came from and discover more
data like it
Comply with policies for audit,
data classification, and lineage
Centralize the audit repository;
perform discovery; automate
lineage
Visibility
Reporting on where
data came from and
how it’s being used
InfoSec Concept:
Audit
Cloudera Navigator
16. 16© Cloudera, Inc. All rights reserved.
Governance is the Foundation of Data Management
Compliance
Track, understand and
protect access to data
Am I prepared for an
audit?
Who’s accessing
sensitive data?
What are they doing with
the data?
Is sensitive data governed
and protected?
Stewardship
Manage and organize data
assets at Hadoop scale
How can I efficiently
manage data lifecycle,
from ingest to purge?
How can I efficiently
organize and classify all
my data?
How can I efficiently
make data available to
my end users?
End User Productivity
Effortlessly find and trust
the data that matters most
How can I find explore
data sets on my own?
Can I trust what I find?
How do I use what I find?
How do I find and use
related data sets?
Administration
Boost user productivity
and cluster performance
Is my data optimized to
support current access
patterns?
How can I optimize for
future workloads?
How can I migrate
workloads to Hadoop
risk-free?
Hadoop Governance Foundation
Centralized audits Unified metadata catalog Comprehensive lineage Data policies