Mais conteĂșdo relacionado Semelhante a Fortifying Multi-Cluster Hybrid Cloud Data Lakes using Apache Knox (20) Mais de DataWorks Summit (20) Fortifying Multi-Cluster Hybrid Cloud Data Lakes using Apache Knox1. 1 © Hortonworks Inc. 2011â2018. All rights reserved
Fortifying Multi-Cluster Hybrid Cloud Data
Lakes using Apache Knox
Sandeep MorĂ© â Sr. Software Engineer
Kiran Matty â Sr. Product Manager
06/19/18
2. 2 © Hortonworks Inc. 2011â2018. All rights reserved
Agenda
âą Multi-Cluster Hybrid Cloud Data lakes
âą Apache Knox
âą Demo
âą Q&A
3. 3 © Hortonworks Inc. 2011â2018. All rights reserved
Who are We?
âą Apache Knox PMC member
âą Sr. Software Engineer@Hortonworks
âą Software Engineer / Security Gateway â
Intel
3
âą PM@Hortonworks â Apache
Knox, HDP Search/Solr, and
Platform Security
âą Big Data Analytics and Security
@ startup, HPE, and Cisco
4. 4 © Hortonworks Inc. 2011â2018. All rights reserved
Multi-Cluster Hybrid
Cloud Data Lakes
5. 5 © Hortonworks Inc. 2011â2018. All rights reserved
Why Hybrid Cloud?
Unified Security &
Governance
Model
Cluster 2
(Unstructured)
Cluster 1
(Structured)
Cluster 3
(Structured)
Cluster 4
(Unstructured)
Data Lake 1, San Jose
Cluster 1
(Unstructured)
Cluster 2
(Structured)
Workloads (typical)
On-prem Cloud
Compliance Sensitive Non-sensitive
Flexibility Production Test/Demo
Cost
Optimization
Fixed Variable
Data Lake 2, UK
Best Practice: Run your analytics workloads where data
is stored
6. 6 © Hortonworks Inc. 2011â2018. All rights reserved
Need to augment existing security controls offered by Cloud
Providers for Hadoop Workloads
Security Control AWS Azure GCP
Network Isolation Virtual Private
Cloud (VPC)
Microsoft Azure Virtual
Network (VNet)
Virtual Private Cloud
(VPC) network
Network security Security Groups Network Access Control
List (NACL) and Network
Security Groups (NSGs)
Firewall rules
Identity
Management
Identity and Access
management (IAM)
Azure Active Directory
(AAD)
Google Cloud Identity
and Access
Management (Cloud
IAM)
7. 7 © Hortonworks Inc. 2011â2018. All rights reserved
A Few Issues across the Hybrid Cloud Data Lakes
How To:
ï authenticate cloud users without moving your on-prem LDAP to the cloud?
ï keep unauthorized users from accessing your customer data i.e. Insider attack?
ï protect your clusters from stolen credentials i.e. Account Highjacking?
8. 8 © Hortonworks Inc. 2011â2018. All rights reserved
AuthN Challenges: Connecting to on-prem Active Directory Options
Replication
Corporate DC Cloud
AD ADVPN
App AppDomain join to on-prem AD over
VPN
1
2
3
10. 10 © Hortonworks Inc. 2011â2018. All rights reserved
âą an extensible reverse proxy framework
âą that can be deployed in the cloud or on-prem
âą for securely exposing REST APIs, HTTP, and WebSockets based services
âą and out of the box it provides:
âą Proxying of HTTP services - REST, UIs, Websockets
âą Authentication services - pluggable authentication and federation providers and token,
SSO services
âą Client services - KnoxShell for consuming cluster services through Knox
âą And many other featuresâŠ
Apache Knox Gateway isâŠ
11. 11 © Hortonworks Inc. 2011â2018. All rights reserved
âą a Firewall
âą a Load balancer
âą a Kerberos replacement
Apache Knox Gateway is NOTâŠ
12. 12 © Hortonworks Inc. 2011â2018. All rights reserved
Why Knox?
Simplified Access
âą Kerberos encapsulation
âą Extends API reach
âą Single access point
âą Multi-cluster support
âą Single SSL certificate
Centralized Control
âą Auditing
âą Service-level authorization
âą Knox Admin UI
âą Service Discovery and Topology Generation
Framework
Enterprise Integration
âą LDAP/AD integration
âą Support for SAMLv2
âą SSO integration
Enhanced Security
âą Proxy to abstract network details
âą TLS Termination for non-SSL services
13. 13 © Hortonworks Inc. 2011â2018. All rights reserved
Apache Knox Community Snapshot
Mar 2013
Entered
Incubator
Oct 2013
0.1.0 - 0.3.0
Incubator
Releases
Feb 2014
Graduates
to
Apache TLP
Apr 2014
0.4.0
TLP
Release
Nov 2014
0.5.0
May 2015
0.6.0
Apr/Aug 2016
0.9.0/0.9.1
Feb 2016
0.8.0
Dec 2015
0.7.0
Nov 2016
0.10.0
Dec 2016
0.11.0
Mar 2017
0.12.0
Feb 2018
1.0
âą Committers: 20
âą Contributors from:
âą Hortonworks, IBM, CGI,
Uber, Oracle, Blue Talon,
Microsoft, Talend
Apache Knox 0.14.0
@apache_knox
Aug 2017
0.13.0
Apache Knox 1.0.0
âą Ambari Service Discovery Support
for HA-Enabled Services
âą Update hadoop dependencies to
Hadoop 3
Dec 2017
0.14.0
âą Service Discovery and Topology
Generation Framework
âą Add support for proxying NiFi and
Livy (Spark Rest Service)
âą High Availability Support For
Apache SOLR, HBase & Kafka
15. 15 © Hortonworks Inc. 2011â2018. All rights reserved
Demo Coverage
How To:
ï authenticate users without moving your on-prem LDAP to the cloud?
âą Knox Federation
ï keep unauthorized users from accessing your customer data i.e. Insider attack?
âą Knox AuthZ
ï protect your clusters from stolen credentials i.e. Account Highjacking?
âą MFA* on Knox
*no out of box support
16. 16 © Hortonworks Inc. 2011â2018. All rights reserved
Knox Providers - Primer
âą Providers add new features to the gateway
âą These features can be used by all services
âą Example providers used for federation:
âą Auth Provider - Knox Federation
Header Based Pre Auth
<provider>
<role>federation</role>
<name>HeaderPreAuth</name>
<enabled>true</enabled>
<param>
<name>preauth.custom.header</name>
<value>aws_user</value>
</param>
</provider>
âą Authorization Provider - Knox AuthZ
AclsAuthz
<provider>
<role>authorization</role>
<name>AclsAuthz</name>
<enabled>true</enabled>
<param>
<name>hive.acl</name>
<value>*;sales;*</value>
</param>
</provider>
17. 17 © Hortonworks Inc. 2011â2018. All rights reserved
Knox Cloud Federation
âą Part of KIP â 11 : Cloud use cases
âą KNOX-1339 â Support for cloud federation
âą Leverages Knox Header Based Pre Auth provider
âą JDBC / Beeline / REST
âą JDBC + Knoxshell for demo
âą Federation Dispatch â
<dispatch classname="org.apache.knox.gateway.dispatch.HeaderPreAuthFederationDispatch" use-two-way-ssl="true" />
18. 18 © Hortonworks Inc. 2011â2018. All rights reserved
Demo Personas
Kate
LDAP Group: DevOps
Cluster Access: Prod and Demo
AWS IAM user
Michelle
LDAP Group: Sales
Cluster Access: Demo
Not AWS IAM user
Malicious Insider
Maximus
Hacker
19. 19 © Hortonworks Inc. 2011â2018. All rights reserved
Demo Architecture
Ambari
HDFS
Hive
Knox
LDAP
Ambari
HDFS
Hive
Knox
2-way
Inbound:8443
JDBC Client
Knoxline
Inbound: 8443
Prod (on-prem) Demo (cloud)
20. 20 © Hortonworks Inc. 2011â2018. All rights reserved
Scenario 1: Access on-prem cluster
Ambari
HDFS
Hive
Knox
LDAP
Ambari
HDFS
Hive
Knox
1. Hive(JDBC)
2. Authenticate Kate
3. Access HDFS
4. HDFS
Response
5. Response
21. 21 © Hortonworks Inc. 2011â2018. All rights reserved
Scenario 2: Access cloud cluster by AuthN w/ on-prem LDAP
Ambari
HDFS
Hive
Knox
LDAP
Ambari
HDFS
Hive
Knox
1. GET Webhdfs
2. Authenticate
Michelle
8. Response
3. Dispatch request to
Cloud Knox
4. Header based pre auth
5. Access HDFS
6. HDFS Response
7. Response
Knox Federation
22. 22 © Hortonworks Inc. 2011â2018. All rights reserved
Scenario 3: Blocking Michelleâs Unauthorized access
AuthN
Run a Hive
query
against the
customer
DB to get
names and
phone
numbers
Load into
CSV file
Exfilterate
via USB
drive
23. 23 © Hortonworks Inc. 2011â2018. All rights reserved
Scenario 3: Blocking Michelleâs Unauthorized access
Ambari
HDFS
Hive
Knox
LDAP
Ambari
HDFS
Hive
Knox
1. Hive (JDBC)
2. Authorization failure
3. 403 Forbidden
Knox AuthZ
24. 24 © Hortonworks Inc. 2011â2018. All rights reserved
Scenario 4: Thwarting Maximusâs Kill Chain
Harvest
Kateâs
credentials
from GitHub
via social
engineering
Create an
exploit to
scan and
identify
sensitive
tables, and
exfilterate
to EC2
server
AuthN
using Kateâs
stolen
credentials
Install the
exploit to
scan
sensitive
tables
Chunk data
and send
to C2
server
Request
for
Ransom
25. 25 © Hortonworks Inc. 2011â2018. All rights reserved
Scenario 4: Thwarting Maximusâs Kill Chain
* No out of the box support for MFA
katec@newcor.com
MFA* on Knox