SlideShare a Scribd company logo
1 of 10
Cloudera Navigator
Patrick Angeles

1
Why You Need Cloudera Navigator

1
2

Many Users Working with the Data

3
2

Lots of Data Landing in Cloudera Enterprise

Need to Effectively Control & Consume Data

 Huge quantities
 Many different sources – structured & unstructured
 Varying levels of sensitivity

 Administrators & compliance officers
 Analysts & data scientists
 Business users

 Get visibility & control over the environment
 Discover and explore data
Cloudera Navigator
Data Management Layer for Cloudera Enterprise
Audit & Access Control
Ensuring appropriate permissions & reporting
on data access for compliance

CLOUDERA NAVIGATOR
Audit &
Access
Control

Discovery & Exploration
Finding out what data is available and
what it looks like

Discovery &
Exploration

Lineage

Lifecycle
Mgmt.

Enterprise Metadata Repository
 Business metadata
 Lineage metadata
 Operational metadata

Lineage
Tracing data back to its original source

CDH

Lifecycle Management
Migration of data based on policies

3

HDFS

HBASE

HIVE
Cloudera Navigator 1.0
Data Audit & Access Control

Verify Permissions
View which users and groups have access to
files and directories
IAM / LDAP SYSTEM

Audit Configuration
Configuration of audit tracking for HDFS,
HBase and Hive

Audit Dashboard
Simple, queryable interface to view data access

Information Export
Export audit information for integration with
SIEM tools
4

CLOUDERA NAVIGATOR 1.0
ACCESS
SERVICE

AUDIT LOG
SERVICE

VIEW PERMISSIONS

HDFS

AUDIT LOG CONFIG
AUDIT LOG
COLLECTION

HBASE

3rd PARTY SIEM / GRC SYSTEM
HIVE
Benefits of Cloudera Navigator 1.0

Control

Visibility

 Verify access permissions to files & directories
 Report on data access by user and type

Integration

5

 Store sensitive data
 Maintain full audit history
 The first & only centralized audit tool for Hadoop

 View permissions for LDAP/IAM users
 Export audit data for integration with 3rd party SIEM tools
Navigator Subscription
Data Management Layer
for Hadoop
Centralized audit management &
access control
8x5 or 24x7 support

CLOUDERA
SUPPORT

CLOUDERA
NAVIGATOR

CLOUDERA
MANAGER

CORE
PROJECTS

CLOUDERA
MGR

CLOUDERA
NVGTR

DATA AUDIT

BASIC
FEATURES

IMPALA

SEARCH

ACCESS MGMT

ADVANCED
FEATURES

CDH

Optional add-on to Cloudera
Enterprise subscription

HBASE

BACKUP
& DR

HBASE

CORE PROJECTS

IMPALA

SEARCH

Cloudera Enterprise

6

Navigator Subscription
Navigator 2.0 – Q1 2014
•

Manage and explore your data with Cloudera Navigator 2.0 (Q1
2014)
•
•
•

Data Discovery (what data do we have?), Annotations/Tags
Search, explore, define, and tag data sets.
Important for:
•
•
•

•

DBAs/Data Modelers
Self-Service Business Analysts
Data Scientists

Data Lineage (where did the data come from? where is it used?)
For files and tables, MR jobs, Hive queries, Impala queries, Pig scripts,
Sqoop load/export.
• Important for:
•

Risk and compliance audits.
BI users facing 10K tables in HDFS. Which ones are relevant to the source data I
need, or the table I’m looking at?
• Data retention policies, where you need to purge not just the source data, but any
data that’s been derived from it.
•
•

7
Navigator 2.0 - Lineage
•
•
•
•

8

Audit data access
Verify access
privileges
Search meta data
Visualize lineage
9
10

More Related Content

Viewers also liked

Choosing the Right Big Data Architecture for your Business
Choosing the Right Big Data Architecture for your BusinessChoosing the Right Big Data Architecture for your Business
Choosing the Right Big Data Architecture for your Business
Chicago Hadoop Users Group
 
Partners 2013 LinkedIn Use Cases for Teradata Connectors for Hadoop
Partners 2013 LinkedIn Use Cases for Teradata Connectors for HadoopPartners 2013 LinkedIn Use Cases for Teradata Connectors for Hadoop
Partners 2013 LinkedIn Use Cases for Teradata Connectors for Hadoop
Eric Sun
 
Priyank Patel, Teradata, Hadoop & SQL
Priyank Patel, Teradata, Hadoop & SQLPriyank Patel, Teradata, Hadoop & SQL
Priyank Patel, Teradata, Hadoop & SQL
The Hive
 

Viewers also liked (20)

Switching from relational to the graph model
Switching from relational to the graph modelSwitching from relational to the graph model
Switching from relational to the graph model
 
Optimize Data for the Logical Data Warehouse
Optimize Data for the Logical Data WarehouseOptimize Data for the Logical Data Warehouse
Optimize Data for the Logical Data Warehouse
 
Marlabs Capabilities Overview: DWBI, Analytics and Big Data Services
Marlabs Capabilities Overview: DWBI, Analytics and Big Data ServicesMarlabs Capabilities Overview: DWBI, Analytics and Big Data Services
Marlabs Capabilities Overview: DWBI, Analytics and Big Data Services
 
Data science big data and analytics
Data science big data and analyticsData science big data and analytics
Data science big data and analytics
 
Hadoop Distributed File System (HDFS) Encryption with Cloudera Navigator Key ...
Hadoop Distributed File System (HDFS) Encryption with Cloudera Navigator Key ...Hadoop Distributed File System (HDFS) Encryption with Cloudera Navigator Key ...
Hadoop Distributed File System (HDFS) Encryption with Cloudera Navigator Key ...
 
Choosing the Right Big Data Architecture for your Business
Choosing the Right Big Data Architecture for your BusinessChoosing the Right Big Data Architecture for your Business
Choosing the Right Big Data Architecture for your Business
 
Attunity Solutions for Teradata
Attunity Solutions for TeradataAttunity Solutions for Teradata
Attunity Solutions for Teradata
 
Hadoop and Data Access Security
Hadoop and Data Access SecurityHadoop and Data Access Security
Hadoop and Data Access Security
 
Data Wrangling on Hadoop - Olivier De Garrigues, Trifacta
Data Wrangling on Hadoop - Olivier De Garrigues, TrifactaData Wrangling on Hadoop - Olivier De Garrigues, Trifacta
Data Wrangling on Hadoop - Olivier De Garrigues, Trifacta
 
Optimizing your Modern Data Architecture - with Attunity, RCG Global Services...
Optimizing your Modern Data Architecture - with Attunity, RCG Global Services...Optimizing your Modern Data Architecture - with Attunity, RCG Global Services...
Optimizing your Modern Data Architecture - with Attunity, RCG Global Services...
 
Solution architecture for big data projects
Solution architecture for big data projectsSolution architecture for big data projects
Solution architecture for big data projects
 
Fluentd - road to v1 -
Fluentd - road to v1 -Fluentd - road to v1 -
Fluentd - road to v1 -
 
Indexed Hive
Indexed HiveIndexed Hive
Indexed Hive
 
The Future of Hadoop Security - Hadoop Summit 2014
The Future of Hadoop Security - Hadoop Summit 2014The Future of Hadoop Security - Hadoop Summit 2014
The Future of Hadoop Security - Hadoop Summit 2014
 
Partners 2013 LinkedIn Use Cases for Teradata Connectors for Hadoop
Partners 2013 LinkedIn Use Cases for Teradata Connectors for HadoopPartners 2013 LinkedIn Use Cases for Teradata Connectors for Hadoop
Partners 2013 LinkedIn Use Cases for Teradata Connectors for Hadoop
 
Teradata - Presentation at Hortonworks Booth - Strata 2014
Teradata - Presentation at Hortonworks Booth - Strata 2014Teradata - Presentation at Hortonworks Booth - Strata 2014
Teradata - Presentation at Hortonworks Booth - Strata 2014
 
Apache Sqoop: A Data Transfer Tool for Hadoop
Apache Sqoop: A Data Transfer Tool for HadoopApache Sqoop: A Data Transfer Tool for Hadoop
Apache Sqoop: A Data Transfer Tool for Hadoop
 
Priyank Patel, Teradata, Hadoop & SQL
Priyank Patel, Teradata, Hadoop & SQLPriyank Patel, Teradata, Hadoop & SQL
Priyank Patel, Teradata, Hadoop & SQL
 
SQL, NoSQL, BigData in Data Architecture
SQL, NoSQL, BigData in Data ArchitectureSQL, NoSQL, BigData in Data Architecture
SQL, NoSQL, BigData in Data Architecture
 
Part 1: Cloudera’s Analytic Database: BI & SQL Analytics in a Hybrid Cloud World
Part 1: Cloudera’s Analytic Database: BI & SQL Analytics in a Hybrid Cloud WorldPart 1: Cloudera’s Analytic Database: BI & SQL Analytics in a Hybrid Cloud World
Part 1: Cloudera’s Analytic Database: BI & SQL Analytics in a Hybrid Cloud World
 

More from Caserta

Not Your Father's Database by Databricks
Not Your Father's Database by DatabricksNot Your Father's Database by Databricks
Not Your Father's Database by Databricks
Caserta
 
Mastering Customer Data on Apache Spark
Mastering Customer Data on Apache SparkMastering Customer Data on Apache Spark
Mastering Customer Data on Apache Spark
Caserta
 

More from Caserta (20)

Using Machine Learning & Spark to Power Data-Driven Marketing
Using Machine Learning & Spark to Power Data-Driven MarketingUsing Machine Learning & Spark to Power Data-Driven Marketing
Using Machine Learning & Spark to Power Data-Driven Marketing
 
Data Intelligence: How the Amalgamation of Data, Science, and Technology is C...
Data Intelligence: How the Amalgamation of Data, Science, and Technology is C...Data Intelligence: How the Amalgamation of Data, Science, and Technology is C...
Data Intelligence: How the Amalgamation of Data, Science, and Technology is C...
 
Creating a DevOps Practice for Analytics -- Strata Data, September 28, 2017
Creating a DevOps Practice for Analytics -- Strata Data, September 28, 2017Creating a DevOps Practice for Analytics -- Strata Data, September 28, 2017
Creating a DevOps Practice for Analytics -- Strata Data, September 28, 2017
 
General Data Protection Regulation - BDW Meetup, October 11th, 2017
General Data Protection Regulation - BDW Meetup, October 11th, 2017General Data Protection Regulation - BDW Meetup, October 11th, 2017
General Data Protection Regulation - BDW Meetup, October 11th, 2017
 
Integrating the CDO Role Into Your Organization; Managing the Disruption (MIT...
Integrating the CDO Role Into Your Organization; Managing the Disruption (MIT...Integrating the CDO Role Into Your Organization; Managing the Disruption (MIT...
Integrating the CDO Role Into Your Organization; Managing the Disruption (MIT...
 
Architecting Data For The Modern Enterprise - Data Summit 2017, Closing Keynote
Architecting Data For The Modern Enterprise - Data Summit 2017, Closing KeynoteArchitecting Data For The Modern Enterprise - Data Summit 2017, Closing Keynote
Architecting Data For The Modern Enterprise - Data Summit 2017, Closing Keynote
 
Introduction to Data Science (Data Summit, 2017)
Introduction to Data Science (Data Summit, 2017)Introduction to Data Science (Data Summit, 2017)
Introduction to Data Science (Data Summit, 2017)
 
Looker Data Modeling in the Age of Cloud - BDW Meetup May 2, 2017
Looker Data Modeling in the Age of Cloud - BDW Meetup May 2, 2017Looker Data Modeling in the Age of Cloud - BDW Meetup May 2, 2017
Looker Data Modeling in the Age of Cloud - BDW Meetup May 2, 2017
 
The Rise of the CDO in Today's Enterprise
The Rise of the CDO in Today's EnterpriseThe Rise of the CDO in Today's Enterprise
The Rise of the CDO in Today's Enterprise
 
Building a New Platform for Customer Analytics
Building a New Platform for Customer Analytics Building a New Platform for Customer Analytics
Building a New Platform for Customer Analytics
 
Building New Data Ecosystem for Customer Analytics, Strata + Hadoop World, 2016
Building New Data Ecosystem for Customer Analytics, Strata + Hadoop World, 2016Building New Data Ecosystem for Customer Analytics, Strata + Hadoop World, 2016
Building New Data Ecosystem for Customer Analytics, Strata + Hadoop World, 2016
 
You're the New CDO, Now What?
You're the New CDO, Now What?You're the New CDO, Now What?
You're the New CDO, Now What?
 
The Data Lake - Balancing Data Governance and Innovation
The Data Lake - Balancing Data Governance and Innovation The Data Lake - Balancing Data Governance and Innovation
The Data Lake - Balancing Data Governance and Innovation
 
Making Big Data Easy for Everyone
Making Big Data Easy for EveryoneMaking Big Data Easy for Everyone
Making Big Data Easy for Everyone
 
Benefits of the Azure Cloud
Benefits of the Azure CloudBenefits of the Azure Cloud
Benefits of the Azure Cloud
 
Big Data Analytics on the Cloud
Big Data Analytics on the CloudBig Data Analytics on the Cloud
Big Data Analytics on the Cloud
 
Intro to Data Science on Hadoop
Intro to Data Science on HadoopIntro to Data Science on Hadoop
Intro to Data Science on Hadoop
 
The Emerging Role of the Data Lake
The Emerging Role of the Data LakeThe Emerging Role of the Data Lake
The Emerging Role of the Data Lake
 
Not Your Father's Database by Databricks
Not Your Father's Database by DatabricksNot Your Father's Database by Databricks
Not Your Father's Database by Databricks
 
Mastering Customer Data on Apache Spark
Mastering Customer Data on Apache SparkMastering Customer Data on Apache Spark
Mastering Customer Data on Apache Spark
 

Recently uploaded

IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
Enterprise Knowledge
 

Recently uploaded (20)

How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Evaluating the top large language models.pdf
Evaluating the top large language models.pdfEvaluating the top large language models.pdf
Evaluating the top large language models.pdf
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 

Big Data Warehousing Meetup: Cloudera Navigator

  • 2. Why You Need Cloudera Navigator 1 2 Many Users Working with the Data 3 2 Lots of Data Landing in Cloudera Enterprise Need to Effectively Control & Consume Data  Huge quantities  Many different sources – structured & unstructured  Varying levels of sensitivity  Administrators & compliance officers  Analysts & data scientists  Business users  Get visibility & control over the environment  Discover and explore data
  • 3. Cloudera Navigator Data Management Layer for Cloudera Enterprise Audit & Access Control Ensuring appropriate permissions & reporting on data access for compliance CLOUDERA NAVIGATOR Audit & Access Control Discovery & Exploration Finding out what data is available and what it looks like Discovery & Exploration Lineage Lifecycle Mgmt. Enterprise Metadata Repository  Business metadata  Lineage metadata  Operational metadata Lineage Tracing data back to its original source CDH Lifecycle Management Migration of data based on policies 3 HDFS HBASE HIVE
  • 4. Cloudera Navigator 1.0 Data Audit & Access Control Verify Permissions View which users and groups have access to files and directories IAM / LDAP SYSTEM Audit Configuration Configuration of audit tracking for HDFS, HBase and Hive Audit Dashboard Simple, queryable interface to view data access Information Export Export audit information for integration with SIEM tools 4 CLOUDERA NAVIGATOR 1.0 ACCESS SERVICE AUDIT LOG SERVICE VIEW PERMISSIONS HDFS AUDIT LOG CONFIG AUDIT LOG COLLECTION HBASE 3rd PARTY SIEM / GRC SYSTEM HIVE
  • 5. Benefits of Cloudera Navigator 1.0 Control Visibility  Verify access permissions to files & directories  Report on data access by user and type Integration 5  Store sensitive data  Maintain full audit history  The first & only centralized audit tool for Hadoop  View permissions for LDAP/IAM users  Export audit data for integration with 3rd party SIEM tools
  • 6. Navigator Subscription Data Management Layer for Hadoop Centralized audit management & access control 8x5 or 24x7 support CLOUDERA SUPPORT CLOUDERA NAVIGATOR CLOUDERA MANAGER CORE PROJECTS CLOUDERA MGR CLOUDERA NVGTR DATA AUDIT BASIC FEATURES IMPALA SEARCH ACCESS MGMT ADVANCED FEATURES CDH Optional add-on to Cloudera Enterprise subscription HBASE BACKUP & DR HBASE CORE PROJECTS IMPALA SEARCH Cloudera Enterprise 6 Navigator Subscription
  • 7. Navigator 2.0 – Q1 2014 • Manage and explore your data with Cloudera Navigator 2.0 (Q1 2014) • • • Data Discovery (what data do we have?), Annotations/Tags Search, explore, define, and tag data sets. Important for: • • • • DBAs/Data Modelers Self-Service Business Analysts Data Scientists Data Lineage (where did the data come from? where is it used?) For files and tables, MR jobs, Hive queries, Impala queries, Pig scripts, Sqoop load/export. • Important for: • Risk and compliance audits. BI users facing 10K tables in HDFS. Which ones are relevant to the source data I need, or the table I’m looking at? • Data retention policies, where you need to purge not just the source data, but any data that’s been derived from it. • • 7
  • 8. Navigator 2.0 - Lineage • • • • 8 Audit data access Verify access privileges Search meta data Visualize lineage
  • 9. 9
  • 10. 10