SlideShare uma empresa Scribd logo
1 de 28
1
GOVERN THIS!
Data Discovery & the Application of Data Governance
Cloudera and Tableau Software Online Webinar
May 1, 2014
Paul Lilford, Tableau Software
Marc Lobree, Tableau Software
Arlene Boyd, Cloudera
Mark Donsky, Cloudera
2
Agenda
©2014 Cloudera and Tableau Software . All rights reserved.
• Data Governance Requires a New Approach
• From Apache Hadoop to an Enterprise Data Hub
• Enterprise-Grade Governance with Cloudera Navigator
• Data Discovery and the Application of Data Governance
• Live Demo – Tableau Data Discovery
• Live Demo – Cloudera Navigator
• Q&A
3
Polling Question
3
How do you view existing governance processes?
1. Completely appropriate
2. Effective
3. Ineffective but needed
4. Obstructive
4
Polling Question
4
Are you in a line of business or IT person?
1. Business user
2. IT admin
5
Hadoop and Cloudera’s EDH:
A New Approach to Data
6 ©2014Cloudera, Inc. All rights reserved.
Expanding Data Requires A New Approach
6
Then
Bring Data to Compute
Now
Bring Compute to Data
Data
Information-centric
businesses use all Data:
Multi-structured,
Internal & external data
of all types
Comput
e
Comput
e
Comput
e
Process-centric
businesses use:
• Structured data mainly
• Internal data only
• “Important” data only
Comput
e
Comput
e
Comput
e
Dat
a
Data
Data
Data
7
From Apache Hadoop to an enterprise data
hub
7
Open Source
Scalable
Flexible
Cost-Effective
✔
Managed
Open
Architecture
Secure and
Governed
✖
✖
✖
BATCH
PROCESSING
STORAGE FOR ANY TYPE OF DATA
UNIFIED, ELASTIC, RESILIENT, SECURE
FILESYSTEM
MAPREDUCE
HDFS
Core Apache Hadoop is great, but…
1) Hard to use and manage.
2) Only supports batch processing.
3) Not comprehensively secure.
©2014 Cloudera and Tableau Software . All rights reserved.
8
From Apache Hadoop to an enterprise data
hub
8
Open Source
Scalable
Flexible
Cost-Effective
✔
Managed
Open
Architecture
Secure and
Governed
✔
BATCH
PROCESSING
STORAGE FOR ANY TYPE OF DATA
UNIFIED, ELASTIC, RESILIENT, SECURE
SYSTEM
MANAGEMENT
FILESYSTEM
MAPREDUCE
HDFS
CLOUDERAMANAGER
✖
✖
©2014 Cloudera and Tableau Software . All rights reserved.
9
From Apache Hadoop to an enterprise data
hub
9
Open Source
Scalable
Flexible
Cost-Effective
✔
Managed
Open
Architecture
Secure and
Governed
✔
✔
BATCH
PROCESSING
ANALYTIC
SQL
SEARCH
ENGINE
MACHINE
LEARNING
STREAM
PROCESSING
3RD PARTY
APPS
WORKLOAD MANAGEMENT
STORAGE FOR ANY TYPE OF DATA
UNIFIED, ELASTIC, RESILIENT, SECURE
SYSTEM
MANAGEMENT
FILESYSTEM ONLINE NOSQL
MAPREDUCE IMPALA SOLR SPARK SPARK STREAMING
YARN
HDFS HBASE
CLOUDERAMANAGER
✖
©2014 Cloudera and Tableau Software . All rights reserved.
10
From Apache Hadoop to an enterprise data
hub
10
Open Source
Scalable
Flexible
Cost-Effective
✔
Managed
Open
Architecture
Secure and
Governed
✔
✔
✔
BATCH
PROCESSING
ANALYTIC
SQL
SEARCH
ENGINE
MACHINE
LEARNING
STREAM
PROCESSING
3RD PARTY
APPS
WORKLOAD MANAGEMENT
STORAGE FOR ANY TYPE OF DATA
UNIFIED, ELASTIC, RESILIENT, SECURE
DATA
MANAGEMENT
SYSTEM
MANAGEMENT
FILESYSTEM ONLINE NOSQL
MAPREDUCE IMPALA SOLR SPARK SPARK STREAMING
YARN
HDFS HBASE
CLOUDERANAVIGATORCLOUDERAMANAGER
SENTRY
©2014 Cloudera and Tableau Software . All rights reserved.
11
From Apache Hadoop to an enterprise data
hub
11
Open Source
Scalable
Flexible
Cost-Effective
✔
Managed
Open
Architecture
Secure and
Governed
✔
✔
✔
BATCH
PROCESSING
ANALYTIC
SQL
SEARCH
ENGINE
MACHINE
LEARNING
STREAM
PROCESSING
3RD PARTY
APPS
WORKLOAD MANAGEMENT
STORAGE FOR ANY TYPE OF DATA
UNIFIED, ELASTIC, RESILIENT, SECURE
DATA
MANAGEMENT
SYSTEM
MANAGEMENT
CLOUDERA’S ENTERPRISE DATA HUB
FILESYSTEM ONLINE NOSQL
MAPREDUCE IMPALA SOLR SPARK SPARK STREAMING
YARN
HDFS HBASE
CLOUDERANAVIGATORCLOUDERAMANAGER
SENTRY
©2014 Cloudera and Tableau Software . All rights reserved.
12
Partners
Proactive &
Predictive Support
Professional
Services
Training
Cloudera: Your Trusted Advisor for Big Data
12
Advance from Strategy to ROI with Best Practices and Peak Performance
©2014 Cloudera and Tableau Software . All rights reserved.
13
Polling Question
13
Do you use Hadoop for data discovery?
1. Yes, currently use Hadoop
2. No, but planning to start
3. Currently have no plans
14
Hadoop/EDH Data Management:
Cloudera Navigator
1515
Problem Statement
Lots of data landing in the enterprise data hub
 Huge quantities with varying levels of sensitivity
 Many different sources – structured & unstructured
1
Many users working with the data in multiple ways
 Users: Compliance Officers, Analysts, Data Scientists, LOB
 Tools: BI tools, ETL tools, Hue, and more
2
Need to effectively control & consume data
 Get visibility & control over the environment
 Discover, explore and consume data
3
16
Data Management Challenges
•View, granting and revoke permissions across the Hadoop stack
•Identify access to a data asset around the time of security breach
•Generate alert when a restricted data asset is accessed
Auditing and Access
Management
•Given a data set, trace back to the original source
•Understand the downstream impact of purging/modifying a data setLineage
•Search through metadata to find data sets of interest
•Given a data set, view schema, metadata and policies
Metadata Tagging
and Discovery
16
17
Cloudera Navigator
17
Data Management Suite for Hadoop and Cloudera’s EDH
Audit & Access
Management
Ensuring appropriate permissions & auditing
on data access
Discovery & Exploration
Finding out what data is available and
what it looks like
Lineage
Tracing data back to its original source
Enterprise Metadata Repository
 Business metadata
 Lineage metadata
 Operational metadata
Audit &
Access Mgmt
Lineage Metadata
Discovery &
Exploration
HDFS HBASE HIVE
CLOUDERA NAVIGATOR
CDH
ETL
DW
DBMS
DM
…
Self
Tooling
REST
XMI
18
Tableau Data Discovery
19
20
• Support the process of discovery, and new insights through
direct access to data by subject experts
• LOB Subject Experts (empowered for their subject area)
• Active IT support and engagement
• Security still fundamental and Data is still protected.
• Flexibility in governance, this is discovery not production.
• Better vetted requirements feed production and more highly
governed data types.
• Help organizations in the move to become data driven.
Data Discovery the new way!
21
But don’t take our word for it!
21
• The new normal:
• Business Driven
• Ease of use
• Self reliance
• Visual
22
For EveryoneEase of use leads to adoption across all departments and use cases
©2014 Cloudera and Tableau Software . All rights reserved.
23
Polling Question
23
What percentage of time would you like to spend in
actual data discovery?
1. 0-10%
2. 10-20%
3. 20-30%
4. 30%+
24
•LIVE DEMO
Tableau
Data Discovery
©2014 Cloudera and Tableau Software . All rights reserved.
25
•LIVE DEMO
Cloudera Navigator
©2014 Cloudera and Tableau Software . All rights reserved.
26
Summary
26
• Business driven data discovery is fundamental for all
organizations
• Move from insight to action - become data driven
• Flexibility is key, yet so is scalability, integrated
management, security, and governance
• Prove it first – data discovery allows you to better vet
your solution before you invest
• The discovery layer brings IT and business users together
in a collaborative form
©2014 Cloudera and Tableau Software . All rights reserved.
27
Questions?
27
Use the Chat tab on the left-side of
your screen to submit question
Watch this webinar on-demand:
www.cloudera.com
Contact Our Presenters:
plilford@tableausoftware.com
aboyd@cloudera.com
Or contact your account team
Thank you for attending!
Connector: Tableau on Cloudera
http://onlinehelp.tableausoftware.com/curre
nt/pro/online/en-
us/help.htm#examples_hadoop.html
Download Tableau
http://www.tableausoftware.com/
Download CDH – Free Open
Source
http://www.cloudera.com/downloads
Cloudera and Tableau:
http://www.cloudera.com/content/cloudera/e
n/solutions/partner/Tableau.html
©2014 Cloudera and Tableau Software . All rights reserved.
28 ©2014 Cloudera and Tableau Software . All rights reserved.

Mais conteúdo relacionado

Mais procurados

Data Discovery and BI - Is there Really a Difference?
Data Discovery and BI - Is there Really a Difference?Data Discovery and BI - Is there Really a Difference?
Data Discovery and BI - Is there Really a Difference?Inside Analysis
 
High-Performance Analytics in the Cloud with Apache Impala
High-Performance Analytics in the Cloud with Apache ImpalaHigh-Performance Analytics in the Cloud with Apache Impala
High-Performance Analytics in the Cloud with Apache ImpalaCloudera, Inc.
 
Hadoop Essentials -- The What, Why and How to Meet Agency Objectives
Hadoop Essentials -- The What, Why and How to Meet Agency ObjectivesHadoop Essentials -- The What, Why and How to Meet Agency Objectives
Hadoop Essentials -- The What, Why and How to Meet Agency ObjectivesCloudera, Inc.
 
How to Build Continuous Ingestion for the Internet of Things
How to Build Continuous Ingestion for the Internet of ThingsHow to Build Continuous Ingestion for the Internet of Things
How to Build Continuous Ingestion for the Internet of ThingsCloudera, Inc.
 
From Insight to Action: Using Data Science to Transform Your Organization
From Insight to Action: Using Data Science to Transform Your OrganizationFrom Insight to Action: Using Data Science to Transform Your Organization
From Insight to Action: Using Data Science to Transform Your OrganizationCloudera, Inc.
 
Is your big data journey stalling? Take the Leap with Capgemini and Cloudera
Is your big data journey stalling? Take the Leap with Capgemini and ClouderaIs your big data journey stalling? Take the Leap with Capgemini and Cloudera
Is your big data journey stalling? Take the Leap with Capgemini and ClouderaCloudera, Inc.
 
Rethink Analytics with an Enterprise Data Hub
Rethink Analytics with an Enterprise Data HubRethink Analytics with an Enterprise Data Hub
Rethink Analytics with an Enterprise Data HubCloudera, Inc.
 
Kudu Forrester Webinar
Kudu Forrester WebinarKudu Forrester Webinar
Kudu Forrester WebinarCloudera, Inc.
 
Increase your ROI with Hadoop in Six Months - Presented by Dell, Cloudera and...
Increase your ROI with Hadoop in Six Months - Presented by Dell, Cloudera and...Increase your ROI with Hadoop in Six Months - Presented by Dell, Cloudera and...
Increase your ROI with Hadoop in Six Months - Presented by Dell, Cloudera and...Cloudera, Inc.
 
Better Together: The New Data Management Orchestra
Better Together: The New Data Management OrchestraBetter Together: The New Data Management Orchestra
Better Together: The New Data Management OrchestraCloudera, Inc.
 
Moving Beyond Lambda Architectures with Apache Kudu
Moving Beyond Lambda Architectures with Apache KuduMoving Beyond Lambda Architectures with Apache Kudu
Moving Beyond Lambda Architectures with Apache KuduCloudera, Inc.
 
MapR Enterprise Data Hub Webinar w/ Mike Ferguson
MapR Enterprise Data Hub Webinar w/ Mike FergusonMapR Enterprise Data Hub Webinar w/ Mike Ferguson
MapR Enterprise Data Hub Webinar w/ Mike FergusonMapR Technologies
 
It Takes a Village: Organizational Alignment to Deliver Big Data Value in Hea...
It Takes a Village: Organizational Alignment to Deliver Big Data Value in Hea...It Takes a Village: Organizational Alignment to Deliver Big Data Value in Hea...
It Takes a Village: Organizational Alignment to Deliver Big Data Value in Hea...DataWorks Summit
 
Turning Petabytes of Data into Profit with Hadoop for the World’s Biggest Ret...
Turning Petabytes of Data into Profit with Hadoop for the World’s Biggest Ret...Turning Petabytes of Data into Profit with Hadoop for the World’s Biggest Ret...
Turning Petabytes of Data into Profit with Hadoop for the World’s Biggest Ret...Cloudera, Inc.
 
Part 1: Introducing the Cloudera Data Science Workbench
Part 1: Introducing the Cloudera Data Science WorkbenchPart 1: Introducing the Cloudera Data Science Workbench
Part 1: Introducing the Cloudera Data Science WorkbenchCloudera, Inc.
 
Hortonworks Hybrid Cloud - Putting you back in control of your data
Hortonworks Hybrid Cloud - Putting you back in control of your dataHortonworks Hybrid Cloud - Putting you back in control of your data
Hortonworks Hybrid Cloud - Putting you back in control of your dataScott Clinton
 
Intuitive Real-Time Analytics with Search
Intuitive Real-Time Analytics with SearchIntuitive Real-Time Analytics with Search
Intuitive Real-Time Analytics with SearchCloudera, Inc.
 
10 Amazing Things To Do With a Hadoop-Based Data Lake
10 Amazing Things To Do With a Hadoop-Based Data Lake10 Amazing Things To Do With a Hadoop-Based Data Lake
10 Amazing Things To Do With a Hadoop-Based Data LakeVMware Tanzu
 
New Performance Benchmarks: Apache Impala (incubating) Leads Traditional Anal...
New Performance Benchmarks: Apache Impala (incubating) Leads Traditional Anal...New Performance Benchmarks: Apache Impala (incubating) Leads Traditional Anal...
New Performance Benchmarks: Apache Impala (incubating) Leads Traditional Anal...Cloudera, Inc.
 
Verizon: Finance Data Lake implementation as a Self Service Discovery Big Dat...
Verizon: Finance Data Lake implementation as a Self Service Discovery Big Dat...Verizon: Finance Data Lake implementation as a Self Service Discovery Big Dat...
Verizon: Finance Data Lake implementation as a Self Service Discovery Big Dat...DataWorks Summit
 

Mais procurados (20)

Data Discovery and BI - Is there Really a Difference?
Data Discovery and BI - Is there Really a Difference?Data Discovery and BI - Is there Really a Difference?
Data Discovery and BI - Is there Really a Difference?
 
High-Performance Analytics in the Cloud with Apache Impala
High-Performance Analytics in the Cloud with Apache ImpalaHigh-Performance Analytics in the Cloud with Apache Impala
High-Performance Analytics in the Cloud with Apache Impala
 
Hadoop Essentials -- The What, Why and How to Meet Agency Objectives
Hadoop Essentials -- The What, Why and How to Meet Agency ObjectivesHadoop Essentials -- The What, Why and How to Meet Agency Objectives
Hadoop Essentials -- The What, Why and How to Meet Agency Objectives
 
How to Build Continuous Ingestion for the Internet of Things
How to Build Continuous Ingestion for the Internet of ThingsHow to Build Continuous Ingestion for the Internet of Things
How to Build Continuous Ingestion for the Internet of Things
 
From Insight to Action: Using Data Science to Transform Your Organization
From Insight to Action: Using Data Science to Transform Your OrganizationFrom Insight to Action: Using Data Science to Transform Your Organization
From Insight to Action: Using Data Science to Transform Your Organization
 
Is your big data journey stalling? Take the Leap with Capgemini and Cloudera
Is your big data journey stalling? Take the Leap with Capgemini and ClouderaIs your big data journey stalling? Take the Leap with Capgemini and Cloudera
Is your big data journey stalling? Take the Leap with Capgemini and Cloudera
 
Rethink Analytics with an Enterprise Data Hub
Rethink Analytics with an Enterprise Data HubRethink Analytics with an Enterprise Data Hub
Rethink Analytics with an Enterprise Data Hub
 
Kudu Forrester Webinar
Kudu Forrester WebinarKudu Forrester Webinar
Kudu Forrester Webinar
 
Increase your ROI with Hadoop in Six Months - Presented by Dell, Cloudera and...
Increase your ROI with Hadoop in Six Months - Presented by Dell, Cloudera and...Increase your ROI with Hadoop in Six Months - Presented by Dell, Cloudera and...
Increase your ROI with Hadoop in Six Months - Presented by Dell, Cloudera and...
 
Better Together: The New Data Management Orchestra
Better Together: The New Data Management OrchestraBetter Together: The New Data Management Orchestra
Better Together: The New Data Management Orchestra
 
Moving Beyond Lambda Architectures with Apache Kudu
Moving Beyond Lambda Architectures with Apache KuduMoving Beyond Lambda Architectures with Apache Kudu
Moving Beyond Lambda Architectures with Apache Kudu
 
MapR Enterprise Data Hub Webinar w/ Mike Ferguson
MapR Enterprise Data Hub Webinar w/ Mike FergusonMapR Enterprise Data Hub Webinar w/ Mike Ferguson
MapR Enterprise Data Hub Webinar w/ Mike Ferguson
 
It Takes a Village: Organizational Alignment to Deliver Big Data Value in Hea...
It Takes a Village: Organizational Alignment to Deliver Big Data Value in Hea...It Takes a Village: Organizational Alignment to Deliver Big Data Value in Hea...
It Takes a Village: Organizational Alignment to Deliver Big Data Value in Hea...
 
Turning Petabytes of Data into Profit with Hadoop for the World’s Biggest Ret...
Turning Petabytes of Data into Profit with Hadoop for the World’s Biggest Ret...Turning Petabytes of Data into Profit with Hadoop for the World’s Biggest Ret...
Turning Petabytes of Data into Profit with Hadoop for the World’s Biggest Ret...
 
Part 1: Introducing the Cloudera Data Science Workbench
Part 1: Introducing the Cloudera Data Science WorkbenchPart 1: Introducing the Cloudera Data Science Workbench
Part 1: Introducing the Cloudera Data Science Workbench
 
Hortonworks Hybrid Cloud - Putting you back in control of your data
Hortonworks Hybrid Cloud - Putting you back in control of your dataHortonworks Hybrid Cloud - Putting you back in control of your data
Hortonworks Hybrid Cloud - Putting you back in control of your data
 
Intuitive Real-Time Analytics with Search
Intuitive Real-Time Analytics with SearchIntuitive Real-Time Analytics with Search
Intuitive Real-Time Analytics with Search
 
10 Amazing Things To Do With a Hadoop-Based Data Lake
10 Amazing Things To Do With a Hadoop-Based Data Lake10 Amazing Things To Do With a Hadoop-Based Data Lake
10 Amazing Things To Do With a Hadoop-Based Data Lake
 
New Performance Benchmarks: Apache Impala (incubating) Leads Traditional Anal...
New Performance Benchmarks: Apache Impala (incubating) Leads Traditional Anal...New Performance Benchmarks: Apache Impala (incubating) Leads Traditional Anal...
New Performance Benchmarks: Apache Impala (incubating) Leads Traditional Anal...
 
Verizon: Finance Data Lake implementation as a Self Service Discovery Big Dat...
Verizon: Finance Data Lake implementation as a Self Service Discovery Big Dat...Verizon: Finance Data Lake implementation as a Self Service Discovery Big Dat...
Verizon: Finance Data Lake implementation as a Self Service Discovery Big Dat...
 

Semelhante a Govern This! Data Discovery and the application of data governance with new stack technologies

MongoDB IoT City Tour LONDON: Hadoop and the future of data management. By, M...
MongoDB IoT City Tour LONDON: Hadoop and the future of data management. By, M...MongoDB IoT City Tour LONDON: Hadoop and the future of data management. By, M...
MongoDB IoT City Tour LONDON: Hadoop and the future of data management. By, M...MongoDB
 
Hitachi Data Systems Hadoop Solution
Hitachi Data Systems Hadoop SolutionHitachi Data Systems Hadoop Solution
Hitachi Data Systems Hadoop SolutionHitachi Vantara
 
Introducing Cloudera Navigator Optimizer: Offload Assessments and Active Data...
Introducing Cloudera Navigator Optimizer: Offload Assessments and Active Data...Introducing Cloudera Navigator Optimizer: Offload Assessments and Active Data...
Introducing Cloudera Navigator Optimizer: Offload Assessments and Active Data...Cloudera, Inc.
 
Cloudera Big Data Integration Speedpitch at TDWI Munich June 2017
Cloudera Big Data Integration Speedpitch at TDWI Munich June 2017Cloudera Big Data Integration Speedpitch at TDWI Munich June 2017
Cloudera Big Data Integration Speedpitch at TDWI Munich June 2017Stefan Lipp
 
Bridging the Big Data Gap in the Software-Driven World
Bridging the Big Data Gap in the Software-Driven WorldBridging the Big Data Gap in the Software-Driven World
Bridging the Big Data Gap in the Software-Driven WorldCA Technologies
 
MongoDB IoT City Tour STUTTGART: Hadoop and future data management. By, Cloudera
MongoDB IoT City Tour STUTTGART: Hadoop and future data management. By, ClouderaMongoDB IoT City Tour STUTTGART: Hadoop and future data management. By, Cloudera
MongoDB IoT City Tour STUTTGART: Hadoop and future data management. By, ClouderaMongoDB
 
Big Data IDEA 101 2019
Big Data IDEA 101 2019Big Data IDEA 101 2019
Big Data IDEA 101 2019Adam Doyle
 
Comprehensive Security for the Enterprise IV: Visibility Through a Single End...
Comprehensive Security for the Enterprise IV: Visibility Through a Single End...Comprehensive Security for the Enterprise IV: Visibility Through a Single End...
Comprehensive Security for the Enterprise IV: Visibility Through a Single End...Cloudera, Inc.
 
Back to school: Big Data IDEA 101
Back to school: Big Data IDEA 101Back to school: Big Data IDEA 101
Back to school: Big Data IDEA 101Adam Doyle
 
Enterprise Hadoop is Here to Stay: Plan Your Evolution Strategy
Enterprise Hadoop is Here to Stay: Plan Your Evolution StrategyEnterprise Hadoop is Here to Stay: Plan Your Evolution Strategy
Enterprise Hadoop is Here to Stay: Plan Your Evolution StrategyInside Analysis
 
The Future of Data Management: The Enterprise Data Hub
The Future of Data Management: The Enterprise Data HubThe Future of Data Management: The Enterprise Data Hub
The Future of Data Management: The Enterprise Data HubCloudera, Inc.
 
Simplify and Secure your Hadoop Environment with Hortonworks and Centrify
Simplify and Secure your Hadoop Environment with Hortonworks and CentrifySimplify and Secure your Hadoop Environment with Hortonworks and Centrify
Simplify and Secure your Hadoop Environment with Hortonworks and CentrifyHortonworks
 
Gab Genai Cloudera - Going Beyond Traditional Analytic
Gab Genai Cloudera - Going Beyond Traditional Analytic Gab Genai Cloudera - Going Beyond Traditional Analytic
Gab Genai Cloudera - Going Beyond Traditional Analytic IntelAPAC
 
Data Governance for Data Lakes
Data Governance for Data LakesData Governance for Data Lakes
Data Governance for Data LakesKiran Kamreddy
 
Standing Up an Effective Enterprise Data Hub -- Technology and Beyond
Standing Up an Effective Enterprise Data Hub -- Technology and BeyondStanding Up an Effective Enterprise Data Hub -- Technology and Beyond
Standing Up an Effective Enterprise Data Hub -- Technology and BeyondCloudera, Inc.
 
Hadoop and Manufacturing
Hadoop and ManufacturingHadoop and Manufacturing
Hadoop and ManufacturingCloudera, Inc.
 
Intel and Cloudera: Accelerating Enterprise Big Data Success
Intel and Cloudera: Accelerating Enterprise Big Data SuccessIntel and Cloudera: Accelerating Enterprise Big Data Success
Intel and Cloudera: Accelerating Enterprise Big Data SuccessCloudera, Inc.
 
Hadoop as an Analytic Platform: Why Not?
Hadoop as an Analytic Platform: Why Not?Hadoop as an Analytic Platform: Why Not?
Hadoop as an Analytic Platform: Why Not?Inside Analysis
 
Bringing Trus and Visibility to Apache Hadoop
Bringing Trus and Visibility to Apache HadoopBringing Trus and Visibility to Apache Hadoop
Bringing Trus and Visibility to Apache HadoopDataWorks Summit
 

Semelhante a Govern This! Data Discovery and the application of data governance with new stack technologies (20)

MongoDB IoT City Tour LONDON: Hadoop and the future of data management. By, M...
MongoDB IoT City Tour LONDON: Hadoop and the future of data management. By, M...MongoDB IoT City Tour LONDON: Hadoop and the future of data management. By, M...
MongoDB IoT City Tour LONDON: Hadoop and the future of data management. By, M...
 
Hitachi Data Systems Hadoop Solution
Hitachi Data Systems Hadoop SolutionHitachi Data Systems Hadoop Solution
Hitachi Data Systems Hadoop Solution
 
Introducing Cloudera Navigator Optimizer: Offload Assessments and Active Data...
Introducing Cloudera Navigator Optimizer: Offload Assessments and Active Data...Introducing Cloudera Navigator Optimizer: Offload Assessments and Active Data...
Introducing Cloudera Navigator Optimizer: Offload Assessments and Active Data...
 
Cloudera Big Data Integration Speedpitch at TDWI Munich June 2017
Cloudera Big Data Integration Speedpitch at TDWI Munich June 2017Cloudera Big Data Integration Speedpitch at TDWI Munich June 2017
Cloudera Big Data Integration Speedpitch at TDWI Munich June 2017
 
Bridging the Big Data Gap in the Software-Driven World
Bridging the Big Data Gap in the Software-Driven WorldBridging the Big Data Gap in the Software-Driven World
Bridging the Big Data Gap in the Software-Driven World
 
MongoDB IoT City Tour STUTTGART: Hadoop and future data management. By, Cloudera
MongoDB IoT City Tour STUTTGART: Hadoop and future data management. By, ClouderaMongoDB IoT City Tour STUTTGART: Hadoop and future data management. By, Cloudera
MongoDB IoT City Tour STUTTGART: Hadoop and future data management. By, Cloudera
 
Big Data IDEA 101 2019
Big Data IDEA 101 2019Big Data IDEA 101 2019
Big Data IDEA 101 2019
 
Comprehensive Security for the Enterprise IV: Visibility Through a Single End...
Comprehensive Security for the Enterprise IV: Visibility Through a Single End...Comprehensive Security for the Enterprise IV: Visibility Through a Single End...
Comprehensive Security for the Enterprise IV: Visibility Through a Single End...
 
Oracle big data discovery 994294
Oracle big data discovery   994294Oracle big data discovery   994294
Oracle big data discovery 994294
 
Back to school: Big Data IDEA 101
Back to school: Big Data IDEA 101Back to school: Big Data IDEA 101
Back to school: Big Data IDEA 101
 
Enterprise Hadoop is Here to Stay: Plan Your Evolution Strategy
Enterprise Hadoop is Here to Stay: Plan Your Evolution StrategyEnterprise Hadoop is Here to Stay: Plan Your Evolution Strategy
Enterprise Hadoop is Here to Stay: Plan Your Evolution Strategy
 
The Future of Data Management: The Enterprise Data Hub
The Future of Data Management: The Enterprise Data HubThe Future of Data Management: The Enterprise Data Hub
The Future of Data Management: The Enterprise Data Hub
 
Simplify and Secure your Hadoop Environment with Hortonworks and Centrify
Simplify and Secure your Hadoop Environment with Hortonworks and CentrifySimplify and Secure your Hadoop Environment with Hortonworks and Centrify
Simplify and Secure your Hadoop Environment with Hortonworks and Centrify
 
Gab Genai Cloudera - Going Beyond Traditional Analytic
Gab Genai Cloudera - Going Beyond Traditional Analytic Gab Genai Cloudera - Going Beyond Traditional Analytic
Gab Genai Cloudera - Going Beyond Traditional Analytic
 
Data Governance for Data Lakes
Data Governance for Data LakesData Governance for Data Lakes
Data Governance for Data Lakes
 
Standing Up an Effective Enterprise Data Hub -- Technology and Beyond
Standing Up an Effective Enterprise Data Hub -- Technology and BeyondStanding Up an Effective Enterprise Data Hub -- Technology and Beyond
Standing Up an Effective Enterprise Data Hub -- Technology and Beyond
 
Hadoop and Manufacturing
Hadoop and ManufacturingHadoop and Manufacturing
Hadoop and Manufacturing
 
Intel and Cloudera: Accelerating Enterprise Big Data Success
Intel and Cloudera: Accelerating Enterprise Big Data SuccessIntel and Cloudera: Accelerating Enterprise Big Data Success
Intel and Cloudera: Accelerating Enterprise Big Data Success
 
Hadoop as an Analytic Platform: Why Not?
Hadoop as an Analytic Platform: Why Not?Hadoop as an Analytic Platform: Why Not?
Hadoop as an Analytic Platform: Why Not?
 
Bringing Trus and Visibility to Apache Hadoop
Bringing Trus and Visibility to Apache HadoopBringing Trus and Visibility to Apache Hadoop
Bringing Trus and Visibility to Apache Hadoop
 

Mais de Cloudera, Inc.

Partner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptxPartner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptxCloudera, Inc.
 
Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists Cloudera, Inc.
 
2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards FinalistsCloudera, Inc.
 
Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019Cloudera, Inc.
 
Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19Cloudera, Inc.
 
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19Cloudera, Inc.
 
Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19Cloudera, Inc.
 
Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19Cloudera, Inc.
 
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19Cloudera, Inc.
 
Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19Cloudera, Inc.
 
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19Cloudera, Inc.
 
Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1Cloudera, Inc.
 
Extending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the PlatformExtending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the PlatformCloudera, Inc.
 
Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18Cloudera, Inc.
 
Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360Cloudera, Inc.
 
Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18Cloudera, Inc.
 
Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18Cloudera, Inc.
 

Mais de Cloudera, Inc. (20)

Partner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptxPartner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptx
 
Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists
 
2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists
 
Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019
 
Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19
 
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
 
Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19
 
Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19
 
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
 
Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19
 
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
 
Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18
 
Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3
 
Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2
 
Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1
 
Extending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the PlatformExtending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the Platform
 
Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18
 
Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360
 
Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18
 
Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18
 

Último

TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embeddingZilliz
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
What is Artificial Intelligence?????????
What is Artificial Intelligence?????????What is Artificial Intelligence?????????
What is Artificial Intelligence?????????blackmambaettijean
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersNicole Novielli
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rick Flair
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESmohitsingh558521
 

Último (20)

TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embedding
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
What is Artificial Intelligence?????????
What is Artificial Intelligence?????????What is Artificial Intelligence?????????
What is Artificial Intelligence?????????
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software Developers
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
 

Govern This! Data Discovery and the application of data governance with new stack technologies

  • 1. 1 GOVERN THIS! Data Discovery & the Application of Data Governance Cloudera and Tableau Software Online Webinar May 1, 2014 Paul Lilford, Tableau Software Marc Lobree, Tableau Software Arlene Boyd, Cloudera Mark Donsky, Cloudera
  • 2. 2 Agenda ©2014 Cloudera and Tableau Software . All rights reserved. • Data Governance Requires a New Approach • From Apache Hadoop to an Enterprise Data Hub • Enterprise-Grade Governance with Cloudera Navigator • Data Discovery and the Application of Data Governance • Live Demo – Tableau Data Discovery • Live Demo – Cloudera Navigator • Q&A
  • 3. 3 Polling Question 3 How do you view existing governance processes? 1. Completely appropriate 2. Effective 3. Ineffective but needed 4. Obstructive
  • 4. 4 Polling Question 4 Are you in a line of business or IT person? 1. Business user 2. IT admin
  • 5. 5 Hadoop and Cloudera’s EDH: A New Approach to Data
  • 6. 6 ©2014Cloudera, Inc. All rights reserved. Expanding Data Requires A New Approach 6 Then Bring Data to Compute Now Bring Compute to Data Data Information-centric businesses use all Data: Multi-structured, Internal & external data of all types Comput e Comput e Comput e Process-centric businesses use: • Structured data mainly • Internal data only • “Important” data only Comput e Comput e Comput e Dat a Data Data Data
  • 7. 7 From Apache Hadoop to an enterprise data hub 7 Open Source Scalable Flexible Cost-Effective ✔ Managed Open Architecture Secure and Governed ✖ ✖ ✖ BATCH PROCESSING STORAGE FOR ANY TYPE OF DATA UNIFIED, ELASTIC, RESILIENT, SECURE FILESYSTEM MAPREDUCE HDFS Core Apache Hadoop is great, but… 1) Hard to use and manage. 2) Only supports batch processing. 3) Not comprehensively secure. ©2014 Cloudera and Tableau Software . All rights reserved.
  • 8. 8 From Apache Hadoop to an enterprise data hub 8 Open Source Scalable Flexible Cost-Effective ✔ Managed Open Architecture Secure and Governed ✔ BATCH PROCESSING STORAGE FOR ANY TYPE OF DATA UNIFIED, ELASTIC, RESILIENT, SECURE SYSTEM MANAGEMENT FILESYSTEM MAPREDUCE HDFS CLOUDERAMANAGER ✖ ✖ ©2014 Cloudera and Tableau Software . All rights reserved.
  • 9. 9 From Apache Hadoop to an enterprise data hub 9 Open Source Scalable Flexible Cost-Effective ✔ Managed Open Architecture Secure and Governed ✔ ✔ BATCH PROCESSING ANALYTIC SQL SEARCH ENGINE MACHINE LEARNING STREAM PROCESSING 3RD PARTY APPS WORKLOAD MANAGEMENT STORAGE FOR ANY TYPE OF DATA UNIFIED, ELASTIC, RESILIENT, SECURE SYSTEM MANAGEMENT FILESYSTEM ONLINE NOSQL MAPREDUCE IMPALA SOLR SPARK SPARK STREAMING YARN HDFS HBASE CLOUDERAMANAGER ✖ ©2014 Cloudera and Tableau Software . All rights reserved.
  • 10. 10 From Apache Hadoop to an enterprise data hub 10 Open Source Scalable Flexible Cost-Effective ✔ Managed Open Architecture Secure and Governed ✔ ✔ ✔ BATCH PROCESSING ANALYTIC SQL SEARCH ENGINE MACHINE LEARNING STREAM PROCESSING 3RD PARTY APPS WORKLOAD MANAGEMENT STORAGE FOR ANY TYPE OF DATA UNIFIED, ELASTIC, RESILIENT, SECURE DATA MANAGEMENT SYSTEM MANAGEMENT FILESYSTEM ONLINE NOSQL MAPREDUCE IMPALA SOLR SPARK SPARK STREAMING YARN HDFS HBASE CLOUDERANAVIGATORCLOUDERAMANAGER SENTRY ©2014 Cloudera and Tableau Software . All rights reserved.
  • 11. 11 From Apache Hadoop to an enterprise data hub 11 Open Source Scalable Flexible Cost-Effective ✔ Managed Open Architecture Secure and Governed ✔ ✔ ✔ BATCH PROCESSING ANALYTIC SQL SEARCH ENGINE MACHINE LEARNING STREAM PROCESSING 3RD PARTY APPS WORKLOAD MANAGEMENT STORAGE FOR ANY TYPE OF DATA UNIFIED, ELASTIC, RESILIENT, SECURE DATA MANAGEMENT SYSTEM MANAGEMENT CLOUDERA’S ENTERPRISE DATA HUB FILESYSTEM ONLINE NOSQL MAPREDUCE IMPALA SOLR SPARK SPARK STREAMING YARN HDFS HBASE CLOUDERANAVIGATORCLOUDERAMANAGER SENTRY ©2014 Cloudera and Tableau Software . All rights reserved.
  • 12. 12 Partners Proactive & Predictive Support Professional Services Training Cloudera: Your Trusted Advisor for Big Data 12 Advance from Strategy to ROI with Best Practices and Peak Performance ©2014 Cloudera and Tableau Software . All rights reserved.
  • 13. 13 Polling Question 13 Do you use Hadoop for data discovery? 1. Yes, currently use Hadoop 2. No, but planning to start 3. Currently have no plans
  • 15. 1515 Problem Statement Lots of data landing in the enterprise data hub  Huge quantities with varying levels of sensitivity  Many different sources – structured & unstructured 1 Many users working with the data in multiple ways  Users: Compliance Officers, Analysts, Data Scientists, LOB  Tools: BI tools, ETL tools, Hue, and more 2 Need to effectively control & consume data  Get visibility & control over the environment  Discover, explore and consume data 3
  • 16. 16 Data Management Challenges •View, granting and revoke permissions across the Hadoop stack •Identify access to a data asset around the time of security breach •Generate alert when a restricted data asset is accessed Auditing and Access Management •Given a data set, trace back to the original source •Understand the downstream impact of purging/modifying a data setLineage •Search through metadata to find data sets of interest •Given a data set, view schema, metadata and policies Metadata Tagging and Discovery 16
  • 17. 17 Cloudera Navigator 17 Data Management Suite for Hadoop and Cloudera’s EDH Audit & Access Management Ensuring appropriate permissions & auditing on data access Discovery & Exploration Finding out what data is available and what it looks like Lineage Tracing data back to its original source Enterprise Metadata Repository  Business metadata  Lineage metadata  Operational metadata Audit & Access Mgmt Lineage Metadata Discovery & Exploration HDFS HBASE HIVE CLOUDERA NAVIGATOR CDH ETL DW DBMS DM … Self Tooling REST XMI
  • 19. 19
  • 20. 20 • Support the process of discovery, and new insights through direct access to data by subject experts • LOB Subject Experts (empowered for their subject area) • Active IT support and engagement • Security still fundamental and Data is still protected. • Flexibility in governance, this is discovery not production. • Better vetted requirements feed production and more highly governed data types. • Help organizations in the move to become data driven. Data Discovery the new way!
  • 21. 21 But don’t take our word for it! 21 • The new normal: • Business Driven • Ease of use • Self reliance • Visual
  • 22. 22 For EveryoneEase of use leads to adoption across all departments and use cases ©2014 Cloudera and Tableau Software . All rights reserved.
  • 23. 23 Polling Question 23 What percentage of time would you like to spend in actual data discovery? 1. 0-10% 2. 10-20% 3. 20-30% 4. 30%+
  • 24. 24 •LIVE DEMO Tableau Data Discovery ©2014 Cloudera and Tableau Software . All rights reserved.
  • 25. 25 •LIVE DEMO Cloudera Navigator ©2014 Cloudera and Tableau Software . All rights reserved.
  • 26. 26 Summary 26 • Business driven data discovery is fundamental for all organizations • Move from insight to action - become data driven • Flexibility is key, yet so is scalability, integrated management, security, and governance • Prove it first – data discovery allows you to better vet your solution before you invest • The discovery layer brings IT and business users together in a collaborative form ©2014 Cloudera and Tableau Software . All rights reserved.
  • 27. 27 Questions? 27 Use the Chat tab on the left-side of your screen to submit question Watch this webinar on-demand: www.cloudera.com Contact Our Presenters: plilford@tableausoftware.com aboyd@cloudera.com Or contact your account team Thank you for attending! Connector: Tableau on Cloudera http://onlinehelp.tableausoftware.com/curre nt/pro/online/en- us/help.htm#examples_hadoop.html Download Tableau http://www.tableausoftware.com/ Download CDH – Free Open Source http://www.cloudera.com/downloads Cloudera and Tableau: http://www.cloudera.com/content/cloudera/e n/solutions/partner/Tableau.html ©2014 Cloudera and Tableau Software . All rights reserved.
  • 28. 28 ©2014 Cloudera and Tableau Software . All rights reserved.

Notas do Editor

  1. Today we're in the middle of a shift in how businesses use information. In the past, you'd define a set of business processes, build applications around each of them, and then go about gathering, conforming, and merging the necessary data sets to support those applications. From an infrastructure perspective, you'd be bringing the data over to the compute, often in relational databases. But you'd be leaving quite a lot on the table.The modern realities of business demand a new approach. Today companies need, more than ever, to become information-driven, but given the amount and diversity of information available, and the rate of change in business, it's simply unsustainable to keep moving around and transforming huge volumes of data.
  2. The foundational platform that's addressing this wide range of problems today is Apache Hadoop, an open source platform for scalable, fault-tolerant data storage and processing that runs on a cluster of industry-standard servers. But Hadoop, in the beginning, wasn't capable of solving these problems. Originally, Hadoop was just a scalable distributed system for storing and processing large amounts of data. You could bring workloads to an effectively limitless amount and variety of data, provided the only kind of work you wanted to do was batch processing by writing Java code, and provided you liked hiring highly-skilled computer scientists to operate it.
  3. Cloudera solved the latter problem with Cloudera Manager, the leading system management application for Apache Hadoop. Customers love Cloudera manager because it makes the complex simple. Hadoop is more than a dozen services running across many machines, with limitless configuration permutations. With Cloudera Manager, customers can centrally manage and monitor their clusters from a single tool. It provides automated installation and configuration of your cluster. Cloudera Manager is really our many years of Hadoop experience realized in software, and helps you get up and running quickly.
  4. Our customers liked the scalability, flexibility, and economic properties of the platform, but, for example, didn't like that they had to move data out to other MPP analytic databases just to run fast SQL queries, so we built Impala, the world's first open source MPP analytic SQL query engine expressly designed for Hadoop. With Impala, you now have a viable open source alternative to proprietary MPP analytic databases, one that also delivers the core scalability, flexibility, and economic benefits of Hadoop.Now, over the past year we've continued to add to the platform, with Search, and Spark for interactive iterative analytics and stream processing. You also get HBase, the online key-value store, to enable real-time applications on the platform. With this range of diverse ways to access your data in Hadoop, far beyond just Java and MapReduce, you can now bring your existing tools and skill sets to the platform. What's even more exciting is that we've recently made it possible for our partners and other 3rd parties to deploy, manage, and monitor their apps in the platform, again leveraging exciting your investments while letting you access an even greater breadth and depth of data, all in one place.
  5. Of course, none of this would matter if the platform weren't reliable, secure, and manageable. * Hadoop today is highly available and Cloudera provides extensions for automated backup and disaster recovery. * Hadoop has had perimeter security for some time but there was a significant gap in the area of fine-grained role-based access controls, the kind you'd expect from a DBMS. That's why, together with the community, we built and contributed the Apache Sentry project which delivers this security for Hive and Impala today, and why we developed Cloudera Navigator to support metadata management, including things like rights auditing, data lineage, and data discovery native to Hadoop. * And all this in addition to the industry-leading system management and customer support you expect from Cloudera.
  6. So you can see a lot has happened in just a few short years. Ultimately what you have here is an enterprise data hub, which has four necessary attributes: * It's Secure and Compliant. In addition to perimeter security and encryption, an EDH offers fine-grained (row and column-level) role-based access controls over data, just like your data warehouse. * It's Governed. You need to understand what data is in your EDH and how it’s used, so an EDH must offer data discovery, data auditing, and data lineage. * It's Unified and Manageable. You need to be able to trust that your data is safe, so an EDH must provide not only native high-availability, fault-tolerance and self-healing storage, but also automated replication and disaster recovery. It also much provide advanced system and management to enable distributed multi-tenant performance. * And it's Open. As an EDH makes it possible to cost-effectively retain data for decades, you need to ensure that the foundational infrastructure is based on open source software and an open platform for 3rd parties. Open source ensures that you are not locked in to any particular vendor’s license agreement; nobody can hold your data or applications hostage. An open platform ensures that you’re not locked into a particular vendor’s stack and that you have a choice of what tools to use with the EDH, for example over 200 ISV products – such as Tableau Software - work with Cloudera today.With an enterprise data hub, our customers are able to store and drive real business impactfrom more data than they'd ever thought possible.
  7. The expansive capabilities of Hadoop, and an enterprise data hub – the ability to store, process, and analyze huge quantities of data with varying levels of sensitivity from many different sources – structured, semi-structured, and unstructured - require a robust security capability to manage the range of vulnerabilities that may arise.As data proliferates, many new users of different types require access, and many different types of tools will access the data, raising concerns about ongoing management and compliance. Organizations will need to anticipate how they will ensure data quality throughout the information pipeline, enforce controls that guarantee appropriate access and rights, and move from ungoverned data systems with full administration, visibility, and security that allow them to discovery, explore, and consume data with full confidence.
  8. Enter Cloudera Navigator, the first fully integrated data management application for Apache Hadoop designed to provide all of the capabilities required for administrators, data managers and analysts to secure, govern, classify and explore the large amounts of diverse data in their Hadoop clusters. Control: Navigator provides the system and data control necessary for compliance and risk management teams to ensure that their organization’s policies extend to critical and sensitive data within Hadoop., visibility, productivity, and reliability extend to critical and sensitive data within Hadoop. IT professionals benefit from the simple, centralized management functions offered by Cloudera Manager, so they gain both system and data control from an integrated end-to-end experienceVisibility – Navigator establishes a centralized system for verifying access permissions across all files and directories within Hadoop. Administrators and operations teams can validate their usage and data access policies by confirming individual and group rights and access. Productivity – Analysts, data scientists and business users easily identify data sets of interest and familiarize themselves with the various structures and formats. As a result, they can more quickly generate insights that benefit the business. Reliability – Navigator Lineage capabilities offer the ability to visually trace the progression of a data set from original source(s) to current state. This gives compliance officers, quality managers, executives and anyone else concerned with data cleanliness a high degree of confidence in the reliability of the data they use for reporting or to make decisions.
  9. Tableau mission is to Help people see and understand their data. We have had this mission for over 10 years, and remain completely committed to helping business users discover new insights.
  10. Data discovery has evolved. It has always been part to businesses, but it was typically done on the desktop or on “business server” environments. Business analysts spend most of their time preparing data to do work, rather than doing the work. Governance was/is Broken! Business users print, email, duplicate, and extract data assets from all over the organization… in a attempt to get their job done. The requirements process of traditional BI tools has failed organizations: 1) To Slow; 2) Requirements Change; 3) rely on a limited few; 4) to inflexible for the needs of the business; 5) costly; and 6) reactive.
  11. We made if for everyone. We made it easy so that anyone would want to adopt it.