SlideShare uma empresa Scribd logo
1 de 25
1© Cloudera, Inc. All rights reserved.
ClouderaAltus: Big Data in der Cloud einfach gemacht
MichaelKohs | SalesEngineer | mkohs@cloudera.com
2© Cloudera, Inc. All rights reserved.
Shift to cloud: an analyst view
Source: 451 Research, Voice of the
Enterprise: Workloads and Key Projects,
Cloud Transformation, 2017.
Key Points
▪ Cloud deployments will be
the dominant environment in
every category
▪ Every cloud deployment
environment will see
increases in every workload
category
▪ Analytics and App
Development areas
expected strong gains
3© Cloudera, Inc. All rights reserved.
My organization
is moving to the cloud,
why should we
consider Cloudera?
4© Cloudera, Inc. All rights reserved.
Traditional on-premises deployments perform reasonably well
Strong multi-function support
Strong shared data experience
Strong operational model
Moderate cost management
Moderate tenant isolation
Moderate workload elasticity
Weak on self service
Weak on speed of deployment
Shared Data Experience (Metadata, Security, Governance)
One physical cluster provides a shared data experience to multiple
workloads and tenants
But not good enough for tomorrow
5© Cloudera, Inc. All rights reserved.
Traditional cloud deployments are strong where on-premises is
weak, but at the expense of creating workload silos
Moderate multi-function support
Weak on shared data experience
Weak operational model
Moderate cost management
Strong on tenant isolation
Strong on workload elasticity
Strong on self service
Strong on speed of deployment
Shared Data (S3, ADLS)
Cloud
This is the experience of cloud house offerings
6© Cloudera, Inc. All rights reserved.
Only Cloud deployments with SDX optimize for all design goals
Shared Data Experience (Metadata, Security, Governance)
One logical cluster provides a shared data experience to multiple
workloads and tenants
SDX makes it possible to transfer on-premises design wins to cloud
Shared storage (S3, ADLS)
Cloud
Strong multi-function support
Strong shared data experience
Strong operational model
Strong on cost management
Strong on tenant isolation
Strong on workload elasticity
Strong on self service
Strong on speed of deployment
7© Cloudera, Inc. All rights reserved.
Cloudera’s Public Cloud Reference Architecture
Compute
Spark Hive
Hive on
Spark
Impala Solr …
Each workload runs in
an isolated Workload
Cluster
Navigator
Metastore
Hive
Metastore
Sentry
Metastore
SDX services run on
shared RDS | MySQL
Management
Cloudera
Manager
Cloudera
Navigator
Cloudera
Director
Each management
service supports all
Workload and SDX
services
Storage
S3, ADLS
Altus
8© Cloudera, Inc. All rights reserved. 8
The modern platform for machine learning and analytics optimized for the cloud
DATA CATALOG
SECURITY GOVERNANCE
WORKLOAD
MANAGEMENT
INGEST &
REPLICATION
EXTENSIBLE
SERVICES
CORE
SERVICES DATA
ENGINEERING
OPERATIONAL
DATABASE
ANALYTIC
DATABASE
DATA
SCIENCE
S
3
ADL
S
HDF
S
KUD
U
STORAGE
SERVICES
Cloudera Enterprise
PRIVATE CLOUDBARE METAL INFRASTRUCTURE
DEPLOYMENT
OPTIONS SERVICES
9© Cloudera, Inc. All rights reserved.
Cloudera Director for cluster lifecycle management
Easy
• Single pane of glass for all cloud
infrastructure
• Create templates to run applications in a pre-
optimized manner
Flexible
• Multi-cloud: AWS, Azure, GCP
• Hourly pricing with auto billing & metering
• Spot instance/block support
Enterprise-grade
• Integration across Cloudera Enterprise
• Management of CDH deployments at scale
• Deeply integrated with Cloudera Manager
10© Cloudera, Inc. All rights reserved.
Cloudera Altus
Multi-cloud foundation for building new cloud services
Altus PaaS Foundation
DATA
ENGINEERING
ANALYTIC
DATABASE
(beta)
Altus Platform Services
11© Cloudera, Inc. All rights reserved.
Director vs Altus - when to use either?
Altus Director
Automated log saving x
Automated Cluster spin up /
down (no extra coding)
x
Data Engineering – Hive, Spark,
HoS, MR
x x
Production Job Driven x
Workload Analytics x
Cluster Duration Purely Transient Transient OR Persistent
Job Development / Exploration x
3rd Party Installations x
Full Control of CM x
Analytical / Operational - Impala,
HBase, Search
x
Persistent (or Transient) x
Grow / Shrink Cluster x
12© Cloudera, Inc. All rights reserved.
Altus Service Architecture (AWS)
● Runs in Cloudera’s secured and monitored environment
● Manages CDH clusters in customer cloud account
● Customer data does not pass to Cloudera (Workload Analytics
requires opt-in log data transfer to Cloudera)
13© Cloudera, Inc. All rights reserved.
Altus Service Architecture (Azure)
● Runs in Cloudera’s secured and monitored environment
● Manages CDH clusters in customer cloud account
● Customer data does not pass to Cloudera (Workload Analytics
requires opt-in log data transfer to Cloudera)
14© Cloudera, Inc. All rights reserved.
Altus Data Engineering
for ETL, machine learning, and data processing
• Fast, easy job submission without the
cluster management
• Built-in Workload Analytics for
troubleshooting and optimization
• Lower costs with transient resources
and pay-per-use pricing
• Full benefits of isolation + Shared Data
Experience
15© Cloudera, Inc. All rights reserved.
End-user focused with jobs as first-class
objects
Workload troubleshooting and
analytics
● Troubleshoot jobs after cluster
termination through job log and
configuration browsing
● Insight into causes of job failure
● Identification and root cause analysis
of slow jobs
16© Cloudera, Inc. All rights reserved.
Capture metadata spanning multiple clusters
Persist metadata with
Cloudera Navigator
● Export metadata and lineage
information from Altus clusters
● Insight into full data management
pipeline including transient
clusters
17© Cloudera, Inc. All rights reserved.
Three immediate use cases for Altus Data Engineering
ETL FOR
ANALYTIC DB
BATCH MACHINE
LEARNING
ETL OFFLOAD
Cloud-native batch
preparation for Impala
on IaaS or, soon,
Altus Analytic DB.
Scalable compute for
massively-parallel batch
machine learning training,
scoring, or simulation.
Offload batch processing
jobs from overburdened
on-premises clusters.
MLData ScienceETL Analytic DB
ETL
On-Prem
18© Cloudera, Inc. All rights reserved.
Altus Analytic Database
For business analysts:
• Query with predictable performance, at
any time, without risking SLAs
• Bring limitless new users and use cases
with instant self-service analytic access
• Data available for broad access (SQL, BI
tools, Python, R, etc)
The first data warehouse cloud service to bring the warehouse to the data - delivering instant
analytics to anyone
For IT:
• Easily and elastically provision isolated
resources as and when they’re needed
• Simple multi-tenant management including
federated identity and consistent governance
• Eliminate data movement and copies across
workloads
19© Cloudera, Inc. All rights reserved.
Long Running Cluster:
Director Managed
Elastic Cluster(s):
Director Managed
/ Altus Managed
ETL to Cloud-Native Analytic Database
SDX: Schema (HMS), Security (Sentry), Lineage/Audit (Navigator)
Data Engineering Workflow: ETL
Source
Data
1+ Jobs
(e.g. Hive)
1+ Jobs
(e.g. Spark)
Analytic Database
(Impala)
Source
Data
BI Tools
(e.g. Tableau)
SQL Editor
(e.g. Hue)
Transient Cluster(s):
Atlus Managed
Workflow, Monitoring, and Scheduling (e.g. AutoSys, ControlM, Airflow, Talend, Informatica)
Object Store (ADLS, S3)
20© Cloudera, Inc. All rights reserved.
The Scenario
My Role: Data Analyst at DataCo – a Sports Retailer
Business Issue: Experiencing lower than expected website sales. Why?
Technical Issues: I have a data warehouse on premise, which contains my sales
order data, but it is very old and slow and it is difficult to do ad hoc
queries on it.
My clickstream data is too big to ingest into my data warehouse
Requirements: Need an Analytic Database to do ad hoc queries on order data
Need a temporary platform to process weblogs once a day
Ability to join processed weblogs to order data
21© Cloudera, Inc. All rights reserved.
22© Cloudera, Inc. All rights reserved.
Demo - Retail clickstream analysis
Data
engineering
Analytic
database
Well-formatted
data
semi-
structured
weblogs
cleaning
formatting
filtering
SQL
reporting
Self-service
BI[IP, date, method,
url, http_version,
code1, code2,
dash, user_agent]
Insights
23© Cloudera, Inc. All rights reserved.
Object store
Cloudera
Director
Long-running
Kafka cluster
Long-running
stream processing
cluster
Long-running
Analytic DB
cluster (Impala)
Click stream
data
Transient Data
Engineering
clusters (Altus)
Cloudera
Altus
Cloudera
Manager
HDFS
on premise
Self-Service
Analytic DB
clusters
Cloudera
Altus
Shared Data Experience (Metadata, Security, Governance)
24© Cloudera, Inc. All rights reserved.
Q&A
25© Cloudera, Inc. All rights reserved.
Resources
Cloudera Altus
Cloudera Altus documentation
Cloudera Director
Cloudera Director documentation
Cloudera SDX
Try Cloudera Director on Microsoft Azure
Try Cloudera Director on AWS with AWS Quickstart
Cloudera Reference Architectures for public and private cloud deployments

Mais conteúdo relacionado

Mais procurados

Mais procurados (20)

Cloudera - IoT & Smart Cities
Cloudera - IoT & Smart CitiesCloudera - IoT & Smart Cities
Cloudera - IoT & Smart Cities
 
PaaS or Fail: Rule the Cloud with Altus
PaaS or Fail: Rule the Cloud with AltusPaaS or Fail: Rule the Cloud with Altus
PaaS or Fail: Rule the Cloud with Altus
 
How komatsu is driving operational efficiencies using io t and machine learni...
How komatsu is driving operational efficiencies using io t and machine learni...How komatsu is driving operational efficiencies using io t and machine learni...
How komatsu is driving operational efficiencies using io t and machine learni...
 
Introducing Workload XM 8.7.18
Introducing Workload XM 8.7.18Introducing Workload XM 8.7.18
Introducing Workload XM 8.7.18
 
2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists
 
Turning Data into Business Value with a Modern Data Platform
Turning Data into Business Value with a Modern Data PlatformTurning Data into Business Value with a Modern Data Platform
Turning Data into Business Value with a Modern Data Platform
 
Top 5 IoT Use Cases
Top 5 IoT Use CasesTop 5 IoT Use Cases
Top 5 IoT Use Cases
 
Extending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the PlatformExtending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the Platform
 
Cloud Data Warehousing with Cloudera Altus 7.24.18
Cloud Data Warehousing with Cloudera Altus 7.24.18Cloud Data Warehousing with Cloudera Altus 7.24.18
Cloud Data Warehousing with Cloudera Altus 7.24.18
 
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
 
Cloudera + Syncsort: Fuel Business Insights, Analytics, and Next Generation T...
Cloudera + Syncsort: Fuel Business Insights, Analytics, and Next Generation T...Cloudera + Syncsort: Fuel Business Insights, Analytics, and Next Generation T...
Cloudera + Syncsort: Fuel Business Insights, Analytics, and Next Generation T...
 
Using Big Data to Transform Your Customer’s Experience - Part 1

Using Big Data to Transform Your Customer’s Experience - Part 1
Using Big Data to Transform Your Customer’s Experience - Part 1

Using Big Data to Transform Your Customer’s Experience - Part 1

 
Get started with Cloudera's cyber solution
Get started with Cloudera's cyber solutionGet started with Cloudera's cyber solution
Get started with Cloudera's cyber solution
 
Big Data Fundamentals
Big Data FundamentalsBig Data Fundamentals
Big Data Fundamentals
 
Cloudera training secure your cloudera cluster 7.10.18
Cloudera training secure your cloudera cluster 7.10.18Cloudera training secure your cloudera cluster 7.10.18
Cloudera training secure your cloudera cluster 7.10.18
 
Spark and Deep Learning Frameworks at Scale 7.19.18
Spark and Deep Learning Frameworks at Scale 7.19.18Spark and Deep Learning Frameworks at Scale 7.19.18
Spark and Deep Learning Frameworks at Scale 7.19.18
 
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
 
Comment développer une stratégie Big Data dans le cloud public avec l'offre P...
Comment développer une stratégie Big Data dans le cloud public avec l'offre P...Comment développer une stratégie Big Data dans le cloud public avec l'offre P...
Comment développer une stratégie Big Data dans le cloud public avec l'offre P...
 
Cloudera training: secure your Cloudera cluster
Cloudera training: secure your Cloudera clusterCloudera training: secure your Cloudera cluster
Cloudera training: secure your Cloudera cluster
 
How to Lower TCO and Avoid Cloud Lock-in

How to Lower TCO and Avoid Cloud Lock-in
How to Lower TCO and Avoid Cloud Lock-in

How to Lower TCO and Avoid Cloud Lock-in

 

Semelhante a Cloudera Altus: Big Data in der Cloud einfach gemacht

Semelhante a Cloudera Altus: Big Data in der Cloud einfach gemacht (20)

A deep dive into running data analytic workloads in the cloud
A deep dive into running data analytic workloads in the cloudA deep dive into running data analytic workloads in the cloud
A deep dive into running data analytic workloads in the cloud
 
How to Build Multi-disciplinary Analytics Applications on a Shared Data Platform
How to Build Multi-disciplinary Analytics Applications on a Shared Data PlatformHow to Build Multi-disciplinary Analytics Applications on a Shared Data Platform
How to Build Multi-disciplinary Analytics Applications on a Shared Data Platform
 
Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18
 
Cloudera GoDataFest Deploying Cloudera in the Cloud
Cloudera GoDataFest Deploying Cloudera in the CloudCloudera GoDataFest Deploying Cloudera in the Cloud
Cloudera GoDataFest Deploying Cloudera in the Cloud
 
Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19
 
Hybrid is the New Normal
Hybrid is the New NormalHybrid is the New Normal
Hybrid is the New Normal
 
Cloudera Altus: Big Data in the Cloud Made Easy
Cloudera Altus: Big Data in the Cloud Made EasyCloudera Altus: Big Data in the Cloud Made Easy
Cloudera Altus: Big Data in the Cloud Made Easy
 
Big data journey to the cloud 5.30.18 asher bartch
Big data journey to the cloud 5.30.18   asher bartchBig data journey to the cloud 5.30.18   asher bartch
Big data journey to the cloud 5.30.18 asher bartch
 
How Big Data Can Enable Analytics from the Cloud (Technical Workshop)
How Big Data Can Enable Analytics from the Cloud (Technical Workshop)How Big Data Can Enable Analytics from the Cloud (Technical Workshop)
How Big Data Can Enable Analytics from the Cloud (Technical Workshop)
 
Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2
 
Self-service Big Data Analytics on Microsoft Azure
Self-service Big Data Analytics on Microsoft AzureSelf-service Big Data Analytics on Microsoft Azure
Self-service Big Data Analytics on Microsoft Azure
 
Gartner Data and Analytics Summit: Bringing Self-Service BI & SQL Analytics ...
 Gartner Data and Analytics Summit: Bringing Self-Service BI & SQL Analytics ... Gartner Data and Analytics Summit: Bringing Self-Service BI & SQL Analytics ...
Gartner Data and Analytics Summit: Bringing Self-Service BI & SQL Analytics ...
 
Five Tips for Running Cloudera on AWS
Five Tips for Running Cloudera on AWSFive Tips for Running Cloudera on AWS
Five Tips for Running Cloudera on AWS
 
High-Performance Analytics in the Cloud with Apache Impala
High-Performance Analytics in the Cloud with Apache ImpalaHigh-Performance Analytics in the Cloud with Apache Impala
High-Performance Analytics in the Cloud with Apache Impala
 
Big Data LDN 2018: MICROSOFT AZURE AND CLOUDERA – FLEXIBLE CLOUD, WHATEVER TH...
Big Data LDN 2018: MICROSOFT AZURE AND CLOUDERA – FLEXIBLE CLOUD, WHATEVER TH...Big Data LDN 2018: MICROSOFT AZURE AND CLOUDERA – FLEXIBLE CLOUD, WHATEVER TH...
Big Data LDN 2018: MICROSOFT AZURE AND CLOUDERA – FLEXIBLE CLOUD, WHATEVER TH...
 
Part 2: Cloudera’s Operational Database: Unlocking New Benefits in the Cloud
Part 2: Cloudera’s Operational Database: Unlocking New Benefits in the CloudPart 2: Cloudera’s Operational Database: Unlocking New Benefits in the Cloud
Part 2: Cloudera’s Operational Database: Unlocking New Benefits in the Cloud
 
Big Data LDN 2018: CONSISTENT SECURITY, GOVERNANCE AND FLEXIBILITY FOR ALL WO...
Big Data LDN 2018: CONSISTENT SECURITY, GOVERNANCE AND FLEXIBILITY FOR ALL WO...Big Data LDN 2018: CONSISTENT SECURITY, GOVERNANCE AND FLEXIBILITY FOR ALL WO...
Big Data LDN 2018: CONSISTENT SECURITY, GOVERNANCE AND FLEXIBILITY FOR ALL WO...
 
Supercharge Splunk with Cloudera

Supercharge Splunk with Cloudera
Supercharge Splunk with Cloudera

Supercharge Splunk with Cloudera

 
The new big data
The new big dataThe new big data
The new big data
 
Enterprise Hadoop in the Cloud. In Minutes. | How to Run Cloudera Enterprise ...
Enterprise Hadoop in the Cloud. In Minutes. | How to Run Cloudera Enterprise ...Enterprise Hadoop in the Cloud. In Minutes. | How to Run Cloudera Enterprise ...
Enterprise Hadoop in the Cloud. In Minutes. | How to Run Cloudera Enterprise ...
 

Mais de Cloudera, Inc.

Mais de Cloudera, Inc. (13)

Partner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptxPartner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptx
 
Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19
 
Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19
 
Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19
 
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
 
Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3
 
Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1
 
Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18
 
Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360
 
Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18
 
How Cloudera SDX can aid GDPR compliance
How Cloudera SDX can aid GDPR complianceHow Cloudera SDX can aid GDPR compliance
How Cloudera SDX can aid GDPR compliance
 
Multi task learning stepping away from narrow expert models 7.11.18
Multi task learning stepping away from narrow expert models 7.11.18Multi task learning stepping away from narrow expert models 7.11.18
Multi task learning stepping away from narrow expert models 7.11.18
 
The 5 Biggest Data Myths in Telco: Exposed
The 5 Biggest Data Myths in Telco: ExposedThe 5 Biggest Data Myths in Telco: Exposed
The 5 Biggest Data Myths in Telco: Exposed
 

Último

Jual Obat Aborsi ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan Cytotec
Jual Obat Aborsi ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan CytotecJual Obat Aborsi ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan Cytotec
Jual Obat Aborsi ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan Cytotec
ZurliaSoop
 
Challenges and Opportunities: A Qualitative Study on Tax Compliance in Pakistan
Challenges and Opportunities: A Qualitative Study on Tax Compliance in PakistanChallenges and Opportunities: A Qualitative Study on Tax Compliance in Pakistan
Challenges and Opportunities: A Qualitative Study on Tax Compliance in Pakistan
vineshkumarsajnani12
 

Último (20)

UAE Bur Dubai Call Girls ☏ 0564401582 Call Girl in Bur Dubai
UAE Bur Dubai Call Girls ☏ 0564401582 Call Girl in Bur DubaiUAE Bur Dubai Call Girls ☏ 0564401582 Call Girl in Bur Dubai
UAE Bur Dubai Call Girls ☏ 0564401582 Call Girl in Bur Dubai
 
Pre Engineered Building Manufacturers Hyderabad.pptx
Pre Engineered  Building Manufacturers Hyderabad.pptxPre Engineered  Building Manufacturers Hyderabad.pptx
Pre Engineered Building Manufacturers Hyderabad.pptx
 
Jual Obat Aborsi ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan Cytotec
Jual Obat Aborsi ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan CytotecJual Obat Aborsi ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan Cytotec
Jual Obat Aborsi ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan Cytotec
 
Challenges and Opportunities: A Qualitative Study on Tax Compliance in Pakistan
Challenges and Opportunities: A Qualitative Study on Tax Compliance in PakistanChallenges and Opportunities: A Qualitative Study on Tax Compliance in Pakistan
Challenges and Opportunities: A Qualitative Study on Tax Compliance in Pakistan
 
Falcon Invoice Discounting: The best investment platform in india for investors
Falcon Invoice Discounting: The best investment platform in india for investorsFalcon Invoice Discounting: The best investment platform in india for investors
Falcon Invoice Discounting: The best investment platform in india for investors
 
Call 7737669865 Vadodara Call Girls Service at your Door Step Available All Time
Call 7737669865 Vadodara Call Girls Service at your Door Step Available All TimeCall 7737669865 Vadodara Call Girls Service at your Door Step Available All Time
Call 7737669865 Vadodara Call Girls Service at your Door Step Available All Time
 
Dr. Admir Softic_ presentation_Green Club_ENG.pdf
Dr. Admir Softic_ presentation_Green Club_ENG.pdfDr. Admir Softic_ presentation_Green Club_ENG.pdf
Dr. Admir Softic_ presentation_Green Club_ENG.pdf
 
How to Get Started in Social Media for Art League City
How to Get Started in Social Media for Art League CityHow to Get Started in Social Media for Art League City
How to Get Started in Social Media for Art League City
 
Paradip CALL GIRL❤7091819311❤CALL GIRLS IN ESCORT SERVICE WE ARE PROVIDING
Paradip CALL GIRL❤7091819311❤CALL GIRLS IN ESCORT SERVICE WE ARE PROVIDINGParadip CALL GIRL❤7091819311❤CALL GIRLS IN ESCORT SERVICE WE ARE PROVIDING
Paradip CALL GIRL❤7091819311❤CALL GIRLS IN ESCORT SERVICE WE ARE PROVIDING
 
Buy gmail accounts.pdf buy Old Gmail Accounts
Buy gmail accounts.pdf buy Old Gmail AccountsBuy gmail accounts.pdf buy Old Gmail Accounts
Buy gmail accounts.pdf buy Old Gmail Accounts
 
Berhampur Call Girl Just Call 8084732287 Top Class Call Girl Service Available
Berhampur Call Girl Just Call 8084732287 Top Class Call Girl Service AvailableBerhampur Call Girl Just Call 8084732287 Top Class Call Girl Service Available
Berhampur Call Girl Just Call 8084732287 Top Class Call Girl Service Available
 
GUWAHATI 💋 Call Girl 9827461493 Call Girls in Escort service book now
GUWAHATI 💋 Call Girl 9827461493 Call Girls in  Escort service book nowGUWAHATI 💋 Call Girl 9827461493 Call Girls in  Escort service book now
GUWAHATI 💋 Call Girl 9827461493 Call Girls in Escort service book now
 
Katrina Personal Brand Project and portfolio 1
Katrina Personal Brand Project and portfolio 1Katrina Personal Brand Project and portfolio 1
Katrina Personal Brand Project and portfolio 1
 
Phases of Negotiation .pptx
 Phases of Negotiation .pptx Phases of Negotiation .pptx
Phases of Negotiation .pptx
 
Marel Q1 2024 Investor Presentation from May 8, 2024
Marel Q1 2024 Investor Presentation from May 8, 2024Marel Q1 2024 Investor Presentation from May 8, 2024
Marel Q1 2024 Investor Presentation from May 8, 2024
 
Falcon Invoice Discounting: Unlock Your Business Potential
Falcon Invoice Discounting: Unlock Your Business PotentialFalcon Invoice Discounting: Unlock Your Business Potential
Falcon Invoice Discounting: Unlock Your Business Potential
 
Berhampur CALL GIRL❤7091819311❤CALL GIRLS IN ESCORT SERVICE WE ARE PROVIDING
Berhampur CALL GIRL❤7091819311❤CALL GIRLS IN ESCORT SERVICE WE ARE PROVIDINGBerhampur CALL GIRL❤7091819311❤CALL GIRLS IN ESCORT SERVICE WE ARE PROVIDING
Berhampur CALL GIRL❤7091819311❤CALL GIRLS IN ESCORT SERVICE WE ARE PROVIDING
 
joint cost.pptx COST ACCOUNTING Sixteenth Edition ...
joint cost.pptx  COST ACCOUNTING  Sixteenth Edition                          ...joint cost.pptx  COST ACCOUNTING  Sixteenth Edition                          ...
joint cost.pptx COST ACCOUNTING Sixteenth Edition ...
 
Escorts in Nungambakkam Phone 8250092165 Enjoy 24/7 Escort Service Enjoy Your...
Escorts in Nungambakkam Phone 8250092165 Enjoy 24/7 Escort Service Enjoy Your...Escorts in Nungambakkam Phone 8250092165 Enjoy 24/7 Escort Service Enjoy Your...
Escorts in Nungambakkam Phone 8250092165 Enjoy 24/7 Escort Service Enjoy Your...
 
PHX May 2024 Corporate Presentation Final
PHX May 2024 Corporate Presentation FinalPHX May 2024 Corporate Presentation Final
PHX May 2024 Corporate Presentation Final
 

Cloudera Altus: Big Data in der Cloud einfach gemacht

  • 1. 1© Cloudera, Inc. All rights reserved. ClouderaAltus: Big Data in der Cloud einfach gemacht MichaelKohs | SalesEngineer | mkohs@cloudera.com
  • 2. 2© Cloudera, Inc. All rights reserved. Shift to cloud: an analyst view Source: 451 Research, Voice of the Enterprise: Workloads and Key Projects, Cloud Transformation, 2017. Key Points ▪ Cloud deployments will be the dominant environment in every category ▪ Every cloud deployment environment will see increases in every workload category ▪ Analytics and App Development areas expected strong gains
  • 3. 3© Cloudera, Inc. All rights reserved. My organization is moving to the cloud, why should we consider Cloudera?
  • 4. 4© Cloudera, Inc. All rights reserved. Traditional on-premises deployments perform reasonably well Strong multi-function support Strong shared data experience Strong operational model Moderate cost management Moderate tenant isolation Moderate workload elasticity Weak on self service Weak on speed of deployment Shared Data Experience (Metadata, Security, Governance) One physical cluster provides a shared data experience to multiple workloads and tenants But not good enough for tomorrow
  • 5. 5© Cloudera, Inc. All rights reserved. Traditional cloud deployments are strong where on-premises is weak, but at the expense of creating workload silos Moderate multi-function support Weak on shared data experience Weak operational model Moderate cost management Strong on tenant isolation Strong on workload elasticity Strong on self service Strong on speed of deployment Shared Data (S3, ADLS) Cloud This is the experience of cloud house offerings
  • 6. 6© Cloudera, Inc. All rights reserved. Only Cloud deployments with SDX optimize for all design goals Shared Data Experience (Metadata, Security, Governance) One logical cluster provides a shared data experience to multiple workloads and tenants SDX makes it possible to transfer on-premises design wins to cloud Shared storage (S3, ADLS) Cloud Strong multi-function support Strong shared data experience Strong operational model Strong on cost management Strong on tenant isolation Strong on workload elasticity Strong on self service Strong on speed of deployment
  • 7. 7© Cloudera, Inc. All rights reserved. Cloudera’s Public Cloud Reference Architecture Compute Spark Hive Hive on Spark Impala Solr … Each workload runs in an isolated Workload Cluster Navigator Metastore Hive Metastore Sentry Metastore SDX services run on shared RDS | MySQL Management Cloudera Manager Cloudera Navigator Cloudera Director Each management service supports all Workload and SDX services Storage S3, ADLS Altus
  • 8. 8© Cloudera, Inc. All rights reserved. 8 The modern platform for machine learning and analytics optimized for the cloud DATA CATALOG SECURITY GOVERNANCE WORKLOAD MANAGEMENT INGEST & REPLICATION EXTENSIBLE SERVICES CORE SERVICES DATA ENGINEERING OPERATIONAL DATABASE ANALYTIC DATABASE DATA SCIENCE S 3 ADL S HDF S KUD U STORAGE SERVICES Cloudera Enterprise PRIVATE CLOUDBARE METAL INFRASTRUCTURE DEPLOYMENT OPTIONS SERVICES
  • 9. 9© Cloudera, Inc. All rights reserved. Cloudera Director for cluster lifecycle management Easy • Single pane of glass for all cloud infrastructure • Create templates to run applications in a pre- optimized manner Flexible • Multi-cloud: AWS, Azure, GCP • Hourly pricing with auto billing & metering • Spot instance/block support Enterprise-grade • Integration across Cloudera Enterprise • Management of CDH deployments at scale • Deeply integrated with Cloudera Manager
  • 10. 10© Cloudera, Inc. All rights reserved. Cloudera Altus Multi-cloud foundation for building new cloud services Altus PaaS Foundation DATA ENGINEERING ANALYTIC DATABASE (beta) Altus Platform Services
  • 11. 11© Cloudera, Inc. All rights reserved. Director vs Altus - when to use either? Altus Director Automated log saving x Automated Cluster spin up / down (no extra coding) x Data Engineering – Hive, Spark, HoS, MR x x Production Job Driven x Workload Analytics x Cluster Duration Purely Transient Transient OR Persistent Job Development / Exploration x 3rd Party Installations x Full Control of CM x Analytical / Operational - Impala, HBase, Search x Persistent (or Transient) x Grow / Shrink Cluster x
  • 12. 12© Cloudera, Inc. All rights reserved. Altus Service Architecture (AWS) ● Runs in Cloudera’s secured and monitored environment ● Manages CDH clusters in customer cloud account ● Customer data does not pass to Cloudera (Workload Analytics requires opt-in log data transfer to Cloudera)
  • 13. 13© Cloudera, Inc. All rights reserved. Altus Service Architecture (Azure) ● Runs in Cloudera’s secured and monitored environment ● Manages CDH clusters in customer cloud account ● Customer data does not pass to Cloudera (Workload Analytics requires opt-in log data transfer to Cloudera)
  • 14. 14© Cloudera, Inc. All rights reserved. Altus Data Engineering for ETL, machine learning, and data processing • Fast, easy job submission without the cluster management • Built-in Workload Analytics for troubleshooting and optimization • Lower costs with transient resources and pay-per-use pricing • Full benefits of isolation + Shared Data Experience
  • 15. 15© Cloudera, Inc. All rights reserved. End-user focused with jobs as first-class objects Workload troubleshooting and analytics ● Troubleshoot jobs after cluster termination through job log and configuration browsing ● Insight into causes of job failure ● Identification and root cause analysis of slow jobs
  • 16. 16© Cloudera, Inc. All rights reserved. Capture metadata spanning multiple clusters Persist metadata with Cloudera Navigator ● Export metadata and lineage information from Altus clusters ● Insight into full data management pipeline including transient clusters
  • 17. 17© Cloudera, Inc. All rights reserved. Three immediate use cases for Altus Data Engineering ETL FOR ANALYTIC DB BATCH MACHINE LEARNING ETL OFFLOAD Cloud-native batch preparation for Impala on IaaS or, soon, Altus Analytic DB. Scalable compute for massively-parallel batch machine learning training, scoring, or simulation. Offload batch processing jobs from overburdened on-premises clusters. MLData ScienceETL Analytic DB ETL On-Prem
  • 18. 18© Cloudera, Inc. All rights reserved. Altus Analytic Database For business analysts: • Query with predictable performance, at any time, without risking SLAs • Bring limitless new users and use cases with instant self-service analytic access • Data available for broad access (SQL, BI tools, Python, R, etc) The first data warehouse cloud service to bring the warehouse to the data - delivering instant analytics to anyone For IT: • Easily and elastically provision isolated resources as and when they’re needed • Simple multi-tenant management including federated identity and consistent governance • Eliminate data movement and copies across workloads
  • 19. 19© Cloudera, Inc. All rights reserved. Long Running Cluster: Director Managed Elastic Cluster(s): Director Managed / Altus Managed ETL to Cloud-Native Analytic Database SDX: Schema (HMS), Security (Sentry), Lineage/Audit (Navigator) Data Engineering Workflow: ETL Source Data 1+ Jobs (e.g. Hive) 1+ Jobs (e.g. Spark) Analytic Database (Impala) Source Data BI Tools (e.g. Tableau) SQL Editor (e.g. Hue) Transient Cluster(s): Atlus Managed Workflow, Monitoring, and Scheduling (e.g. AutoSys, ControlM, Airflow, Talend, Informatica) Object Store (ADLS, S3)
  • 20. 20© Cloudera, Inc. All rights reserved. The Scenario My Role: Data Analyst at DataCo – a Sports Retailer Business Issue: Experiencing lower than expected website sales. Why? Technical Issues: I have a data warehouse on premise, which contains my sales order data, but it is very old and slow and it is difficult to do ad hoc queries on it. My clickstream data is too big to ingest into my data warehouse Requirements: Need an Analytic Database to do ad hoc queries on order data Need a temporary platform to process weblogs once a day Ability to join processed weblogs to order data
  • 21. 21© Cloudera, Inc. All rights reserved.
  • 22. 22© Cloudera, Inc. All rights reserved. Demo - Retail clickstream analysis Data engineering Analytic database Well-formatted data semi- structured weblogs cleaning formatting filtering SQL reporting Self-service BI[IP, date, method, url, http_version, code1, code2, dash, user_agent] Insights
  • 23. 23© Cloudera, Inc. All rights reserved. Object store Cloudera Director Long-running Kafka cluster Long-running stream processing cluster Long-running Analytic DB cluster (Impala) Click stream data Transient Data Engineering clusters (Altus) Cloudera Altus Cloudera Manager HDFS on premise Self-Service Analytic DB clusters Cloudera Altus Shared Data Experience (Metadata, Security, Governance)
  • 24. 24© Cloudera, Inc. All rights reserved. Q&A
  • 25. 25© Cloudera, Inc. All rights reserved. Resources Cloudera Altus Cloudera Altus documentation Cloudera Director Cloudera Director documentation Cloudera SDX Try Cloudera Director on Microsoft Azure Try Cloudera Director on AWS with AWS Quickstart Cloudera Reference Architectures for public and private cloud deployments