SlideShare uma empresa Scribd logo
1 de 34
Baixar para ler offline
We’ll get started soon… 
Q&A box is available for your questions 
Webinar will be recorded for future viewing 
Thank you for joining! 
© Hortonworks Inc. 2011 – 2014. All Rights Reserved
Combine SAS High-Performance 
Capabilities with Hadoop YARN 
© Hortonworks Inc. 2011 – 2014. All Rights Reserved 
We do Hadoop.
Your speakers… 
Arun Murthy, Founder and Architect 
Hortonworks 
@acmurthy 
Paul Kent, Vice President Big Data 
SAS 
@hornpolish 
© Hortonworks Inc. 2011 – 2014. All Rights Reserved
Agenda 
• Introduction to YARN 
• SAS Workloads on the Cluster 
• SAS Workloads: Resource Settings 
• SAS and YARN 
• YARN Futures 
• Next Steps 
© Hortonworks Inc. 2011 – 2014. All Rights Reserved
The 1st Generation of Hadoop: Batch 
HADOOP 1.0 
Built for Web-Scale Batch Apps 
Single 
App 
INTERACTIVE 
Single 
App 
BATCH 
HDFS 
Single 
App 
BATCH 
HDFS 
© Hortonworks Inc. 2011 – 2014. All Rights Reserved 
• All other usage patterns must 
leverage that same 
infrastructure 
• Forces the creation of silos 
for managing mixed 
workloads 
Single 
App 
ONLINE 
Single 
App 
BATCH 
HDFS
Hadoop MapReduce Classic 
JobTracker 
§ Manages cluster resources and job scheduling 
TaskTracker 
§ Per-node agent 
§ Manage tasks 
© Hortonworks Inc. 2011 – 2014. All Rights Reserved
MapReduce Classic: Limitations 
Scalability 
§ Maximum Cluster size – 4,000 nodes 
§ Maximum concurrent tasks – 40,000 
§ Coarse synchronization in JobTracker 
Availability 
§ Failure kills all queued and running jobs 
Hard partition of resources into map and reduce slots 
§ Low resource utilization 
Lacks support for alternate paradigms and services 
§ Iterative applications implemented using MapReduce are 10x slower 
© Hortonworks Inc. 2011 – 2014. All Rights Reserved
Our Vision: Hadoop as Next-Gen Platform 
Real-time 
HBase 
© Hortonworks Inc. 2011 – 2014. All Rights Reserved 
Tez 
YARN: Data Operating System 
(Cluster Resource Management) 
1 ° ° ° ° ° ° ° 
° ° ° ° ° ° ° ° 
° ° ° ° ° ° ° ° 
MapReduce 
(Cluster Resource Management & Data Processing) 
Script 
Pig 
SQL 
Hive 
Others 
Storm, 
Solr, etc. 
1 ° ° ° ° ° 
° ° ° ° ° ° 
° ° ° ° ° ° 
° 
° 
N 
HDFS 
(Hadoop Distributed File System) 
Script 
Pig 
SQL 
Hive 
Engines 
HBase 
Accumulo, Storm, 
Solr, Spark. 
Others 
ISV Engines 
TezTez 
Others 
Engines 
Tez 
Hadoop 1 
• Silos & Largely batch 
• Single Processing engine 
Hadoop 2 w/ 
• Multiple Engines, Single Data Set 
• Batch, Interactive & Real-Time 
Java 
Cascading 
T ez 
° ° 
° ° 
° ° 
° 
° 
N 
HDFS 
(Hadoop Distributed File System)
YARN: Taking Hadoop Beyond Batch 
Applica,ons 
Run 
Na,vely 
IN 
Hadoop 
© Hortonworks Inc. 2011 – 2014. All Rights Reserved 
HDFS2 
(Redundant, 
Reliable 
Storage) 
YARN 
(Cluster 
Resource 
Management) 
BATCH 
(MapReduce) 
INTERACTIVE 
(Tez) 
STREAMING 
(Storm, 
S4,…) 
GRAPH 
(Giraph) 
IN-­‐MEMORY 
(Spark) 
HPC 
MPI 
(OpenMPI) 
ONLINE 
(HBase) 
OTHER 
(Search) 
(Weave…) 
Store ALL DATA in one place… 
Interact with that data in MULTIPLE WAYS 
with Predictable Performance and Quality of Service
YARN 
Hortonworks Data Platform 
Script 
Pig 
SQL 
Hive 
TezT ez 
Java 
Cascading 
T ez 
© Hortonworks Inc. 2011 – 2014. All Rights Reserved 
Accumulo 
NoSQL 
YARN: Data Operating System 
(Cluster Resource Management) 
Others 
Engines 
Tez 
1 ° ° ° ° ° ° ° 
° ° ° ° ° ° ° ° 
° ° ° ° ° ° ° ° 
° ° 
° ° 
° ° 
HBase 
NoSQL 
Storm 
Stream 
Slider 
Sli der 
Others 
Engines 
Slider 
Slider 
° ° ° ° ° 
° ° ° ° ° 
° ° ° ° ° 
° 
° 
° 
Spark 
In-Memory 
° 
° 
° 
° 
° 
° 
PaaS 
Kubernetes 
LASR 
HPA 
° 
° 
N 
° 
° 
° 
° 
° 
° 
HDFS 
(Hadoop Distributed File System) 
Batch 
MR
5 Key Benefits of YARN 
1. Scale 
2. New Programming Models 5 & Services 
3. Improved cluster utilization 
4. Agility 
5. Beyond Java 
© Hortonworks Inc. 2011 – 2014. All Rights Reserved
Concepts 
Application 
§ Application is a temporal job or a service submitted YARN 
§ Examples 
– Map Reduce Job (job) 
– Hbase Cluster (service) 
Container 
§ Basic unit of allocation 
§ Fine-grained resource allocation across multiple resource types (memory, cpu, disk, 
network, gpu etc.) 
– container_0 = 2GB, 1CPU 
– container_1 = 1GB, 6 CPU 
§ Replaces the fixed map/reduce slots 
© Hortonworks Inc. 2011 – 2014. All Rights Reserved
Design Centre 
Split up the two major functions of JobTracker 
§ Cluster resource management 
§ Application life-cycle management 
MapReduce becomes user-land library 
© Hortonworks Inc. 2011 – 2014. All Rights Reserved
NodeManager 
NodeManager 
Container 
© Hortonworks Inc. 2011 – 2014. All Rights Reserved 
NodeManager 
NodeManager 
Container 
1.1 
Container 
2.4 
NodeManager 
NodeManager 
NodeManager 
NodeManager 
1.2 
NodeManager 
NodeManager 
NodeManager 
NodeManager 
Container 
1.3 
AM 
1 
Container 
2.2 
Container 
2.1 
Container 
2.3 
AM2 
YARN Architecture - Walkthrough 
Client2 
ResourceManager 
Scheduler
Multi-Tenancy with YARN 
Economics as queue-capacity 
§ Heirarchical Queues 
SLAs 
§ Preemption 
Resource Isolation 
§ Linux: cgroups 
§ MS Windows: Job Control 
§ Roadmap: Virtualization (Xen, KVM) 
Administration 
§ Queue ACLs 
§ Run-time re-configuration for queues 
§ Charge-back 
© Hortonworks Inc. 2011 – 2014. All Rights Reserved 
ResourceManager 
Scheduler 
root 
Adhoc 
10% 
DW 
70% 
Mrkting 
20% 
Dev 
10% 
Reserved 
20% 
Prod 
70% 
Prod 
80% 
Dev 
20% 
P0 
70% 
P1 
30% 
Capacity Scheduler 
Hierarchical 
Queues
YARN Applications 
Data processing applications and services 
§ Services - Slider 
§ Real-time event processing – Storm, S4, other commercial platforms 
§ Tez – Generic framework to run a complex DAG 
§ MPI: OpenMPI, MPICH2 
§ Master-Worker 
§ Machine Learning: Spark 
§ Graph processing: Giraph 
§ Enabled by allowing the use of paradigm-specific application master 
Run all on the same Hadoop cluster! 
© Hortonworks Inc. 2011 – 2014. All Rights Reserved
SHARE! 
Customers are: 
wrapping up POCs 
building Bigger Clusters 
assembling their Data { Lake, Reservoir } 
want their software to SHARE the cluster 
Copyright © 2014, SAS Institute Inc. All rights reserved.
SAS Workloads on the Cluster 
Copyright © 2014, SAS Institute Inc. All rights reserved.
SAS Workloads on the Cluster - Video 
Copyright © 2014, SAS Institute Inc. All rights reserved.
SAS Workloads on the Cluster 
Some Requests are for a significant slice of the cluster 
Reservation will be ALL DAY, ALL WEEK, ALL MONTH? 
Memory typically fixed (15% of cluster) 
CPU floor, would like the spare capacity when available 
Some Requests are more short term 
Memory can be estimated 
Duration can be capped 
CPU floor, would like spare capacity 
Copyright © 2014, SAS Institute Inc. All rights reserved.
SAS Workloads on the Cluster 
Copyright © 2014, SAS Institute Inc. All rights reserved.
SAS Workloads – Resource Settings 
How much should you reserve? 
not a perfect science yet 
Long Running? 
LASR server by percent of total memory 
More like a batch request? 
HPA procedure by anecdotal experience 
Copyright © 2014, SAS Institute Inc. All rights reserved.
SAS Workloads – Resource Settings 
if [ "$USER" = "lasradm" ]; then 
# Custom settings for running under the lasradm account. 
export TKMPI_ULIMIT="-v 50000000” 
export TKMPI_MEMSIZE=50000 
export TKMPI_CGROUP="cgexec -g cpu:75” 
fi 
# if [ "$TKMPI_APPNAME" = "lasr" ]; then 
# Custom settings for a lasr process running under any account. 
# export TKMPI_ULIMIT="-v 50000000" 
# export TKMPI_MEMSIZE=50000 
# export TKMPI_CGROUP="cgexec -g cpu:75" 
Copyright © 2014, SAS Institute Inc. All rights reserved.
YARN: Taking Hadoop Beyond Batch 
Store ALL DATA in one place… 
Interact with that data in MULTIPLE WAYS 
with Predictable Performance and Quality of Service 
Applica,ons 
Run 
Na,vely 
IN 
Hadoop 
© Hortonworks Inc. 2011 – 2014. All Rights Reserved 
HDFS2 
YARN 
(Redundant, 
Reliable 
Storage) 
BATCH 
(MapReduce) 
INTERACTIVE 
(Tez) 
STREAMING 
(Storm, 
S4,…) 
GRAPH 
(Giraph) 
IN-­‐MEMORY 
(Spark) 
ONLINE 
(HBase)
YARN Futures 
© Hortonworks Inc. 2011 – 2014. All Rights Reserved
YARN – Delegated Container Model 
© Hortonworks Inc. 2011 – 2014. All Rights Reserved 
NodeManager 
NodeManager 
Container 
ResourceManager 
1.1 
NodeManager 
NodeManager 
AM 
1 
startContainer! 
Scheduler 
NodeManager 
NodeManager 
NodeManager 
NodeManager 
NodeManager 
NodeManager 
NodeManager 
NodeManager 
1 
allocate! 
container! 2 
3
YARN – Delegated Container Model 
© Hortonworks Inc. 2011 – 2014. All Rights Reserved 
NodeManager 
NodeManager 
ResourceManager 
ServiceX 
NodeManager 
NodeManager 
AM 
1 
delegateContainer! 
Scheduler 
NodeManager 
NodeManager 
NodeManager 
NodeManager 
NodeManager 
NodeManager 
NodeManager 
NodeManager 
1 
allocate! 
2 
container! 
3 
4
YARN – Delegated Container Model 
© Hortonworks Inc. 2011 – 2014. All Rights Reserved 
NodeManager 
NodeManager 
ServiceX 
NodeManager 
NodeManager 
AM 
1 
ResourceManager 
Scheduler 
NodeManager 
NodeManager 
NodeManager 
NodeManager 
NodeManager 
NodeManager 
NodeManager 
NodeManager 
5
YARN – Delegated Container Model 
© Hortonworks Inc. 2011 – 2014. All Rights Reserved 
NodeManager 
NodeManager 
NodeManager 
NodeManager 
AM 
1 
ResourceManager 
Scheduler 
NodeManager 
NodeManager 
NodeManager 
NodeManager 
NodeManager 
NodeManager 
NodeManager 
NodeManager 
6 ServiceX
PaaS - Kubernetes-on-YARN 
YARN as the default enterprise-class scheduler and resource manager for Kubernetes and 
OpenShift 3 
q First class support for containerization and mainstream PaaS 
q Updated go language bindings for YARN 
q Uses container delegation model 
© Hortonworks Inc. 2011 – 2014. All Rights Reserved
Labels – Constraint Specifications 
NodeManager 
NodeManager 
© Hortonworks Inc. 2011 – 2014. All Rights Reserved 
NodeManager 
NodeManager 
w/ 
GPU 
map 
1.1 
NodeManager 
NodeManager 
NodeManager 
w/ 
GPU 
NodeManager 
w/ 
GPU 
NodeManager 
NodeManager 
NodeManager 
NodeManager 
w/ 
GPU 
map1.2 
reduce1.1 
MR 
AM 
1 
DL1.1 
DL1.2 
DL1.3 
DL-­‐AM 
ResourceManager 
Scheduler
Reservations - SLAs via Allocation Planning 
© Hortonworks Inc. 2011 – 2014. All Rights Reserved
YARN 
Hortonworks Data Platform 
Script 
Pig 
SQL 
Hive 
TezT ez 
Java 
Cascading 
T ez 
© Hortonworks Inc. 2011 – 2014. All Rights Reserved 
Accumulo 
NoSQL 
YARN: Data Operating System 
(Cluster Resource Management) 
Others 
Engines 
Tez 
1 ° ° ° ° ° ° ° 
° ° ° ° ° ° ° ° 
° ° ° ° ° ° ° ° 
° ° 
° ° 
° ° 
HBase 
NoSQL 
Storm 
Stream 
Slider 
Sli der 
Others 
Engines 
Slider 
Slider 
° ° ° ° ° 
° ° ° ° ° 
° ° ° ° ° 
° 
° 
° 
Spark 
In-Memory 
° 
° 
° 
° 
° 
° 
PaaS 
Kubernetes 
LASR 
HPA 
° 
° 
N 
° 
° 
° 
° 
° 
° 
HDFS 
(Hadoop Distributed File System) 
Batch 
MR
Next Steps… 
More about SAS & Hortonworks 
http://hortonworks.com/partner/SAS/ 
Download the Hortonworks Sandbox 
Learn Hadoop 
Build Your Analytic App 
Try Hadoop 2 
Contact us: events@hortonworks.com 
© Hortonworks Inc. 2011 – 2014. All Rights Reserved

Mais conteúdo relacionado

Mais procurados

Mais procurados (20)

Enrich a 360-degree Customer View with Splunk and Apache Hadoop
Enrich a 360-degree Customer View with Splunk and Apache HadoopEnrich a 360-degree Customer View with Splunk and Apache Hadoop
Enrich a 360-degree Customer View with Splunk and Apache Hadoop
 
Hortonworks Technical Workshop: Real Time Monitoring with Apache Hadoop
Hortonworks Technical Workshop: Real Time Monitoring with Apache HadoopHortonworks Technical Workshop: Real Time Monitoring with Apache Hadoop
Hortonworks Technical Workshop: Real Time Monitoring with Apache Hadoop
 
Discover HDP 2.1: Apache Falcon for Data Governance in Hadoop
Discover HDP 2.1: Apache Falcon for Data Governance in HadoopDiscover HDP 2.1: Apache Falcon for Data Governance in Hadoop
Discover HDP 2.1: Apache Falcon for Data Governance in Hadoop
 
YARN Ready: Integrating to YARN with Tez
YARN Ready: Integrating to YARN with Tez YARN Ready: Integrating to YARN with Tez
YARN Ready: Integrating to YARN with Tez
 
Hadoop crashcourse v3
Hadoop crashcourse v3Hadoop crashcourse v3
Hadoop crashcourse v3
 
Discover HDP 2.2: Even Faster SQL Queries with Apache Hive and Stinger.next
Discover HDP 2.2: Even Faster SQL Queries with Apache Hive and Stinger.nextDiscover HDP 2.2: Even Faster SQL Queries with Apache Hive and Stinger.next
Discover HDP 2.2: Even Faster SQL Queries with Apache Hive and Stinger.next
 
State of the Union with Shaun Connolly
State of the Union with Shaun ConnollyState of the Union with Shaun Connolly
State of the Union with Shaun Connolly
 
Discover hdp 2.2: Data storage innovations in Hadoop Distributed Filesystem (...
Discover hdp 2.2: Data storage innovations in Hadoop Distributed Filesystem (...Discover hdp 2.2: Data storage innovations in Hadoop Distributed Filesystem (...
Discover hdp 2.2: Data storage innovations in Hadoop Distributed Filesystem (...
 
Discover HDP 2.1: Apache Solr for Hadoop Search
Discover HDP 2.1: Apache Solr for Hadoop SearchDiscover HDP 2.1: Apache Solr for Hadoop Search
Discover HDP 2.1: Apache Solr for Hadoop Search
 
Hortonworks Yarn Code Walk Through January 2014
Hortonworks Yarn Code Walk Through January 2014Hortonworks Yarn Code Walk Through January 2014
Hortonworks Yarn Code Walk Through January 2014
 
Splunk-hortonworks-risk-management-oct-2014
Splunk-hortonworks-risk-management-oct-2014Splunk-hortonworks-risk-management-oct-2014
Splunk-hortonworks-risk-management-oct-2014
 
Discover hdp 2.2 hdfs - final
Discover hdp 2.2   hdfs - finalDiscover hdp 2.2   hdfs - final
Discover hdp 2.2 hdfs - final
 
Hortonworks Technical Workshop - Operational Best Practices Workshop
Hortonworks Technical Workshop - Operational Best Practices WorkshopHortonworks Technical Workshop - Operational Best Practices Workshop
Hortonworks Technical Workshop - Operational Best Practices Workshop
 
Discover Enterprise Security Features in Hortonworks Data Platform 2.1: Apach...
Discover Enterprise Security Features in Hortonworks Data Platform 2.1: Apach...Discover Enterprise Security Features in Hortonworks Data Platform 2.1: Apach...
Discover Enterprise Security Features in Hortonworks Data Platform 2.1: Apach...
 
Discover HDP 2.2: Apache Falcon for Hadoop Data Governance
Discover HDP 2.2: Apache Falcon for Hadoop Data GovernanceDiscover HDP 2.2: Apache Falcon for Hadoop Data Governance
Discover HDP 2.2: Apache Falcon for Hadoop Data Governance
 
Hp Converged Systems and Hortonworks - Webinar Slides
Hp Converged Systems and Hortonworks - Webinar SlidesHp Converged Systems and Hortonworks - Webinar Slides
Hp Converged Systems and Hortonworks - Webinar Slides
 
Webinar - Accelerating Hadoop Success with Rapid Data Integration for the Mod...
Webinar - Accelerating Hadoop Success with Rapid Data Integration for the Mod...Webinar - Accelerating Hadoop Success with Rapid Data Integration for the Mod...
Webinar - Accelerating Hadoop Success with Rapid Data Integration for the Mod...
 
Rescue your Big Data from Downtime with HP Operations Bridge and Apache Hadoop
Rescue your Big Data from Downtime with HP Operations Bridge and Apache HadoopRescue your Big Data from Downtime with HP Operations Bridge and Apache Hadoop
Rescue your Big Data from Downtime with HP Operations Bridge and Apache Hadoop
 
Predictive Analytics and Machine Learning …with SAS and Apache Hadoop
Predictive Analytics and Machine Learning…with SAS and Apache HadoopPredictive Analytics and Machine Learning…with SAS and Apache Hadoop
Predictive Analytics and Machine Learning …with SAS and Apache Hadoop
 
Supporting Financial Services with a More Flexible Approach to Big Data
Supporting Financial Services with a More Flexible Approach to Big DataSupporting Financial Services with a More Flexible Approach to Big Data
Supporting Financial Services with a More Flexible Approach to Big Data
 

Semelhante a Combine SAS High-Performance Capabilities with Hadoop YARN

Developing YARN Applications - Integrating natively to YARN July 24 2014
Developing YARN Applications - Integrating natively to YARN July 24 2014Developing YARN Applications - Integrating natively to YARN July 24 2014
Developing YARN Applications - Integrating natively to YARN July 24 2014
Hortonworks
 
Apache Hadoop YARN: state of the union
Apache Hadoop YARN: state of the unionApache Hadoop YARN: state of the union
Apache Hadoop YARN: state of the union
DataWorks Summit
 
Bikas saha:the next generation of hadoop– hadoop 2 and yarn
Bikas saha:the next generation of hadoop– hadoop 2 and yarnBikas saha:the next generation of hadoop– hadoop 2 and yarn
Bikas saha:the next generation of hadoop– hadoop 2 and yarn
hdhappy001
 

Semelhante a Combine SAS High-Performance Capabilities with Hadoop YARN (20)

Running Services on YARN
Running Services on YARNRunning Services on YARN
Running Services on YARN
 
How YARN Enables Multiple Data Processing Engines in Hadoop
How YARN Enables Multiple Data Processing Engines in HadoopHow YARN Enables Multiple Data Processing Engines in Hadoop
How YARN Enables Multiple Data Processing Engines in Hadoop
 
MHUG - YARN
MHUG - YARNMHUG - YARN
MHUG - YARN
 
Overview of slider project
Overview of slider projectOverview of slider project
Overview of slider project
 
Hadoop - Looking to the Future By Arun Murthy
Hadoop - Looking to the Future By Arun MurthyHadoop - Looking to the Future By Arun Murthy
Hadoop - Looking to the Future By Arun Murthy
 
Developing YARN Applications - Integrating natively to YARN July 24 2014
Developing YARN Applications - Integrating natively to YARN July 24 2014Developing YARN Applications - Integrating natively to YARN July 24 2014
Developing YARN Applications - Integrating natively to YARN July 24 2014
 
Dataworks Berlin Summit 18' - Apache hadoop YARN State Of The Union
Dataworks Berlin Summit 18' - Apache hadoop YARN State Of The UnionDataworks Berlin Summit 18' - Apache hadoop YARN State Of The Union
Dataworks Berlin Summit 18' - Apache hadoop YARN State Of The Union
 
Apache Hadoop YARN: state of the union
Apache Hadoop YARN: state of the unionApache Hadoop YARN: state of the union
Apache Hadoop YARN: state of the union
 
Hadoop: Past, Present and Future - v2.2 - SQLSaturday #326 - Tampa BA Edition
Hadoop: Past, Present and Future - v2.2 - SQLSaturday #326 - Tampa BA EditionHadoop: Past, Present and Future - v2.2 - SQLSaturday #326 - Tampa BA Edition
Hadoop: Past, Present and Future - v2.2 - SQLSaturday #326 - Tampa BA Edition
 
Hoya for Code Review
Hoya for Code ReviewHoya for Code Review
Hoya for Code Review
 
Bikas saha:the next generation of hadoop– hadoop 2 and yarn
Bikas saha:the next generation of hadoop– hadoop 2 and yarnBikas saha:the next generation of hadoop– hadoop 2 and yarn
Bikas saha:the next generation of hadoop– hadoop 2 and yarn
 
YARN - Hadoop Next Generation Compute Platform
YARN - Hadoop Next Generation Compute PlatformYARN - Hadoop Next Generation Compute Platform
YARN - Hadoop Next Generation Compute Platform
 
Apache Hadoop YARN: Understanding the Data Operating System of Hadoop
Apache Hadoop YARN: Understanding the Data Operating System of HadoopApache Hadoop YARN: Understanding the Data Operating System of Hadoop
Apache Hadoop YARN: Understanding the Data Operating System of Hadoop
 
Get most out of Spark on YARN
Get most out of Spark on YARNGet most out of Spark on YARN
Get most out of Spark on YARN
 
Hadoop - Past, Present and Future - v2.0
Hadoop - Past, Present and Future - v2.0Hadoop - Past, Present and Future - v2.0
Hadoop - Past, Present and Future - v2.0
 
Introduction to the Hadoop EcoSystem
Introduction to the Hadoop EcoSystemIntroduction to the Hadoop EcoSystem
Introduction to the Hadoop EcoSystem
 
Apache Tez - A unifying Framework for Hadoop Data Processing
Apache Tez - A unifying Framework for Hadoop Data ProcessingApache Tez - A unifying Framework for Hadoop Data Processing
Apache Tez - A unifying Framework for Hadoop Data Processing
 
Discover.hdp2.2.storm and kafka.final
Discover.hdp2.2.storm and kafka.finalDiscover.hdp2.2.storm and kafka.final
Discover.hdp2.2.storm and kafka.final
 
Hadoop YARN Services
Hadoop YARN ServicesHadoop YARN Services
Hadoop YARN Services
 
Hadoop: Past, Present and Future - v2.1 - SQLSaturday #340
Hadoop: Past, Present and Future - v2.1 - SQLSaturday #340Hadoop: Past, Present and Future - v2.1 - SQLSaturday #340
Hadoop: Past, Present and Future - v2.1 - SQLSaturday #340
 

Mais de Hortonworks

Mais de Hortonworks (20)

Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next LevelHortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
 
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT StrategyIoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
 
Getting the Most Out of Your Data in the Cloud with Cloudbreak
Getting the Most Out of Your Data in the Cloud with CloudbreakGetting the Most Out of Your Data in the Cloud with Cloudbreak
Getting the Most Out of Your Data in the Cloud with Cloudbreak
 
Johns Hopkins - Using Hadoop to Secure Access Log Events
Johns Hopkins - Using Hadoop to Secure Access Log EventsJohns Hopkins - Using Hadoop to Secure Access Log Events
Johns Hopkins - Using Hadoop to Secure Access Log Events
 
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad GuysCatch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
 
HDF 3.2 - What's New
HDF 3.2 - What's NewHDF 3.2 - What's New
HDF 3.2 - What's New
 
Curing Kafka Blindness with Hortonworks Streams Messaging Manager
Curing Kafka Blindness with Hortonworks Streams Messaging ManagerCuring Kafka Blindness with Hortonworks Streams Messaging Manager
Curing Kafka Blindness with Hortonworks Streams Messaging Manager
 
Interpretation Tool for Genomic Sequencing Data in Clinical Environments
Interpretation Tool for Genomic Sequencing Data in Clinical EnvironmentsInterpretation Tool for Genomic Sequencing Data in Clinical Environments
Interpretation Tool for Genomic Sequencing Data in Clinical Environments
 
IBM+Hortonworks = Transformation of the Big Data Landscape
IBM+Hortonworks = Transformation of the Big Data LandscapeIBM+Hortonworks = Transformation of the Big Data Landscape
IBM+Hortonworks = Transformation of the Big Data Landscape
 
Premier Inside-Out: Apache Druid
Premier Inside-Out: Apache DruidPremier Inside-Out: Apache Druid
Premier Inside-Out: Apache Druid
 
Accelerating Data Science and Real Time Analytics at Scale
Accelerating Data Science and Real Time Analytics at ScaleAccelerating Data Science and Real Time Analytics at Scale
Accelerating Data Science and Real Time Analytics at Scale
 
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATATIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
 
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
 
Delivering Real-Time Streaming Data for Healthcare Customers: Clearsense
Delivering Real-Time Streaming Data for Healthcare Customers: ClearsenseDelivering Real-Time Streaming Data for Healthcare Customers: Clearsense
Delivering Real-Time Streaming Data for Healthcare Customers: Clearsense
 
Making Enterprise Big Data Small with Ease
Making Enterprise Big Data Small with EaseMaking Enterprise Big Data Small with Ease
Making Enterprise Big Data Small with Ease
 
Webinewbie to Webinerd in 30 Days - Webinar World Presentation
Webinewbie to Webinerd in 30 Days - Webinar World PresentationWebinewbie to Webinerd in 30 Days - Webinar World Presentation
Webinewbie to Webinerd in 30 Days - Webinar World Presentation
 
Driving Digital Transformation Through Global Data Management
Driving Digital Transformation Through Global Data ManagementDriving Digital Transformation Through Global Data Management
Driving Digital Transformation Through Global Data Management
 
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming FeaturesHDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
 
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
 
Unlock Value from Big Data with Apache NiFi and Streaming CDC
Unlock Value from Big Data with Apache NiFi and Streaming CDCUnlock Value from Big Data with Apache NiFi and Streaming CDC
Unlock Value from Big Data with Apache NiFi and Streaming CDC
 

Último

%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
masabamasaba
 
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
masabamasaba
 
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
Medical / Health Care (+971588192166) Mifepristone and Misoprostol tablets 200mg
 
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
masabamasaba
 

Último (20)

VTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnVTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learn
 
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park %in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
 
AI & Machine Learning Presentation Template
AI & Machine Learning Presentation TemplateAI & Machine Learning Presentation Template
AI & Machine Learning Presentation Template
 
WSO2CON 2024 - Navigating API Complexity: REST, GraphQL, gRPC, Websocket, Web...
WSO2CON 2024 - Navigating API Complexity: REST, GraphQL, gRPC, Websocket, Web...WSO2CON 2024 - Navigating API Complexity: REST, GraphQL, gRPC, Websocket, Web...
WSO2CON 2024 - Navigating API Complexity: REST, GraphQL, gRPC, Websocket, Web...
 
%in Benoni+277-882-255-28 abortion pills for sale in Benoni
%in Benoni+277-882-255-28 abortion pills for sale in Benoni%in Benoni+277-882-255-28 abortion pills for sale in Benoni
%in Benoni+277-882-255-28 abortion pills for sale in Benoni
 
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
 
Announcing Codolex 2.0 from GDK Software
Announcing Codolex 2.0 from GDK SoftwareAnnouncing Codolex 2.0 from GDK Software
Announcing Codolex 2.0 from GDK Software
 
%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand
 
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
 
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
 
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
Direct Style Effect Systems -The Print[A] Example- A Comprehension AidDirect Style Effect Systems -The Print[A] Example- A Comprehension Aid
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
 
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...
 
%in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park %in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park
 
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
 
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
 
tonesoftg
tonesoftgtonesoftg
tonesoftg
 
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
 
WSO2CON 2024 - API Management Usage at La Poste and Its Impact on Business an...
WSO2CON 2024 - API Management Usage at La Poste and Its Impact on Business an...WSO2CON 2024 - API Management Usage at La Poste and Its Impact on Business an...
WSO2CON 2024 - API Management Usage at La Poste and Its Impact on Business an...
 
WSO2Con2024 - Enabling Transactional System's Exponential Growth With Simplicity
WSO2Con2024 - Enabling Transactional System's Exponential Growth With SimplicityWSO2Con2024 - Enabling Transactional System's Exponential Growth With Simplicity
WSO2Con2024 - Enabling Transactional System's Exponential Growth With Simplicity
 
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
 

Combine SAS High-Performance Capabilities with Hadoop YARN

  • 1. We’ll get started soon… Q&A box is available for your questions Webinar will be recorded for future viewing Thank you for joining! © Hortonworks Inc. 2011 – 2014. All Rights Reserved
  • 2. Combine SAS High-Performance Capabilities with Hadoop YARN © Hortonworks Inc. 2011 – 2014. All Rights Reserved We do Hadoop.
  • 3. Your speakers… Arun Murthy, Founder and Architect Hortonworks @acmurthy Paul Kent, Vice President Big Data SAS @hornpolish © Hortonworks Inc. 2011 – 2014. All Rights Reserved
  • 4. Agenda • Introduction to YARN • SAS Workloads on the Cluster • SAS Workloads: Resource Settings • SAS and YARN • YARN Futures • Next Steps © Hortonworks Inc. 2011 – 2014. All Rights Reserved
  • 5. The 1st Generation of Hadoop: Batch HADOOP 1.0 Built for Web-Scale Batch Apps Single App INTERACTIVE Single App BATCH HDFS Single App BATCH HDFS © Hortonworks Inc. 2011 – 2014. All Rights Reserved • All other usage patterns must leverage that same infrastructure • Forces the creation of silos for managing mixed workloads Single App ONLINE Single App BATCH HDFS
  • 6. Hadoop MapReduce Classic JobTracker § Manages cluster resources and job scheduling TaskTracker § Per-node agent § Manage tasks © Hortonworks Inc. 2011 – 2014. All Rights Reserved
  • 7. MapReduce Classic: Limitations Scalability § Maximum Cluster size – 4,000 nodes § Maximum concurrent tasks – 40,000 § Coarse synchronization in JobTracker Availability § Failure kills all queued and running jobs Hard partition of resources into map and reduce slots § Low resource utilization Lacks support for alternate paradigms and services § Iterative applications implemented using MapReduce are 10x slower © Hortonworks Inc. 2011 – 2014. All Rights Reserved
  • 8. Our Vision: Hadoop as Next-Gen Platform Real-time HBase © Hortonworks Inc. 2011 – 2014. All Rights Reserved Tez YARN: Data Operating System (Cluster Resource Management) 1 ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° MapReduce (Cluster Resource Management & Data Processing) Script Pig SQL Hive Others Storm, Solr, etc. 1 ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° N HDFS (Hadoop Distributed File System) Script Pig SQL Hive Engines HBase Accumulo, Storm, Solr, Spark. Others ISV Engines TezTez Others Engines Tez Hadoop 1 • Silos & Largely batch • Single Processing engine Hadoop 2 w/ • Multiple Engines, Single Data Set • Batch, Interactive & Real-Time Java Cascading T ez ° ° ° ° ° ° ° ° N HDFS (Hadoop Distributed File System)
  • 9. YARN: Taking Hadoop Beyond Batch Applica,ons Run Na,vely IN Hadoop © Hortonworks Inc. 2011 – 2014. All Rights Reserved HDFS2 (Redundant, Reliable Storage) YARN (Cluster Resource Management) BATCH (MapReduce) INTERACTIVE (Tez) STREAMING (Storm, S4,…) GRAPH (Giraph) IN-­‐MEMORY (Spark) HPC MPI (OpenMPI) ONLINE (HBase) OTHER (Search) (Weave…) Store ALL DATA in one place… Interact with that data in MULTIPLE WAYS with Predictable Performance and Quality of Service
  • 10. YARN Hortonworks Data Platform Script Pig SQL Hive TezT ez Java Cascading T ez © Hortonworks Inc. 2011 – 2014. All Rights Reserved Accumulo NoSQL YARN: Data Operating System (Cluster Resource Management) Others Engines Tez 1 ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° HBase NoSQL Storm Stream Slider Sli der Others Engines Slider Slider ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° Spark In-Memory ° ° ° ° ° ° PaaS Kubernetes LASR HPA ° ° N ° ° ° ° ° ° HDFS (Hadoop Distributed File System) Batch MR
  • 11. 5 Key Benefits of YARN 1. Scale 2. New Programming Models 5 & Services 3. Improved cluster utilization 4. Agility 5. Beyond Java © Hortonworks Inc. 2011 – 2014. All Rights Reserved
  • 12. Concepts Application § Application is a temporal job or a service submitted YARN § Examples – Map Reduce Job (job) – Hbase Cluster (service) Container § Basic unit of allocation § Fine-grained resource allocation across multiple resource types (memory, cpu, disk, network, gpu etc.) – container_0 = 2GB, 1CPU – container_1 = 1GB, 6 CPU § Replaces the fixed map/reduce slots © Hortonworks Inc. 2011 – 2014. All Rights Reserved
  • 13. Design Centre Split up the two major functions of JobTracker § Cluster resource management § Application life-cycle management MapReduce becomes user-land library © Hortonworks Inc. 2011 – 2014. All Rights Reserved
  • 14. NodeManager NodeManager Container © Hortonworks Inc. 2011 – 2014. All Rights Reserved NodeManager NodeManager Container 1.1 Container 2.4 NodeManager NodeManager NodeManager NodeManager 1.2 NodeManager NodeManager NodeManager NodeManager Container 1.3 AM 1 Container 2.2 Container 2.1 Container 2.3 AM2 YARN Architecture - Walkthrough Client2 ResourceManager Scheduler
  • 15. Multi-Tenancy with YARN Economics as queue-capacity § Heirarchical Queues SLAs § Preemption Resource Isolation § Linux: cgroups § MS Windows: Job Control § Roadmap: Virtualization (Xen, KVM) Administration § Queue ACLs § Run-time re-configuration for queues § Charge-back © Hortonworks Inc. 2011 – 2014. All Rights Reserved ResourceManager Scheduler root Adhoc 10% DW 70% Mrkting 20% Dev 10% Reserved 20% Prod 70% Prod 80% Dev 20% P0 70% P1 30% Capacity Scheduler Hierarchical Queues
  • 16. YARN Applications Data processing applications and services § Services - Slider § Real-time event processing – Storm, S4, other commercial platforms § Tez – Generic framework to run a complex DAG § MPI: OpenMPI, MPICH2 § Master-Worker § Machine Learning: Spark § Graph processing: Giraph § Enabled by allowing the use of paradigm-specific application master Run all on the same Hadoop cluster! © Hortonworks Inc. 2011 – 2014. All Rights Reserved
  • 17. SHARE! Customers are: wrapping up POCs building Bigger Clusters assembling their Data { Lake, Reservoir } want their software to SHARE the cluster Copyright © 2014, SAS Institute Inc. All rights reserved.
  • 18. SAS Workloads on the Cluster Copyright © 2014, SAS Institute Inc. All rights reserved.
  • 19. SAS Workloads on the Cluster - Video Copyright © 2014, SAS Institute Inc. All rights reserved.
  • 20. SAS Workloads on the Cluster Some Requests are for a significant slice of the cluster Reservation will be ALL DAY, ALL WEEK, ALL MONTH? Memory typically fixed (15% of cluster) CPU floor, would like the spare capacity when available Some Requests are more short term Memory can be estimated Duration can be capped CPU floor, would like spare capacity Copyright © 2014, SAS Institute Inc. All rights reserved.
  • 21. SAS Workloads on the Cluster Copyright © 2014, SAS Institute Inc. All rights reserved.
  • 22. SAS Workloads – Resource Settings How much should you reserve? not a perfect science yet Long Running? LASR server by percent of total memory More like a batch request? HPA procedure by anecdotal experience Copyright © 2014, SAS Institute Inc. All rights reserved.
  • 23. SAS Workloads – Resource Settings if [ "$USER" = "lasradm" ]; then # Custom settings for running under the lasradm account. export TKMPI_ULIMIT="-v 50000000” export TKMPI_MEMSIZE=50000 export TKMPI_CGROUP="cgexec -g cpu:75” fi # if [ "$TKMPI_APPNAME" = "lasr" ]; then # Custom settings for a lasr process running under any account. # export TKMPI_ULIMIT="-v 50000000" # export TKMPI_MEMSIZE=50000 # export TKMPI_CGROUP="cgexec -g cpu:75" Copyright © 2014, SAS Institute Inc. All rights reserved.
  • 24. YARN: Taking Hadoop Beyond Batch Store ALL DATA in one place… Interact with that data in MULTIPLE WAYS with Predictable Performance and Quality of Service Applica,ons Run Na,vely IN Hadoop © Hortonworks Inc. 2011 – 2014. All Rights Reserved HDFS2 YARN (Redundant, Reliable Storage) BATCH (MapReduce) INTERACTIVE (Tez) STREAMING (Storm, S4,…) GRAPH (Giraph) IN-­‐MEMORY (Spark) ONLINE (HBase)
  • 25. YARN Futures © Hortonworks Inc. 2011 – 2014. All Rights Reserved
  • 26. YARN – Delegated Container Model © Hortonworks Inc. 2011 – 2014. All Rights Reserved NodeManager NodeManager Container ResourceManager 1.1 NodeManager NodeManager AM 1 startContainer! Scheduler NodeManager NodeManager NodeManager NodeManager NodeManager NodeManager NodeManager NodeManager 1 allocate! container! 2 3
  • 27. YARN – Delegated Container Model © Hortonworks Inc. 2011 – 2014. All Rights Reserved NodeManager NodeManager ResourceManager ServiceX NodeManager NodeManager AM 1 delegateContainer! Scheduler NodeManager NodeManager NodeManager NodeManager NodeManager NodeManager NodeManager NodeManager 1 allocate! 2 container! 3 4
  • 28. YARN – Delegated Container Model © Hortonworks Inc. 2011 – 2014. All Rights Reserved NodeManager NodeManager ServiceX NodeManager NodeManager AM 1 ResourceManager Scheduler NodeManager NodeManager NodeManager NodeManager NodeManager NodeManager NodeManager NodeManager 5
  • 29. YARN – Delegated Container Model © Hortonworks Inc. 2011 – 2014. All Rights Reserved NodeManager NodeManager NodeManager NodeManager AM 1 ResourceManager Scheduler NodeManager NodeManager NodeManager NodeManager NodeManager NodeManager NodeManager NodeManager 6 ServiceX
  • 30. PaaS - Kubernetes-on-YARN YARN as the default enterprise-class scheduler and resource manager for Kubernetes and OpenShift 3 q First class support for containerization and mainstream PaaS q Updated go language bindings for YARN q Uses container delegation model © Hortonworks Inc. 2011 – 2014. All Rights Reserved
  • 31. Labels – Constraint Specifications NodeManager NodeManager © Hortonworks Inc. 2011 – 2014. All Rights Reserved NodeManager NodeManager w/ GPU map 1.1 NodeManager NodeManager NodeManager w/ GPU NodeManager w/ GPU NodeManager NodeManager NodeManager NodeManager w/ GPU map1.2 reduce1.1 MR AM 1 DL1.1 DL1.2 DL1.3 DL-­‐AM ResourceManager Scheduler
  • 32. Reservations - SLAs via Allocation Planning © Hortonworks Inc. 2011 – 2014. All Rights Reserved
  • 33. YARN Hortonworks Data Platform Script Pig SQL Hive TezT ez Java Cascading T ez © Hortonworks Inc. 2011 – 2014. All Rights Reserved Accumulo NoSQL YARN: Data Operating System (Cluster Resource Management) Others Engines Tez 1 ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° HBase NoSQL Storm Stream Slider Sli der Others Engines Slider Slider ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° Spark In-Memory ° ° ° ° ° ° PaaS Kubernetes LASR HPA ° ° N ° ° ° ° ° ° HDFS (Hadoop Distributed File System) Batch MR
  • 34. Next Steps… More about SAS & Hortonworks http://hortonworks.com/partner/SAS/ Download the Hortonworks Sandbox Learn Hadoop Build Your Analytic App Try Hadoop 2 Contact us: events@hortonworks.com © Hortonworks Inc. 2011 – 2014. All Rights Reserved