SlideShare uma empresa Scribd logo
1 de 41
© Hortonworks Inc. 2014
Apache Hadoop YARN
Present and Future
Vinod Kumar Vavilapalli
vinodkv [at] apache.org
@tshooter
Jian He
jianhe [at] apache.org
Page 1
© Hortonworks Inc. 2014
Who are we?
• Vinod Kumar Vavilapalli
– 7 Hadoop-years old
– Previously @Yahoo!, now @Hortonworks
– Hadoop MapReduce and YARN Development lead & Architect at Hortonworks
– Apache Hadoop YARN project lead
– Apache Hadoop PMC, Apache Member
– 99% + code in Apache, Hadoop
• Jian He
– Software Engineer @ Hortonworks
– Apache Hadoop Committer
– Masters Degree from Brown University.
– Focus on YARN/MapReduce
Page 2
Architecting the Future of Big Data
© Hortonworks Inc. 2014
A quick show of hands..
• Hadoop 1
• Hadoop 2 & YARN
• YARN for MapReduce2
• YARN for beyond MR2
Page 3
Architecting the Future of Big Data
© Hortonworks Inc. 2014
Agenda
• Apache Hadoop 2 : Overview
• Community
• Present
• Future
Page 4
Architecting the Future of Big Data
© Hortonworks Inc. 2014
Apache Hadoop 2
Next Generation Architecture
Architecting the Future of Big Data
Page 5
© Hortonworks Inc. 2014
YARN: the Data Operating System
Page 6
Architecting the Future of Big Data
• Resource Management Platform
• MapReduce v2
• Beyond MapReduce with Tez, Storm, Spark; in Hadoop!
• Did I mention Services like HBase, Accumulo on YARN with Apache Slider?
© Hortonworks Inc. 2014
Why?
• 2.0 >= 2 * 1.0
– YARN: Next generation architecture
• Scale
• Agility
• Return on Investment: 2x throughput on same hardware!
• Ready for improvements in hardware
• Not convinced? Let’s see what others are saying!
Page 7
Architecting the Future of Big Data
© Hortonworks Inc. 2014
Yahoo!
• Leader/Visionary on all things Hadoop!
• On YARN (0.23.x)
• Moving fast to 2.x
Page 8
Architecting the Future of Big Data
http://developer.yahoo.com/blogs/ydn/hadoop-yahoo-more-ever-54421.html
© Hortonworks Inc. 2014
Twitter
Page 9
Architecting the Future of Big Data
Talk: “ Hadoop 2 @Twitter, Elephant Scale”
By: Lohit Vijayarenu & Gera Shegalov
© Hortonworks Inc. 2014
Ebay
• Has one of the largest Hadoop clusters in the industry with tens-
hundreds petabytes of data
• Migrated production clusters to Hadoop-2
Page 10
Architecting the Future of Big Data
© Hortonworks Inc. 2014
YARN Community
At Apache Software Foundation
Architecting the Future of Big Data
Page 11
© Hortonworks Inc. 2014
YARN contributions
Page 12
Architecting the Future of Big Data
0
50
100
150
200
250
300
350
400
2.0.x 2.1.x 2.2.x 2.3.x 2.4.x 2.x trunk
YARN Releases - 06/02/14
YARN Releases - 06/02/14
© Hortonworks Inc. 2014
Contributors
• 104 and counting
• Few ‘big’ contributors
• And a long tail
Page 13
Architecting the Future of Big Data
0
10
20
30
40
50
60
70
80
90
100
© Hortonworks Inc. 2014
Present
Architecting the Future of Big Data
Page 14
© Hortonworks Inc. 2014
Apache Hadoop releases
• 15 October, 2013
• The 1st GA release of Apache Hadoop 2.x
• YARN
– First stable and supported release of YARN
– YARN level APIs solidified for the future
– Binary Compatibility for MapReduce applications built on hadoop-1.x
– Performance
– Scale!
• Support for running Hadoop on Microsoft Windows
• Substantial amount of integration testing with rest of projects in the
ecosystem
– Pig, Hive, Oozie, HBase..
Page 15
Architecting the Future of Big Data
Apache Hadoop 2.2
© Hortonworks Inc. 2014
Apache Hadoop releases (contd)
• 24 February, 2014
• First post GA release for the year 2014
• Alpha features in YARN
– ResourceManager High Availability
– Application History Server
– Will be covered in detail in the 2.4 section
• Number of bug-fixes, enhancements
Page 16
Architecting the Future of Big Data
Apache Hadoop 2.3
© Hortonworks Inc. 2014
Apache Hadoop releases (contd)
• 7 April, 2014
• Most recent release
• Stabilizing features in YARN
– Details follow
– ResourceManager HA
– YARN Timeline Server (beyond history server)
– Preemption in YARN CapacityScheduler
– Container-preserving AM recovery.
Page 17
Architecting the Future of Big Data
Apache Hadoop 2.4
© Hortonworks Inc. 2014
ResourceManager High Availability
Page 18
Architecting the Future of Big Data
• RM – single point of failure
• Goal : Downtime invisible to end-users
– Apps not required to be re-submitted
– NMs to rebind with newly started RM
• Two stories:
– Recovery of state
– Failover
© Hortonworks Inc. 2014
ResourceManager High Availability
Page 19
Architecting the Future of Big Data
• Active/Standby
o Leader election
(ZooKeeper)
• Standby on transition to
Active loads all the
state from the state
store.
• NM, AM, clients, redirect
to the new RM
o RMProxy lib
Talk: Highly Available Resource Management for YARN
By: Karthik Kambatla, Xuan Gong
© Hortonworks Inc. 2014
YARN Timeline Server
• Few MR specific implementations: History and web-UI
• YARN: Not just MR anymore!
• Previous state
– MapReduce specific Job History Server
– YARN level ‘History’ lost beyond ResourceManager Restart
Page 20
Architecting the Future of Big Data
© Hortonworks Inc. 2014
YARN Timeline Server (contd)
Page 21
Entity and Event
collection
RM and Applications periodically send events to
Timeline sever
Pluggable store Depending on site requirements
REST APIs or RPC
Applications and user-interfaces can access
information via REST/ RPC
Visualizations
Users can build tools and visualizations using the
APIs
Apps and System
Applications as well as the system
entities/events
© Hortonworks Inc. 2014
YARN Timeline Server (contd)
Page 22
Architecting the Future of Big Data
YARN
Timeline
Serv`er
App1
App2
RM
Custom App
monitoring
client
RPC
REST API
Events
Events
AMBARI
Events
Talk: “Analyzing Historical Data of Applications on Hadoop
YARN: for Fun and Profit”
By: Zhijie Shen, Mayank Bansal
© Hortonworks Inc. 2014
Capacity Scheduler Preemption
• Enforce
SLAs
• Preempt
across
queues
• Current Capacity
• Guaranteed Capacity
Gather Queue State
STEP1
• Select applications to preempt: Over
cap. Qs
Identify preemptions
STEP2
• Issue preemptions for containers to
application
Issue preemptions
STEP3
• Track containers that have been issued
by not yet executed preemption
• Forcibly kill these containers after
timeout
Kill containers
STEP4
© Hortonworks Inc. 2014
Capacity Scheduler Preemption (Contd)
Application Scheduler
Page 24
Architecting the Future of Big Data
Premptions
Release Resource
Premptions
Kill containers forcibly
after timeout
x
© Hortonworks Inc. 2014
Container-preserving AM restart
• Problem
– Containers are killed when AM goes down.
– New AM needs to know where the previous containers are running
– Previous containers need to know about the new AM. (WIP)
Page 25
Architecting the Future of Big Data
Container1
Container2
Container3
AM1
AM2
restart
© Hortonworks Inc. 2014
Apache Hadoop releases (contd)
• Next releases
– 2.4.1
– 2.5.x
• YARN
– Details follow in future’s section
– ResourceManager work-preserving restart for High Availability
– YARN Timeline Server security & enhancement.
– Lots more
Page 26
Architecting the Future of Big Data
Apache Hadoop 2.5.x
© Hortonworks Inc. 2014
Future
Architecting the Future of Big Data
Page 27
© Hortonworks Inc. 2014
Future: Operational enhancements
• Rolling upgrades
– No/minimal impact to users
– Ideal: Always rolling!
• HDFS upgrades effort is in
• YARN
– RM restart
– NM restart
– Upgrades
Page 28
Architecting the Future of Big Data
Talk: “Hadoop Rolling Upgrades – Taking Availability to the Next Level”
By: Suresh Srinvias, Hortonworks & Jason Lowe Yahoo!
© Hortonworks Inc. 2014
Future: Enabling apps
• Beyond MapReduce
– Apache Tez, Apache Slider, Apache Storm.
• Discussing next
– Long running services
– Multi-dimensional resource scheduling
– Isolation
– Web services
Page 29
Architecting the Future of Big Data
© Hortonworks Inc. 2014
Future: Long running services
• You can run them already!
• Few enhancements needed
– Logs
– Security
– Management/monitoring
• Resource sharing across workload types
Page 30
Architecting the Future of Big Data
Talk: “ Bring your Service to YARN”
By: Sumit Mohanty
© Hortonworks Inc. 2014
Multi-resource scheduling
• Today – memory & cpu
– Physical memory / virtual memory
– CPU Cores – Virtual cores
• CPU stuff: More bake in
• Disks
– Space
– IOPS
• Network
Page 31
Architecting the Future of Big Data
© Hortonworks Inc. 2014
Fine-grain isolation for multi-tenancy
• Custom memory-monitoring
• Cgroups
• Linux Containers
• VMs
Page 32
Architecting the Future of Big Data
© Hortonworks Inc. 2014
Other features
• Application SLAs
– Run my application at 6:00 AM tomorrow and guarantee capacity for me!
• Node labels
– Some of the nodes in my cluster have specialized hardware, give them to me!
• Node affinity/anti-affinity
– Get me on to the nodes where my data is
– Get me off of this node
• Better online queue-management
– Centralized
– Quality feedback
• Web-services
– RESTful APIs for submitting, monitoring and killing apps
– Beyond java-only clients
Page 33
Architecting the Future of Big Data
© Hortonworks Inc. 2014
YARN Ecosystem
Beyond the core YARN project: Briefly
Architecting the Future of Big Data
Page 34
© Hortonworks Inc. 2014
Eco-system
Page 35
Classic Apache Hadoop
MapReduce – Batch
Batch & Interactive
• Apache Tez –
Batch/Interactive
Stream Processing
• Apache Storm
• Apache Samza
Apache Spark – Iterative
applications
YARN Frameworks
• Apache Twill
• Microsoft REEF
There's an app for that...
YARN App Marketplace!
Existing apps
• Apache Slider
Graph Processing
• Apache Giraph
Applications Powered by YARN
Talk: Apache Tez - A New Chapter in Hadoop Data Processing”
By Bikas Saha, Hitesh Shah
© Hortonworks Inc. 2014
Recap
Architecting the Future of Big Data
Page 36
© Hortonworks Inc. 2014
Recap
Page 37
Architecting the Future of Big Data
• YARN helps Apache Hadoop 2 to be twice as good!
• Exciting journey with Hadoop for this decade…
– Hadoop is no longer a one-trick pony, err elephant
– Beyond just MapReduce
• Hadoop 2: Architecture for the future
– Centralized data, multiple apps
• Lots of exciting new features
– Exciting spectrum of application types, workloads and use-cases
© Hortonworks Inc. 2014
Couple more things..
Architecting the Future of Big Data
Page 38
© Hortonworks Inc. 2014
The Book is out!
Page 39
Architecting the Future of Big Data
© Hortonworks Inc. 2014
Page 40
Architecting the Future of Big Data
© Hortonworks Inc. 2014
Thank you!
Page 41
Download Sandbox: Experience Apache Hadoop
Both 2.x and 1.x Versions Available!
http://hortonworks.com/products/hortonworks-sandbox/
Questions Time!

Mais conteúdo relacionado

Mais procurados

Apache Hadoop YARN – Multi-Tenancy, Capacity Scheduler & Preemption - Stamped...
Apache Hadoop YARN – Multi-Tenancy, Capacity Scheduler & Preemption - Stamped...Apache Hadoop YARN – Multi-Tenancy, Capacity Scheduler & Preemption - Stamped...
Apache Hadoop YARN – Multi-Tenancy, Capacity Scheduler & Preemption - Stamped...StampedeCon
 
Hadoop Summit Europe Talk 2014: Apache Hadoop YARN: Present and Future
Hadoop Summit Europe Talk 2014: Apache Hadoop YARN: Present and FutureHadoop Summit Europe Talk 2014: Apache Hadoop YARN: Present and Future
Hadoop Summit Europe Talk 2014: Apache Hadoop YARN: Present and FutureVinod Kumar Vavilapalli
 
Apache Hadoop YARN
Apache Hadoop YARNApache Hadoop YARN
Apache Hadoop YARNAdam Kawa
 
Apache Hadoop YARN - The Future of Data Processing with Hadoop
Apache Hadoop YARN - The Future of Data Processing with HadoopApache Hadoop YARN - The Future of Data Processing with Hadoop
Apache Hadoop YARN - The Future of Data Processing with HadoopHortonworks
 
Enabling Diverse Workload Scheduling in YARN
Enabling Diverse Workload Scheduling in YARNEnabling Diverse Workload Scheduling in YARN
Enabling Diverse Workload Scheduling in YARNDataWorks Summit
 
Apache Hadoop YARN - Enabling Next Generation Data Applications
Apache Hadoop YARN - Enabling Next Generation Data ApplicationsApache Hadoop YARN - Enabling Next Generation Data Applications
Apache Hadoop YARN - Enabling Next Generation Data ApplicationsHortonworks
 
Developing Applications with Hadoop 2.0 and YARN by Abhijit Lele
Developing Applications with Hadoop 2.0 and YARN by Abhijit Lele Developing Applications with Hadoop 2.0 and YARN by Abhijit Lele
Developing Applications with Hadoop 2.0 and YARN by Abhijit Lele Hakka Labs
 
YARN - Hadoop's Resource Manager
YARN - Hadoop's Resource ManagerYARN - Hadoop's Resource Manager
YARN - Hadoop's Resource ManagerVertiCloud Inc
 
Hadoop Summit San Jose 2015: Towards SLA-based Scheduling on YARN Clusters
Hadoop Summit San Jose 2015: Towards SLA-based Scheduling on YARN Clusters Hadoop Summit San Jose 2015: Towards SLA-based Scheduling on YARN Clusters
Hadoop Summit San Jose 2015: Towards SLA-based Scheduling on YARN Clusters Sumeet Singh
 
An Introduction to Apache Hadoop Yarn
An Introduction to Apache Hadoop YarnAn Introduction to Apache Hadoop Yarn
An Introduction to Apache Hadoop YarnMike Frampton
 
Apache Hadoop YARN: Past, Present and Future
Apache Hadoop YARN: Past, Present and FutureApache Hadoop YARN: Past, Present and Future
Apache Hadoop YARN: Past, Present and FutureDataWorks Summit
 
Hadoop YARN | Hadoop YARN Architecture | Hadoop YARN Tutorial | Hadoop Tutori...
Hadoop YARN | Hadoop YARN Architecture | Hadoop YARN Tutorial | Hadoop Tutori...Hadoop YARN | Hadoop YARN Architecture | Hadoop YARN Tutorial | Hadoop Tutori...
Hadoop YARN | Hadoop YARN Architecture | Hadoop YARN Tutorial | Hadoop Tutori...Simplilearn
 
NextGen Apache Hadoop MapReduce
NextGen Apache Hadoop MapReduceNextGen Apache Hadoop MapReduce
NextGen Apache Hadoop MapReduceHortonworks
 
Introduction to YARN Apps
Introduction to YARN AppsIntroduction to YARN Apps
Introduction to YARN AppsCloudera, Inc.
 
Running Non-MapReduce Big Data Applications on Apache Hadoop
Running Non-MapReduce Big Data Applications on Apache HadoopRunning Non-MapReduce Big Data Applications on Apache Hadoop
Running Non-MapReduce Big Data Applications on Apache Hadoophitesh1892
 
YARN - Hadoop Next Generation Compute Platform
YARN - Hadoop Next Generation Compute PlatformYARN - Hadoop Next Generation Compute Platform
YARN - Hadoop Next Generation Compute PlatformBikas Saha
 
Developing YARN Applications - Integrating natively to YARN July 24 2014
Developing YARN Applications - Integrating natively to YARN July 24 2014Developing YARN Applications - Integrating natively to YARN July 24 2014
Developing YARN Applications - Integrating natively to YARN July 24 2014Hortonworks
 
Lessons Learned from Migration of a Large-analytics Platform from MPP Databas...
Lessons Learned from Migration of a Large-analytics Platform from MPP Databas...Lessons Learned from Migration of a Large-analytics Platform from MPP Databas...
Lessons Learned from Migration of a Large-analytics Platform from MPP Databas...DataWorks Summit
 

Mais procurados (20)

Apache Hadoop YARN – Multi-Tenancy, Capacity Scheduler & Preemption - Stamped...
Apache Hadoop YARN – Multi-Tenancy, Capacity Scheduler & Preemption - Stamped...Apache Hadoop YARN – Multi-Tenancy, Capacity Scheduler & Preemption - Stamped...
Apache Hadoop YARN – Multi-Tenancy, Capacity Scheduler & Preemption - Stamped...
 
Yarns About Yarn
Yarns About YarnYarns About Yarn
Yarns About Yarn
 
Hadoop Summit Europe Talk 2014: Apache Hadoop YARN: Present and Future
Hadoop Summit Europe Talk 2014: Apache Hadoop YARN: Present and FutureHadoop Summit Europe Talk 2014: Apache Hadoop YARN: Present and Future
Hadoop Summit Europe Talk 2014: Apache Hadoop YARN: Present and Future
 
Apache Hadoop YARN
Apache Hadoop YARNApache Hadoop YARN
Apache Hadoop YARN
 
Apache Hadoop YARN - The Future of Data Processing with Hadoop
Apache Hadoop YARN - The Future of Data Processing with HadoopApache Hadoop YARN - The Future of Data Processing with Hadoop
Apache Hadoop YARN - The Future of Data Processing with Hadoop
 
Enabling Diverse Workload Scheduling in YARN
Enabling Diverse Workload Scheduling in YARNEnabling Diverse Workload Scheduling in YARN
Enabling Diverse Workload Scheduling in YARN
 
Apache Hadoop YARN - Enabling Next Generation Data Applications
Apache Hadoop YARN - Enabling Next Generation Data ApplicationsApache Hadoop YARN - Enabling Next Generation Data Applications
Apache Hadoop YARN - Enabling Next Generation Data Applications
 
Yarn
YarnYarn
Yarn
 
Developing Applications with Hadoop 2.0 and YARN by Abhijit Lele
Developing Applications with Hadoop 2.0 and YARN by Abhijit Lele Developing Applications with Hadoop 2.0 and YARN by Abhijit Lele
Developing Applications with Hadoop 2.0 and YARN by Abhijit Lele
 
YARN - Hadoop's Resource Manager
YARN - Hadoop's Resource ManagerYARN - Hadoop's Resource Manager
YARN - Hadoop's Resource Manager
 
Hadoop Summit San Jose 2015: Towards SLA-based Scheduling on YARN Clusters
Hadoop Summit San Jose 2015: Towards SLA-based Scheduling on YARN Clusters Hadoop Summit San Jose 2015: Towards SLA-based Scheduling on YARN Clusters
Hadoop Summit San Jose 2015: Towards SLA-based Scheduling on YARN Clusters
 
An Introduction to Apache Hadoop Yarn
An Introduction to Apache Hadoop YarnAn Introduction to Apache Hadoop Yarn
An Introduction to Apache Hadoop Yarn
 
Apache Hadoop YARN: Past, Present and Future
Apache Hadoop YARN: Past, Present and FutureApache Hadoop YARN: Past, Present and Future
Apache Hadoop YARN: Past, Present and Future
 
Hadoop YARN | Hadoop YARN Architecture | Hadoop YARN Tutorial | Hadoop Tutori...
Hadoop YARN | Hadoop YARN Architecture | Hadoop YARN Tutorial | Hadoop Tutori...Hadoop YARN | Hadoop YARN Architecture | Hadoop YARN Tutorial | Hadoop Tutori...
Hadoop YARN | Hadoop YARN Architecture | Hadoop YARN Tutorial | Hadoop Tutori...
 
NextGen Apache Hadoop MapReduce
NextGen Apache Hadoop MapReduceNextGen Apache Hadoop MapReduce
NextGen Apache Hadoop MapReduce
 
Introduction to YARN Apps
Introduction to YARN AppsIntroduction to YARN Apps
Introduction to YARN Apps
 
Running Non-MapReduce Big Data Applications on Apache Hadoop
Running Non-MapReduce Big Data Applications on Apache HadoopRunning Non-MapReduce Big Data Applications on Apache Hadoop
Running Non-MapReduce Big Data Applications on Apache Hadoop
 
YARN - Hadoop Next Generation Compute Platform
YARN - Hadoop Next Generation Compute PlatformYARN - Hadoop Next Generation Compute Platform
YARN - Hadoop Next Generation Compute Platform
 
Developing YARN Applications - Integrating natively to YARN July 24 2014
Developing YARN Applications - Integrating natively to YARN July 24 2014Developing YARN Applications - Integrating natively to YARN July 24 2014
Developing YARN Applications - Integrating natively to YARN July 24 2014
 
Lessons Learned from Migration of a Large-analytics Platform from MPP Databas...
Lessons Learned from Migration of a Large-analytics Platform from MPP Databas...Lessons Learned from Migration of a Large-analytics Platform from MPP Databas...
Lessons Learned from Migration of a Large-analytics Platform from MPP Databas...
 

Destaque

Application Timeline Server - Past, Present and Future
Application Timeline Server - Past, Present and FutureApplication Timeline Server - Past, Present and Future
Application Timeline Server - Past, Present and FutureVARUN SAXENA
 
How to Increase Performance of Your Hadoop Cluster
How to Increase Performance of Your Hadoop ClusterHow to Increase Performance of Your Hadoop Cluster
How to Increase Performance of Your Hadoop ClusterAltoros
 
Ceph Day SF 2015 - SysAdmin's Toolbox: Tools for Running Ceph in Production
Ceph Day SF 2015 - SysAdmin's Toolbox: Tools for Running Ceph in Production Ceph Day SF 2015 - SysAdmin's Toolbox: Tools for Running Ceph in Production
Ceph Day SF 2015 - SysAdmin's Toolbox: Tools for Running Ceph in Production Ceph Community
 
Improving Hadoop Cluster Performance via Linux Configuration
Improving Hadoop Cluster Performance via Linux ConfigurationImproving Hadoop Cluster Performance via Linux Configuration
Improving Hadoop Cluster Performance via Linux ConfigurationDataWorks Summit
 
Discover HDP 2.1: Apache Hadoop 2.4.0, YARN & HDFS
Discover HDP 2.1: Apache Hadoop 2.4.0, YARN & HDFSDiscover HDP 2.1: Apache Hadoop 2.4.0, YARN & HDFS
Discover HDP 2.1: Apache Hadoop 2.4.0, YARN & HDFSHortonworks
 
Accelerate Big Data Application Development with Cascading and HDP, Hortonwor...
Accelerate Big Data Application Development with Cascading and HDP, Hortonwor...Accelerate Big Data Application Development with Cascading and HDP, Hortonwor...
Accelerate Big Data Application Development with Cascading and HDP, Hortonwor...Hortonworks
 
Alluxio Keynote at Strata+Hadoop World Beijing 2016
Alluxio Keynote at Strata+Hadoop World Beijing 2016Alluxio Keynote at Strata+Hadoop World Beijing 2016
Alluxio Keynote at Strata+Hadoop World Beijing 2016Alluxio, Inc.
 
Apache Hadoop YARN: Understanding the Data Operating System of Hadoop
Apache Hadoop YARN: Understanding the Data Operating System of HadoopApache Hadoop YARN: Understanding the Data Operating System of Hadoop
Apache Hadoop YARN: Understanding the Data Operating System of HadoopHortonworks
 
Hadoop configuration & performance tuning
Hadoop configuration & performance tuningHadoop configuration & performance tuning
Hadoop configuration & performance tuningVitthal Gogate
 
Hadoop introduction , Why and What is Hadoop ?
Hadoop introduction , Why and What is  Hadoop ?Hadoop introduction , Why and What is  Hadoop ?
Hadoop introduction , Why and What is Hadoop ?sudhakara st
 
How to Become a Thought Leader in Your Niche
How to Become a Thought Leader in Your NicheHow to Become a Thought Leader in Your Niche
How to Become a Thought Leader in Your NicheLeslie Samuel
 

Destaque (14)

Application Timeline Server - Past, Present and Future
Application Timeline Server - Past, Present and FutureApplication Timeline Server - Past, Present and Future
Application Timeline Server - Past, Present and Future
 
How to Increase Performance of Your Hadoop Cluster
How to Increase Performance of Your Hadoop ClusterHow to Increase Performance of Your Hadoop Cluster
How to Increase Performance of Your Hadoop Cluster
 
Ceph Day SF 2015 - SysAdmin's Toolbox: Tools for Running Ceph in Production
Ceph Day SF 2015 - SysAdmin's Toolbox: Tools for Running Ceph in Production Ceph Day SF 2015 - SysAdmin's Toolbox: Tools for Running Ceph in Production
Ceph Day SF 2015 - SysAdmin's Toolbox: Tools for Running Ceph in Production
 
Improving Hadoop Cluster Performance via Linux Configuration
Improving Hadoop Cluster Performance via Linux ConfigurationImproving Hadoop Cluster Performance via Linux Configuration
Improving Hadoop Cluster Performance via Linux Configuration
 
HDP2 and YARN operations point
HDP2 and YARN operations pointHDP2 and YARN operations point
HDP2 and YARN operations point
 
Discover HDP 2.1: Apache Hadoop 2.4.0, YARN & HDFS
Discover HDP 2.1: Apache Hadoop 2.4.0, YARN & HDFSDiscover HDP 2.1: Apache Hadoop 2.4.0, YARN & HDFS
Discover HDP 2.1: Apache Hadoop 2.4.0, YARN & HDFS
 
Accelerate Big Data Application Development with Cascading and HDP, Hortonwor...
Accelerate Big Data Application Development with Cascading and HDP, Hortonwor...Accelerate Big Data Application Development with Cascading and HDP, Hortonwor...
Accelerate Big Data Application Development with Cascading and HDP, Hortonwor...
 
Optimizing Hive Queries
Optimizing Hive QueriesOptimizing Hive Queries
Optimizing Hive Queries
 
Alluxio Keynote at Strata+Hadoop World Beijing 2016
Alluxio Keynote at Strata+Hadoop World Beijing 2016Alluxio Keynote at Strata+Hadoop World Beijing 2016
Alluxio Keynote at Strata+Hadoop World Beijing 2016
 
Apache Hadoop YARN: Understanding the Data Operating System of Hadoop
Apache Hadoop YARN: Understanding the Data Operating System of HadoopApache Hadoop YARN: Understanding the Data Operating System of Hadoop
Apache Hadoop YARN: Understanding the Data Operating System of Hadoop
 
Cisco OpenSOC
Cisco OpenSOCCisco OpenSOC
Cisco OpenSOC
 
Hadoop configuration & performance tuning
Hadoop configuration & performance tuningHadoop configuration & performance tuning
Hadoop configuration & performance tuning
 
Hadoop introduction , Why and What is Hadoop ?
Hadoop introduction , Why and What is  Hadoop ?Hadoop introduction , Why and What is  Hadoop ?
Hadoop introduction , Why and What is Hadoop ?
 
How to Become a Thought Leader in Your Niche
How to Become a Thought Leader in Your NicheHow to Become a Thought Leader in Your Niche
How to Become a Thought Leader in Your Niche
 

Semelhante a Apache Hadoop YARN: Present and Future

Apache Hadoop YARN: state of the union
Apache Hadoop YARN: state of the unionApache Hadoop YARN: state of the union
Apache Hadoop YARN: state of the unionDataWorks Summit
 
Dataworks Berlin Summit 18' - Apache hadoop YARN State Of The Union
Dataworks Berlin Summit 18' - Apache hadoop YARN State Of The UnionDataworks Berlin Summit 18' - Apache hadoop YARN State Of The Union
Dataworks Berlin Summit 18' - Apache hadoop YARN State Of The UnionWangda Tan
 
Hadoop Summit San Jose 2015: YARN - Past, Present and Future
Hadoop Summit San Jose 2015: YARN - Past, Present and FutureHadoop Summit San Jose 2015: YARN - Past, Present and Future
Hadoop Summit San Jose 2015: YARN - Past, Present and FutureVinod Kumar Vavilapalli
 
Apache Hadoop YARN: State of the Union
Apache Hadoop YARN: State of the UnionApache Hadoop YARN: State of the Union
Apache Hadoop YARN: State of the UnionDataWorks Summit
 
Get Started Building YARN Applications
Get Started Building YARN ApplicationsGet Started Building YARN Applications
Get Started Building YARN ApplicationsHortonworks
 
Apache Hadoop YARN: state of the union - Tokyo
Apache Hadoop YARN: state of the union - Tokyo Apache Hadoop YARN: state of the union - Tokyo
Apache Hadoop YARN: state of the union - Tokyo DataWorks Summit
 
Apache Hadoop YARN: state of the union
Apache Hadoop YARN: state of the unionApache Hadoop YARN: state of the union
Apache Hadoop YARN: state of the unionDataWorks Summit
 
Apache Hadoop YARN: Present and Future
Apache Hadoop YARN: Present and FutureApache Hadoop YARN: Present and Future
Apache Hadoop YARN: Present and FutureDataWorks Summit
 
YARN - Past, Present, & Future
YARN - Past, Present, & FutureYARN - Past, Present, & Future
YARN - Past, Present, & FutureDataWorks Summit
 
Deploying and Managing Hadoop Clusters with AMBARI
Deploying and Managing Hadoop Clusters with AMBARIDeploying and Managing Hadoop Clusters with AMBARI
Deploying and Managing Hadoop Clusters with AMBARIDataWorks Summit
 
Bikas saha:the next generation of hadoop– hadoop 2 and yarn
Bikas saha:the next generation of hadoop– hadoop 2 and yarnBikas saha:the next generation of hadoop– hadoop 2 and yarn
Bikas saha:the next generation of hadoop– hadoop 2 and yarnhdhappy001
 
Apache Hadoop YARN: state of the union
Apache Hadoop YARN: state of the unionApache Hadoop YARN: state of the union
Apache Hadoop YARN: state of the unionDataWorks Summit
 
YARN: Future of Data Processing with Apache Hadoop
YARN: Future of Data Processing with Apache HadoopYARN: Future of Data Processing with Apache Hadoop
YARN: Future of Data Processing with Apache HadoopHortonworks
 
Nicholas:hdfs what is new in hadoop 2
Nicholas:hdfs what is new in hadoop 2Nicholas:hdfs what is new in hadoop 2
Nicholas:hdfs what is new in hadoop 2hdhappy001
 
YARN Ready - Integrating to YARN using Slider Webinar
YARN Ready - Integrating to YARN using Slider WebinarYARN Ready - Integrating to YARN using Slider Webinar
YARN Ready - Integrating to YARN using Slider WebinarHortonworks
 
What's new in Hadoop Yarn- Dec 2014
What's new in Hadoop Yarn- Dec 2014What's new in Hadoop Yarn- Dec 2014
What's new in Hadoop Yarn- Dec 2014InMobi Technology
 

Semelhante a Apache Hadoop YARN: Present and Future (20)

Apache Hadoop YARN: state of the union
Apache Hadoop YARN: state of the unionApache Hadoop YARN: state of the union
Apache Hadoop YARN: state of the union
 
Dataworks Berlin Summit 18' - Apache hadoop YARN State Of The Union
Dataworks Berlin Summit 18' - Apache hadoop YARN State Of The UnionDataworks Berlin Summit 18' - Apache hadoop YARN State Of The Union
Dataworks Berlin Summit 18' - Apache hadoop YARN State Of The Union
 
Hadoop Summit San Jose 2015: YARN - Past, Present and Future
Hadoop Summit San Jose 2015: YARN - Past, Present and FutureHadoop Summit San Jose 2015: YARN - Past, Present and Future
Hadoop Summit San Jose 2015: YARN - Past, Present and Future
 
Apache Hadoop YARN: State of the Union
Apache Hadoop YARN: State of the UnionApache Hadoop YARN: State of the Union
Apache Hadoop YARN: State of the Union
 
Get Started Building YARN Applications
Get Started Building YARN ApplicationsGet Started Building YARN Applications
Get Started Building YARN Applications
 
Yarnthug2014
Yarnthug2014Yarnthug2014
Yarnthug2014
 
Apache Hadoop YARN: state of the union - Tokyo
Apache Hadoop YARN: state of the union - Tokyo Apache Hadoop YARN: state of the union - Tokyo
Apache Hadoop YARN: state of the union - Tokyo
 
Apache Hadoop YARN: state of the union
Apache Hadoop YARN: state of the unionApache Hadoop YARN: state of the union
Apache Hadoop YARN: state of the union
 
MHUG - YARN
MHUG - YARNMHUG - YARN
MHUG - YARN
 
Apache Hadoop YARN: Present and Future
Apache Hadoop YARN: Present and FutureApache Hadoop YARN: Present and Future
Apache Hadoop YARN: Present and Future
 
YARN - Past, Present, & Future
YARN - Past, Present, & FutureYARN - Past, Present, & Future
YARN - Past, Present, & Future
 
Deploying and Managing Hadoop Clusters with AMBARI
Deploying and Managing Hadoop Clusters with AMBARIDeploying and Managing Hadoop Clusters with AMBARI
Deploying and Managing Hadoop Clusters with AMBARI
 
Bikas saha:the next generation of hadoop– hadoop 2 and yarn
Bikas saha:the next generation of hadoop– hadoop 2 and yarnBikas saha:the next generation of hadoop– hadoop 2 and yarn
Bikas saha:the next generation of hadoop– hadoop 2 and yarn
 
Apache Hadoop YARN: state of the union
Apache Hadoop YARN: state of the unionApache Hadoop YARN: state of the union
Apache Hadoop YARN: state of the union
 
YARN: Future of Data Processing with Apache Hadoop
YARN: Future of Data Processing with Apache HadoopYARN: Future of Data Processing with Apache Hadoop
YARN: Future of Data Processing with Apache Hadoop
 
Nicholas:hdfs what is new in hadoop 2
Nicholas:hdfs what is new in hadoop 2Nicholas:hdfs what is new in hadoop 2
Nicholas:hdfs what is new in hadoop 2
 
Munich HUG 21.11.2013
Munich HUG 21.11.2013Munich HUG 21.11.2013
Munich HUG 21.11.2013
 
YARN
YARNYARN
YARN
 
YARN Ready - Integrating to YARN using Slider Webinar
YARN Ready - Integrating to YARN using Slider WebinarYARN Ready - Integrating to YARN using Slider Webinar
YARN Ready - Integrating to YARN using Slider Webinar
 
What's new in Hadoop Yarn- Dec 2014
What's new in Hadoop Yarn- Dec 2014What's new in Hadoop Yarn- Dec 2014
What's new in Hadoop Yarn- Dec 2014
 

Mais de DataWorks Summit

Floating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisFloating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisDataWorks Summit
 
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiTracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiDataWorks Summit
 
HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...DataWorks Summit
 
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...DataWorks Summit
 
Managing the Dewey Decimal System
Managing the Dewey Decimal SystemManaging the Dewey Decimal System
Managing the Dewey Decimal SystemDataWorks Summit
 
Practical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExamplePractical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExampleDataWorks Summit
 
HBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberHBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberDataWorks Summit
 
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixScaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixDataWorks Summit
 
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiBuilding the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiDataWorks Summit
 
Supporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsSupporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsDataWorks Summit
 
Security Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureSecurity Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureDataWorks Summit
 
Presto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EnginePresto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EngineDataWorks Summit
 
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...DataWorks Summit
 
Extending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudExtending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudDataWorks Summit
 
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiEvent-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiDataWorks Summit
 
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerSecuring Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerDataWorks Summit
 
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...DataWorks Summit
 
Computer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouComputer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouDataWorks Summit
 
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkBig Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkDataWorks Summit
 

Mais de DataWorks Summit (20)

Data Science Crash Course
Data Science Crash CourseData Science Crash Course
Data Science Crash Course
 
Floating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisFloating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache Ratis
 
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiTracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
 
HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...
 
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
 
Managing the Dewey Decimal System
Managing the Dewey Decimal SystemManaging the Dewey Decimal System
Managing the Dewey Decimal System
 
Practical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExamplePractical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist Example
 
HBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberHBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at Uber
 
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixScaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
 
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiBuilding the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
 
Supporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsSupporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability Improvements
 
Security Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureSecurity Framework for Multitenant Architecture
Security Framework for Multitenant Architecture
 
Presto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EnginePresto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything Engine
 
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
 
Extending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudExtending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google Cloud
 
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiEvent-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
 
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerSecuring Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
 
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
 
Computer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouComputer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near You
 
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkBig Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
 

Último

Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxBkGupta21
 
Ryan Mahoney - Will Artificial Intelligence Replace Real Estate Agents
Ryan Mahoney - Will Artificial Intelligence Replace Real Estate AgentsRyan Mahoney - Will Artificial Intelligence Replace Real Estate Agents
Ryan Mahoney - Will Artificial Intelligence Replace Real Estate AgentsRyan Mahoney
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...AliaaTarek5
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersNicole Novielli
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rick Flair
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterMydbops
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 

Último (20)

Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptx
 
Ryan Mahoney - Will Artificial Intelligence Replace Real Estate Agents
Ryan Mahoney - Will Artificial Intelligence Replace Real Estate AgentsRyan Mahoney - Will Artificial Intelligence Replace Real Estate Agents
Ryan Mahoney - Will Artificial Intelligence Replace Real Estate Agents
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software Developers
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL Router
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 

Apache Hadoop YARN: Present and Future

  • 1. © Hortonworks Inc. 2014 Apache Hadoop YARN Present and Future Vinod Kumar Vavilapalli vinodkv [at] apache.org @tshooter Jian He jianhe [at] apache.org Page 1
  • 2. © Hortonworks Inc. 2014 Who are we? • Vinod Kumar Vavilapalli – 7 Hadoop-years old – Previously @Yahoo!, now @Hortonworks – Hadoop MapReduce and YARN Development lead & Architect at Hortonworks – Apache Hadoop YARN project lead – Apache Hadoop PMC, Apache Member – 99% + code in Apache, Hadoop • Jian He – Software Engineer @ Hortonworks – Apache Hadoop Committer – Masters Degree from Brown University. – Focus on YARN/MapReduce Page 2 Architecting the Future of Big Data
  • 3. © Hortonworks Inc. 2014 A quick show of hands.. • Hadoop 1 • Hadoop 2 & YARN • YARN for MapReduce2 • YARN for beyond MR2 Page 3 Architecting the Future of Big Data
  • 4. © Hortonworks Inc. 2014 Agenda • Apache Hadoop 2 : Overview • Community • Present • Future Page 4 Architecting the Future of Big Data
  • 5. © Hortonworks Inc. 2014 Apache Hadoop 2 Next Generation Architecture Architecting the Future of Big Data Page 5
  • 6. © Hortonworks Inc. 2014 YARN: the Data Operating System Page 6 Architecting the Future of Big Data • Resource Management Platform • MapReduce v2 • Beyond MapReduce with Tez, Storm, Spark; in Hadoop! • Did I mention Services like HBase, Accumulo on YARN with Apache Slider?
  • 7. © Hortonworks Inc. 2014 Why? • 2.0 >= 2 * 1.0 – YARN: Next generation architecture • Scale • Agility • Return on Investment: 2x throughput on same hardware! • Ready for improvements in hardware • Not convinced? Let’s see what others are saying! Page 7 Architecting the Future of Big Data
  • 8. © Hortonworks Inc. 2014 Yahoo! • Leader/Visionary on all things Hadoop! • On YARN (0.23.x) • Moving fast to 2.x Page 8 Architecting the Future of Big Data http://developer.yahoo.com/blogs/ydn/hadoop-yahoo-more-ever-54421.html
  • 9. © Hortonworks Inc. 2014 Twitter Page 9 Architecting the Future of Big Data Talk: “ Hadoop 2 @Twitter, Elephant Scale” By: Lohit Vijayarenu & Gera Shegalov
  • 10. © Hortonworks Inc. 2014 Ebay • Has one of the largest Hadoop clusters in the industry with tens- hundreds petabytes of data • Migrated production clusters to Hadoop-2 Page 10 Architecting the Future of Big Data
  • 11. © Hortonworks Inc. 2014 YARN Community At Apache Software Foundation Architecting the Future of Big Data Page 11
  • 12. © Hortonworks Inc. 2014 YARN contributions Page 12 Architecting the Future of Big Data 0 50 100 150 200 250 300 350 400 2.0.x 2.1.x 2.2.x 2.3.x 2.4.x 2.x trunk YARN Releases - 06/02/14 YARN Releases - 06/02/14
  • 13. © Hortonworks Inc. 2014 Contributors • 104 and counting • Few ‘big’ contributors • And a long tail Page 13 Architecting the Future of Big Data 0 10 20 30 40 50 60 70 80 90 100
  • 14. © Hortonworks Inc. 2014 Present Architecting the Future of Big Data Page 14
  • 15. © Hortonworks Inc. 2014 Apache Hadoop releases • 15 October, 2013 • The 1st GA release of Apache Hadoop 2.x • YARN – First stable and supported release of YARN – YARN level APIs solidified for the future – Binary Compatibility for MapReduce applications built on hadoop-1.x – Performance – Scale! • Support for running Hadoop on Microsoft Windows • Substantial amount of integration testing with rest of projects in the ecosystem – Pig, Hive, Oozie, HBase.. Page 15 Architecting the Future of Big Data Apache Hadoop 2.2
  • 16. © Hortonworks Inc. 2014 Apache Hadoop releases (contd) • 24 February, 2014 • First post GA release for the year 2014 • Alpha features in YARN – ResourceManager High Availability – Application History Server – Will be covered in detail in the 2.4 section • Number of bug-fixes, enhancements Page 16 Architecting the Future of Big Data Apache Hadoop 2.3
  • 17. © Hortonworks Inc. 2014 Apache Hadoop releases (contd) • 7 April, 2014 • Most recent release • Stabilizing features in YARN – Details follow – ResourceManager HA – YARN Timeline Server (beyond history server) – Preemption in YARN CapacityScheduler – Container-preserving AM recovery. Page 17 Architecting the Future of Big Data Apache Hadoop 2.4
  • 18. © Hortonworks Inc. 2014 ResourceManager High Availability Page 18 Architecting the Future of Big Data • RM – single point of failure • Goal : Downtime invisible to end-users – Apps not required to be re-submitted – NMs to rebind with newly started RM • Two stories: – Recovery of state – Failover
  • 19. © Hortonworks Inc. 2014 ResourceManager High Availability Page 19 Architecting the Future of Big Data • Active/Standby o Leader election (ZooKeeper) • Standby on transition to Active loads all the state from the state store. • NM, AM, clients, redirect to the new RM o RMProxy lib Talk: Highly Available Resource Management for YARN By: Karthik Kambatla, Xuan Gong
  • 20. © Hortonworks Inc. 2014 YARN Timeline Server • Few MR specific implementations: History and web-UI • YARN: Not just MR anymore! • Previous state – MapReduce specific Job History Server – YARN level ‘History’ lost beyond ResourceManager Restart Page 20 Architecting the Future of Big Data
  • 21. © Hortonworks Inc. 2014 YARN Timeline Server (contd) Page 21 Entity and Event collection RM and Applications periodically send events to Timeline sever Pluggable store Depending on site requirements REST APIs or RPC Applications and user-interfaces can access information via REST/ RPC Visualizations Users can build tools and visualizations using the APIs Apps and System Applications as well as the system entities/events
  • 22. © Hortonworks Inc. 2014 YARN Timeline Server (contd) Page 22 Architecting the Future of Big Data YARN Timeline Serv`er App1 App2 RM Custom App monitoring client RPC REST API Events Events AMBARI Events Talk: “Analyzing Historical Data of Applications on Hadoop YARN: for Fun and Profit” By: Zhijie Shen, Mayank Bansal
  • 23. © Hortonworks Inc. 2014 Capacity Scheduler Preemption • Enforce SLAs • Preempt across queues • Current Capacity • Guaranteed Capacity Gather Queue State STEP1 • Select applications to preempt: Over cap. Qs Identify preemptions STEP2 • Issue preemptions for containers to application Issue preemptions STEP3 • Track containers that have been issued by not yet executed preemption • Forcibly kill these containers after timeout Kill containers STEP4
  • 24. © Hortonworks Inc. 2014 Capacity Scheduler Preemption (Contd) Application Scheduler Page 24 Architecting the Future of Big Data Premptions Release Resource Premptions Kill containers forcibly after timeout x
  • 25. © Hortonworks Inc. 2014 Container-preserving AM restart • Problem – Containers are killed when AM goes down. – New AM needs to know where the previous containers are running – Previous containers need to know about the new AM. (WIP) Page 25 Architecting the Future of Big Data Container1 Container2 Container3 AM1 AM2 restart
  • 26. © Hortonworks Inc. 2014 Apache Hadoop releases (contd) • Next releases – 2.4.1 – 2.5.x • YARN – Details follow in future’s section – ResourceManager work-preserving restart for High Availability – YARN Timeline Server security & enhancement. – Lots more Page 26 Architecting the Future of Big Data Apache Hadoop 2.5.x
  • 27. © Hortonworks Inc. 2014 Future Architecting the Future of Big Data Page 27
  • 28. © Hortonworks Inc. 2014 Future: Operational enhancements • Rolling upgrades – No/minimal impact to users – Ideal: Always rolling! • HDFS upgrades effort is in • YARN – RM restart – NM restart – Upgrades Page 28 Architecting the Future of Big Data Talk: “Hadoop Rolling Upgrades – Taking Availability to the Next Level” By: Suresh Srinvias, Hortonworks & Jason Lowe Yahoo!
  • 29. © Hortonworks Inc. 2014 Future: Enabling apps • Beyond MapReduce – Apache Tez, Apache Slider, Apache Storm. • Discussing next – Long running services – Multi-dimensional resource scheduling – Isolation – Web services Page 29 Architecting the Future of Big Data
  • 30. © Hortonworks Inc. 2014 Future: Long running services • You can run them already! • Few enhancements needed – Logs – Security – Management/monitoring • Resource sharing across workload types Page 30 Architecting the Future of Big Data Talk: “ Bring your Service to YARN” By: Sumit Mohanty
  • 31. © Hortonworks Inc. 2014 Multi-resource scheduling • Today – memory & cpu – Physical memory / virtual memory – CPU Cores – Virtual cores • CPU stuff: More bake in • Disks – Space – IOPS • Network Page 31 Architecting the Future of Big Data
  • 32. © Hortonworks Inc. 2014 Fine-grain isolation for multi-tenancy • Custom memory-monitoring • Cgroups • Linux Containers • VMs Page 32 Architecting the Future of Big Data
  • 33. © Hortonworks Inc. 2014 Other features • Application SLAs – Run my application at 6:00 AM tomorrow and guarantee capacity for me! • Node labels – Some of the nodes in my cluster have specialized hardware, give them to me! • Node affinity/anti-affinity – Get me on to the nodes where my data is – Get me off of this node • Better online queue-management – Centralized – Quality feedback • Web-services – RESTful APIs for submitting, monitoring and killing apps – Beyond java-only clients Page 33 Architecting the Future of Big Data
  • 34. © Hortonworks Inc. 2014 YARN Ecosystem Beyond the core YARN project: Briefly Architecting the Future of Big Data Page 34
  • 35. © Hortonworks Inc. 2014 Eco-system Page 35 Classic Apache Hadoop MapReduce – Batch Batch & Interactive • Apache Tez – Batch/Interactive Stream Processing • Apache Storm • Apache Samza Apache Spark – Iterative applications YARN Frameworks • Apache Twill • Microsoft REEF There's an app for that... YARN App Marketplace! Existing apps • Apache Slider Graph Processing • Apache Giraph Applications Powered by YARN Talk: Apache Tez - A New Chapter in Hadoop Data Processing” By Bikas Saha, Hitesh Shah
  • 36. © Hortonworks Inc. 2014 Recap Architecting the Future of Big Data Page 36
  • 37. © Hortonworks Inc. 2014 Recap Page 37 Architecting the Future of Big Data • YARN helps Apache Hadoop 2 to be twice as good! • Exciting journey with Hadoop for this decade… – Hadoop is no longer a one-trick pony, err elephant – Beyond just MapReduce • Hadoop 2: Architecture for the future – Centralized data, multiple apps • Lots of exciting new features – Exciting spectrum of application types, workloads and use-cases
  • 38. © Hortonworks Inc. 2014 Couple more things.. Architecting the Future of Big Data Page 38
  • 39. © Hortonworks Inc. 2014 The Book is out! Page 39 Architecting the Future of Big Data
  • 40. © Hortonworks Inc. 2014 Page 40 Architecting the Future of Big Data
  • 41. © Hortonworks Inc. 2014 Thank you! Page 41 Download Sandbox: Experience Apache Hadoop Both 2.x and 1.x Versions Available! http://hortonworks.com/products/hortonworks-sandbox/ Questions Time!

Notas do Editor

  1. Graph processing – Giraph, Hama Stream proessing – Smaza, Storm, Spark, DataTorrent MapReduce Tez – fast query execution Weave/REEF – frameworks to help with writing applications List of some of the applications which already support YARN, in some form. Smaza, Storm, S4 and DataTorrent are streaming frameworks Various types of graph processing frameworks – Giraph and Hama are graph processing systems There’s some github projects – caching systems, on-demand web-server spin up Wave and REEF are frameworks on top of YARN to make writing applications easier