SlideShare a Scribd company logo
1 of 21
Building Applications on YARN




Chris Riccomini
10/11/2012
Staff Software Engineer at LinkedIn
      http://riccomini.name
           @criccomini
What I want to Talk About
Anatomy of a YARN Application

Things to consider when building your application
  Architecture
  Operations
Anatomy of a YARN App
Client

Application Master

Container Code

Resource Manager

Node Manager
Anatomy of a YARN App
Client
                     Client
                     Client    RM
                               RM
Application Master

Container Code

Resource Manager
                     NM
                     NM        NM
                               NM
Node Manager

                     AM
                     AM         CC
                                CC


                              * simplified
A lot to consider
Deployment            Logging

Metrics               Fault Tolerance

Configuration         Isolation

Security              Dashboard

Language              State
Deployment
HDFS

HTTP

File (NFS)

DDOS’ing your servers

What we do: Tarball over HTTP. Life is easier with HDFS,
but operational overhead is too high.
Metrics
Application-level metrics

YARN-level metrics

metrics2

Containers are transient

What we do: Both app-level and framework-level metrics use
same metrics framework. Pipe to in-house metrics
dashboard. We don’t use metrics2 since we don’t want a
dependency on Hadoop in our core jar.
Metrics
Configuration
YARN config (yarn-site.xml, core-site.xml, etc)

Application Configuration

Transporting Configuration

What we do: Config is fully resolved at client execution time.
No admin-override/locked config protection yet. Config is
passed from client to AM to containers via environment
variables.
Security
Kerberos?

Firewalls are your friend

Gateway machine

Dashboard

What we do: Firewall all YARN machines so they can only
talk to each-other. All users go through LDAP controlled
dashboard.
Language
Favor complexity in Application Master, and make
container-logic thin

Talk to RM via REST

Potential to talk to RM via Protobuf RPC

What we do: Application AM is Java. Tasks-side of
application has Python and Java implementations.
Logging
Local storage (application is running)

HDFS storage (application has stopped for a while)

Be careful with STDOUT/STDERR (rollover)

What we do: No HDFS. Logs sit for 7 days, then disappear.
Not ideal.
Fault Tolerance
Failure matrix

HA RM/NM

Orphaned processes

Pay attention to process trees

What we do: No HA. Manual fail over when RM dies.
Orphaned process monitor (proc start time < RM start time).
Fault Tolerance
Isolation
Memory

Disk

CPU

Network

What we do: Nothing, right now. Hoping YARN will solve
this before we need it (cgroups?).
Dashboard
Application-specific information

Integrate with YARN

Application Master or Standalone?

What we do: Dashboard enforces security, talks to RM/AM
via HTTP/JSON to get information about jobs.
Dashboard
State
HDFS

Deployed with Application

Remote data store

What we do: Nothing, right now.
Takeaways
There’s a lot more than just the YARN API

Look for examples (Spark, Storm, Map-Reduce)

Decide your level of Hadoop integration
  Metrics2

  HDFS

  Config

  Kerberos and doAs
Questions?

More Related Content

Similar to Building Applications on YARN

Continuous delivery on the cloud
Continuous delivery on the cloudContinuous delivery on the cloud
Continuous delivery on the cloudAnand B Narasimhan
 
Zend In The Cloud
Zend In The CloudZend In The Cloud
Zend In The Cloudphptechtalk
 
WinConnections Spring, 2011 - How to Securely Connect Remote Desktop Services...
WinConnections Spring, 2011 - How to Securely Connect Remote Desktop Services...WinConnections Spring, 2011 - How to Securely Connect Remote Desktop Services...
WinConnections Spring, 2011 - How to Securely Connect Remote Desktop Services...Concentrated Technology
 
Building Microservices with the 12 Factor App Pattern on AWS - AWS Online Tec...
Building Microservices with the 12 Factor App Pattern on AWS - AWS Online Tec...Building Microservices with the 12 Factor App Pattern on AWS - AWS Online Tec...
Building Microservices with the 12 Factor App Pattern on AWS - AWS Online Tec...Amazon Web Services
 
E Snet Raf Essc Jan2005
E Snet Raf Essc Jan2005E Snet Raf Essc Jan2005
E Snet Raf Essc Jan2005FNian
 
The Good, the Bad and the Ugly of Migrating Hundreds of Legacy Applications ...
 The Good, the Bad and the Ugly of Migrating Hundreds of Legacy Applications ... The Good, the Bad and the Ugly of Migrating Hundreds of Legacy Applications ...
The Good, the Bad and the Ugly of Migrating Hundreds of Legacy Applications ...Josef Adersberger
 
Migrating Hundreds of Legacy Applications to Kubernetes - The Good, the Bad, ...
Migrating Hundreds of Legacy Applications to Kubernetes - The Good, the Bad, ...Migrating Hundreds of Legacy Applications to Kubernetes - The Good, the Bad, ...
Migrating Hundreds of Legacy Applications to Kubernetes - The Good, the Bad, ...QAware GmbH
 
Voldemort & Hadoop @ Linkedin, Hadoop User Group Jan 2010
Voldemort & Hadoop @ Linkedin, Hadoop User Group Jan 2010Voldemort & Hadoop @ Linkedin, Hadoop User Group Jan 2010
Voldemort & Hadoop @ Linkedin, Hadoop User Group Jan 2010Bhupesh Bansal
 
Hadoop and Voldemort @ LinkedIn
Hadoop and Voldemort @ LinkedInHadoop and Voldemort @ LinkedIn
Hadoop and Voldemort @ LinkedInHadoop User Group
 
Configuration Management and Transforming Legacy Applications in the Enterpri...
Configuration Management and Transforming Legacy Applications in the Enterpri...Configuration Management and Transforming Legacy Applications in the Enterpri...
Configuration Management and Transforming Legacy Applications in the Enterpri...Docker, Inc.
 
Handling Data in Mega Scale Systems
Handling Data in Mega Scale SystemsHandling Data in Mega Scale Systems
Handling Data in Mega Scale SystemsDirecti Group
 
Ebs architecture con9036_pdf_9036_0001
Ebs architecture con9036_pdf_9036_0001Ebs architecture con9036_pdf_9036_0001
Ebs architecture con9036_pdf_9036_0001jucaab
 
DockerCon EU 2015: The Missing Piece: when Docker networking unleashing soft ...
DockerCon EU 2015: The Missing Piece: when Docker networking unleashing soft ...DockerCon EU 2015: The Missing Piece: when Docker networking unleashing soft ...
DockerCon EU 2015: The Missing Piece: when Docker networking unleashing soft ...Docker, Inc.
 
Containerized Hadoop beyond Kubernetes
Containerized Hadoop beyond KubernetesContainerized Hadoop beyond Kubernetes
Containerized Hadoop beyond KubernetesDataWorks Summit
 
RackN Physical Layer Automation Innovation
RackN Physical Layer Automation InnovationRackN Physical Layer Automation Innovation
RackN Physical Layer Automation Innovationrhirschfeld
 
Meteor South Bay Meetup - Kubernetes & Google Container Engine
Meteor South Bay Meetup - Kubernetes & Google Container EngineMeteor South Bay Meetup - Kubernetes & Google Container Engine
Meteor South Bay Meetup - Kubernetes & Google Container EngineKit Merker
 
Building Microservices with the 12 Factor App Pattern on AWS
Building Microservices with the 12 Factor App Pattern on AWSBuilding Microservices with the 12 Factor App Pattern on AWS
Building Microservices with the 12 Factor App Pattern on AWSAmazon Web Services
 
Developing YARN Applications - Integrating natively to YARN July 24 2014
Developing YARN Applications - Integrating natively to YARN July 24 2014Developing YARN Applications - Integrating natively to YARN July 24 2014
Developing YARN Applications - Integrating natively to YARN July 24 2014Hortonworks
 
70-410 Practice Test
70-410 Practice Test70-410 Practice Test
70-410 Practice Testwrailebo
 

Similar to Building Applications on YARN (20)

Continuous delivery on the cloud
Continuous delivery on the cloudContinuous delivery on the cloud
Continuous delivery on the cloud
 
Ramji
RamjiRamji
Ramji
 
Zend In The Cloud
Zend In The CloudZend In The Cloud
Zend In The Cloud
 
WinConnections Spring, 2011 - How to Securely Connect Remote Desktop Services...
WinConnections Spring, 2011 - How to Securely Connect Remote Desktop Services...WinConnections Spring, 2011 - How to Securely Connect Remote Desktop Services...
WinConnections Spring, 2011 - How to Securely Connect Remote Desktop Services...
 
Building Microservices with the 12 Factor App Pattern on AWS - AWS Online Tec...
Building Microservices with the 12 Factor App Pattern on AWS - AWS Online Tec...Building Microservices with the 12 Factor App Pattern on AWS - AWS Online Tec...
Building Microservices with the 12 Factor App Pattern on AWS - AWS Online Tec...
 
E Snet Raf Essc Jan2005
E Snet Raf Essc Jan2005E Snet Raf Essc Jan2005
E Snet Raf Essc Jan2005
 
The Good, the Bad and the Ugly of Migrating Hundreds of Legacy Applications ...
 The Good, the Bad and the Ugly of Migrating Hundreds of Legacy Applications ... The Good, the Bad and the Ugly of Migrating Hundreds of Legacy Applications ...
The Good, the Bad and the Ugly of Migrating Hundreds of Legacy Applications ...
 
Migrating Hundreds of Legacy Applications to Kubernetes - The Good, the Bad, ...
Migrating Hundreds of Legacy Applications to Kubernetes - The Good, the Bad, ...Migrating Hundreds of Legacy Applications to Kubernetes - The Good, the Bad, ...
Migrating Hundreds of Legacy Applications to Kubernetes - The Good, the Bad, ...
 
Voldemort & Hadoop @ Linkedin, Hadoop User Group Jan 2010
Voldemort & Hadoop @ Linkedin, Hadoop User Group Jan 2010Voldemort & Hadoop @ Linkedin, Hadoop User Group Jan 2010
Voldemort & Hadoop @ Linkedin, Hadoop User Group Jan 2010
 
Hadoop and Voldemort @ LinkedIn
Hadoop and Voldemort @ LinkedInHadoop and Voldemort @ LinkedIn
Hadoop and Voldemort @ LinkedIn
 
Configuration Management and Transforming Legacy Applications in the Enterpri...
Configuration Management and Transforming Legacy Applications in the Enterpri...Configuration Management and Transforming Legacy Applications in the Enterpri...
Configuration Management and Transforming Legacy Applications in the Enterpri...
 
Handling Data in Mega Scale Systems
Handling Data in Mega Scale SystemsHandling Data in Mega Scale Systems
Handling Data in Mega Scale Systems
 
Ebs architecture con9036_pdf_9036_0001
Ebs architecture con9036_pdf_9036_0001Ebs architecture con9036_pdf_9036_0001
Ebs architecture con9036_pdf_9036_0001
 
DockerCon EU 2015: The Missing Piece: when Docker networking unleashing soft ...
DockerCon EU 2015: The Missing Piece: when Docker networking unleashing soft ...DockerCon EU 2015: The Missing Piece: when Docker networking unleashing soft ...
DockerCon EU 2015: The Missing Piece: when Docker networking unleashing soft ...
 
Containerized Hadoop beyond Kubernetes
Containerized Hadoop beyond KubernetesContainerized Hadoop beyond Kubernetes
Containerized Hadoop beyond Kubernetes
 
RackN Physical Layer Automation Innovation
RackN Physical Layer Automation InnovationRackN Physical Layer Automation Innovation
RackN Physical Layer Automation Innovation
 
Meteor South Bay Meetup - Kubernetes & Google Container Engine
Meteor South Bay Meetup - Kubernetes & Google Container EngineMeteor South Bay Meetup - Kubernetes & Google Container Engine
Meteor South Bay Meetup - Kubernetes & Google Container Engine
 
Building Microservices with the 12 Factor App Pattern on AWS
Building Microservices with the 12 Factor App Pattern on AWSBuilding Microservices with the 12 Factor App Pattern on AWS
Building Microservices with the 12 Factor App Pattern on AWS
 
Developing YARN Applications - Integrating natively to YARN July 24 2014
Developing YARN Applications - Integrating natively to YARN July 24 2014Developing YARN Applications - Integrating natively to YARN July 24 2014
Developing YARN Applications - Integrating natively to YARN July 24 2014
 
70-410 Practice Test
70-410 Practice Test70-410 Practice Test
70-410 Practice Test
 

More from Chris Riccomini

What Your Tech Lead Thinks You Know (But Didn't Teach You)
What Your Tech Lead Thinks You Know (But Didn't Teach You)What Your Tech Lead Thinks You Know (But Didn't Teach You)
What Your Tech Lead Thinks You Know (But Didn't Teach You)Chris Riccomini
 
The Future of Data Engineering - 2019 InfoQ QConSF
The Future of Data Engineering - 2019 InfoQ QConSFThe Future of Data Engineering - 2019 InfoQ QConSF
The Future of Data Engineering - 2019 InfoQ QConSFChris Riccomini
 
Apache Incubator Samza: Stream Processing at LinkedIn
Apache Incubator Samza: Stream Processing at LinkedInApache Incubator Samza: Stream Processing at LinkedIn
Apache Incubator Samza: Stream Processing at LinkedInChris Riccomini
 
Apache Incubator Samza: Stream Processing at LinkedIn
Apache Incubator Samza: Stream Processing at LinkedInApache Incubator Samza: Stream Processing at LinkedIn
Apache Incubator Samza: Stream Processing at LinkedInChris Riccomini
 

More from Chris Riccomini (6)

Data Warehousing Trends
Data Warehousing TrendsData Warehousing Trends
Data Warehousing Trends
 
What Your Tech Lead Thinks You Know (But Didn't Teach You)
What Your Tech Lead Thinks You Know (But Didn't Teach You)What Your Tech Lead Thinks You Know (But Didn't Teach You)
What Your Tech Lead Thinks You Know (But Didn't Teach You)
 
The Future of Data Engineering - 2019 InfoQ QConSF
The Future of Data Engineering - 2019 InfoQ QConSFThe Future of Data Engineering - 2019 InfoQ QConSF
The Future of Data Engineering - 2019 InfoQ QConSF
 
Airflow at WePay
Airflow at WePayAirflow at WePay
Airflow at WePay
 
Apache Incubator Samza: Stream Processing at LinkedIn
Apache Incubator Samza: Stream Processing at LinkedInApache Incubator Samza: Stream Processing at LinkedIn
Apache Incubator Samza: Stream Processing at LinkedIn
 
Apache Incubator Samza: Stream Processing at LinkedIn
Apache Incubator Samza: Stream Processing at LinkedInApache Incubator Samza: Stream Processing at LinkedIn
Apache Incubator Samza: Stream Processing at LinkedIn
 

Building Applications on YARN