SlideShare uma empresa Scribd logo
1 de 25
© Hortonworks Inc. 2013
Hoya: HBase on YARN
Steve Loughran & Devaraj Das
{stevel, ddas} at hortonworks.com
@steveloughran, @ddraj
August 2013
© Hortonworks Inc. 2012
Hadoop as Next-Gen Platform
HADOOP 1.0
HDFS
(redundant, reliable storage)
MapReduce
(cluster resource management
& data processing)
HDFS2
(redundant, reliable storage)
YARN
(cluster resource management)
MapReduce
(data processing)
Others
(data processing)
HADOOP 2.0
Single Use System
Batch Apps
Multi Purpose Platform
Batch, Interactive, Online, Streaming, …
Page 2
© Hortonworks Inc.
YARN: Taking Hadoop Beyond Batch
Page 3
Applications Run Natively IN Hadoop
HDFS2 (Redundant, Reliable Storage)
YARN (Cluster Resource Management)
BATCH
(MapReduce)
INTERACTIVE
(Tez)
STREAMING
(Storm, S4,…)
GRAPH
(Giraph)
HPC MPI
(OpenMPI)
OTHER
(Search)
(Weave…)
Samza
Store ALL DATA in one place…
Interact with that data in MULTIPLE WAYS
with Predictable Performance and Quality of Service
IN-MEMORY
(Spark)
HDFS2 (Redundant, Reliable Storage)
YARN (Cluster Resource Management)
BATCH
(MapReduce)
INTERACTIVE
(Tez)
STREAMING
(Storm, S4,…)
GRAPH
(Giraph)
HPC MPI
(OpenMPI)
OTHER
(Search)
(Weave…)
HBase
IN-MEMORY
(Spark)
HDFS2 (Redundant, Reliable Storage)
YARN (Cluster Resource Management)
BATCH
(MapReduce)
INTERACTIVE
(Tez)
STREAMING
(Storm, S4,…)
GRAPH
(Giraph)
HPC MPI
(OpenMPI)
OTHER
(Search)
(Weave…)
HBase
IN-MEMORY
(Spark)
And HBase?
© Hortonworks Inc.
Page 5
© Hortonworks Inc.
Hoya: On-demand HBase clusters
1. Small HBase cluster in large YARN cluster
2. Dynamic HBase clusters
3. Elastic HBase clusters
4. Transient/intermittent clusters for workflows
5. Custom versions & configurations
6. More efficient utilization/sharing of cluster
Page 6
© Hortonworks Inc.
Goal: No code changes in HBase
• Today : none
HBase 0.95.2$ mvn install -Dhadoop.version=2.0
But we'd like
• ZK reporting of web UI ports
• Allocation of tables in RS to be block location aware
• A way to get from failed RS to YARN container
(configurable ID is enough)
Page 7
© Hortonworks Inc.
Hoya – the tool
• Hoya (Hbase On YArn)
–Java tool
–Completely CLI driven
• Input: cluster description as JSON
–Specification of cluster: node options, ZK params
–Configuration generated
–Entire state persisted
• Actions: create, freeze/thaw, flex, exists <cluster>
• Can change cluster state later
–Add/remove nodes, started / stopped states
© Hortonworks Inc. 2012
YARN manages the cluster
Page 9
HDFS
YARN Node Manager
HDFS
YARN Node Manager
HDFS
YARN Resource Manager
HDFS
YARN Node Manager
• Servers run YARN Node Managers
• NM's heartbeat to Resource Manager
• RM schedules work over cluster
• RM allocates containers to apps
• NMs start containers
• NMs report container health
© Hortonworks Inc. 2012
Hoya Client creates App Master
Page 10
HDFS
YARN Node Manager
HDFS
YARN Node Manager
HDFS
YARN Resource Manager
HDFS
YARN Node Manager
Hoya Client
Hoya AM
© Hortonworks Inc. 2012
AM deploys HBase with YARN
Page 11
HDFS
YARN Node Manager
HDFS
YARN Node Manager
HDFS
YARN Resource Manager
Hoya Client
HDFS
YARN Node Manager
Hoya AM [HBase Master]
HBase Region Server
HBase Region Server
© Hortonworks Inc. 2012
HBase & clients bind via Zookeeper
Page 12
HDFS
YARN Node Manager
HBase Region Server
HDFS
YARN Node Manager
HBase Region Server
HDFS
YARN Resource Manager
HBase Client HDFS
YARN Node Manager
Hoya AM [HBase Master]
Hoya Client
© Hortonworks Inc. 2012
YARN notifies AM of failures
Page 13
HDFS
YARN Node Manager
HDFS
YARN Node Manager
HBase Region Server
HDFS
YARN Resource Manager
Hoya Client
HDFS
YARN Node Manager
Hoya AM [HBase Master]
HBase Region Server
HBase Region Server
© Hortonworks Inc.
HOYA - cool bits
• Cluster specification stored as JSON in HDFS
• Conf dir cached, dynamically patched before pushing
up as local resources for master & region servers
• HBase .tar file stored in HDFS -clusters can use the
same/different HBase versions
• Handling of cluster flexing is the same code as
unplanned container loss.
• No Hoya code on region servers
Page 14
© Hortonworks Inc.
HOYA - AM RPC API
//shut down
public void stopCluster();
//change #of worker nodes in cluster
public boolean flexNodes(int workers);
//get JSON description of live cluster
public String getClusterStatus();
Page 15
© Hortonworks Inc.
Flexing/failure handling is same code
public boolean flexNodes(int workers) throws IOException {
log.info("Flexing cluster count from {} to {}", numTotalContainers,
workers);
if (numTotalContainers == workers) {
//no-op
log.info("Flex is a no-op");
return false;
}
//update the #of workers
numTotalContainers = workers;
// ask for more containers if needed
reviewRequestAndReleaseNodes();
return true;
}
Page 16
© Hortonworks Inc. 2012
{
"name" : "TestHBaseMaster",
"createTime" : 1371738651059,
"flags" : { "--Xtest" : "true" },
"originConfigurationPath" : "file:/Users/stevel/.hoya/cluster/TestHBaseMaster/orig",
"generatedConfigurationPath" :
"file:/Users/stevel/.hoya/cluster/TestHBaseMaster/gen",
"hBaseClientProperties" : { },
"hbaseHome" : "/Users/stevel/Java/Apps/hbase",
"hbaseRootPath" : "file:/Users/stevel/.hoya/cluster/TestHBaseMaster/hbase",
"zkHosts" : "127.0.0.1",
"zkPath" : "/hbase",
"zkPort" : 49564
"workers" : 5,
"masterHeap" : 128,
"masters" : 1,
"workerHeap" : 256,
"startTime" : 0,
"state" : 1,
"statusTime" : 0,
"stopTime" : 0,
}
Spec: declarative parts; this is persisted
Cluster Specification: persistent & wire
© Hortonworks Inc.
Current status
• Able to create & stop on-demand HBase clusters
–RegionServer failures handled
• Able to specify specific HBase configuration:
hbase-home or .tar.gz
• Cluster stop, restart, flex
• get (dynamic) conf as XML, properties
© Hortonworks Inc.
What's Next
• Multiple roles: worker, master, monitor
--role worker --roleopts worker yarn.vcores 2
• Multiple Providers: HBase + others
–client side: preflight, configuration patching
–server side: starting roles, liveness
• Liveness probes: HTTP GET, RPC port, RPC op?
• YARN enhancements
Page 19
© Hortonworks Inc.
YARN-896: long-lived services
1. Container reconnect on AM restart
2. Token renewal on long-lived apps
3. Containers: signalling, >1 process sequence
4. AM/RM managed gang scheduling
5. Anti-affinity hint in container requests
6. Service Registry - ZK?
7. Logging
All post Hadoop-2.1
Page 20
© Hortonworks Inc.
Hoya needs a home!
Page 21
https://github.com/hortonworks/hoya
© Hortonworks Inc
Questions?
hortonworks.com
Page 22
© Hortonworks Inc
http://hortonworks.com/careers/
Page 23
P.S: we are hiring
© Hortonworks Inc.
Requirements of an App: MUST
• Install from tarball; run as normal user
• Pre-configurable, static instance config data
• deploy/start without human intervention
• support dynamic discovery/binding of peers
• co-existence with other app instance in cluster/nodes
• handle co-located role instances
• Persist data to HDFS
• support 'kill' as a shutdown option
• support role instances moving after failure
• handle failed role instances
Page 24
© Hortonworks Inc.
Requirements of an App: SHOULD
• Be configurable by Hadoop XML files
• Publish dynamically assigned web UI & RPC ports
• Support cluster flexing up/down
• Support API to determine role instance status
• Make it possible to determine role instance ID from
app
• Support simple remote liveness probes
Page 25

Mais conteúdo relacionado

Mais de Steve Loughran

Extreme Programming Deployed
Extreme Programming DeployedExtreme Programming Deployed
Extreme Programming DeployedSteve Loughran
 
What does rename() do?
What does rename() do?What does rename() do?
What does rename() do?Steve Loughran
 
Dancing Elephants: Working with Object Storage in Apache Spark and Hive
Dancing Elephants: Working with Object Storage in Apache Spark and HiveDancing Elephants: Working with Object Storage in Apache Spark and Hive
Dancing Elephants: Working with Object Storage in Apache Spark and HiveSteve Loughran
 
Apache Spark and Object Stores —for London Spark User Group
Apache Spark and Object Stores —for London Spark User GroupApache Spark and Object Stores —for London Spark User Group
Apache Spark and Object Stores —for London Spark User GroupSteve Loughran
 
Spark Summit East 2017: Apache spark and object stores
Spark Summit East 2017: Apache spark and object storesSpark Summit East 2017: Apache spark and object stores
Spark Summit East 2017: Apache spark and object storesSteve Loughran
 
Hadoop, Hive, Spark and Object Stores
Hadoop, Hive, Spark and Object StoresHadoop, Hive, Spark and Object Stores
Hadoop, Hive, Spark and Object StoresSteve Loughran
 
Apache Spark and Object Stores
Apache Spark and Object StoresApache Spark and Object Stores
Apache Spark and Object StoresSteve Loughran
 
Household INFOSEC in a Post-Sony Era
Household INFOSEC in a Post-Sony EraHousehold INFOSEC in a Post-Sony Era
Household INFOSEC in a Post-Sony EraSteve Loughran
 
Hadoop and Kerberos: the Madness Beyond the Gate: January 2016 edition
Hadoop and Kerberos: the Madness Beyond the Gate: January 2016 editionHadoop and Kerberos: the Madness Beyond the Gate: January 2016 edition
Hadoop and Kerberos: the Madness Beyond the Gate: January 2016 editionSteve Loughran
 
Hadoop and Kerberos: the Madness Beyond the Gate
Hadoop and Kerberos: the Madness Beyond the GateHadoop and Kerberos: the Madness Beyond the Gate
Hadoop and Kerberos: the Madness Beyond the GateSteve Loughran
 
Slider: Applications on YARN
Slider: Applications on YARNSlider: Applications on YARN
Slider: Applications on YARNSteve Loughran
 
Overview of slider project
Overview of slider projectOverview of slider project
Overview of slider projectSteve Loughran
 
Help! My Hadoop doesn't work!
Help! My Hadoop doesn't work!Help! My Hadoop doesn't work!
Help! My Hadoop doesn't work!Steve Loughran
 
2014 01-02-patching-workflow
2014 01-02-patching-workflow2014 01-02-patching-workflow
2014 01-02-patching-workflowSteve Loughran
 
2013 11-19-hoya-status
2013 11-19-hoya-status2013 11-19-hoya-status
2013 11-19-hoya-statusSteve Loughran
 

Mais de Steve Loughran (20)

Extreme Programming Deployed
Extreme Programming DeployedExtreme Programming Deployed
Extreme Programming Deployed
 
Testing
TestingTesting
Testing
 
I hate mocking
I hate mockingI hate mocking
I hate mocking
 
What does rename() do?
What does rename() do?What does rename() do?
What does rename() do?
 
Dancing Elephants: Working with Object Storage in Apache Spark and Hive
Dancing Elephants: Working with Object Storage in Apache Spark and HiveDancing Elephants: Working with Object Storage in Apache Spark and Hive
Dancing Elephants: Working with Object Storage in Apache Spark and Hive
 
Apache Spark and Object Stores —for London Spark User Group
Apache Spark and Object Stores —for London Spark User GroupApache Spark and Object Stores —for London Spark User Group
Apache Spark and Object Stores —for London Spark User Group
 
Spark Summit East 2017: Apache spark and object stores
Spark Summit East 2017: Apache spark and object storesSpark Summit East 2017: Apache spark and object stores
Spark Summit East 2017: Apache spark and object stores
 
Hadoop, Hive, Spark and Object Stores
Hadoop, Hive, Spark and Object StoresHadoop, Hive, Spark and Object Stores
Hadoop, Hive, Spark and Object Stores
 
Apache Spark and Object Stores
Apache Spark and Object StoresApache Spark and Object Stores
Apache Spark and Object Stores
 
Household INFOSEC in a Post-Sony Era
Household INFOSEC in a Post-Sony EraHousehold INFOSEC in a Post-Sony Era
Household INFOSEC in a Post-Sony Era
 
Hadoop and Kerberos: the Madness Beyond the Gate: January 2016 edition
Hadoop and Kerberos: the Madness Beyond the Gate: January 2016 editionHadoop and Kerberos: the Madness Beyond the Gate: January 2016 edition
Hadoop and Kerberos: the Madness Beyond the Gate: January 2016 edition
 
Hadoop and Kerberos: the Madness Beyond the Gate
Hadoop and Kerberos: the Madness Beyond the GateHadoop and Kerberos: the Madness Beyond the Gate
Hadoop and Kerberos: the Madness Beyond the Gate
 
Slider: Applications on YARN
Slider: Applications on YARNSlider: Applications on YARN
Slider: Applications on YARN
 
YARN Services
YARN ServicesYARN Services
YARN Services
 
Datacentre stack
Datacentre stackDatacentre stack
Datacentre stack
 
Overview of slider project
Overview of slider projectOverview of slider project
Overview of slider project
 
Help! My Hadoop doesn't work!
Help! My Hadoop doesn't work!Help! My Hadoop doesn't work!
Help! My Hadoop doesn't work!
 
2014 01-02-patching-workflow
2014 01-02-patching-workflow2014 01-02-patching-workflow
2014 01-02-patching-workflow
 
2013 11-19-hoya-status
2013 11-19-hoya-status2013 11-19-hoya-status
2013 11-19-hoya-status
 
Hoya for Code Review
Hoya for Code ReviewHoya for Code Review
Hoya for Code Review
 

Último

ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdflior mazor
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 

Último (20)

ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 

Hoya : HBase on YARN (2013-08-20 HBase Hug)

  • 1. © Hortonworks Inc. 2013 Hoya: HBase on YARN Steve Loughran & Devaraj Das {stevel, ddas} at hortonworks.com @steveloughran, @ddraj August 2013
  • 2. © Hortonworks Inc. 2012 Hadoop as Next-Gen Platform HADOOP 1.0 HDFS (redundant, reliable storage) MapReduce (cluster resource management & data processing) HDFS2 (redundant, reliable storage) YARN (cluster resource management) MapReduce (data processing) Others (data processing) HADOOP 2.0 Single Use System Batch Apps Multi Purpose Platform Batch, Interactive, Online, Streaming, … Page 2
  • 3. © Hortonworks Inc. YARN: Taking Hadoop Beyond Batch Page 3 Applications Run Natively IN Hadoop HDFS2 (Redundant, Reliable Storage) YARN (Cluster Resource Management) BATCH (MapReduce) INTERACTIVE (Tez) STREAMING (Storm, S4,…) GRAPH (Giraph) HPC MPI (OpenMPI) OTHER (Search) (Weave…) Samza Store ALL DATA in one place… Interact with that data in MULTIPLE WAYS with Predictable Performance and Quality of Service IN-MEMORY (Spark)
  • 4. HDFS2 (Redundant, Reliable Storage) YARN (Cluster Resource Management) BATCH (MapReduce) INTERACTIVE (Tez) STREAMING (Storm, S4,…) GRAPH (Giraph) HPC MPI (OpenMPI) OTHER (Search) (Weave…) HBase IN-MEMORY (Spark) HDFS2 (Redundant, Reliable Storage) YARN (Cluster Resource Management) BATCH (MapReduce) INTERACTIVE (Tez) STREAMING (Storm, S4,…) GRAPH (Giraph) HPC MPI (OpenMPI) OTHER (Search) (Weave…) HBase IN-MEMORY (Spark) And HBase?
  • 6. © Hortonworks Inc. Hoya: On-demand HBase clusters 1. Small HBase cluster in large YARN cluster 2. Dynamic HBase clusters 3. Elastic HBase clusters 4. Transient/intermittent clusters for workflows 5. Custom versions & configurations 6. More efficient utilization/sharing of cluster Page 6
  • 7. © Hortonworks Inc. Goal: No code changes in HBase • Today : none HBase 0.95.2$ mvn install -Dhadoop.version=2.0 But we'd like • ZK reporting of web UI ports • Allocation of tables in RS to be block location aware • A way to get from failed RS to YARN container (configurable ID is enough) Page 7
  • 8. © Hortonworks Inc. Hoya – the tool • Hoya (Hbase On YArn) –Java tool –Completely CLI driven • Input: cluster description as JSON –Specification of cluster: node options, ZK params –Configuration generated –Entire state persisted • Actions: create, freeze/thaw, flex, exists <cluster> • Can change cluster state later –Add/remove nodes, started / stopped states
  • 9. © Hortonworks Inc. 2012 YARN manages the cluster Page 9 HDFS YARN Node Manager HDFS YARN Node Manager HDFS YARN Resource Manager HDFS YARN Node Manager • Servers run YARN Node Managers • NM's heartbeat to Resource Manager • RM schedules work over cluster • RM allocates containers to apps • NMs start containers • NMs report container health
  • 10. © Hortonworks Inc. 2012 Hoya Client creates App Master Page 10 HDFS YARN Node Manager HDFS YARN Node Manager HDFS YARN Resource Manager HDFS YARN Node Manager Hoya Client Hoya AM
  • 11. © Hortonworks Inc. 2012 AM deploys HBase with YARN Page 11 HDFS YARN Node Manager HDFS YARN Node Manager HDFS YARN Resource Manager Hoya Client HDFS YARN Node Manager Hoya AM [HBase Master] HBase Region Server HBase Region Server
  • 12. © Hortonworks Inc. 2012 HBase & clients bind via Zookeeper Page 12 HDFS YARN Node Manager HBase Region Server HDFS YARN Node Manager HBase Region Server HDFS YARN Resource Manager HBase Client HDFS YARN Node Manager Hoya AM [HBase Master] Hoya Client
  • 13. © Hortonworks Inc. 2012 YARN notifies AM of failures Page 13 HDFS YARN Node Manager HDFS YARN Node Manager HBase Region Server HDFS YARN Resource Manager Hoya Client HDFS YARN Node Manager Hoya AM [HBase Master] HBase Region Server HBase Region Server
  • 14. © Hortonworks Inc. HOYA - cool bits • Cluster specification stored as JSON in HDFS • Conf dir cached, dynamically patched before pushing up as local resources for master & region servers • HBase .tar file stored in HDFS -clusters can use the same/different HBase versions • Handling of cluster flexing is the same code as unplanned container loss. • No Hoya code on region servers Page 14
  • 15. © Hortonworks Inc. HOYA - AM RPC API //shut down public void stopCluster(); //change #of worker nodes in cluster public boolean flexNodes(int workers); //get JSON description of live cluster public String getClusterStatus(); Page 15
  • 16. © Hortonworks Inc. Flexing/failure handling is same code public boolean flexNodes(int workers) throws IOException { log.info("Flexing cluster count from {} to {}", numTotalContainers, workers); if (numTotalContainers == workers) { //no-op log.info("Flex is a no-op"); return false; } //update the #of workers numTotalContainers = workers; // ask for more containers if needed reviewRequestAndReleaseNodes(); return true; } Page 16
  • 17. © Hortonworks Inc. 2012 { "name" : "TestHBaseMaster", "createTime" : 1371738651059, "flags" : { "--Xtest" : "true" }, "originConfigurationPath" : "file:/Users/stevel/.hoya/cluster/TestHBaseMaster/orig", "generatedConfigurationPath" : "file:/Users/stevel/.hoya/cluster/TestHBaseMaster/gen", "hBaseClientProperties" : { }, "hbaseHome" : "/Users/stevel/Java/Apps/hbase", "hbaseRootPath" : "file:/Users/stevel/.hoya/cluster/TestHBaseMaster/hbase", "zkHosts" : "127.0.0.1", "zkPath" : "/hbase", "zkPort" : 49564 "workers" : 5, "masterHeap" : 128, "masters" : 1, "workerHeap" : 256, "startTime" : 0, "state" : 1, "statusTime" : 0, "stopTime" : 0, } Spec: declarative parts; this is persisted Cluster Specification: persistent & wire
  • 18. © Hortonworks Inc. Current status • Able to create & stop on-demand HBase clusters –RegionServer failures handled • Able to specify specific HBase configuration: hbase-home or .tar.gz • Cluster stop, restart, flex • get (dynamic) conf as XML, properties
  • 19. © Hortonworks Inc. What's Next • Multiple roles: worker, master, monitor --role worker --roleopts worker yarn.vcores 2 • Multiple Providers: HBase + others –client side: preflight, configuration patching –server side: starting roles, liveness • Liveness probes: HTTP GET, RPC port, RPC op? • YARN enhancements Page 19
  • 20. © Hortonworks Inc. YARN-896: long-lived services 1. Container reconnect on AM restart 2. Token renewal on long-lived apps 3. Containers: signalling, >1 process sequence 4. AM/RM managed gang scheduling 5. Anti-affinity hint in container requests 6. Service Registry - ZK? 7. Logging All post Hadoop-2.1 Page 20
  • 21. © Hortonworks Inc. Hoya needs a home! Page 21 https://github.com/hortonworks/hoya
  • 24. © Hortonworks Inc. Requirements of an App: MUST • Install from tarball; run as normal user • Pre-configurable, static instance config data • deploy/start without human intervention • support dynamic discovery/binding of peers • co-existence with other app instance in cluster/nodes • handle co-located role instances • Persist data to HDFS • support 'kill' as a shutdown option • support role instances moving after failure • handle failed role instances Page 24
  • 25. © Hortonworks Inc. Requirements of an App: SHOULD • Be configurable by Hadoop XML files • Publish dynamically assigned web UI & RPC ports • Support cluster flexing up/down • Support API to determine role instance status • Make it possible to determine role instance ID from app • Support simple remote liveness probes Page 25