When Apache Hadoop was first introduced to the Open Source world, it was focused on implementing Google's Map/Reduce, a framework for batch processing of very large files in a distributed system. Hadoop implemented this framework along with a built-in cluster resource manager specialized in the execution of Map/Reduce programs. The popularity of Hadoop has since led to the (re-)emergence of various other "Big Data" processing paradigms, such as real-time stream processing (Storm, S4), message-passing (MPI), and graph processing (Giraph). However, because Hadoop was specialized in Map/Reduce, it has been hard to leverage existing Hadoop infrastructure for these other paradigms. This changes with the release of Apache Hadoop 2.0, the next-generation Map/Reduce engine: Its new resource manager, YARN, not only improves availability, scalability, security and multi-tenancy, it also decouples the cluster management from the Map/Reduce engine. YARN manages the cluster's compute resources as "compute slots". Through its application master API, each application can obtain slots from YARN and is free to use them for any type of computation. YARN manages and schedules available slots in the cluster, monitors running slots, and notifies the application if one of its slots has died. The application master decides how many slots to allocate and what tasks run in those slots. With this architecture, YARN supports running arbitrary compute paradigms that can all share the resources of one cluster. This allows for innovation, agility and better hardware utilization. This talk will give a detailed introduction of YARN's concepts, architecture and interfaces. We will illustrate the use of YARN's APIs and show examples of YARN applications. We will share the experience and lessons learned from building a real-time streaming engine on top of YARN and running it at scale in production, and we will present best practices and design patterns around YARN.
2. Andreas Neumann, Terence Yim
IN THIS TALK, YOU WILL
• Learn about Hadoop
• Learn about other distributed compute paradigms
• Learn about YARN and how it works
• Learn how to get more value out of your Hadoop cluster
• Learn about Weave
3. Andreas Neumann, Terence Yim
A WEB-APP
Database
Presentation
(stateless)
Business Logic
(stateless)
4. Andreas Neumann, Terence Yim
A SCALABLE WEB-APP
Database
Presentation
(stateless)
Business Logic
(stateless)
5. Andreas Neumann, Terence Yim
A LOGGING WEB-APP
Presentation
(stateless)
Business Logic
(stateless) log log log
Database
6. Andreas Neumann, Terence Yim
WHAT TO DO WITH MY LOGS?
Analyzing web logs can create real business value
• Optimize for observed user behavior
• Personalize each user’s experience
• ...
Many web logs are simply being discarded
• Too expensive?
• Too Hard to analyze?
• Too much heavy lifting?
8. Andreas Neumann, Terence Yim
APACHE
• Open source and free
• Runs on commodity hardware - cheap to store
• Programming Paradigm: Map/Reduce (Google 2004)
• Batch processing of very large files in a distributed file system
• Simple yet powerful
• Turn your web app into a data-driven app
9. Andreas Neumann, Terence Yim
A MAP/REDUCE APP
Reducers
Mappers
(shuffle)
split split split
part part part
10. Andreas Neumann, Terence Yim
MAP/REDUCE CLUSTER
split split split
part part part
split split split
part part part
split split
part part
split split
part part
11. Andreas Neumann, Terence Yim
I HAVE HADOOP - WHAT NEXT?
What if:
• Map/Reduce is not the only thing I want to run?
• I would like to do real-time analysis?
• Or run a message-passing algorithm over my data?
• My cluster has more compute capacity than my big data needs?
Can I utilize its idle resources for other, compute-intensive tasks?
14. Andreas Neumann, Terence Yim
A DISTRIBUTED LOAD TEST
test testtest
test test test
test test test
test
test
test
Web
Service
15. Andreas Neumann, Terence Yim
HADOOP AT CONTINUUITY
Continuuity AppFabric
• Developer centric big data app platform
• Runs many different types of jobs in Hadoop cluster
• Real-time stream processing
• Ad-hoc OLAP queries
• Map/Reduce
16. Andreas Neumann, Terence Yim
A MULTI-PURPOSE CLUSTER
log log log
Database
spli
t spli spli
par par par
data
data data
data data
data
test testtest
test test test
test test test
test
test
test
Web
Service
Database
events
17. Andreas Neumann, Terence Yim
THE ANSWER: YARN
• The Resource Manager of Hadoop 2.0
• Separates resource management from programming paradigm
• Allows (almost) all kinds of distributed application in Hadoop cluster
• Multi-tenant, secure
• Scales to many thousands of nodes
• Pluggable scheduling algorithms
18. Andreas Neumann, Terence Yim
YARN APPLICATION MASTER
log log log
Database
spli
t spli spli
par par par
data
data data
data data
data
test testtest
test test test
test test test
test
test
test
Web
Service
Database
events
AM AM AM
AM
AM
YARN
Resource
Manager
19. Andreas Neumann, Terence Yim
YARN - HOW IT WORKS
• Every node (machine) in the cluster has several slots.
• A node manager runs on each node to manages the node’s slots.
• The resource manager assigns the slots of the cluster to
applications.
• Applications must have a master.
• The master requests slots from the resource manager and starts
tasks.
20. Andreas Neumann, Terence Yim
YARN - HOW IT WORKS
Protocols
1) Client-RM: Submit the app master
2) RM-NM: Start the app master
3) AM-RM: Request + release containers
4) RM-NM: Start tasks in containers
YARN
Resource
Manager
Node Mgr
AM
Node Mgr
Node MgrNode Mgr
Task Task
Task
Task
TaskTask
YARN
Client
1)
2)
3)
4)
21. Andreas Neumann, Terence Yim
BACKGROUND: HADOOP+HDFS
Local
file system
Local
file system
. . .
Name
Node
HDFS distributed file system
Data
Node
• Every node contributes part of its
local file system to HDFS.
• Tasks can only depend on the
local file system
(JVM class path does not
understand HDFS protocol)
Data
Node
22. Andreas Neumann, Terence Yim
BACKGROUND: HADOOP+HDFS
YARN
Resource
Manager AM
YARN
Client
Local
file system
Local
file system
. . .
Name
Node
HDFS distributed file system
Local
file system
AM.jar
Node
Mgr
Node
Mgr
Data
Node
Data
Node
X
23. Andreas Neumann, Terence Yim
BACKGROUND: HADOOP+HDFS
YARN
Resource
Manager AM
YARN
Client
Local
file system
. . .
Name
Node
HDFS distributed file system
Local
file system
AM.jar
1) copy
to HDFS
AM.jar
2) copy
to local
AM.jar
3)
Node
Mgr
Node
Mgr
Data
Node
Data
Node
24. Andreas Neumann, Terence Yim
WRITING THE YARN CLIENT
1) Connect to the Resource Manager.
2) Request a new application ID.
3) Create a submission context and a container launch context.
4) Define the local resources for the AM.
5) Define the environment for the AM.
6) Define the command to run for the AM.
7) Define the resource limits for the AM.
8) Submit the request to start the app master.
25. Andreas Neumann, Terence Yim
WRITING THE YARN CLIENT
Connect to the Resource Manager:
YarnConfiguration yarnConf = new YarnConfiguration(conf);
InetSocketAddress rmAddress =
NetUtils.createSocketAddr(yarnConf.get(
YarnConfiguration.RM_ADDRESS,
YarnConfiguration.DEFAULT_RM_ADDRESS));
LOG.info("Connecting to ResourceManager at " + rmAddress);
configuration rmServerConf = new Configuration(conf);
rmServerConf.setClass(
YarnConfiguration.YARN_SECURITY_INFO,
ClientRMSecurityInfo.class, SecurityInfo.class);
ClientRMProtocol resourceManager = ((ClientRMProtocol) rpc.getProxy(
ClientRMProtocol.class, rmAddress, appsManagerServerConf));
26. Andreas Neumann, Terence Yim
WRITING THE YARN CLIENT
Request an application ID, create a submission context and a launch
context:
GetNewApplicationRequest request =
Records.newRecord(GetNewApplicationRequest.class);
GetNewApplicationResponse response =
resourceManager.getNewApplication(request);
LOG.info("Got new ApplicationId=" + response.getApplicationId());
ApplicationSubmissionContext appContext =
Records.newRecord(ApplicationSubmissionContext.class);
appContext.setApplicationId(appId);
appContext.setApplicationName(appName);
ContainerLaunchContext amContainer =
27. Andreas Neumann, Terence Yim
WRITING THE YARN CLIENT
Define the local resources:
Map<String, LocalResource> localResources = Maps.newHashMap();
// assume the AM jar is here:
Path jarPath; // <- known path to jar file
// Create a resource with location, time stamp and file length
LocalResource amJarRsrc = Records.newRecord(LocalResource.class);
amJarRsrc.setType(LocalResourceType.FILE);
amJarRsrc.setResource(ConverterUtils.getYarnUrlFromPath(jarPath));
FileStatus jarStatus = fs.getFileStatus(jarPath);
amJarRsrc.setTimestamp(jarStatus.getModificationTime());
amJarRsrc.setSize(jarStatus.getLen());
localResources.put("AppMaster.jar", amJarRsrc);
amContainer.setLocalResources(localResources);
28. Andreas Neumann, Terence Yim
WRITING THE YARN CLIENT
Define the environment:
// Set up the environment needed for the launch context
Map<String, String> env = new HashMap<String, String>();
// Setup the classpath needed.
// Assuming our classes are available as local resources in the
// working directory, we need to append "." to the path.
String classPathEnv = "$CLASSPATH:./*:";
env.put("CLASSPATH", classPathEnv);
// setup more environment
env.put(...);
amContainer.setEnvironment(env);
29. Andreas Neumann, Terence Yim
WRITING THE YARN CLIENT
Define the command to run for the AM:
// Construct the command to be executed on the launched container
String command =
"${JAVA_HOME}" + /bin/java" +
" MyAppMaster" +
" arg1 arg2 arg3" +
" 1>" + ApplicationConstants.LOG_DIR_EXPANSION_VAR + "/stdout" +
" 2>" + ApplicationConstants.LOG_DIR_EXPANSION_VAR + "/stderr";
List<String> commands = new ArrayList<String>();
commands.add(command);
// Set the commands into the container spec
amContainer.setCommands(commands);
30. Andreas Neumann, Terence Yim
WRITING THE YARN CLIENT
Define the resource limits for the AM:
// Define the resource requirements for the container.
// For now, YARN only supports memory constraints.
// If the process takes more memory, it is killed by the framework.
Resource capability = Records.newRecord(Resource.class);
capability.setMemory(amMemory);
amContainer.setResource(capability);
// Set the container launch content into the submission context
appContext.setAMContainerSpec(amContainer);
31. Andreas Neumann, Terence Yim
WRITING THE YARN CLIENT
Submit the request to start the app master:
// Create the request to send to the Resource Manager
SubmitApplicationRequest appRequest =
Records.newRecord(SubmitApplicationRequest.class);
appRequest.setApplicationSubmissionContext(appContext);
// Submit the application to the ApplicationsManager
resourceManager.submitApplication(appRequest);
32. Andreas Neumann, Terence Yim
YARN: APPLICATION MASTER
YARN
Resource
Manager AM
YARN
Client
Local
file system
Local
file system
. . .
Name
Node
HDFS distributed file system
Node
Mgr
Node
Mgr
Data
Node
Data
Node
33. Andreas Neumann, Terence Yim
YARN - APPLICATION MASTER
• Now we have an Application Master running in a container.
• Next, the master starts the application’s tasks in containers
• For each task, similar procedure as starting the master
34. Andreas Neumann, Terence Yim
YARN: APPLICATION MASTER
YARN
Resource
Manager AM
YARN
Client
Local
file system
Local
file system
. . .
Name
Node
HDFS distributed file system
Node
Mgr
Node
Mgr
Data
Node
Data
Node
Task
TaskTask
35. Andreas Neumann, Terence Yim
YARN - APPLICATION MASTER
1) Connect with Resource Manager and register itself.
2) Loop:
• Send request to Resource Manager, to:
a) Send heartbeat
b) Request containers
c) Release containers
• Receive Response from Resource Manager
a) Receive containers
b) Receive notification of containers that terminated
• For each container received, send request to Node Manager to start task.
36. Andreas Neumann, Terence Yim
YARN - APPLICATION MASTER
• Master should terminate after all containers have terminated.
• If master crashes (or fails to send heart beat), all containers are killed.
• In the future, YARN will support restart of the master.
• Starting a task in a container
• Very similar to YARN client starting the master.
• Node manager starts and monitors task.
• If task crashes, Node Manager informs Resource Manager
• Resource Manager informs master in next response
• Master must still release the container.
37. Andreas Neumann, Terence Yim
YARN - APPLICATION MASTER
• Resource Manager assigns containers asynchronously:
• Requested containers are returned at the earliest in the next
call.
• Master must send empty requests until all it receives all
containers.
• Subsequent requests are incremental and can ask for additional
containers.
• Master must keep track of what it has requested and received.
38. Andreas Neumann, Terence Yim
YARN - APPLICATION LOGS
YARN
Resource
Manager AM
Task
TaskTask
YARN
Client
Local
file system
. . .
Name
Node
HDFS distributed file system
2) copy
to HDFS
logs
4) log access through Name Node
logs
1) log4j,
stdout,
stderr
3)
query
RM for
logs
location
Node
Mgr
Node
Mgr
Data
Node
Data
Node
39. Andreas Neumann, Terence Yim
EXAMPLE - DISTRIBUTED SHELL
• Example application in the YARN/Hadoop distribution
• Runs a shell command in a number of containers
• hadoop-2.0.0-alpha: 1681 lines of code
• some simplification in 2.0.3-alpa: 1543 lines
• Conclusion: YARN is complex!
41. Andreas Neumann, Terence Yim
YARN - SHORTCOMINGS
• Complexity
• Protocols are at very low level and very verbose.
• Client must prepare all dependent jars on HDFS.
• Client must set up environment, class path etc.
• Logs are only collected after application terminates
• What about long-running apps?
• Applications don’t survive master crash: SPOF!
• No built-in communication between containers and master.
• Hard to debug
42. Andreas Neumann, Terence Yim
TYPICAL YARN APP
• Many distributed applications are simple.
• For example: Distributed load test for a queuing system:
• “Run n producers and m consumers for k seconds”
• Neither producer nor consumer know that they run in YARN
• This would be easy if we run them as threads in a single JVM
• Can we make YARN as simple as Java Threads?
43. Andreas Neumann, Terence Yim
TYPICAL NEEDS ...
... of a distributed application:
• Monitor and restart containers
• Collect logs and metrics
• Elastic scale (dynamic adding/removing of instances)
• Long-running jobs
45. Andreas Neumann, Terence Yim
HERE COMES WEAVE
• Open-Source, Apache License
• Does not replace YARN, but complements it
• Generic App Master
+ a DSL for specifying applications
+ an API for writing container tasks
+ built-in service discovery, logging, metrics, monitoring
47. Andreas Neumann, Terence Yim
WEAVE
YARN
Resource
Manager
Node Mgr Node Mgr
Node MgrNode Mgr
Task
Weave
Client
...
AM+
Kafka
Task
Task
Task
Task Task
Weave
Runnable
Weave
Runner
Weave
Runnable
log
ZooKeeper
ZooKeeper
ZooKeeper
state +
messages
log stream
This is the only
programming
interface you need
48. Andreas Neumann, Terence Yim
RUNNABLE
Define a task:
public class EchoServer implements WeaveRunnable
{
static Logger LOG = ...
private WeaveContext context;
//....
@Override
public void run() {
ServerSocket serverSocket = new ServerSocket(0);
context.announce("echo", serverSocket.getLocalPort());
while (isRunning()) {
Socket socket = serverSocket.accept();
//...
}
}
}
WeaveRunnable = single task
SLF4J can be used for logging
- collected via Kafka
- available to client
WeaveRunnable can be run in
- Thread
- YARN container
Service endpoint announcement
49. Andreas Neumann, Terence Yim
RUNNER
Start a task:
WeaveRunner runner = ...;
WeaveController controller =
runner.prepare(new EchoServer())
.addLogHandler(new PrinterLogHandler(
new PrintWriter(System.out)
)
).start();
//...
controller.sendCommand(
Commands.create("stat", "incoming"));
//...
controller.stop();
WeaveRunner
- packages and its dependencies
- plus specified resources
- attaches log collector to all containers
WeaveController is used to manage
the lifecycle of a runnable or application
Controller also provides ability to
send commands to running containers
50. Andreas Neumann, Terence Yim
APPLICATION
Define an application:
public class WebServer implements WeaveApplication {
@Override
public WeaveSpecification configure() {
return WeaveSpecification.Builder.with()
.setName(“Jetty WebServer”)
.withRunnables()
.add(new JettyWebServer())
.withLocalFiles()
.add(“www-html”, new File(“/var/html”))
.apply()
.add(new LogsCollector())
.anyOrder()
.build();
}
}
WeaveController controller =
runner.prepare(new WebServer()).start();
WeaveApplication = collection of tasks
Application is given by
a WeaveSpecification
Application can specify additional local
directories to be made available.
Weave starts containers in no order.
WeaveRunner can also start
an application
51. Andreas Neumann, Terence Yim
APPLICATION
Define an application with order:
public class DataApplication implements WeaveApplication
{
@Override
public WeaveSpecification configure() {
return WeaveSpecification.Builder.with()
.setName(“Cool Data Application”)
.withRunnables()
.add(“reader”, new InputReader())
.add(“processor”, new Processor())
.add(“writer”, new OutputWriter())
.order()
.first(“reader”)
.next(“processor”)
.next(“writer”)
.build();
}
}
starts containers in given order.
52. Andreas Neumann, Terence Yim
SUMMARY
• Web apps generate data.
• Data driven apps with Hadoop.
• More value out of Hadoop with YARN.
• Easier with Weave.
• Now every Java developer can write distributed apps.
53. Andreas Neumann, Terence Yim
THANK YOU
https://github.com/continuuity/weave
http://hadoop.apache.org/docs/current/hadoop-yarn/
hadoop-yarn-site/YARN.html
Yes, we are hiring:
careers@continuuity.com