4. Overview: Coordinator
• Oozie executes workflow based on:
– Time Dependency (Frequency)
– Data Dependency
• Introduced in Oozie 2.x.
Oozie Server
Check
WS API Oozie Data Availability
Coordinator
Oozie
Oozie Workflow
Client Hadoop
5. Oozie 3.x: Bundle
• User can define and execute a bunch of
coordinator applicaons.
• User could start/stop/suspend/resume/rerun in
the bundle level.
• Benefits: Easy to maintain and control large data
pipelines applicaons for Service Engineering
team.
Oozie Server Check
WS API Data Availability
Bundle
Coordinator
Oozie Workflow
Client Hadoop
7. Enhanced Stability and Scalability
• Issue :
– At very high load, Oozie becomes slow.
– 90% of the total Oozie support incidence.
• Reason:
– Lot of acve but non‐progressing jobs.
– Oozie internal queue is full.
• Resoluon:
– Throcle the number of acve jobs/coordinator
– Put the job into meout state.
– Enforce the uniqueness for oozie queue element.
8. Improved Usability
• Issue:
– Coordinator job’s status is not intuive and causes
confusion to the Oozie user.
• Reason:
– Status SUCCEEDED doesn’t mean job is
successful!!
– Status PREMATER is for oozie internal use only.
But it was exposed to user.
• Resoluon:
– Redesign Coordinator status
9. Coordinator Status Redesign
Current SUSPENDED KILLED
PREP PREMATER Running SUCCEEDED
FAILED
New SUSPENDED KILLED
SUCCEEDED
PREP Running
DONE_WITH_ERROR
PAUSED FAILED
10. The Second Year ...
• Number of Releases
– Feature Releases : 3
– Patches : 9
• Backward compa5bility is strongly maintained.
• No need to resubmit the job if Oozie is restarted.
• Code Overhaul:
– Re‐designed the command pacern to avoid DB
connecon leaks and to improve DB connecons
usages.
11. Oozie Usages
• Y! internal usages:
– Total number of user : 377
– Total number of processed jobs ≈ 600K/month
• External downloads:
– 1500+ in last 8 months from Github
– A large number of downloads maintained by 3rd
party packaging.
13. Challenges 1 :Data Availability Check
• Issue :
– Currently checks directory in every minute (polling
based).
– Increases NN overhead and does not scale well.
• Reason: No meta‐data system with
appropriate noficaons mechanism.
• Planned resoluon: Incorporate with HCatalog
metadata system.
14. Challenges 2 : Adaptability to Hadoop
• Issues : If Hadoop NN or JT is down, Oozie
submits job and obviously fails. User intervenon
is required when Hadoop server is back.
• Impact: Inconvenient for Oozie user. For example,
if Hadoop is restarted on Friday night, job will not
run unl next Monday.
• Planned Resoluon: Graceful handling of Hadoop
downme:
– If Hadoop is down, block submission.
– When Hadoop becomes available
• Submit the blocked job
• Auto‐resubmit the untraced job.
15. Challenges 3: Horizontally Scalable
• Issues: One instance of Oozie could not efficiently
handle a very large number of jobs (say 100K/
hours). In addion, Oozie doesn’t support load
balancing.
• Reason: Oozie internal task queue is not
synchronized across mulple Oozie instances.
• Planned Resoluon: Use Zookeeper for coordinaon.
• Benefits: As the load increases, add extra Oozie
server.
19. Oozie Workflow Applicaon
• Contents
– A workflow.xml file
– Resource files, config files and Pig scripts
– All necessary JAR and nave library files
• Parameters
– The workflow.xml, is parameterized, parameters
can be propagated to map-reduce, pig & ssh
jobs
• Deployment
– In a directory in the HDFS of the Hadoop cluster
where the Hadoop & Pig jobs will run
19
20. Oozie
Running a Workflow Job cmd
Workflow ApplicaNon Deployment
$ hadoop fs –mkdir hdfs://usr/tucu/wordcount-wf
$ hadoop fs –mkdir hdfs://usr/tucu/wordcount-wf/lib
$ hadoop fs –copyFromLocal workflow.xml wordcount.xml hdfs://usr/tucu/wordcount-wf
$ hadoop fs –copyFromLocal hadoop-examples.jar hdfs://usr/tucu/wordcount-wf/lib
$
Workflow Job ExecuNon
$ oozie run -o http://foo.corp:8080/oozie
-a hdfs://bar.corp:9000/usr/tucu/wordcount-wf
input=/data/2008/input output=/data/2008/output
Workflow job id [1234567890-wordcount-wf]
$
Workflow Job Status
$ oozie status -o http://foo.corp:8080/oozie -j 1234567890-wordcount-wf
Workflow job status [RUNNING]
...
$
20