2. Falcon is a feed processing and feed management system aimed at making it easier for
end consumers to onboard their feed processing and feed management on hadoop
clusters.
http://falcon.apache.org
19. OnBoarding Pipeline
• Group All Process
• Minutely, Hourly, Daily, Weekly, Monthly
• Group Related Feeds
• Verify All process jars, workflows pushed to cluster
• Verify ownerships of all feed and process directories
• Verify owners have job scheduling access roles in particular cluster
• Validate the feeds
• Submit and schedule the feeds, so retention and replication is in place
• Dryrun the process schedule
• Submit and schedule the process
• Document the FEED SLA, HDFS Usage, retention period for
monitoring
• Document the PROCESS SLA, to observe delays
20. Challenges
• Tightly Integrated with Oozie
• Monitoring, onboarding needs streamlined
• Realtime change in Schedule Time, Queues
Advantages
• Development is very aggressive
• Industry is adopted quickly
• Once onboarded, focus only needs to be on set of critical process
• Easy shutdown and upgrade, as all the running jobs are managed by oozie
• DevOps can do easy setup and manage data