2. Overview
● Our Journey
● Analytics@Ola
● CDC Overview
● Application Events
● Majority Sources
● Hello Presto
● The Presto Kafka Problem
● Solution
● Results
● We like ambari!
● How to expose?
● Hue drawbacks
● Presto as a first class citizen of Hue
● Roadmap
3.
4. Overview of analytics@Ola
● 25k query run daily by business analysts.
● ~400 business analysts.
● 2.5 TB of daily data ingest.
● ~3k tables maintained by dataplatform.
● Auth managed via Ranger
12. We like ambari!!
● Exposing presto on ambari .
● Patching open source ambari to fit our needs of pulling tars from s3.
● Out of the box alerting and monitoring.
● Releasing plugins via s3 poll.
● Autoscaling via AWS autoscaling groups.
13. That's okay but how to expose?
● We had 3 choices.
○ MSTR
○ Hue
○ New interface like superset
14. Why Hue will not work?
● No results download
● No query progress
● No query kill functionality
● Result caching
● Download limit on rows fetched and not size.
● Launching jvm for each user
15. Why MSTR did not work?
● Downloading was tedious.
● Per user memory issue.
● UI unfamiliarity.
16. Presto as a first class citizen for hue
● Results download upto 100 mb.
● Query progress .
● Query kill supported .
● Query expiry after 7 days. No need to rerun historical q’s
● Coordinator query url
17. Roadmap
1. Contributing presto kafka connector back
2. Presto oozie support
3. Getting Presto Ranger PR merged
4. Deprecating Hive for analysts