Teradata joined the Presto community in 2015 and is now a leading contributor to this open source SQL engine, originally created by Facebook. The project has a rapidly growing community of users, including Airbnb, FINRA, Netflix, Twitter, and Uber. Kamil Bajda-Pawlikowski explores the key architectural components that allow querying variety of data sources and make Presto uniquely position to be applied in both Hadoop and Cloud use cases. Along the way, Kamil covers Teradata’s recent enhancements in query performance, security integrations, and ANSI SQL coverage and shares the roadmap for 2017 and beyond.
Presto: Distributed SQL on Anything - Strata Hadoop 2017 San Jose, CA
1. 1
Presto: Distributed SQL on Anything
Strata Hadoop, San Jose 2017
Kamil Bajda-Pawlikowski
Chief Architect
Teradata Center for Hadoop
2. 2
What is Presto?
100% open source distributed ANSI SQL query engine
Originally developed by Facebook
Key Differentiators:
Performance & Scale
Cross platform query capability, not only SQL on Hadoop
Apache licensed, hosted on GitHub
Certified distro & support from Teradata
3. 3
Presto Architecture
Data stream API
Worker
Data stream API
Worker
Coordinator
Metadata
API
Parser/
analyzer
Planner Scheduler
Worker
Client
Data location
API
Pluggable
12. 12
Open Source Community
• Collaboration with Facebook and Presto community
– Joint design and development
– Conference talks, meetups and webinars
• Major commitment from Teradata Labs:
– 20 full-time engineers
– Free and open source contributions
– Enterprise-ready distribution
"A special shout out goes to Teradata — which joined the Presto community this year
with a focus on enhancing enterprise features and providing support — for having
seven of our top 10 external contributors."
— Facebook