Mais conteúdo relacionado
Semelhante a February 2014 HUG : Introduction to Tez (20)
Mais de Yahoo Developer Network (20)
February 2014 HUG : Introduction to Tez
- 2. In The Beginning of Hadoop…
...there was MapReduce
–It could handle data sizes way beyond
those of its competitors
–It was resilient in the face of failure
–It made it easy for users to bring their
code and algorithms to the data (i.e. free
to program in Java instead of just SQL)
© 2014 Hortonworks
Page 2
- 7. Why Tez? Enable Data Processing In Many
Tools
•An execution engine that can be used by
Hive, Pig, Cascading, and others
•Right now SQL on hadoop is hot, and we
want to enable that
•But we also want to keep in mind that
there’s a lot else to be done in Hadoop
(machine learning, ETL, graph
processing, etc.) and we want to open up
the work we’re doing to those groups as
well.
© 2014 Hortonworks
Page 7
- 8. Why Tez? Span Batch and Interactive
•It’s hard for customers to use different tools
depending on their data size
•It’s hard for applications like Hive to use
different back end engines depending on
the inputs and outputs
© 2014 Hortonworks
Page 8
- 9. Why Tez? Preserve MapReduce Experience
•MapReduce represents engineering
centuries of work
•Much has been learned (mostly the hard
way) about scale and resiliency
•We are not excited to reinvent those
wheels, we would rather rebuild the vehicle
on top of them
© 2014 Hortonworks
Page 9