February 2014 HUG : Introduction to Tez

Tez, An Introduction

Alan F. Gates
Founder & Architect
@alanfgates

Page 1

In The Beginning of Hadoop…
...there was MapReduce

–It could handle data sizes way beyond
those of its competitors
–It was resilient in the face of failure
–It made it easy for users to bring their
code and algorithms to the data (i.e. free
to program in Java instead of just SQL)

© 2014 Hortonworks

Page 2

But, It Was Too Low Level

© 2014 Hortonworks

Page 3

But it was too rigid

© 2014 Hortonworks

Page 4

But, It Was Batch

© 2014 Hortonworks

Page 5

YARN to the Rescue

© 2014 Hortonworks

Page 6

Why Tez? Enable Data Processing In Many
Tools
•An execution engine that can be used by
Hive, Pig, Cascading, and others
•Right now SQL on hadoop is hot, and we
want to enable that
•But we also want to keep in mind that
there’s a lot else to be done in Hadoop
(machine learning, ETL, graph
processing, etc.) and we want to open up
the work we’re doing to those groups as
well.
© 2014 Hortonworks

Page 7

Why Tez? Span Batch and Interactive
•It’s hard for customers to use different tools
depending on their data size
•It’s hard for applications like Hive to use
different back end engines depending on
the inputs and outputs

© 2014 Hortonworks

Page 8

Why Tez? Preserve MapReduce Experience
•MapReduce represents engineering
centuries of work
•Much has been learned (mostly the hard
way) about scale and resiliency
•We are not excited to reinvent those
wheels, we would rather rebuild the vehicle
on top of them

© 2014 Hortonworks

Page 9

February 2014 HUG : Introduction to Tez

Recomendados

Recomendados

Mais conteúdo relacionado

Mais procurados

Mais procurados (20)

Semelhante a February 2014 HUG : Introduction to Tez

Semelhante a February 2014 HUG : Introduction to Tez (20)

Mais de Yahoo Developer Network

Mais de Yahoo Developer Network (20)

Último

Último (20)

February 2014 HUG : Introduction to Tez