Enviar pesquisa
Carregar
Hadoop Puzzlers
•
3 gostaram
•
1,120 visualizações
DataWorks Summit
Seguir
Tecnologia
Educação
Diversão e humor
Denunciar
Compartilhar
Denunciar
Compartilhar
1 de 49
Recomendados
Ipc: aidl sexy, not a curse
Ipc: aidl sexy, not a curse
Yonatan Levin
IPC: AIDL is sexy, not a curse
IPC: AIDL is sexy, not a curse
Yonatan Levin
GPars (Groovy Parallel Systems)
GPars (Groovy Parallel Systems)
Gagan Agrawal
Clustering your Application with Hazelcast
Clustering your Application with Hazelcast
Hazelcast
concurrency with GPars
concurrency with GPars
Paul King
Easy Scaling with Open Source Data Structures, by Talip Ozturk
Easy Scaling with Open Source Data Structures, by Talip Ozturk
ZeroTurnaround
Hazelcast
Hazelcast
Bruno Lellis
Integrate Solr with real-time stream processing applications
Integrate Solr with real-time stream processing applications
lucenerevolution
Recomendados
Ipc: aidl sexy, not a curse
Ipc: aidl sexy, not a curse
Yonatan Levin
IPC: AIDL is sexy, not a curse
IPC: AIDL is sexy, not a curse
Yonatan Levin
GPars (Groovy Parallel Systems)
GPars (Groovy Parallel Systems)
Gagan Agrawal
Clustering your Application with Hazelcast
Clustering your Application with Hazelcast
Hazelcast
concurrency with GPars
concurrency with GPars
Paul King
Easy Scaling with Open Source Data Structures, by Talip Ozturk
Easy Scaling with Open Source Data Structures, by Talip Ozturk
ZeroTurnaround
Hazelcast
Hazelcast
Bruno Lellis
Integrate Solr with real-time stream processing applications
Integrate Solr with real-time stream processing applications
lucenerevolution
Vielseitiges In-Memory Computing mit Apache Ignite und Kubernetes
Vielseitiges In-Memory Computing mit Apache Ignite und Kubernetes
QAware GmbH
Getting Started with Datatsax .Net Driver
Getting Started with Datatsax .Net Driver
DataStax Academy
ChtiJUG - Cassandra 2.0
ChtiJUG - Cassandra 2.0
Michaël Figuière
Cassandra summit 2013 - DataStax Java Driver Unleashed!
Cassandra summit 2013 - DataStax Java Driver Unleashed!
Michaël Figuière
Kotlin coroutines and spring framework
Kotlin coroutines and spring framework
Sunghyouk Bae
Apex code benchmarking
Apex code benchmarking
purushottambhaigade
Showdown of the Asserts by Philipp Krenn
Showdown of the Asserts by Philipp Krenn
JavaDayUA
.NET Multithreading and File I/O
.NET Multithreading and File I/O
Jussi Pohjolainen
Paris Cassandra Meetup - Cassandra for Developers
Paris Cassandra Meetup - Cassandra for Developers
Michaël Figuière
Clojure: a LISP for the JVM
Clojure: a LISP for the JVM
Knowledge Engineering and Machine Learning Group
Psycopg2 - Connect to PostgreSQL using Python Script
Psycopg2 - Connect to PostgreSQL using Python Script
Survey Department
Programming with Python and PostgreSQL
Programming with Python and PostgreSQL
Peter Eisentraut
Hazelcast and MongoDB at Cloud CMS
Hazelcast and MongoDB at Cloud CMS
uzquiano
Concurrency Concepts in Java
Concurrency Concepts in Java
Doug Hawkins
Vavr Java User Group Rheinland
Vavr Java User Group Rheinland
David Schmitz
Rx 101 Codemotion Milan 2015 - Tamir Dresher
Rx 101 Codemotion Milan 2015 - Tamir Dresher
Tamir Dresher
Building responsive application with Rx - confoo - tamir dresher
Building responsive application with Rx - confoo - tamir dresher
Tamir Dresher
Hadoop
Hadoop
Karunakar Singh Thakur
JVM Mechanics
JVM Mechanics
Doug Hawkins
NoSQL @ CodeMash 2010
NoSQL @ CodeMash 2010
Ben Scofield
Practical Computing With Chaos
Practical Computing With Chaos
DataWorks Summit
Omid Efficient Transaction Mgmt and Processing for HBase
Omid Efficient Transaction Mgmt and Processing for HBase
DataWorks Summit
Mais conteúdo relacionado
Mais procurados
Vielseitiges In-Memory Computing mit Apache Ignite und Kubernetes
Vielseitiges In-Memory Computing mit Apache Ignite und Kubernetes
QAware GmbH
Getting Started with Datatsax .Net Driver
Getting Started with Datatsax .Net Driver
DataStax Academy
ChtiJUG - Cassandra 2.0
ChtiJUG - Cassandra 2.0
Michaël Figuière
Cassandra summit 2013 - DataStax Java Driver Unleashed!
Cassandra summit 2013 - DataStax Java Driver Unleashed!
Michaël Figuière
Kotlin coroutines and spring framework
Kotlin coroutines and spring framework
Sunghyouk Bae
Apex code benchmarking
Apex code benchmarking
purushottambhaigade
Showdown of the Asserts by Philipp Krenn
Showdown of the Asserts by Philipp Krenn
JavaDayUA
.NET Multithreading and File I/O
.NET Multithreading and File I/O
Jussi Pohjolainen
Paris Cassandra Meetup - Cassandra for Developers
Paris Cassandra Meetup - Cassandra for Developers
Michaël Figuière
Clojure: a LISP for the JVM
Clojure: a LISP for the JVM
Knowledge Engineering and Machine Learning Group
Psycopg2 - Connect to PostgreSQL using Python Script
Psycopg2 - Connect to PostgreSQL using Python Script
Survey Department
Programming with Python and PostgreSQL
Programming with Python and PostgreSQL
Peter Eisentraut
Hazelcast and MongoDB at Cloud CMS
Hazelcast and MongoDB at Cloud CMS
uzquiano
Concurrency Concepts in Java
Concurrency Concepts in Java
Doug Hawkins
Vavr Java User Group Rheinland
Vavr Java User Group Rheinland
David Schmitz
Rx 101 Codemotion Milan 2015 - Tamir Dresher
Rx 101 Codemotion Milan 2015 - Tamir Dresher
Tamir Dresher
Building responsive application with Rx - confoo - tamir dresher
Building responsive application with Rx - confoo - tamir dresher
Tamir Dresher
Hadoop
Hadoop
Karunakar Singh Thakur
JVM Mechanics
JVM Mechanics
Doug Hawkins
NoSQL @ CodeMash 2010
NoSQL @ CodeMash 2010
Ben Scofield
Mais procurados
(20)
Vielseitiges In-Memory Computing mit Apache Ignite und Kubernetes
Vielseitiges In-Memory Computing mit Apache Ignite und Kubernetes
Getting Started with Datatsax .Net Driver
Getting Started with Datatsax .Net Driver
ChtiJUG - Cassandra 2.0
ChtiJUG - Cassandra 2.0
Cassandra summit 2013 - DataStax Java Driver Unleashed!
Cassandra summit 2013 - DataStax Java Driver Unleashed!
Kotlin coroutines and spring framework
Kotlin coroutines and spring framework
Apex code benchmarking
Apex code benchmarking
Showdown of the Asserts by Philipp Krenn
Showdown of the Asserts by Philipp Krenn
.NET Multithreading and File I/O
.NET Multithreading and File I/O
Paris Cassandra Meetup - Cassandra for Developers
Paris Cassandra Meetup - Cassandra for Developers
Clojure: a LISP for the JVM
Clojure: a LISP for the JVM
Psycopg2 - Connect to PostgreSQL using Python Script
Psycopg2 - Connect to PostgreSQL using Python Script
Programming with Python and PostgreSQL
Programming with Python and PostgreSQL
Hazelcast and MongoDB at Cloud CMS
Hazelcast and MongoDB at Cloud CMS
Concurrency Concepts in Java
Concurrency Concepts in Java
Vavr Java User Group Rheinland
Vavr Java User Group Rheinland
Rx 101 Codemotion Milan 2015 - Tamir Dresher
Rx 101 Codemotion Milan 2015 - Tamir Dresher
Building responsive application with Rx - confoo - tamir dresher
Building responsive application with Rx - confoo - tamir dresher
Hadoop
Hadoop
JVM Mechanics
JVM Mechanics
NoSQL @ CodeMash 2010
NoSQL @ CodeMash 2010
Destaque
Practical Computing With Chaos
Practical Computing With Chaos
DataWorks Summit
Omid Efficient Transaction Mgmt and Processing for HBase
Omid Efficient Transaction Mgmt and Processing for HBase
DataWorks Summit
Learning Linear Models with Hadoop
Learning Linear Models with Hadoop
DataWorks Summit
Scalding: Twitter's New DSL for Hadoop
Scalding: Twitter's New DSL for Hadoop
DataWorks Summit
Enterprise Integration of Disruptive Technologies
Enterprise Integration of Disruptive Technologies
DataWorks Summit
Enterprise-Grade Rolling Upgrade for a Live Hadoop Cluster
Enterprise-Grade Rolling Upgrade for a Live Hadoop Cluster
DataWorks Summit
Hadoop and R Go to the Movies
Hadoop and R Go to the Movies
DataWorks Summit
Hadoop-scale Search with Solr
Hadoop-scale Search with Solr
DataWorks Summit
Safer on the road with Hadoop! Setting up a Data Science Platform
Safer on the road with Hadoop! Setting up a Data Science Platform
DataWorks Summit
Big Data Management System: Smart SQL Processing Across Hadoop and your Data ...
Big Data Management System: Smart SQL Processing Across Hadoop and your Data ...
DataWorks Summit
Building and Improving Products with Hadoop
Building and Improving Products with Hadoop
DataWorks Summit
Fast, Scalable Graph Processing: Apache Giraph on YARN
Fast, Scalable Graph Processing: Apache Giraph on YARN
DataWorks Summit
Hdfs high availability
Hdfs high availability
DataWorks Summit
Hive & HBase For Transaction Processing
Hive & HBase For Transaction Processing
DataWorks Summit
How to Determine which Algorithms Really Matter
How to Determine which Algorithms Really Matter
DataWorks Summit
Apache Pig for Data Scientists
Apache Pig for Data Scientists
DataWorks Summit
Analyzing Historical Data of Applications on YARN for Fun and Profit
Analyzing Historical Data of Applications on YARN for Fun and Profit
DataWorks Summit
SAS on Your (Apache) Cluster, Serving your Data (Analysts)
SAS on Your (Apache) Cluster, Serving your Data (Analysts)
DataWorks Summit
Experiences Streaming Analytics at Petabyte Scale
Experiences Streaming Analytics at Petabyte Scale
DataWorks Summit
Big Data Analytics for Dodd-Frank
Big Data Analytics for Dodd-Frank
DataWorks Summit
Destaque
(20)
Practical Computing With Chaos
Practical Computing With Chaos
Omid Efficient Transaction Mgmt and Processing for HBase
Omid Efficient Transaction Mgmt and Processing for HBase
Learning Linear Models with Hadoop
Learning Linear Models with Hadoop
Scalding: Twitter's New DSL for Hadoop
Scalding: Twitter's New DSL for Hadoop
Enterprise Integration of Disruptive Technologies
Enterprise Integration of Disruptive Technologies
Enterprise-Grade Rolling Upgrade for a Live Hadoop Cluster
Enterprise-Grade Rolling Upgrade for a Live Hadoop Cluster
Hadoop and R Go to the Movies
Hadoop and R Go to the Movies
Hadoop-scale Search with Solr
Hadoop-scale Search with Solr
Safer on the road with Hadoop! Setting up a Data Science Platform
Safer on the road with Hadoop! Setting up a Data Science Platform
Big Data Management System: Smart SQL Processing Across Hadoop and your Data ...
Big Data Management System: Smart SQL Processing Across Hadoop and your Data ...
Building and Improving Products with Hadoop
Building and Improving Products with Hadoop
Fast, Scalable Graph Processing: Apache Giraph on YARN
Fast, Scalable Graph Processing: Apache Giraph on YARN
Hdfs high availability
Hdfs high availability
Hive & HBase For Transaction Processing
Hive & HBase For Transaction Processing
How to Determine which Algorithms Really Matter
How to Determine which Algorithms Really Matter
Apache Pig for Data Scientists
Apache Pig for Data Scientists
Analyzing Historical Data of Applications on YARN for Fun and Profit
Analyzing Historical Data of Applications on YARN for Fun and Profit
SAS on Your (Apache) Cluster, Serving your Data (Analysts)
SAS on Your (Apache) Cluster, Serving your Data (Analysts)
Experiences Streaming Analytics at Petabyte Scale
Experiences Streaming Analytics at Petabyte Scale
Big Data Analytics for Dodd-Frank
Big Data Analytics for Dodd-Frank
Semelhante a Hadoop Puzzlers
Introduction to Scalding and Monoids
Introduction to Scalding and Monoids
Hugo Gävert
C# - What's next
C# - What's next
Christian Nagel
Store and Process Big Data with Hadoop and Cassandra
Store and Process Big Data with Hadoop and Cassandra
Deependra Ariyadewa
Adam Sitnik "State of the .NET Performance"
Adam Sitnik "State of the .NET Performance"
Yulia Tsisyk
State of the .Net Performance
State of the .Net Performance
CUSTIS
Celery - A Distributed Task Queue
Celery - A Distributed Task Queue
Duy Do
Blazing Fast Windows 8 Apps using Visual C++
Blazing Fast Windows 8 Apps using Visual C++
Microsoft Developer Network (MSDN) - Belgium and Luxembourg
The Art Of Readable Code
The Art Of Readable Code
Baidu, Inc.
Hadoop MapReduce framework - Module 3
Hadoop MapReduce framework - Module 3
Rohit Agrawal
Scala @ TechMeetup Edinburgh
Scala @ TechMeetup Edinburgh
Stuart Roebuck
Anti patterns
Anti patterns
Alex Tumanoff
TechTalk - Dotnet
TechTalk - Dotnet
heinrich.wendel
C# 7.x What's new and what's coming with C# 8
C# 7.x What's new and what's coming with C# 8
Christian Nagel
Using xUnit as a Swiss-Aarmy Testing Toolkit
Using xUnit as a Swiss-Aarmy Testing Toolkit
Chris Oldwood
RxJava on Android
RxJava on Android
Dustin Graham
ECSE 221 - Introduction to Computer Engineering - Tutorial 1 - Muhammad Ehtas...
ECSE 221 - Introduction to Computer Engineering - Tutorial 1 - Muhammad Ehtas...
Muhammad Ulhaque
What is new in Java 8
What is new in Java 8
Sandeep Kr. Singh
JRubyKaigi2010 Hadoop Papyrus
JRubyKaigi2010 Hadoop Papyrus
Koichi Fujikawa
Stream analysis with kafka native way and considerations about monitoring as ...
Stream analysis with kafka native way and considerations about monitoring as ...
Andrew Yongjoon Kong
실시간 인벤트 처리
실시간 인벤트 처리
Byeongweon Moon
Semelhante a Hadoop Puzzlers
(20)
Introduction to Scalding and Monoids
Introduction to Scalding and Monoids
C# - What's next
C# - What's next
Store and Process Big Data with Hadoop and Cassandra
Store and Process Big Data with Hadoop and Cassandra
Adam Sitnik "State of the .NET Performance"
Adam Sitnik "State of the .NET Performance"
State of the .Net Performance
State of the .Net Performance
Celery - A Distributed Task Queue
Celery - A Distributed Task Queue
Blazing Fast Windows 8 Apps using Visual C++
Blazing Fast Windows 8 Apps using Visual C++
The Art Of Readable Code
The Art Of Readable Code
Hadoop MapReduce framework - Module 3
Hadoop MapReduce framework - Module 3
Scala @ TechMeetup Edinburgh
Scala @ TechMeetup Edinburgh
Anti patterns
Anti patterns
TechTalk - Dotnet
TechTalk - Dotnet
C# 7.x What's new and what's coming with C# 8
C# 7.x What's new and what's coming with C# 8
Using xUnit as a Swiss-Aarmy Testing Toolkit
Using xUnit as a Swiss-Aarmy Testing Toolkit
RxJava on Android
RxJava on Android
ECSE 221 - Introduction to Computer Engineering - Tutorial 1 - Muhammad Ehtas...
ECSE 221 - Introduction to Computer Engineering - Tutorial 1 - Muhammad Ehtas...
What is new in Java 8
What is new in Java 8
JRubyKaigi2010 Hadoop Papyrus
JRubyKaigi2010 Hadoop Papyrus
Stream analysis with kafka native way and considerations about monitoring as ...
Stream analysis with kafka native way and considerations about monitoring as ...
실시간 인벤트 처리
실시간 인벤트 처리
Mais de DataWorks Summit
Data Science Crash Course
Data Science Crash Course
DataWorks Summit
Floating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache Ratis
DataWorks Summit
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
DataWorks Summit
HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...
DataWorks Summit
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
DataWorks Summit
Managing the Dewey Decimal System
Managing the Dewey Decimal System
DataWorks Summit
Practical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist Example
DataWorks Summit
HBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at Uber
DataWorks Summit
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
DataWorks Summit
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
DataWorks Summit
Supporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability Improvements
DataWorks Summit
Security Framework for Multitenant Architecture
Security Framework for Multitenant Architecture
DataWorks Summit
Presto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything Engine
DataWorks Summit
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
DataWorks Summit
Extending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google Cloud
DataWorks Summit
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
DataWorks Summit
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
DataWorks Summit
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
DataWorks Summit
Computer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near You
DataWorks Summit
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
DataWorks Summit
Mais de DataWorks Summit
(20)
Data Science Crash Course
Data Science Crash Course
Floating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache Ratis
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Managing the Dewey Decimal System
Managing the Dewey Decimal System
Practical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist Example
HBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at Uber
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Supporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability Improvements
Security Framework for Multitenant Architecture
Security Framework for Multitenant Architecture
Presto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything Engine
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Extending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google Cloud
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Computer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near You
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Último
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
BookNet Canada
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
Pixlogix Infotech
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
2toLead Limited
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
Delhi Call girls
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
Results
Slack Application Development 101 Slides
Slack Application Development 101 Slides
praypatel2
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
Rafal Los
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Igalia
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
giselly40
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
Scott Keck-Warren
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
soniya singh
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
Delhi Call girls
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
Allon Mureinik
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Drew Madelung
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
Paola De la Torre
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
Anna Loughnan Colquhoun
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Miguel Araújo
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
shyamraj55
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
Principled Technologies
Último
(20)
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
Slack Application Development 101 Slides
Slack Application Development 101 Slides
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
Hadoop Puzzlers
1.
1 Hadoop Puzzlers Aaron Myers
& Daniel Templeton Cloudera, Inc.
2.
2 Your Hosts Aaron “ATM”
Myers • AKA “Cash Money” • Software Engineer • Apache Hadoop Committer Daniel Templeton • Certification Developer • Crusty, old HPC guy • Likes Perl ©2014 Cloudera, Inc. All rights reserved.2
3.
3 What is a
Hadoop Puzzler ©2014 Cloudera, Inc. All rights reserved.3 • Shameless knockoff of Josh Bloch’s Java Puzzlers talks • We’ll walk through a puzzle • You vote on the answer • We all learn a valuable lesson
4.
4 ©2014 Cloudera,
Inc. All rights reserved.4 Let’s try it, OK?
5.
5 An Easy One public
class MaxMap extends Mapper<LongWritable, Text,Text,IntWritable> { Text k = new Text(); IntWritable v = new IntWritable(); protected void map(LongWritable key, Text val, Context c) … { String[] parts = val.toString().split(","); k.set(parts[0]); v.set(Integer.parseInt(parts[1])); c.write(k, v); } } public class MaxReduce extends Reducer<Text,IntWritable, Text,IntWritable> { protected void reduce(Text key, Iterable<IntWritable> values, Context c) … { IntWritable max = new IntWritable(0); for (IntWritable v: values) if (v.get() > max.get()) max = v; c.write(key, max); } } ©2014 Cloudera, Inc. All rights reserved.5
6.
6 An Easy One The
data: A,1 A,5 A,3 The results: a) A 5 b) A 1 c) A 3 d) The job fails ©2014 Cloudera, Inc. All rights reserved.6
7.
7 An Easy One public
class MaxMap extends Mapper<LongWritable, Text,Text,IntWritable> { Text k = new Text(); IntWritable v = new IntWritable(); protected void map(LongWritable key, Text val, Context c) … { String[] parts = val.toString().split(","); k.set(parts[0]); v.set(Integer.parseInt(parts[1])); c.write(k, v); } } public class MaxReduce extends Reducer<Text,IntWritable, Text,IntWritable> { protected void reduce(Text key, Iterable<IntWritable> values, Context c) … { IntWritable max = new IntWritable(0); for (IntWritable v: values) if (v.get() > max.get()) max = v; c.write(key, max); } } ©2014 Cloudera, Inc. All rights reserved.7 A 1 A 5 A 3
8.
8 An Easy One The
data: A,1 A,5 A,3 The results: a) A 5 b) A 1 c) A 3 d) The job fails ©2014 Cloudera, Inc. All rights reserved.8
9.
9 An Easy One
(Answer) The data: A,1 A,5 A,3 The results: a) A 5 b) A 1 c) A 3 d) The job fails ©2014 Cloudera, Inc. All rights reserved.9
10.
10 An Easy One
(Problem) public class MaxMap extends Mapper<LongWritable, Text,Text,IntWritable> { Text k = new Text(); IntWritable v = new IntWritable(); protected void map(LongWritable key, Text val, Context c) … { String[] parts = val.toString().split(","); k.set(parts[0]); v.set(Integer.parseInt(parts[1])); c.write(k, v); } } public class MaxReduce extends Reducer<Text,IntWritable, Text,IntWritable> { protected void reduce(Text key, Iterable<IntWritable> values, Context c) … { IntWritable max = new IntWritable(0); for (IntWritable v: values) if (v.get() > max.get()) max = v; c.write(key, max); } } ©2014 Cloudera, Inc. All rights reserved.10
11.
11 An Easy One
(Moral) ©2014 Cloudera, Inc. All rights reserved.11 • MapReduce reuses Writables whenever it can • That includes while iterating through the values • Always be careful to only store the value instead of the Writable!
12.
12 A Sinking Feeling public
class AsyncSubmit extends Configured implements Tool { public static void main(String[] args) throws Exception { int ret = ToolRunner.run( new Configuration(), new AsyncSubmit(), args); System.exit(ret); } public int run(String[] args) throws Exception { Job job = Job.getInstance(getConf()); job.setNumReduceTasks(0); FileInputFormat.addInputPath(job, new Path(args[0])); FileOutputFormat.setOutputPath(job, new Path(args[1])); job.waitForCompletion(false); return job.isComplete() ? 1 : 0; } } ©2014 Cloudera, Inc. All rights reserved.12
13.
13 A Sinking Feeling The
data: The complete works of William Shakespeare The results: a) Fails to compile b) The job fails c) Exits with 0 d) Exits with 1 ©2014 Cloudera, Inc. All rights reserved.13
14.
14 A Sinking Feeling public
class AsyncSubmit extends Configured implements Tool { public static void main(String[] args) throws Exception { int ret = ToolRunner.run( new Configuration(), new AsyncSubmit(), args); System.exit(ret); } public int run(String[] args) throws Exception { Job job = Job.getInstance(getConf()); job.setNumReduceTasks(0); FileInputFormat.addInputPath(job, new Path(args[0])); FileOutputFormat.setOutputPath(job, new Path(args[1])); job.waitForCompletion(false); return job.isComplete() ? 1 : 0; } } ©2014 Cloudera, Inc. All rights reserved.14 The complete works of William Shakespeare
15.
15 A Sinking Feeling The
data: The complete works of William Shakespeare The results: a) Fails to compile b) The job fails c) Exits with 0 d) Exits with 1 ©2014 Cloudera, Inc. All rights reserved.15
16.
16 A Sinking Feeling
(Answer) The data: The complete works of William Shakespeare The results: a) Fails to compile b) The job fails c) Exits with 0 d) Exits with 1 ©2014 Cloudera, Inc. All rights reserved.16
17.
17 A Sinking Feeling
(Problem) public class AsyncSubmit extends Configured implements Tool { public static void main(String[] args) throws Exception { int ret = ToolRunner.run( new Configuration(), new AsyncSubmit(), args); System.exit(ret); } public int run(String[] args) throws Exception { Job job = Job.getInstance(getConf()); job.setNumReduceTasks(0); FileInputFormat.addInputPath(job, new Path(args[0])); FileOutputFormat.setOutputPath(job, new Path(args[1])); job.waitForCompletion(false); return job.isComplete() ? 1 : 0; } } ©2014 Cloudera, Inc. All rights reserved.17
18.
18 A Sinking Job
(Moral) ©2014 Cloudera, Inc. All rights reserved.18 • Read the API docs! • Sometimes the obvious meanings of methods and parameters aren’t correct • Parameter for waitForCompletion() controls whether status output is printed • Driver does wait for job to exit but does not print all the job status information
19.
19 Do-over public class MaxMap extends
Mapper<LongWritable, Text,Text,IntWritable> { Text k = new Text(); IntWritable v = new IntWritable(); protected void map(LongWritable key, Text val, Context c) … { String[] parts = val.toString().split(","); k.set(parts[0]); v.set(Integer.parseInt(parts[1])); c.write(k, v); } } public class MaxReduceRedux extends Reducer<Text,Text, Text,IntWritable> { protected void reduce(Text key, Iterable<IntWritable> values, Context c) … { int max = 0; for (IntWritable v: values) if (v.get() > max) max = v.get(); c.write(key, new IntWritable(max)); } } ©2014 Cloudera, Inc. All rights reserved.19
20.
20 Do-over The data: A,1 A,5 The results: a)
A 5 b) A 1 c) A 1 A 5 d) The job fails ©2014 Cloudera, Inc. All rights reserved.20
21.
21 Do-over public class MaxMap extends
Mapper<LongWritable, Text,Text,IntWritable> { Text k = new Text(); IntWritable v = new IntWritable(); protected void map(LongWritable key, Text val, Context c) … { String[] parts = val.toString().split(","); k.set(parts[0]); v.set(Integer.parseInt(parts[1])); c.write(k, v); } } public class MaxReduceRedux extends Reducer<Text,Text, Text,IntWritable> { protected void reduce(Text key, Iterable<IntWritable> values, Context c) … { int max = 0; for (IntWritable v: values) if (v.get() > max) max = v.get(); c.write(key, new IntWritable(max)); } } ©2014 Cloudera, Inc. All rights reserved.21 A 1 A 5
22.
22 Do-over The data: A,1 A,5 The results: a)
A 5 b) A 1 c) A 1 A 5 d) The job fails ©2014 Cloudera, Inc. All rights reserved.22
23.
23 Do-over (Answer) The data: A,1 A,5 The
results: a) A 5 b) A 1 c) A 1 A 5 d) The job fails ©2014 Cloudera, Inc. All rights reserved.23
24.
24 Do-over (Problem) public class
MaxMap extends Mapper<LongWritable, Text,Text,IntWritable> { Text k = new Text(); IntWritable v = new IntWritable(); protected void map(LongWritable key, Text val, Context c) … { String[] parts = val.toString().split(","); k.set(parts[0]); v.set(Integer.parseInt(parts[1])); c.write(k, v); } } public class MaxReduceRedux extends Reducer<Text,Text, Text,IntWritable> { protected void reduce(Text key, Iterable<IntWritable> values, Context c) … { int max = 0; for (IntWritable v: values) if (v.get() > max) max = v.get(); c.write(key, new IntWritable(max)); } } ©2014 Cloudera, Inc. All rights reserved.24
25.
25 Do-over (Moral) ©2014 Cloudera,
Inc. All rights reserved.25 • Mismatched signatures can lead to unexpected behaviors because of exposed base class method implementations • ALWAYS use @Override!
26.
26 Joining Forces hive> DESCRIBE
table1; OK id int phone string state string Time taken: 0.236 seconds hive> DESCRIBE table2; OK id int city string state string Time taken: 0.116 seconds hive> CREATE TABLE table3 AS SELECT table2.*,table1.phone,table1.state AS s FROM table1 JOIN table2 ON (table1.id == table2.id); … hive> EXPORT TABLE table3 TO '/user/cloudera/table3.csv'; … hive> exit $ hadoop fs –cat table3.csv | head -1 | tr , 'n' | wc –l ©2014 Cloudera, Inc. All rights reserved.26
27.
27 Joining Forces The data: hive>
SELECT * FROM table1; OK 1 6506506500 CA 2 2282282280 MS Time taken: 1.006 seconds hive> SELECT * FROM table2; OK 1 Palo Alto CA 2 Gautier MS Time taken: 1.202 seconds The results: a) 5 b) 4 c) 1 d) The join fails ©2014 Cloudera, Inc. All rights reserved.27
28.
28 Joining Forces hive> DESCRIBE
table1; OK id int phone string state string Time taken: 0.236 seconds hive> DESCRIBE table2; OK id int city string state string Time taken: 0.116 seconds hive> CREATE TABLE table3 AS SELECT table2.*,table1.phone,table1.state AS s FROM table1 JOIN table2 ON (table1.id == table2.id); … hive> EXPORT TABLE table3 TO '/user/cloudera/table3.csv'; … hive> exit $ hadoop fs –cat table3.csv | head -1 | tr , 'n' | wc –l ©2014 Cloudera, Inc. All rights reserved.28 1 6506506500 CA 2 2282282280 MS 1 Palo Alto CA 2 Gautier MS
29.
29 Joining Forces The data: hive>
SELECT * FROM table1; OK 1 6506506500 CA 2 2282282280 MS Time taken: 1.006 seconds hive> SELECT * FROM table2; OK 1 Palo Alto CA 2 Gautier MS Time taken: 1.202 seconds The results: a) 5 b) 4 c) 1 d) The join fails ©2014 Cloudera, Inc. All rights reserved.29
30.
30 Joining Forces (Answer) The
data: hive> SELECT * FROM table1; OK 1 6506506500 CA 2 2282282280 MS Time taken: 1.006 seconds hive> SELECT * FROM table2; OK 1 Palo Alto CA 2 Gautier MS Time taken: 1.202 seconds The results: a) 5 b) 4 c) 1 d) The join fails ©2014 Cloudera, Inc. All rights reserved.30
31.
31 Joining Forces (Problem) hive>
DESCRIBE table1; OK id int phone string state string Time taken: 0.236 seconds hive> DESCRIBE table2; OK id int city string state string Time taken: 0.116 seconds hive> CREATE TABLE table3 AS SELECT table2.*,table1.phone,table1.state AS s FROM table1 JOIN table2 ON (table1.id == table2.id); … hive> EXPORT TABLE table3 TO '/user/cloudera/table3.csv'; … hive> exit $ hadoop fs –cat table3.csv | head -1 | tr , 'n' | wc –l ©2014 Cloudera, Inc. All rights reserved.31
32.
32 Joining Forces (Moral) ©2014
Cloudera, Inc. All rights reserved.32 • Hive’s default delimiter is 0x01 (CTRL-A) • Easy to assume export will use a sane delimiter – it doesn’t • Incidentally, Hive’s join rules are pretty sane and work as you’d expect
33.
33 Close Enough public class
MaxMap extends Mapper<LongWritable, Text,Text,IntWritable> { Text k = new Text(); IntWritable v = new IntWritable(); protected void map(LongWritable key, Text val, Context c) … { String[] parts = val.toString().split(","); k.set(parts[0]); v.set(Integer.parseInt(parts[1])); c.write(k, v); } } public class Top20Reduce extends Reducer<Text,IntWritable, Text,IntWritable> { protected void reduce(Text key, Iterable<IntWritable> values, Context c) … { float max = 0.0f; for (IntWritable v: values) if (v.get() > max) max = v.get(); max *= 0.8f; for (IntWritable v: values) if (v.get() >= max) c.write(key, v); } } ©2014 Cloudera, Inc. All rights reserved.33
34.
34 Close Enough The data: A,1 A,5 A,4 The
results: a) b) A 5 c) A 5 A 4 d) The job fails ©2014 Cloudera, Inc. All rights reserved.34
35.
35 Close Enough public class
MaxMap extends Mapper<LongWritable, Text,Text,IntWritable> { Text k = new Text(); IntWritable v = new IntWritable(); protected void map(LongWritable key, Text val, Context c) … { String[] parts = val.toString().split(","); k.set(parts[0]); v.set(Integer.parseInt(parts[1])); c.write(k, v); } } public class Top20Reduce extends Reducer<Text,IntWritable, Text,IntWritable> { protected void reduce(Text key, Iterable<IntWritable> values, Context c) … { float max = 0.0f; for (IntWritable v: values) if (v.get() > max) max = v.get(); max *= 0.8f; for (IntWritable v: values) if (v.get() >= max) c.write(key, v); } } ©2014 Cloudera, Inc. All rights reserved.35 A 1 A 5 A 4
36.
36 Close Enough The data: A,1 A,5 A,4 The
results: a) b) A 5 c) A 5 A 4 d) The job fails ©2014 Cloudera, Inc. All rights reserved.36
37.
37 Close Enough (Answer) The
data: A,1 A,5 A,4 The results: a) b) A 5 c) A 5 A 4 d) The job fails ©2014 Cloudera, Inc. All rights reserved.37
38.
38 Close Enough (Problem) public
class MaxMap extends Mapper<LongWritable, Text,Text,IntWritable> { Text k = new Text(); IntWritable v = new IntWritable(); protected void map(LongWritable key, Text val, Context c) … { String[] parts = val.toString().split(","); k.set(parts[0]); v.set(Integer.parseInt(parts[1])); c.write(k, v); } } public class Top20Reduce extends Reducer<Text,IntWritable, Text,IntWritable> { protected void reduce(Text key, Iterable<IntWritable> values, Context c) … { float max = 0.0f; for (IntWritable v: values) if (v.get() > max) max = v.get(); max *= 0.8f; for (IntWritable v: values) if (v.get() >= max) c.write(key, v); } } ©2014 Cloudera, Inc. All rights reserved.38
39.
39 Close Enough (Moral) ©2014
Cloudera, Inc. All rights reserved.39 • For scalability reasons, the values iterable is single-shot • Subsequent iterators iterate over an empty collection • Store values (not Writables!) in the first pass • Better yet, restructure the logic to avoid storing all values in memory
40.
40 Overbyte public class MinLineMap extends
Mapper<LongWritable, Text,Text,Text> { Text k = new Text(); protected void map(LongWritable key, Text value, Context c) … { String val = value.toString(); k.set(val.substring(0, 1)); c.write(k, value); } } public class MinLineReduce extends Reducer<Text,Text, Text,IntWritable> { protected void reduce(Text key, Iterable<Text> values, Context c) … { int min = Integer.MAX_VALUE; for (Text v: values) if (v.getBytes().length < min) min = v.getBytes().length; c.write(key, new IntWritable(min)); } } ©2014 Cloudera, Inc. All rights reserved.40
41.
41 Overbyte The data: Hadoop Spark Hive Sqoop2 The results: a)
H 4 S 5 b) H 6 S 5 c) H 6 S 6 d) The job fails ©2014 Cloudera, Inc. All rights reserved.41
42.
42 Overbyte public class MinLineMap extends
Mapper<LongWritable, Text,Text,Text> { Text k = new Text(); protected void map(LongWritable key, Text value, Context c) … { String val = value.toString(); k.set(val.substring(0, 1)); c.write(k, value); } } public class MinLineReduce extends Reducer<Text,Text, Text,IntWritable> { protected void reduce(Text key, Iterable<Text> values, Context c) … { int min = Integer.MAX_VALUE; for (Text v: values) if (v.getBytes().length < min) min = v.getBytes().length; c.write(key, new IntWritable(min)); } } ©2014 Cloudera, Inc. All rights reserved.42 Hadoop Spark Hive Sqoop2
43.
43 Overbyte The data: Hadoop Spark Hive Sqoop2 The results: a)
H 4 S 5 b) H 6 S 5 c) H 6 S 6 d) The job fails ©2014 Cloudera, Inc. All rights reserved.43
44.
44 Overbyte (Answer) The data: Hadoop Spark Hive Sqoop2 The
results: a) H 4 S 5 b) H 6 S 5 c) H 6 S 6 d) The job fails ©2014 Cloudera, Inc. All rights reserved.44
45.
45 Overbyte (Problem) public class
MinLineMap extends Mapper<LongWritable, Text,Text,Text> { Text k = new Text(); protected void map(LongWritable key, Text value, Context c) … { String val = value.toString(); k.set(val.substring(0, 1)); c.write(k, value); } } public class MinLineReduce extends Reducer<Text,Text, Text,IntWritable> { protected void reduce(Text key, Iterable<Text> values, Context c) … { int min = Integer.MAX_VALUE; for (Text v: values) if (v.getBytes().length < min) min = v.getBytes().length; c.write(key, new IntWritable(min)); } } ©2014 Cloudera, Inc. All rights reserved.45
46.
46 Overbyte (Moral) ©2014 Cloudera,
Inc. All rights reserved.46 • Writables get reused in loops • In addition, Text.getBytes() reuses byte array allocated by previous calls • Net result is wrongness • Text.getLength() is the correct way to get the length of a Text.
47.
47 What We Learned ©2014
Cloudera, Inc. All rights reserved.47 • Beware of reuse of Writables • Always use @Override so your compiler can help you • Don’t assume you know what a method does because of the name or parameters – read the docs! • Sometimes scalability is inconvenient
48.
48 One Closing Note ©2014
Cloudera, Inc. All rights reserved.48 • Hadoop is still not easy • Being good takes effort and experience • Recognizing Hadoop talent can be hard • Cloudera’s is working to make Hadoop talent easier to recognize through certification http://cloudera.com/content/cloudera/en/training/cert ification.html
49.
49 ©2014 Cloudera,
Inc. All rights reserved. Aaron Myers & Daniel Templeton