SlideShare a Scribd company logo
1 of 11
Hive – SerDe and LazySerDe ,[object Object],[object Object]
Where is SerDe? File on HDFS Hierarchical Object Writable Stream Stream Hierarchical Object Map Output File Writable Writable Writable Writable Writable Hierarchical Object File on HDFS User Script Hierarchical Object Hierarchical Object Hive Operator Hive Operator SerDe FileFormat  / Hadoop Serialization Mapper Reducer ObjectInspector imp 1.0 3 54 Imp 0.2 1 33 clk 2.2 8 212 Imp 0.7 2 22  thrift_record<…> thrift_record<…> thrift_record<…> thrift_record<…> BytesWritable(3F647200) Text(‘imp 1.0 3 54’) // UTF8 encoded Java Object Object of a Java Class Standard Object Use ArrayList for struct and array Use HashMap for map LazyObject Lazily-deserialized
SerDe, ObjectInspector and TypeInfo Hierarchical Object Writable Writable Struct int string list struct map string string Hierarchical Object String Object TypeInfo BytesWritable(3F647200) Text(‘ a=av:b=bv 23 1:2=4:5 abcd ’) class HO { HashMap<String, String> a, Integer b, List<ClassC> c, String d; } Class ClassC { Integer a, Integer b; } List ( HashMap(“a”    “av”, “b”    “bv”), 23, List(List(1,null),List(2,4),List(5,null)), “ abcd” ) int int HashMap(“a”    “av”, “b”    “bv”), HashMap<String, String> a, “ av” getType ObjectInspector1 getFieldOI getStructField getType ObjectInspector2 getMapValueOI getMapValue deserialize SerDe serialize getOI getType ObjectInspector3
LazySimpleSerDe components LazyStruct LazyInteger LazyString LazyMap LazyString LazyString LazyString LazyString LazyStructOI(“ “) LazyArrayOI(“:”) LazyMapOI(“:”,”=“) StandardIntegerOI StandardStringOI StandardStringOI byte[] data Hierarchical Object / LazyObject One Per SerDe instance LazyObjectInspector Singleton byte[](‘ a=av:b=bv 23 1:2=4:5 abcd ’) LazyStruct LazyStructOI(“=“) StandardIntegerOI LazyStruct LazyArray LazyInteger LazyInteger LazyInteger LazyInteger
LazyPrimitive ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
LazyNonPrimitive ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Why another SerDe? ,[object Object],[object Object],[object Object],[object Object],[object Object]
Features of LazySimpleSerDe ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Profiling result of a mapper ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
TypeInfo String specification ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Future of SerDe ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

More Related Content

What's hot

Mapreduce in Search
Mapreduce in SearchMapreduce in Search
Mapreduce in SearchAmund Tveit
 
Big Data Everywhere Chicago: Unleash the Power of HBase Shell (Conversant)
Big Data Everywhere Chicago: Unleash the Power of HBase Shell (Conversant) Big Data Everywhere Chicago: Unleash the Power of HBase Shell (Conversant)
Big Data Everywhere Chicago: Unleash the Power of HBase Shell (Conversant) BigDataEverywhere
 
HCatalog Hadoop Summit 2011
HCatalog Hadoop Summit 2011HCatalog Hadoop Summit 2011
HCatalog Hadoop Summit 2011Hortonworks
 
Introduction to Pig & Pig Latin | Big Data Hadoop Spark Tutorial | CloudxLab
Introduction to Pig & Pig Latin | Big Data Hadoop Spark Tutorial | CloudxLabIntroduction to Pig & Pig Latin | Big Data Hadoop Spark Tutorial | CloudxLab
Introduction to Pig & Pig Latin | Big Data Hadoop Spark Tutorial | CloudxLabCloudxLab
 
Scalding - Hadoop Word Count in LESS than 70 lines of code
Scalding - Hadoop Word Count in LESS than 70 lines of codeScalding - Hadoop Word Count in LESS than 70 lines of code
Scalding - Hadoop Word Count in LESS than 70 lines of codeKonrad Malawski
 
Future of HCatalog - Hadoop Summit 2012
Future of HCatalog - Hadoop Summit 2012Future of HCatalog - Hadoop Summit 2012
Future of HCatalog - Hadoop Summit 2012Hortonworks
 
Introduction to Scalding and Monoids
Introduction to Scalding and MonoidsIntroduction to Scalding and Monoids
Introduction to Scalding and MonoidsHugo Gävert
 
Declarative Internal DSLs in Lua: A Game Changing Experience
Declarative Internal DSLs in Lua: A Game Changing ExperienceDeclarative Internal DSLs in Lua: A Game Changing Experience
Declarative Internal DSLs in Lua: A Game Changing ExperienceAlexander Gladysh
 
Intro To Cascading
Intro To CascadingIntro To Cascading
Intro To CascadingNate Murray
 
Writing Hadoop Jobs in Scala using Scalding
Writing Hadoop Jobs in Scala using ScaldingWriting Hadoop Jobs in Scala using Scalding
Writing Hadoop Jobs in Scala using ScaldingToni Cebrián
 
EuroPython 2015 - Big Data with Python and Hadoop
EuroPython 2015 - Big Data with Python and HadoopEuroPython 2015 - Big Data with Python and Hadoop
EuroPython 2015 - Big Data with Python and HadoopMax Tepkeev
 
Should I Use Scalding or Scoobi or Scrunch?
Should I Use Scalding or Scoobi or Scrunch? Should I Use Scalding or Scoobi or Scrunch?
Should I Use Scalding or Scoobi or Scrunch? DataWorks Summit
 
Implementing a many-to-many Relationship with Slick
Implementing a many-to-many Relationship with SlickImplementing a many-to-many Relationship with Slick
Implementing a many-to-many Relationship with SlickHermann Hueck
 
Spark + Clojure for Topic Discovery - Zalando Tech Clojure/Conj Talk
Spark + Clojure for Topic Discovery - Zalando Tech Clojure/Conj TalkSpark + Clojure for Topic Discovery - Zalando Tech Clojure/Conj Talk
Spark + Clojure for Topic Discovery - Zalando Tech Clojure/Conj TalkZalando Technology
 
Search Engine Building with Lucene and Solr (So Code Camp San Diego 2014)
Search Engine Building with Lucene and Solr (So Code Camp San Diego 2014)Search Engine Building with Lucene and Solr (So Code Camp San Diego 2014)
Search Engine Building with Lucene and Solr (So Code Camp San Diego 2014)Kai Chan
 
Lightning fast analytics with Spark and Cassandra
Lightning fast analytics with Spark and CassandraLightning fast analytics with Spark and Cassandra
Lightning fast analytics with Spark and CassandraRustam Aliyev
 
Scalding: Reaching Efficient MapReduce
Scalding: Reaching Efficient MapReduceScalding: Reaching Efficient MapReduce
Scalding: Reaching Efficient MapReduceLivePerson
 
2014 holden - databricks umd scala crash course
2014   holden - databricks umd scala crash course2014   holden - databricks umd scala crash course
2014 holden - databricks umd scala crash courseHolden Karau
 
Object Relational Mapping in PHP
Object Relational Mapping in PHPObject Relational Mapping in PHP
Object Relational Mapping in PHPRob Knight
 
Avro, la puissance du binaire, la souplesse du JSON
Avro, la puissance du binaire, la souplesse du JSONAvro, la puissance du binaire, la souplesse du JSON
Avro, la puissance du binaire, la souplesse du JSONAlexandre Victoor
 

What's hot (20)

Mapreduce in Search
Mapreduce in SearchMapreduce in Search
Mapreduce in Search
 
Big Data Everywhere Chicago: Unleash the Power of HBase Shell (Conversant)
Big Data Everywhere Chicago: Unleash the Power of HBase Shell (Conversant) Big Data Everywhere Chicago: Unleash the Power of HBase Shell (Conversant)
Big Data Everywhere Chicago: Unleash the Power of HBase Shell (Conversant)
 
HCatalog Hadoop Summit 2011
HCatalog Hadoop Summit 2011HCatalog Hadoop Summit 2011
HCatalog Hadoop Summit 2011
 
Introduction to Pig & Pig Latin | Big Data Hadoop Spark Tutorial | CloudxLab
Introduction to Pig & Pig Latin | Big Data Hadoop Spark Tutorial | CloudxLabIntroduction to Pig & Pig Latin | Big Data Hadoop Spark Tutorial | CloudxLab
Introduction to Pig & Pig Latin | Big Data Hadoop Spark Tutorial | CloudxLab
 
Scalding - Hadoop Word Count in LESS than 70 lines of code
Scalding - Hadoop Word Count in LESS than 70 lines of codeScalding - Hadoop Word Count in LESS than 70 lines of code
Scalding - Hadoop Word Count in LESS than 70 lines of code
 
Future of HCatalog - Hadoop Summit 2012
Future of HCatalog - Hadoop Summit 2012Future of HCatalog - Hadoop Summit 2012
Future of HCatalog - Hadoop Summit 2012
 
Introduction to Scalding and Monoids
Introduction to Scalding and MonoidsIntroduction to Scalding and Monoids
Introduction to Scalding and Monoids
 
Declarative Internal DSLs in Lua: A Game Changing Experience
Declarative Internal DSLs in Lua: A Game Changing ExperienceDeclarative Internal DSLs in Lua: A Game Changing Experience
Declarative Internal DSLs in Lua: A Game Changing Experience
 
Intro To Cascading
Intro To CascadingIntro To Cascading
Intro To Cascading
 
Writing Hadoop Jobs in Scala using Scalding
Writing Hadoop Jobs in Scala using ScaldingWriting Hadoop Jobs in Scala using Scalding
Writing Hadoop Jobs in Scala using Scalding
 
EuroPython 2015 - Big Data with Python and Hadoop
EuroPython 2015 - Big Data with Python and HadoopEuroPython 2015 - Big Data with Python and Hadoop
EuroPython 2015 - Big Data with Python and Hadoop
 
Should I Use Scalding or Scoobi or Scrunch?
Should I Use Scalding or Scoobi or Scrunch? Should I Use Scalding or Scoobi or Scrunch?
Should I Use Scalding or Scoobi or Scrunch?
 
Implementing a many-to-many Relationship with Slick
Implementing a many-to-many Relationship with SlickImplementing a many-to-many Relationship with Slick
Implementing a many-to-many Relationship with Slick
 
Spark + Clojure for Topic Discovery - Zalando Tech Clojure/Conj Talk
Spark + Clojure for Topic Discovery - Zalando Tech Clojure/Conj TalkSpark + Clojure for Topic Discovery - Zalando Tech Clojure/Conj Talk
Spark + Clojure for Topic Discovery - Zalando Tech Clojure/Conj Talk
 
Search Engine Building with Lucene and Solr (So Code Camp San Diego 2014)
Search Engine Building with Lucene and Solr (So Code Camp San Diego 2014)Search Engine Building with Lucene and Solr (So Code Camp San Diego 2014)
Search Engine Building with Lucene and Solr (So Code Camp San Diego 2014)
 
Lightning fast analytics with Spark and Cassandra
Lightning fast analytics with Spark and CassandraLightning fast analytics with Spark and Cassandra
Lightning fast analytics with Spark and Cassandra
 
Scalding: Reaching Efficient MapReduce
Scalding: Reaching Efficient MapReduceScalding: Reaching Efficient MapReduce
Scalding: Reaching Efficient MapReduce
 
2014 holden - databricks umd scala crash course
2014   holden - databricks umd scala crash course2014   holden - databricks umd scala crash course
2014 holden - databricks umd scala crash course
 
Object Relational Mapping in PHP
Object Relational Mapping in PHPObject Relational Mapping in PHP
Object Relational Mapping in PHP
 
Avro, la puissance du binaire, la souplesse du JSON
Avro, la puissance du binaire, la souplesse du JSONAvro, la puissance du binaire, la souplesse du JSON
Avro, la puissance du binaire, la souplesse du JSON
 

Similar to Hive SerDe and LazySerDe overview

VelocityGraph Introduction
VelocityGraph IntroductionVelocityGraph Introduction
VelocityGraph IntroductionMats Persson
 
A really really fast introduction to PySpark - lightning fast cluster computi...
A really really fast introduction to PySpark - lightning fast cluster computi...A really really fast introduction to PySpark - lightning fast cluster computi...
A really really fast introduction to PySpark - lightning fast cluster computi...Holden Karau
 
Mapreduce by examples
Mapreduce by examplesMapreduce by examples
Mapreduce by examplesAndrea Iacono
 
devLink - What's New in C# 4?
devLink - What's New in C# 4?devLink - What's New in C# 4?
devLink - What's New in C# 4?Kevin Pilch
 
Allura - an Open Source MongoDB Based Document Oriented SourceForge
Allura - an Open Source MongoDB Based Document Oriented SourceForgeAllura - an Open Source MongoDB Based Document Oriented SourceForge
Allura - an Open Source MongoDB Based Document Oriented SourceForgeRick Copeland
 
Alpine academy apache spark series #1 introduction to cluster computing wit...
Alpine academy apache spark series #1   introduction to cluster computing wit...Alpine academy apache spark series #1   introduction to cluster computing wit...
Alpine academy apache spark series #1 introduction to cluster computing wit...Holden Karau
 
Introduction to jQuery
Introduction to jQueryIntroduction to jQuery
Introduction to jQueryGunjan Kumar
 
Introduction to Client-Side Javascript
Introduction to Client-Side JavascriptIntroduction to Client-Side Javascript
Introduction to Client-Side JavascriptJulie Iskander
 
Scalable and Flexible Machine Learning With Scala @ LinkedIn
Scalable and Flexible Machine Learning With Scala @ LinkedInScalable and Flexible Machine Learning With Scala @ LinkedIn
Scalable and Flexible Machine Learning With Scala @ LinkedInVitaly Gordon
 
Ida python intro
Ida python introIda python intro
Ida python intro小静 安
 
PofEAA and SQLAlchemy
PofEAA and SQLAlchemyPofEAA and SQLAlchemy
PofEAA and SQLAlchemyInada Naoki
 
.NET Database Toolkit
.NET Database Toolkit.NET Database Toolkit
.NET Database Toolkitwlscaudill
 
Serializing EMF models with Xtext
Serializing EMF models with XtextSerializing EMF models with Xtext
Serializing EMF models with Xtextmeysholdt
 
Introduction To Groovy 2005
Introduction To Groovy 2005Introduction To Groovy 2005
Introduction To Groovy 2005Tugdual Grall
 
Js info vis_toolkit
Js info vis_toolkitJs info vis_toolkit
Js info vis_toolkitnikhilyagnic
 

Similar to Hive SerDe and LazySerDe overview (20)

VelocityGraph Introduction
VelocityGraph IntroductionVelocityGraph Introduction
VelocityGraph Introduction
 
A really really fast introduction to PySpark - lightning fast cluster computi...
A really really fast introduction to PySpark - lightning fast cluster computi...A really really fast introduction to PySpark - lightning fast cluster computi...
A really really fast introduction to PySpark - lightning fast cluster computi...
 
Big Data Analytics Part2
Big Data Analytics Part2Big Data Analytics Part2
Big Data Analytics Part2
 
Mapreduce by examples
Mapreduce by examplesMapreduce by examples
Mapreduce by examples
 
Dynamic Python
Dynamic PythonDynamic Python
Dynamic Python
 
devLink - What's New in C# 4?
devLink - What's New in C# 4?devLink - What's New in C# 4?
devLink - What's New in C# 4?
 
Allura - an Open Source MongoDB Based Document Oriented SourceForge
Allura - an Open Source MongoDB Based Document Oriented SourceForgeAllura - an Open Source MongoDB Based Document Oriented SourceForge
Allura - an Open Source MongoDB Based Document Oriented SourceForge
 
Alpine academy apache spark series #1 introduction to cluster computing wit...
Alpine academy apache spark series #1   introduction to cluster computing wit...Alpine academy apache spark series #1   introduction to cluster computing wit...
Alpine academy apache spark series #1 introduction to cluster computing wit...
 
Introduction to jQuery
Introduction to jQueryIntroduction to jQuery
Introduction to jQuery
 
JavaEE Spring Seam
JavaEE Spring SeamJavaEE Spring Seam
JavaEE Spring Seam
 
Introduction to Client-Side Javascript
Introduction to Client-Side JavascriptIntroduction to Client-Side Javascript
Introduction to Client-Side Javascript
 
Scalable and Flexible Machine Learning With Scala @ LinkedIn
Scalable and Flexible Machine Learning With Scala @ LinkedInScalable and Flexible Machine Learning With Scala @ LinkedIn
Scalable and Flexible Machine Learning With Scala @ LinkedIn
 
Pune Clojure Course Outline
Pune Clojure Course OutlinePune Clojure Course Outline
Pune Clojure Course Outline
 
Ida python intro
Ida python introIda python intro
Ida python intro
 
PofEAA and SQLAlchemy
PofEAA and SQLAlchemyPofEAA and SQLAlchemy
PofEAA and SQLAlchemy
 
.NET Database Toolkit
.NET Database Toolkit.NET Database Toolkit
.NET Database Toolkit
 
Java
JavaJava
Java
 
Serializing EMF models with Xtext
Serializing EMF models with XtextSerializing EMF models with Xtext
Serializing EMF models with Xtext
 
Introduction To Groovy 2005
Introduction To Groovy 2005Introduction To Groovy 2005
Introduction To Groovy 2005
 
Js info vis_toolkit
Js info vis_toolkitJs info vis_toolkit
Js info vis_toolkit
 

Recently uploaded

🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 

Recently uploaded (20)

🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 

Hive SerDe and LazySerDe overview

  • 1.
  • 2. Where is SerDe? File on HDFS Hierarchical Object Writable Stream Stream Hierarchical Object Map Output File Writable Writable Writable Writable Writable Hierarchical Object File on HDFS User Script Hierarchical Object Hierarchical Object Hive Operator Hive Operator SerDe FileFormat / Hadoop Serialization Mapper Reducer ObjectInspector imp 1.0 3 54 Imp 0.2 1 33 clk 2.2 8 212 Imp 0.7 2 22 thrift_record<…> thrift_record<…> thrift_record<…> thrift_record<…> BytesWritable(3F647200) Text(‘imp 1.0 3 54’) // UTF8 encoded Java Object Object of a Java Class Standard Object Use ArrayList for struct and array Use HashMap for map LazyObject Lazily-deserialized
  • 3. SerDe, ObjectInspector and TypeInfo Hierarchical Object Writable Writable Struct int string list struct map string string Hierarchical Object String Object TypeInfo BytesWritable(3F647200) Text(‘ a=av:b=bv 23 1:2=4:5 abcd ’) class HO { HashMap<String, String> a, Integer b, List<ClassC> c, String d; } Class ClassC { Integer a, Integer b; } List ( HashMap(“a”  “av”, “b”  “bv”), 23, List(List(1,null),List(2,4),List(5,null)), “ abcd” ) int int HashMap(“a”  “av”, “b”  “bv”), HashMap<String, String> a, “ av” getType ObjectInspector1 getFieldOI getStructField getType ObjectInspector2 getMapValueOI getMapValue deserialize SerDe serialize getOI getType ObjectInspector3
  • 4. LazySimpleSerDe components LazyStruct LazyInteger LazyString LazyMap LazyString LazyString LazyString LazyString LazyStructOI(“ “) LazyArrayOI(“:”) LazyMapOI(“:”,”=“) StandardIntegerOI StandardStringOI StandardStringOI byte[] data Hierarchical Object / LazyObject One Per SerDe instance LazyObjectInspector Singleton byte[](‘ a=av:b=bv 23 1:2=4:5 abcd ’) LazyStruct LazyStructOI(“=“) StandardIntegerOI LazyStruct LazyArray LazyInteger LazyInteger LazyInteger LazyInteger
  • 5.
  • 6.
  • 7.
  • 8.
  • 9.
  • 10.
  • 11.