SlideShare uma empresa Scribd logo
1 de 17
Introduction to Pig Xiafei.qiu@PCA
Nested Data Model Field, Tuple, Bag, Map
Normal Operators Arithmetic Operators X = FOREACH A GENERATE f1, f2, f1 % f2; Boolean Operators X = FILTER A BY (f1 == 8) OR (NOT (f2+f3 > f1)); Cast operators X = FOREACH B GENERATE group, (chararray) COUNT(A) AS total; Comparison Operators X = FILTER A BY (f1 matches '.*apache.*') OR (NOT (f2+f3 > f1)); Flatten Operator Tuple: remove a level of nesting Bag :remove a level of nesting, may cause cross product 
Normal Operators
Relational Operators LOADa bag of tuples A = LOAD 'data' [USING function] [AS schema];  STORE A = STORE alias INTO 'directory' [USING function]; FOREACHtuple in the bag, produce a new tuple A = FOREACH queries GENERATE uid, expandQuery(query); FILTERa bag to produce a subset of it A = FILTER queries BY uidneq ‘bot’ OR notBot(uid);
Relational Operators COGROUP/GROUPone or less than 127 relations alias = GROUP … by …, … by… {group: int, A: {name: chararray,age: int,gpa: float}} (18,{(John,18,4.0F),(Joe,18,3.8F)})
Relational Operators JOIN(inner/outer) Replicated Joins one or more relations are small enough to fit into main memory.  Skewed Joins computes a histogram of the key space and uses this data to allocate reducers for a given key.  Merge Joins Sorted, perform join on map phase
Relational Operators
Relational Operators ORDERalias by filed DESC/ASC Unstable SPLITalias INTO alias IF …, alias IF … CROSS cross product X = CROSS A, B; DISTINCT Removes duplicate tuples in a relation. X = DISTINCT A; LIMIT LIMITE A 3; SAMPLE SAMPLE alias size; IMPORT Import other .pig file DEFINE Define a Pig macro.
Built In Eval Function AVG/MAX/MIN/SUM  on a single column of a bag; group it first COUNT/ COUNT_STAR  number of elements in a bag; COUNT_STAR  counts null CONCAT DIFF IsEmpty SIZE TOKENIZE
Other Built In Function Load/Store Functions Math Functions String Functions
Map-Reduce Plan Compilation Compile each GROUP into distinct Map-Reduce job Push commands between LOAD and GROUP to the Map Side Commands between subsequent GROUP Gi and Gi+1 pushed into the Reduce  Side of Gi
Map-Reduce Plan Compilation ORDER is compiled into two map-reduce jobs. MR1: sample the key space MR2: sort
User Defined Function Simple Eval Function public class UPPER extends EvalFunc<String>{  public String exec(Tuple input) throws IOException {     // .......  }}
User Defined Function Aggregate Functions Algebraic Interface they can be computed incrementally in a distributed fashion. Accumulator Interface designed to decrease memory usage
Accumulator Interface public interface Accumulator <T> {    public void accumulate(Tuple b) throws IOException;      public T getValue();      public void cleanup();}
Aggregate Functions public interface Algebraic{        public String getInitial();        public String getIntermed();        public String getFinal();}

Mais conteúdo relacionado

Mais procurados

STACK ( LIFO STRUCTURE) - Data Structure
STACK ( LIFO STRUCTURE) - Data StructureSTACK ( LIFO STRUCTURE) - Data Structure
STACK ( LIFO STRUCTURE) - Data StructureYaksh Jethva
 
Mapreduce: Theory and implementation
Mapreduce: Theory and implementationMapreduce: Theory and implementation
Mapreduce: Theory and implementationSri Prasanna
 
Csa stack
Csa stackCsa stack
Csa stackPCTE
 
Presentation topic is stick data structure
Presentation topic is stick data structurePresentation topic is stick data structure
Presentation topic is stick data structureAizazAli21
 
Simple Linear Regression with R
Simple Linear Regression with RSimple Linear Regression with R
Simple Linear Regression with RJerome Gomes
 
Stacks Implementation and Examples
Stacks Implementation and ExamplesStacks Implementation and Examples
Stacks Implementation and Examplesgreatqadirgee4u
 
Stack organization
Stack organizationStack organization
Stack organizationchauhankapil
 
Data Visualization With R: Introduction
Data Visualization With R: IntroductionData Visualization With R: Introduction
Data Visualization With R: IntroductionRsquared Academy
 
Data Visualization With R: Learn To Combine Multiple Graphs
Data Visualization With R: Learn To Combine Multiple GraphsData Visualization With R: Learn To Combine Multiple Graphs
Data Visualization With R: Learn To Combine Multiple GraphsRsquared Academy
 

Mais procurados (20)

STACK ( LIFO STRUCTURE) - Data Structure
STACK ( LIFO STRUCTURE) - Data StructureSTACK ( LIFO STRUCTURE) - Data Structure
STACK ( LIFO STRUCTURE) - Data Structure
 
Introduction To Stack
Introduction To StackIntroduction To Stack
Introduction To Stack
 
Stack project
Stack projectStack project
Stack project
 
Algo>Stacks
Algo>StacksAlgo>Stacks
Algo>Stacks
 
Stack of Data structure
Stack of Data structureStack of Data structure
Stack of Data structure
 
Stack
StackStack
Stack
 
Stack - Data Structure
Stack - Data StructureStack - Data Structure
Stack - Data Structure
 
Mapreduce: Theory and implementation
Mapreduce: Theory and implementationMapreduce: Theory and implementation
Mapreduce: Theory and implementation
 
Stack push pop
Stack push popStack push pop
Stack push pop
 
Stack a Data Structure
Stack a Data StructureStack a Data Structure
Stack a Data Structure
 
Csa stack
Csa stackCsa stack
Csa stack
 
Presentation topic is stick data structure
Presentation topic is stick data structurePresentation topic is stick data structure
Presentation topic is stick data structure
 
stack presentation
stack presentationstack presentation
stack presentation
 
Simple Linear Regression with R
Simple Linear Regression with RSimple Linear Regression with R
Simple Linear Regression with R
 
Stack and queue
Stack and queueStack and queue
Stack and queue
 
Stacks Implementation and Examples
Stacks Implementation and ExamplesStacks Implementation and Examples
Stacks Implementation and Examples
 
Stack organization
Stack organizationStack organization
Stack organization
 
Stacks in c++
Stacks in c++Stacks in c++
Stacks in c++
 
Data Visualization With R: Introduction
Data Visualization With R: IntroductionData Visualization With R: Introduction
Data Visualization With R: Introduction
 
Data Visualization With R: Learn To Combine Multiple Graphs
Data Visualization With R: Learn To Combine Multiple GraphsData Visualization With R: Learn To Combine Multiple Graphs
Data Visualization With R: Learn To Combine Multiple Graphs
 

Semelhante a Introduction to pig

ADT STACK and Queues
ADT STACK and QueuesADT STACK and Queues
ADT STACK and QueuesBHARATH KUMAR
 
Map reduce (from Google)
Map reduce (from Google)Map reduce (from Google)
Map reduce (from Google)Sri Prasanna
 
Functional Programming for OO Programmers (part 2)
Functional Programming for OO Programmers (part 2)Functional Programming for OO Programmers (part 2)
Functional Programming for OO Programmers (part 2)Calvin Cheng
 
Distributed Computing Seminar - Lecture 2: MapReduce Theory and Implementation
Distributed Computing Seminar - Lecture 2: MapReduce Theory and ImplementationDistributed Computing Seminar - Lecture 2: MapReduce Theory and Implementation
Distributed Computing Seminar - Lecture 2: MapReduce Theory and Implementationtugrulh
 
Stacks,queues,linked-list
Stacks,queues,linked-listStacks,queues,linked-list
Stacks,queues,linked-listpinakspatel
 
Mapfilterreducepresentation
MapfilterreducepresentationMapfilterreducepresentation
MapfilterreducepresentationManjuKumara GH
 
Functional Programming Concepts for Imperative Programmers
Functional Programming Concepts for Imperative ProgrammersFunctional Programming Concepts for Imperative Programmers
Functional Programming Concepts for Imperative ProgrammersChris
 
Type class survival guide
Type class survival guideType class survival guide
Type class survival guideMark Canlas
 
Data structures stacks
Data structures   stacksData structures   stacks
Data structures stacksmaamir farooq
 
Fp in scala part 1
Fp in scala part 1Fp in scala part 1
Fp in scala part 1Hang Zhao
 
Composition birds-and-recursion
Composition birds-and-recursionComposition birds-and-recursion
Composition birds-and-recursionDavid Atchley
 
Functional Pearls 4 (YAPC::EU::2009 remix)
Functional Pearls 4 (YAPC::EU::2009 remix)Functional Pearls 4 (YAPC::EU::2009 remix)
Functional Pearls 4 (YAPC::EU::2009 remix)osfameron
 

Semelhante a Introduction to pig (20)

Lec_4_1_IntrotoPIG.pptx
Lec_4_1_IntrotoPIG.pptxLec_4_1_IntrotoPIG.pptx
Lec_4_1_IntrotoPIG.pptx
 
4.1-Pig.pptx
4.1-Pig.pptx4.1-Pig.pptx
4.1-Pig.pptx
 
Hadoop Pig
Hadoop PigHadoop Pig
Hadoop Pig
 
ADT STACK and Queues
ADT STACK and QueuesADT STACK and Queues
ADT STACK and Queues
 
Map reduce (from Google)
Map reduce (from Google)Map reduce (from Google)
Map reduce (from Google)
 
Functional Programming for OO Programmers (part 2)
Functional Programming for OO Programmers (part 2)Functional Programming for OO Programmers (part 2)
Functional Programming for OO Programmers (part 2)
 
Lec2 Mapred
Lec2 MapredLec2 Mapred
Lec2 Mapred
 
Distributed Computing Seminar - Lecture 2: MapReduce Theory and Implementation
Distributed Computing Seminar - Lecture 2: MapReduce Theory and ImplementationDistributed Computing Seminar - Lecture 2: MapReduce Theory and Implementation
Distributed Computing Seminar - Lecture 2: MapReduce Theory and Implementation
 
Stacks,queues,linked-list
Stacks,queues,linked-listStacks,queues,linked-list
Stacks,queues,linked-list
 
Mapfilterreducepresentation
MapfilterreducepresentationMapfilterreducepresentation
Mapfilterreducepresentation
 
Functional Programming Concepts for Imperative Programmers
Functional Programming Concepts for Imperative ProgrammersFunctional Programming Concepts for Imperative Programmers
Functional Programming Concepts for Imperative Programmers
 
Type class survival guide
Type class survival guideType class survival guide
Type class survival guide
 
Data structures stacks
Data structures   stacksData structures   stacks
Data structures stacks
 
stacks and queues
stacks and queuesstacks and queues
stacks and queues
 
Fp in scala part 1
Fp in scala part 1Fp in scala part 1
Fp in scala part 1
 
Composition birds-and-recursion
Composition birds-and-recursionComposition birds-and-recursion
Composition birds-and-recursion
 
Pig
PigPig
Pig
 
Functional Pearls 4 (YAPC::EU::2009 remix)
Functional Pearls 4 (YAPC::EU::2009 remix)Functional Pearls 4 (YAPC::EU::2009 remix)
Functional Pearls 4 (YAPC::EU::2009 remix)
 
Zenith it-hadoop-training
Zenith it-hadoop-trainingZenith it-hadoop-training
Zenith it-hadoop-training
 
Pig workshop
Pig workshopPig workshop
Pig workshop
 

Último

How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsRoshan Dwivedi
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilV3cube
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 

Último (20)

How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 

Introduction to pig