SlideShare a Scribd company logo
1 of 22
Download to read offline
Cascalog
                   Programmation logique pour Hadoop




                             Bertrand Dechoux   13 Octobre 2012




Saturday, October 13, 2012
MapReduce : et vous?

 Python
      ▶   map(function, iterable, ...)
      ▶   reduce(function,iterable[, initializer])


 Perl
      ▶   map BLOCK LIST
      ▶   reduce BLOCK LIST


 Ruby
      ▶   map {|item| block} -> new_ary / collect {|item| block} -> new_ary
      ▶   reduce(initial,sym) -> obj / inject(initial,sym) -> obj


 Smalltalk
      ▶   collect:aBlock=TheArray
      ▶   inject: thisValue into: binaryBlock


 PHP
      ▶   array array_map ( callable $callback, array $arr1 [, array $...])
      ▶   mixed array_reduce (array $input, callable $function [, mixed $initial = NULL])




                                                                                            2
Saturday, October 13, 2012
Hadoop MapReduce : la théorie




 Map
      ▶   Map(k1,v1) -> list(k2,v2)



 Reduce
      ▶   Reduce(k2, list (v2)) -> list(k3,v3)




                                                             3
Saturday, October 13, 2012
Hadoop MapReduce : la théorie




 Map
      ▶ Map(k1,v1) -> list(k2,v2)
      ▶ SortByKey(list(k2,v2)) -> list(k2,v2)



 Reduce
      ▶ MergeByKey(list,list,...) -> list(k2,list(v2))
      ▶ Reduce(k2, list (v2)) -> list(k3,v3)




                                                             4
Saturday, October 13, 2012
Hadoop MapReduce : la pratique
                             public class WordCount {

                                 public static class Map extends Mapper<LongWritable, Text, Text, IntWritable> {
                                    private final static IntWritable one = new IntWritable(1);




                                                                          X
                                    private Text word = new Text();

                                     public void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException {
                                         String line = value.toString();
                                         StringTokenizer tokenizer = new StringTokenizer(line);
                                         while (tokenizer.hasMoreTokens()) {
                                             word.set(tokenizer.nextToken());
                                             context.write(word, one);
                                         }
                                     }
                                 }

                                 public static class Reduce extends Reducer<Text, IntWritable, Text, IntWritable> {

                                     public void reduce(Text key, Iterable<IntWritable> values, Context context)
                                       throws IOException, InterruptedException {
                                         int sum = 0;
                                         for (IntWritable val : values) {
                                             sum += val.get();
                                         }
                                         context.write(key, new IntWritable(sum));
                                     }
                                 }

                                 public static void main(String[] args) throws Exception {
                                    Configuration conf = new Configuration();

                                         Job job = new Job(conf, "wordcount");

                                     job.setOutputKeyClass(Text.class);
                                     job.setOutputValueClass(IntWritable.class);

                                     job.setMapperClass(Map.class);
                                     job.setReducerClass(Reduce.class);

                                     job.setInputFormatClass(TextInputFormat.class);
                                     job.setOutputFormatClass(TextOutputFormat.class);

                                     FileInputFormat.addInputPath(job, new Path(args[0]));
                                     FileOutputFormat.setOutputPath(job, new Path(args[1]));

                                     job.waitForCompletion(true);
                                 }

                             }


                                                                                                                                                 5
Saturday, October 13, 2012
Cascading : des abstractions necessaires




                                                       6
Saturday, October 13, 2012
Cascading : des abstractions necessaires




                                                       7
Saturday, October 13, 2012
Cascading : ‘field algebra’ ?!




                                       X
                                                              8
Saturday, October 13, 2012
Cascalog
                   programmation logique pour Hadoop




 (my-predicate ?var1 42 ?var3 :> ?var4 ?var5)




                                                       9
Saturday, October 13, 2012
Cascalog : select ... from ...


 (?<- (stdout) [?person] (person ?person))




                                                              10
Saturday, October 13, 2012
Cascalog : select ... from ...


 (?<- (stdout) [?person] (person ?person))


 (?<- (stdout) [?person ?age] (age ?person ?age))




                                                              11
Saturday, October 13, 2012
Cascalog : select ... from ...


 (?<- (stdout) [?person] (person ?person))


 (?<- (stdout) [?person ?age] (age ?person ?age))


 (?<- (stdout) [?age] (age _ ?age))




                                                              12
Saturday, October 13, 2012
Cascalog : select ... from ...


 (?<- (stdout) [?person] (person ?person))


 (?<- (stdout) [?person ?age] (age ?person ?age))


 (?<- (stdout) [?age] (age _ ?age))


 (?<- (stdout) [?person] (age ?person 42))


                                                              13
Saturday, October 13, 2012
Cascalog : select ... from ... where




 (?<- (stdout) [?person ?age]
                             (age ?person ?age)
                             (< ?age 30))




                                                               14
Saturday, October 13, 2012
Cascalog : select ... as ... from ...




 (?<- (stdout) [?person ?junior]
                               (age ?person ?age)
                               (< ?age 30 :> ?junior))




                                                                     15
Saturday, October 13, 2012
Cascalog : select count(*) from ... group by ...




 (?<- (stdout) [?count]
                             (age _ _)
                             (c/count ?count))




                                                 16
Saturday, October 13, 2012
Cascalog : select count(*) from ... group by ...




 (?<- (stdout) [?junior ?count]
                             (age _ ?age)
                             (< ?age 30 :> ?junior)
                             (c/count ?count))




                                                      17
Saturday, October 13, 2012
Cascalog : select ... from ... join ...




 (?<- (stdout) [?person ?age ?gender]
                             (age ?person ?age)
                             (gender ?person ?gender))




                                                                  18
Saturday, October 13, 2012
Cascalog : select ... from ... (select ...)


 (let [many-follows
                  (<- [?person] (follows ?person _)
                            (c/count ?count)
                            (> ?count 2))]


                  (?<- (stdout) [?personA ?personB]
                            (many-follows ?personA)
                            (many-follows ?personB)
                            (follows ?personA ?personB))
)




                                                               19
Saturday, October 13, 2012
Cascalog : définir vos fonctions




 (defn toUpperCase [person] (.toUpperCase person))
     (?<- (stdout) [?PERSON]
               (person ?person)
               (toUpperCase ?person :> ?PERSON))




                                                                20
Saturday, October 13, 2012
Une conclusion?



 ‘nouveaux’ datastores, ‘nouveaux’ types de requetage
      ▶   Cascalog, RDF, Datomic, Neo4j ...


 Affinitée entre le paradigme fonctionel
      ▶ Et les traitements de données?
      ▶ Et vous? Cascalog mais aussi...




                                                     ...
                             PIG

                                                           21
Saturday, October 13, 2012
http://blog.xebia.fr/author/bdechoux/

                             @BertrandDechoux




                                  ?
                                                     22
Saturday, October 13, 2012

More Related Content

What's hot

Scala Domain Modeling and Architecture
Scala Domain Modeling and ArchitectureScala Domain Modeling and Architecture
Scala Domain Modeling and Architecture
Hossam Karim
 
Software architecture2008 ejbql-quickref
Software architecture2008 ejbql-quickrefSoftware architecture2008 ejbql-quickref
Software architecture2008 ejbql-quickref
jaiverlh
 

What's hot (20)

Erlang for data ops
Erlang for data opsErlang for data ops
Erlang for data ops
 
Pragmatic Real-World Scala (short version)
Pragmatic Real-World Scala (short version)Pragmatic Real-World Scala (short version)
Pragmatic Real-World Scala (short version)
 
Scala Domain Modeling and Architecture
Scala Domain Modeling and ArchitectureScala Domain Modeling and Architecture
Scala Domain Modeling and Architecture
 
Couchbase Korea User Group 2nd Meetup #2
Couchbase Korea User Group 2nd Meetup #2Couchbase Korea User Group 2nd Meetup #2
Couchbase Korea User Group 2nd Meetup #2
 
Testing javascriptwithjasmine sydjs
Testing javascriptwithjasmine sydjsTesting javascriptwithjasmine sydjs
Testing javascriptwithjasmine sydjs
 
tutorial5
tutorial5tutorial5
tutorial5
 
iBATIS
iBATISiBATIS
iBATIS
 
JavaTalks: OOD principles
JavaTalks: OOD principlesJavaTalks: OOD principles
JavaTalks: OOD principles
 
Cleaner APIs, Cleaner UIs with Visage (33rd Degrees)
Cleaner APIs, Cleaner UIs with Visage (33rd Degrees)Cleaner APIs, Cleaner UIs with Visage (33rd Degrees)
Cleaner APIs, Cleaner UIs with Visage (33rd Degrees)
 
JavaFX and Scala - Like Milk and Cookies
JavaFX and Scala - Like Milk and CookiesJavaFX and Scala - Like Milk and Cookies
JavaFX and Scala - Like Milk and Cookies
 
Software architecture2008 ejbql-quickref
Software architecture2008 ejbql-quickrefSoftware architecture2008 ejbql-quickref
Software architecture2008 ejbql-quickref
 
groovy databases
groovy databasesgroovy databases
groovy databases
 
Parsing with Perl6 Grammars
Parsing with Perl6 GrammarsParsing with Perl6 Grammars
Parsing with Perl6 Grammars
 
Presentatie - Introductie in Groovy
Presentatie - Introductie in GroovyPresentatie - Introductie in Groovy
Presentatie - Introductie in Groovy
 
Mary Had a Little λ (QCon)
Mary Had a Little λ (QCon)Mary Had a Little λ (QCon)
Mary Had a Little λ (QCon)
 
Types and Immutability: why you should care
Types and Immutability: why you should careTypes and Immutability: why you should care
Types and Immutability: why you should care
 
Modern Application Foundations: Underscore and Twitter Bootstrap
Modern Application Foundations: Underscore and Twitter BootstrapModern Application Foundations: Underscore and Twitter Bootstrap
Modern Application Foundations: Underscore and Twitter Bootstrap
 
Python speleology
Python speleologyPython speleology
Python speleology
 
Xm lparsers
Xm lparsersXm lparsers
Xm lparsers
 
DevFest Istanbul - a free guided tour of Neo4J
DevFest Istanbul - a free guided tour of Neo4JDevFest Istanbul - a free guided tour of Neo4J
DevFest Istanbul - a free guided tour of Neo4J
 

Similar to OSDC.fr 2012 :: Cascalog : progammation logique pour Hadoop

Alexei shilov 2010 rit-rakudo
Alexei shilov 2010 rit-rakudoAlexei shilov 2010 rit-rakudo
Alexei shilov 2010 rit-rakudo
rit2010
 
Jonathan Worthington – Perl 2010 Rit Rakudo
Jonathan Worthington – Perl 2010 Rit RakudoJonathan Worthington – Perl 2010 Rit Rakudo
Jonathan Worthington – Perl 2010 Rit Rakudo
rit2010
 
Refactoring to Macros with Clojure
Refactoring to Macros with ClojureRefactoring to Macros with Clojure
Refactoring to Macros with Clojure
Dmitry Buzdin
 

Similar to OSDC.fr 2012 :: Cascalog : progammation logique pour Hadoop (20)

JavaOne報告会 Java SE/JavaFX 編 - JJUG CCC 2010 Fall
JavaOne報告会 Java SE/JavaFX 編 - JJUG CCC 2010 FallJavaOne報告会 Java SE/JavaFX 編 - JJUG CCC 2010 Fall
JavaOne報告会 Java SE/JavaFX 編 - JJUG CCC 2010 Fall
 
Hw09 Hadoop + Clojure
Hw09   Hadoop + ClojureHw09   Hadoop + Clojure
Hw09 Hadoop + Clojure
 
Introduction to Scalding and Monoids
Introduction to Scalding and MonoidsIntroduction to Scalding and Monoids
Introduction to Scalding and Monoids
 
Alexei shilov 2010 rit-rakudo
Alexei shilov 2010 rit-rakudoAlexei shilov 2010 rit-rakudo
Alexei shilov 2010 rit-rakudo
 
Jonathan Worthington – Perl 2010 Rit Rakudo
Jonathan Worthington – Perl 2010 Rit RakudoJonathan Worthington – Perl 2010 Rit Rakudo
Jonathan Worthington – Perl 2010 Rit Rakudo
 
Workshop Scala
Workshop ScalaWorkshop Scala
Workshop Scala
 
Hadoop + Clojure
Hadoop + ClojureHadoop + Clojure
Hadoop + Clojure
 
Metaprogramming in Haskell
Metaprogramming in HaskellMetaprogramming in Haskell
Metaprogramming in Haskell
 
Kotlin: maybe it's the right time
Kotlin: maybe it's the right timeKotlin: maybe it's the right time
Kotlin: maybe it's the right time
 
Scoobi - Scala for Startups
Scoobi - Scala for StartupsScoobi - Scala for Startups
Scoobi - Scala for Startups
 
Davide Cerbo - Kotlin: forse è la volta buona - Codemotion Milan 2017
Davide Cerbo - Kotlin: forse è la volta buona - Codemotion Milan 2017 Davide Cerbo - Kotlin: forse è la volta buona - Codemotion Milan 2017
Davide Cerbo - Kotlin: forse è la volta buona - Codemotion Milan 2017
 
Refactoring to Macros with Clojure
Refactoring to Macros with ClojureRefactoring to Macros with Clojure
Refactoring to Macros with Clojure
 
Introducción a hadoop
Introducción a hadoopIntroducción a hadoop
Introducción a hadoop
 
Kotlin techtalk
Kotlin techtalkKotlin techtalk
Kotlin techtalk
 
Kotlin – the future of android
Kotlin – the future of androidKotlin – the future of android
Kotlin – the future of android
 
A bit about Scala
A bit about ScalaA bit about Scala
A bit about Scala
 
What did you miss in Java from 9-13?
What did you miss in Java from 9-13?What did you miss in Java from 9-13?
What did you miss in Java from 9-13?
 
Groovy intro for OUDL
Groovy intro for OUDLGroovy intro for OUDL
Groovy intro for OUDL
 
Dynamic C++ Silicon Valley Code Camp 2012
Dynamic C++ Silicon Valley Code Camp 2012Dynamic C++ Silicon Valley Code Camp 2012
Dynamic C++ Silicon Valley Code Camp 2012
 
Softshake 2013: 10 reasons why java developers are jealous of Scala developers
Softshake 2013: 10 reasons why java developers are jealous of Scala developersSoftshake 2013: 10 reasons why java developers are jealous of Scala developers
Softshake 2013: 10 reasons why java developers are jealous of Scala developers
 

More from Publicis Sapient Engineering

More from Publicis Sapient Engineering (20)

XebiCon'18 - L'algorithme de reconnaissance de formes par le cerveau humain
XebiCon'18 - L'algorithme de reconnaissance de formes par le cerveau humainXebiCon'18 - L'algorithme de reconnaissance de formes par le cerveau humain
XebiCon'18 - L'algorithme de reconnaissance de formes par le cerveau humain
 
Xebicon'18 - IoT: From Edge to Cloud
Xebicon'18 - IoT: From Edge to CloudXebicon'18 - IoT: From Edge to Cloud
Xebicon'18 - IoT: From Edge to Cloud
 
Xebicon'18 - Spark in jail : conteneurisez vos traitements data sans serveur
Xebicon'18 - Spark in jail : conteneurisez vos traitements data sans serveurXebicon'18 - Spark in jail : conteneurisez vos traitements data sans serveur
Xebicon'18 - Spark in jail : conteneurisez vos traitements data sans serveur
 
XebiCon'18 - Modern Infrastructure
XebiCon'18 - Modern InfrastructureXebiCon'18 - Modern Infrastructure
XebiCon'18 - Modern Infrastructure
 
XebiCon'18 - La Web App d'aujourd'hui et de demain : état de l'art et bleedin...
XebiCon'18 - La Web App d'aujourd'hui et de demain : état de l'art et bleedin...XebiCon'18 - La Web App d'aujourd'hui et de demain : état de l'art et bleedin...
XebiCon'18 - La Web App d'aujourd'hui et de demain : état de l'art et bleedin...
 
XebiCon'18 - Des notebook pour le monitoring avec Zeppelin
XebiCon'18 - Des notebook pour le monitoring avec Zeppelin XebiCon'18 - Des notebook pour le monitoring avec Zeppelin
XebiCon'18 - Des notebook pour le monitoring avec Zeppelin
 
XebiCon'18 - Event Sourcing et RGPD, incompatibles ?
XebiCon'18 - Event Sourcing et RGPD, incompatibles ?XebiCon'18 - Event Sourcing et RGPD, incompatibles ?
XebiCon'18 - Event Sourcing et RGPD, incompatibles ?
 
XebiCon'18 - Deno, le nouveau NodeJS qui inverse la tendance ?
XebiCon'18 - Deno, le nouveau NodeJS qui inverse la tendance ?XebiCon'18 - Deno, le nouveau NodeJS qui inverse la tendance ?
XebiCon'18 - Deno, le nouveau NodeJS qui inverse la tendance ?
 
XebiCon'18 - Boostez vos modèles avec du Deep Learning distribué
XebiCon'18 - Boostez vos modèles avec du Deep Learning distribuéXebiCon'18 - Boostez vos modèles avec du Deep Learning distribué
XebiCon'18 - Boostez vos modèles avec du Deep Learning distribué
 
XebiCon'18 - Comment j'ai développé un jeu vidéo avec des outils de développe...
XebiCon'18 - Comment j'ai développé un jeu vidéo avec des outils de développe...XebiCon'18 - Comment j'ai développé un jeu vidéo avec des outils de développe...
XebiCon'18 - Comment j'ai développé un jeu vidéo avec des outils de développe...
 
XebiCon'18 - Les utilisateurs finaux, les oubliés de nos produits !
XebiCon'18 - Les utilisateurs finaux, les oubliés de nos produits !XebiCon'18 - Les utilisateurs finaux, les oubliés de nos produits !
XebiCon'18 - Les utilisateurs finaux, les oubliés de nos produits !
 
XebiCon'18 - Comment fausser l'interprétation de vos résultats avec des dataviz
XebiCon'18 - Comment fausser l'interprétation de vos résultats avec des datavizXebiCon'18 - Comment fausser l'interprétation de vos résultats avec des dataviz
XebiCon'18 - Comment fausser l'interprétation de vos résultats avec des dataviz
 
XebiCon'18 - Le développeur dans la Pop Culture
XebiCon'18 - Le développeur dans la Pop Culture XebiCon'18 - Le développeur dans la Pop Culture
XebiCon'18 - Le développeur dans la Pop Culture
 
XebiCon'18 - Architecturer son application mobile pour la durabilité
XebiCon'18 - Architecturer son application mobile pour la durabilitéXebiCon'18 - Architecturer son application mobile pour la durabilité
XebiCon'18 - Architecturer son application mobile pour la durabilité
 
XebiCon'18 - Sécuriser son API avec OpenID Connect
XebiCon'18 - Sécuriser son API avec OpenID ConnectXebiCon'18 - Sécuriser son API avec OpenID Connect
XebiCon'18 - Sécuriser son API avec OpenID Connect
 
XebiCon'18 - Structuration du Temps et Dynamique de Groupes, Théorie organisa...
XebiCon'18 - Structuration du Temps et Dynamique de Groupes, Théorie organisa...XebiCon'18 - Structuration du Temps et Dynamique de Groupes, Théorie organisa...
XebiCon'18 - Structuration du Temps et Dynamique de Groupes, Théorie organisa...
 
XebiCon'18 - Spark NLP, un an après
XebiCon'18 - Spark NLP, un an aprèsXebiCon'18 - Spark NLP, un an après
XebiCon'18 - Spark NLP, un an après
 
XebiCon'18 - La sécurité, douce illusion même en 2018
XebiCon'18 - La sécurité, douce illusion même en 2018XebiCon'18 - La sécurité, douce illusion même en 2018
XebiCon'18 - La sécurité, douce illusion même en 2018
 
XebiCon'18 - Utiliser Hyperledger Fabric pour la création d'une blockchain pr...
XebiCon'18 - Utiliser Hyperledger Fabric pour la création d'une blockchain pr...XebiCon'18 - Utiliser Hyperledger Fabric pour la création d'une blockchain pr...
XebiCon'18 - Utiliser Hyperledger Fabric pour la création d'une blockchain pr...
 
XebiCon'18 - Ce que l'histoire du métro Parisien m'a enseigné sur la création...
XebiCon'18 - Ce que l'histoire du métro Parisien m'a enseigné sur la création...XebiCon'18 - Ce que l'histoire du métro Parisien m'a enseigné sur la création...
XebiCon'18 - Ce que l'histoire du métro Parisien m'a enseigné sur la création...
 

Recently uploaded

Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Victor Rentea
 

Recently uploaded (20)

"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptx
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 

OSDC.fr 2012 :: Cascalog : progammation logique pour Hadoop

  • 1. Cascalog Programmation logique pour Hadoop Bertrand Dechoux 13 Octobre 2012 Saturday, October 13, 2012
  • 2. MapReduce : et vous?  Python ▶ map(function, iterable, ...) ▶ reduce(function,iterable[, initializer])  Perl ▶ map BLOCK LIST ▶ reduce BLOCK LIST  Ruby ▶ map {|item| block} -> new_ary / collect {|item| block} -> new_ary ▶ reduce(initial,sym) -> obj / inject(initial,sym) -> obj  Smalltalk ▶ collect:aBlock=TheArray ▶ inject: thisValue into: binaryBlock  PHP ▶ array array_map ( callable $callback, array $arr1 [, array $...]) ▶ mixed array_reduce (array $input, callable $function [, mixed $initial = NULL]) 2 Saturday, October 13, 2012
  • 3. Hadoop MapReduce : la théorie  Map ▶ Map(k1,v1) -> list(k2,v2)  Reduce ▶ Reduce(k2, list (v2)) -> list(k3,v3) 3 Saturday, October 13, 2012
  • 4. Hadoop MapReduce : la théorie  Map ▶ Map(k1,v1) -> list(k2,v2) ▶ SortByKey(list(k2,v2)) -> list(k2,v2)  Reduce ▶ MergeByKey(list,list,...) -> list(k2,list(v2)) ▶ Reduce(k2, list (v2)) -> list(k3,v3) 4 Saturday, October 13, 2012
  • 5. Hadoop MapReduce : la pratique public class WordCount { public static class Map extends Mapper<LongWritable, Text, Text, IntWritable> { private final static IntWritable one = new IntWritable(1); X private Text word = new Text(); public void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException { String line = value.toString(); StringTokenizer tokenizer = new StringTokenizer(line); while (tokenizer.hasMoreTokens()) { word.set(tokenizer.nextToken()); context.write(word, one); } } } public static class Reduce extends Reducer<Text, IntWritable, Text, IntWritable> { public void reduce(Text key, Iterable<IntWritable> values, Context context) throws IOException, InterruptedException { int sum = 0; for (IntWritable val : values) { sum += val.get(); } context.write(key, new IntWritable(sum)); } } public static void main(String[] args) throws Exception { Configuration conf = new Configuration(); Job job = new Job(conf, "wordcount"); job.setOutputKeyClass(Text.class); job.setOutputValueClass(IntWritable.class); job.setMapperClass(Map.class); job.setReducerClass(Reduce.class); job.setInputFormatClass(TextInputFormat.class); job.setOutputFormatClass(TextOutputFormat.class); FileInputFormat.addInputPath(job, new Path(args[0])); FileOutputFormat.setOutputPath(job, new Path(args[1])); job.waitForCompletion(true); } } 5 Saturday, October 13, 2012
  • 6. Cascading : des abstractions necessaires 6 Saturday, October 13, 2012
  • 7. Cascading : des abstractions necessaires 7 Saturday, October 13, 2012
  • 8. Cascading : ‘field algebra’ ?! X 8 Saturday, October 13, 2012
  • 9. Cascalog programmation logique pour Hadoop  (my-predicate ?var1 42 ?var3 :> ?var4 ?var5) 9 Saturday, October 13, 2012
  • 10. Cascalog : select ... from ...  (?<- (stdout) [?person] (person ?person)) 10 Saturday, October 13, 2012
  • 11. Cascalog : select ... from ...  (?<- (stdout) [?person] (person ?person))  (?<- (stdout) [?person ?age] (age ?person ?age)) 11 Saturday, October 13, 2012
  • 12. Cascalog : select ... from ...  (?<- (stdout) [?person] (person ?person))  (?<- (stdout) [?person ?age] (age ?person ?age))  (?<- (stdout) [?age] (age _ ?age)) 12 Saturday, October 13, 2012
  • 13. Cascalog : select ... from ...  (?<- (stdout) [?person] (person ?person))  (?<- (stdout) [?person ?age] (age ?person ?age))  (?<- (stdout) [?age] (age _ ?age))  (?<- (stdout) [?person] (age ?person 42)) 13 Saturday, October 13, 2012
  • 14. Cascalog : select ... from ... where  (?<- (stdout) [?person ?age] (age ?person ?age) (< ?age 30)) 14 Saturday, October 13, 2012
  • 15. Cascalog : select ... as ... from ...  (?<- (stdout) [?person ?junior] (age ?person ?age) (< ?age 30 :> ?junior)) 15 Saturday, October 13, 2012
  • 16. Cascalog : select count(*) from ... group by ...  (?<- (stdout) [?count] (age _ _) (c/count ?count)) 16 Saturday, October 13, 2012
  • 17. Cascalog : select count(*) from ... group by ...  (?<- (stdout) [?junior ?count] (age _ ?age) (< ?age 30 :> ?junior) (c/count ?count)) 17 Saturday, October 13, 2012
  • 18. Cascalog : select ... from ... join ...  (?<- (stdout) [?person ?age ?gender] (age ?person ?age) (gender ?person ?gender)) 18 Saturday, October 13, 2012
  • 19. Cascalog : select ... from ... (select ...)  (let [many-follows (<- [?person] (follows ?person _) (c/count ?count) (> ?count 2))] (?<- (stdout) [?personA ?personB] (many-follows ?personA) (many-follows ?personB) (follows ?personA ?personB)) ) 19 Saturday, October 13, 2012
  • 20. Cascalog : définir vos fonctions  (defn toUpperCase [person] (.toUpperCase person)) (?<- (stdout) [?PERSON] (person ?person) (toUpperCase ?person :> ?PERSON)) 20 Saturday, October 13, 2012
  • 21. Une conclusion?  ‘nouveaux’ datastores, ‘nouveaux’ types de requetage ▶ Cascalog, RDF, Datomic, Neo4j ...  Affinitée entre le paradigme fonctionel ▶ Et les traitements de données? ▶ Et vous? Cascalog mais aussi... ... PIG 21 Saturday, October 13, 2012
  • 22. http://blog.xebia.fr/author/bdechoux/ @BertrandDechoux ? 22 Saturday, October 13, 2012