SlideShare uma empresa Scribd logo
1 de 18
Baixar para ler offline
Gagnez du temp en parall´lisant sous R
                        e

              Maxime Tˆ
                      o


             June 12, 2012
Parall´liser sous R
      e




       On utilise ici le package SNOW:
       http://www.sfu.ca/~sblay/R/snow.html
This presentation is based on my own practice of R. I do not know
if it is optimal, but it made me gain a lot of time...
Parall´liser sous R
      e
   How does parallel computing work?
      Using the snow package “we open as many R session as the
      number of nodes we choose”:
      library(snow)
      cl <- makeCluster(3, type = "SOCK")
Parall´liser sous R
      e

       The clusterEvalQ() function allows to execute R code on all
       sessions:

    clusterEvalQ(cl, ls())
   > clusterEvalQ(cl, 1 + 1)
   [[1]]
   [1] 2
   [[2]]
   [1] 2
   [[3]]
   [1] 2
Parall´liser sous R
      e

       Nodes may be called independently:

   > clusterEvalQ(cl[1], a <- 1)
   > clusterEvalQ(cl[2], a <- 2)
   > clusterEvalQ(cl[3], a <- 3)
   > clusterEvalQ(cl, a)
   [[1]]
   [1] 1

   [[2]]
   [1] 2

   [[3]]
   [1] 3
Parall´liser sous R
      e


       The snow package comes with many parallelized versions of
       usual R functions as parLapply, parApply, etc. which are not
       always efficients:

   > a <- matrix(rnorm(10000000), ncol = 1000)
   > system.time(apply(a, 1, sum))
   utilisateur     syst`me
                       e        e
                                ´coul´
                                     e
          0.27        0.02        0.28
   > system.time(parApply(cl, a, 1, sum))
   utilisateur     syst`me
                       e        e
                                ´coul´
                                     e
          0.67        0.39        1.09
Parall´liser sous R
      e




   Using parallel code is not always efficient:
       It always takes some time to serialize and unserialize data
       If the data is huge R may need some time to copy it...
Parall´liser sous R
      e

       One solution is to first export data to all nodes and then
       execute the code on each node:

   > #### First Export:
   > columns <- clusterSplit(cl, 1:10000)
   > for (cc in 1:3){
   + aa <- a[columns[[cc]],]
   + clusterExport(cl[cc], "aa")
   + }
   > #### Then execute
   >
   > system.time(do.call("c",
   clusterEvalQ(cl, apply(aa, 1, sum))))
   utilisateur     syst`me
                       e        e
                                ´coul´
                                     e
          0.00        0.00        0.16
Parall´liser sous R
      e


   Of course, it is not necessary optimal to always export the data
   first... but in many cases it may be usefull:
       If one has many computation to do on one dataset
       For any iterative method:
            Bootstrap
            Iterative estimation: ML, GMM, etc.
       The idea is to first export data and then execute the code on
       the different nodes
       Exporting data is the costly step. Making a synthesis of the
       results is often quite easy (sum, c, cbind, etc.)
We simple problem




      We want to estimate a probit model
      ML estimation is iterative. You need to estimate partial
      derivatives for the gradient and the hessian matrix
      thus you need to evaluate the objective function many many
      times to obtain numerical derivatives
      Reducing the time of one iteration reduces the whole time of
      iteration a lot...
The probit model



   The model is given by:

                        Y ∗ = X β + varepsilon
                         Y    = 1{Y ∗ >0}

   The individual contribution to the likelihood is then :

                      L = Φ(X β)Y Φ(−X β)(1−Y )
A very simple problem

   > n       <- 5000000
   > param   <- c(1,2,-.5)
   > X1      <- rnorm(n)
   > X2      <- rnorm(n, mean = 1, sd = 2)
   > Ys      <- param[1] + param[2] * X1 +
   + param[3] * X2 + rnorm(n)
   > Y <- Ys > 0
   > probit <- function(para, y, x1, x2){
   + mu <- para[1] + para[2] * x1 + para[3] * x2
   + sum(pnorm(mu, log = T)*y + pnorm(-mu, log = T)*(1 - y))
   + }
   >
   > system.time(test1 <- probit(param, Y, X1, X2))
   utilisateur     syst`me
                       e        e
                                ´coul´
                                     e
          1.72        0.08        1.80
Make a parallel version



   We build a parallel version of our program doing the following
   steps:
    1. Make clusters
    2. Divide the data over the nodes
    3. Write the likelihood
    4. Execute the likelihood on each node
    5. Collect the results
Divide data:
> nn <- clusterSplit(cl, 1:n)
> for (cc in 1:3){
+ YY <- Y[nn[[cc]]]
+ XX1 <- X1[nn[[cc]]]
+ XX2 <- X2[nn[[cc]]]
+ clusterExport(cl[cc], c("YY", "XX1", "XX2"))
+ }
> clusterExport(cl, "probit")
> clusterEvalQ(cl, ls())
[[1]]
[1] "probit" "XX1"    "XX2"    "YY"

[[2]]
[1] "probit" "XX1"   "XX2"    "YY"

[[3]]
[1] "probit" "XX1"   "XX2"    "YY"
Write a new version of the likelihood:
>   gets<-function(n, v) {
+   assign(n,v, envir=.GlobalEnv);NULL
+   }
>   lik <- function(para){
+   clusterCall(cl, gets ,"para", get("para"))
+   do.call("sum",
+       clusterEvalQ(cl, probit(para, YY, XX1, XX2)))
+   }
Execute and compare theg results:
> system.time(test2 <- lik(param)) ## 1.5 sec
utilisateur     syst`me
                    e        e
                             ´coul´
                                  e
       0.00        0.00        0.78
> c(test1, test2) ## Same results
[1] -1432674 -1432674
Conclusion




      By using parallel versions of R, one may gain a lot of time...
      A wrong use of R packages may also be costly...
      Of course, for probit problem, use glm package...
      Don’t forget to close the nodes:
      > stopCluster(cl)

Mais conteúdo relacionado

Mais procurados

PostgreSQL 9.6 새 기능 소개
PostgreSQL 9.6 새 기능 소개PostgreSQL 9.6 새 기능 소개
PostgreSQL 9.6 새 기능 소개PgDay.Seoul
 
Programming with Python and PostgreSQL
Programming with Python and PostgreSQLProgramming with Python and PostgreSQL
Programming with Python and PostgreSQLPeter Eisentraut
 
Functional Reactive Programming with RxJS
Functional Reactive Programming with RxJSFunctional Reactive Programming with RxJS
Functional Reactive Programming with RxJSstefanmayer13
 
Performance improvements in PostgreSQL 9.5 and beyond
Performance improvements in PostgreSQL 9.5 and beyondPerformance improvements in PostgreSQL 9.5 and beyond
Performance improvements in PostgreSQL 9.5 and beyondTomas Vondra
 
[Pgday.Seoul 2017] 3. PostgreSQL WAL Buffers, Clog Buffers Deep Dive - 이근오
[Pgday.Seoul 2017] 3. PostgreSQL WAL Buffers, Clog Buffers Deep Dive - 이근오[Pgday.Seoul 2017] 3. PostgreSQL WAL Buffers, Clog Buffers Deep Dive - 이근오
[Pgday.Seoul 2017] 3. PostgreSQL WAL Buffers, Clog Buffers Deep Dive - 이근오PgDay.Seoul
 
Time Series Meetup: Virtual Edition | July 2020
Time Series Meetup: Virtual Edition | July 2020Time Series Meetup: Virtual Edition | July 2020
Time Series Meetup: Virtual Edition | July 2020InfluxData
 
The Ring programming language version 1.9 book - Part 90 of 210
The Ring programming language version 1.9 book - Part 90 of 210The Ring programming language version 1.9 book - Part 90 of 210
The Ring programming language version 1.9 book - Part 90 of 210Mahmoud Samir Fayed
 
Psycopg2 - Connect to PostgreSQL using Python Script
Psycopg2 - Connect to PostgreSQL using Python ScriptPsycopg2 - Connect to PostgreSQL using Python Script
Psycopg2 - Connect to PostgreSQL using Python ScriptSurvey Department
 
Леонид Шевцов «Clojure в деле»
Леонид Шевцов «Clojure в деле»Леонид Шевцов «Clojure в деле»
Леонид Шевцов «Clojure в деле»DataArt
 
Codepot - Pig i Hive: szybkie wprowadzenie / Pig and Hive crash course
Codepot - Pig i Hive: szybkie wprowadzenie / Pig and Hive crash courseCodepot - Pig i Hive: szybkie wprowadzenie / Pig and Hive crash course
Codepot - Pig i Hive: szybkie wprowadzenie / Pig and Hive crash courseSages
 
Declare Your Language: Virtual Machines & Code Generation
Declare Your Language: Virtual Machines & Code GenerationDeclare Your Language: Virtual Machines & Code Generation
Declare Your Language: Virtual Machines & Code GenerationEelco Visser
 
Compact and safely: static DSL on Kotlin
Compact and safely: static DSL on KotlinCompact and safely: static DSL on Kotlin
Compact and safely: static DSL on KotlinDmitry Pranchuk
 
PostgreSQL: Data analysis and analytics
PostgreSQL: Data analysis and analyticsPostgreSQL: Data analysis and analytics
PostgreSQL: Data analysis and analyticsHans-Jürgen Schönig
 
Accelerating Local Search with PostgreSQL (KNN-Search)
Accelerating Local Search with PostgreSQL (KNN-Search)Accelerating Local Search with PostgreSQL (KNN-Search)
Accelerating Local Search with PostgreSQL (KNN-Search)Jonathan Katz
 
[Pgday.Seoul 2019] Citus를 이용한 분산 데이터베이스
[Pgday.Seoul 2019] Citus를 이용한 분산 데이터베이스[Pgday.Seoul 2019] Citus를 이용한 분산 데이터베이스
[Pgday.Seoul 2019] Citus를 이용한 분산 데이터베이스PgDay.Seoul
 

Mais procurados (20)

Operating System Engineering
Operating System EngineeringOperating System Engineering
Operating System Engineering
 
PostgreSQL 9.6 새 기능 소개
PostgreSQL 9.6 새 기능 소개PostgreSQL 9.6 새 기능 소개
PostgreSQL 9.6 새 기능 소개
 
Programming Assignment Help
Programming Assignment HelpProgramming Assignment Help
Programming Assignment Help
 
Programming with Python and PostgreSQL
Programming with Python and PostgreSQLProgramming with Python and PostgreSQL
Programming with Python and PostgreSQL
 
Functional Reactive Programming with RxJS
Functional Reactive Programming with RxJSFunctional Reactive Programming with RxJS
Functional Reactive Programming with RxJS
 
Performance improvements in PostgreSQL 9.5 and beyond
Performance improvements in PostgreSQL 9.5 and beyondPerformance improvements in PostgreSQL 9.5 and beyond
Performance improvements in PostgreSQL 9.5 and beyond
 
[Pgday.Seoul 2017] 3. PostgreSQL WAL Buffers, Clog Buffers Deep Dive - 이근오
[Pgday.Seoul 2017] 3. PostgreSQL WAL Buffers, Clog Buffers Deep Dive - 이근오[Pgday.Seoul 2017] 3. PostgreSQL WAL Buffers, Clog Buffers Deep Dive - 이근오
[Pgday.Seoul 2017] 3. PostgreSQL WAL Buffers, Clog Buffers Deep Dive - 이근오
 
Time Series Meetup: Virtual Edition | July 2020
Time Series Meetup: Virtual Edition | July 2020Time Series Meetup: Virtual Edition | July 2020
Time Series Meetup: Virtual Edition | July 2020
 
The Ring programming language version 1.9 book - Part 90 of 210
The Ring programming language version 1.9 book - Part 90 of 210The Ring programming language version 1.9 book - Part 90 of 210
The Ring programming language version 1.9 book - Part 90 of 210
 
Psycopg2 - Connect to PostgreSQL using Python Script
Psycopg2 - Connect to PostgreSQL using Python ScriptPsycopg2 - Connect to PostgreSQL using Python Script
Psycopg2 - Connect to PostgreSQL using Python Script
 
Леонид Шевцов «Clojure в деле»
Леонид Шевцов «Clojure в деле»Леонид Шевцов «Clojure в деле»
Леонид Шевцов «Clojure в деле»
 
Codepot - Pig i Hive: szybkie wprowadzenie / Pig and Hive crash course
Codepot - Pig i Hive: szybkie wprowadzenie / Pig and Hive crash courseCodepot - Pig i Hive: szybkie wprowadzenie / Pig and Hive crash course
Codepot - Pig i Hive: szybkie wprowadzenie / Pig and Hive crash course
 
Declare Your Language: Virtual Machines & Code Generation
Declare Your Language: Virtual Machines & Code GenerationDeclare Your Language: Virtual Machines & Code Generation
Declare Your Language: Virtual Machines & Code Generation
 
Compact and safely: static DSL on Kotlin
Compact and safely: static DSL on KotlinCompact and safely: static DSL on Kotlin
Compact and safely: static DSL on Kotlin
 
PostgreSQL: Data analysis and analytics
PostgreSQL: Data analysis and analyticsPostgreSQL: Data analysis and analytics
PostgreSQL: Data analysis and analytics
 
Accelerating Local Search with PostgreSQL (KNN-Search)
Accelerating Local Search with PostgreSQL (KNN-Search)Accelerating Local Search with PostgreSQL (KNN-Search)
Accelerating Local Search with PostgreSQL (KNN-Search)
 
Operating System Engineering Quiz
Operating System Engineering QuizOperating System Engineering Quiz
Operating System Engineering Quiz
 
[Pgday.Seoul 2019] Citus를 이용한 분산 데이터베이스
[Pgday.Seoul 2019] Citus를 이용한 분산 데이터베이스[Pgday.Seoul 2019] Citus를 이용한 분산 데이터베이스
[Pgday.Seoul 2019] Citus를 이용한 분산 데이터베이스
 
Operating System Assignment Help
Operating System Assignment HelpOperating System Assignment Help
Operating System Assignment Help
 
Computer Science Homework Help
Computer Science Homework HelpComputer Science Homework Help
Computer Science Homework Help
 

Destaque

Exports de r vers office
Exports de r vers officeExports de r vers office
Exports de r vers officefrancoismarical
 
Introduction à la cartographie avec R
Introduction à la cartographie avec RIntroduction à la cartographie avec R
Introduction à la cartographie avec RCdiscount
 
Fltau r interface
Fltau r interfaceFltau r interface
Fltau r interfaceCdiscount
 
Incorporer du C dans R, créer son package
Incorporer du C dans R, créer son packageIncorporer du C dans R, créer son package
Incorporer du C dans R, créer son packageCdiscount
 
R aux enquêtes de conjoncture
R aux enquêtes de conjonctureR aux enquêtes de conjoncture
R aux enquêtes de conjoncturefrancoismarical
 
Premier pas de web scrapping avec R
Premier pas de  web scrapping avec RPremier pas de  web scrapping avec R
Premier pas de web scrapping avec RCdiscount
 
Dataiku r users group v2
Dataiku   r users group v2Dataiku   r users group v2
Dataiku r users group v2Cdiscount
 
R2DOCX : R + WORD
R2DOCX : R + WORDR2DOCX : R + WORD
R2DOCX : R + WORDCdiscount
 
R fait du la tex
R fait du la texR fait du la tex
R fait du la texCdiscount
 
Ugc net solutions at target ies
Ugc net solutions at target iesUgc net solutions at target ies
Ugc net solutions at target iesneeraj7svp
 
Solution manual for modern processor design by john paul shen and mikko h. li...
Solution manual for modern processor design by john paul shen and mikko h. li...Solution manual for modern processor design by john paul shen and mikko h. li...
Solution manual for modern processor design by john paul shen and mikko h. li...neeraj7svp
 
Full solution manual for modern processor design by john paul shen and mikko ...
Full solution manual for modern processor design by john paul shen and mikko ...Full solution manual for modern processor design by john paul shen and mikko ...
Full solution manual for modern processor design by john paul shen and mikko ...neeraj7svp
 
RStudio is good for you
RStudio is good for youRStudio is good for you
RStudio is good for youCdiscount
 
Cartographie avec igraph sous R (Partie 2)
Cartographie avec igraph sous R (Partie 2)Cartographie avec igraph sous R (Partie 2)
Cartographie avec igraph sous R (Partie 2)Cdiscount
 
Cartographie avec igraph sous R (Partie 1)
Cartographie avec igraph sous R (Partie 1) Cartographie avec igraph sous R (Partie 1)
Cartographie avec igraph sous R (Partie 1) Cdiscount
 
Première approche de cartographie sous R
Première approche de cartographie sous RPremière approche de cartographie sous R
Première approche de cartographie sous RCdiscount
 

Destaque (20)

Exports de r vers office
Exports de r vers officeExports de r vers office
Exports de r vers office
 
Introduction à la cartographie avec R
Introduction à la cartographie avec RIntroduction à la cartographie avec R
Introduction à la cartographie avec R
 
Fltau r interface
Fltau r interfaceFltau r interface
Fltau r interface
 
Incorporer du C dans R, créer son package
Incorporer du C dans R, créer son packageIncorporer du C dans R, créer son package
Incorporer du C dans R, créer son package
 
R aux enquêtes de conjoncture
R aux enquêtes de conjonctureR aux enquêtes de conjoncture
R aux enquêtes de conjoncture
 
Premier pas de web scrapping avec R
Premier pas de  web scrapping avec RPremier pas de  web scrapping avec R
Premier pas de web scrapping avec R
 
Dataiku r users group v2
Dataiku   r users group v2Dataiku   r users group v2
Dataiku r users group v2
 
R in latex
R in latexR in latex
R in latex
 
HADOOP + R
HADOOP + RHADOOP + R
HADOOP + R
 
Big data with r
Big data with rBig data with r
Big data with r
 
R2DOCX : R + WORD
R2DOCX : R + WORDR2DOCX : R + WORD
R2DOCX : R + WORD
 
R fait du la tex
R fait du la texR fait du la tex
R fait du la tex
 
ca-ap9222-pdf
ca-ap9222-pdfca-ap9222-pdf
ca-ap9222-pdf
 
Ugc net solutions at target ies
Ugc net solutions at target iesUgc net solutions at target ies
Ugc net solutions at target ies
 
Solution manual for modern processor design by john paul shen and mikko h. li...
Solution manual for modern processor design by john paul shen and mikko h. li...Solution manual for modern processor design by john paul shen and mikko h. li...
Solution manual for modern processor design by john paul shen and mikko h. li...
 
Full solution manual for modern processor design by john paul shen and mikko ...
Full solution manual for modern processor design by john paul shen and mikko ...Full solution manual for modern processor design by john paul shen and mikko ...
Full solution manual for modern processor design by john paul shen and mikko ...
 
RStudio is good for you
RStudio is good for youRStudio is good for you
RStudio is good for you
 
Cartographie avec igraph sous R (Partie 2)
Cartographie avec igraph sous R (Partie 2)Cartographie avec igraph sous R (Partie 2)
Cartographie avec igraph sous R (Partie 2)
 
Cartographie avec igraph sous R (Partie 1)
Cartographie avec igraph sous R (Partie 1) Cartographie avec igraph sous R (Partie 1)
Cartographie avec igraph sous R (Partie 1)
 
Première approche de cartographie sous R
Première approche de cartographie sous RPremière approche de cartographie sous R
Première approche de cartographie sous R
 

Semelhante a Parallel R in snow (english after 2nd slide)

Do snow.rwn
Do snow.rwnDo snow.rwn
Do snow.rwnARUN DN
 
Simple, fast, and scalable torch7 tutorial
Simple, fast, and scalable torch7 tutorialSimple, fast, and scalable torch7 tutorial
Simple, fast, and scalable torch7 tutorialJin-Hwa Kim
 
Algorithm analysis.pptx
Algorithm analysis.pptxAlgorithm analysis.pptx
Algorithm analysis.pptxDrBashirMSaad
 
Parallel Computing with R
Parallel Computing with RParallel Computing with R
Parallel Computing with RPeter Solymos
 
Effective Numerical Computation in NumPy and SciPy
Effective Numerical Computation in NumPy and SciPyEffective Numerical Computation in NumPy and SciPy
Effective Numerical Computation in NumPy and SciPyKimikazu Kato
 
Complex models in ecology: challenges and solutions
Complex models in ecology: challenges and solutionsComplex models in ecology: challenges and solutions
Complex models in ecology: challenges and solutionsPeter Solymos
 
Numerical tour in the Python eco-system: Python, NumPy, scikit-learn
Numerical tour in the Python eco-system: Python, NumPy, scikit-learnNumerical tour in the Python eco-system: Python, NumPy, scikit-learn
Numerical tour in the Python eco-system: Python, NumPy, scikit-learnArnaud Joly
 
Write Python for Speed
Write Python for SpeedWrite Python for Speed
Write Python for SpeedYung-Yu Chen
 
Parallel Computing with R
Parallel Computing with RParallel Computing with R
Parallel Computing with RAbhirup Mallik
 
Julia - Easier, Better, Faster, Stronger
Julia - Easier, Better, Faster, StrongerJulia - Easier, Better, Faster, Stronger
Julia - Easier, Better, Faster, StrongerKenta Sato
 
Programming python quick intro for schools
Programming python quick intro for schoolsProgramming python quick intro for schools
Programming python quick intro for schoolsDan Bowen
 
Testing in those hard to reach places
Testing in those hard to reach placesTesting in those hard to reach places
Testing in those hard to reach placesdn
 
關於測試,我說的其實是......
關於測試,我說的其實是......關於測試,我說的其實是......
關於測試,我說的其實是......hugo lu
 
Pythran: Static compiler for high performance by Mehdi Amini PyData SV 2014
Pythran: Static compiler for high performance by Mehdi Amini PyData SV 2014Pythran: Static compiler for high performance by Mehdi Amini PyData SV 2014
Pythran: Static compiler for high performance by Mehdi Amini PyData SV 2014PyData
 
Fast, stable and scalable true radix sorting with Matt Dowle at useR! Aalborg
Fast, stable and scalable true radix sorting with Matt Dowle at useR! AalborgFast, stable and scalable true radix sorting with Matt Dowle at useR! Aalborg
Fast, stable and scalable true radix sorting with Matt Dowle at useR! AalborgSri Ambati
 
Optimization and Mathematical Programming in R and ROI - R Optimization Infra...
Optimization and Mathematical Programming in R and ROI - R Optimization Infra...Optimization and Mathematical Programming in R and ROI - R Optimization Infra...
Optimization and Mathematical Programming in R and ROI - R Optimization Infra...Dr. Volkan OBAN
 
Python Programming - IX. On Randomness
Python Programming - IX. On RandomnessPython Programming - IX. On Randomness
Python Programming - IX. On RandomnessRanel Padon
 

Semelhante a Parallel R in snow (english after 2nd slide) (20)

Do snow.rwn
Do snow.rwnDo snow.rwn
Do snow.rwn
 
Simple, fast, and scalable torch7 tutorial
Simple, fast, and scalable torch7 tutorialSimple, fast, and scalable torch7 tutorial
Simple, fast, and scalable torch7 tutorial
 
Algorithm analysis.pptx
Algorithm analysis.pptxAlgorithm analysis.pptx
Algorithm analysis.pptx
 
Parallel Computing with R
Parallel Computing with RParallel Computing with R
Parallel Computing with R
 
MLE Example
MLE ExampleMLE Example
MLE Example
 
Effective Numerical Computation in NumPy and SciPy
Effective Numerical Computation in NumPy and SciPyEffective Numerical Computation in NumPy and SciPy
Effective Numerical Computation in NumPy and SciPy
 
Complex models in ecology: challenges and solutions
Complex models in ecology: challenges and solutionsComplex models in ecology: challenges and solutions
Complex models in ecology: challenges and solutions
 
Numerical tour in the Python eco-system: Python, NumPy, scikit-learn
Numerical tour in the Python eco-system: Python, NumPy, scikit-learnNumerical tour in the Python eco-system: Python, NumPy, scikit-learn
Numerical tour in the Python eco-system: Python, NumPy, scikit-learn
 
Write Python for Speed
Write Python for SpeedWrite Python for Speed
Write Python for Speed
 
Parallel Computing with R
Parallel Computing with RParallel Computing with R
Parallel Computing with R
 
Julia - Easier, Better, Faster, Stronger
Julia - Easier, Better, Faster, StrongerJulia - Easier, Better, Faster, Stronger
Julia - Easier, Better, Faster, Stronger
 
Programming python quick intro for schools
Programming python quick intro for schoolsProgramming python quick intro for schools
Programming python quick intro for schools
 
Learn Matlab
Learn MatlabLearn Matlab
Learn Matlab
 
Testing in those hard to reach places
Testing in those hard to reach placesTesting in those hard to reach places
Testing in those hard to reach places
 
關於測試,我說的其實是......
關於測試,我說的其實是......關於測試,我說的其實是......
關於測試,我說的其實是......
 
alexnet.pdf
alexnet.pdfalexnet.pdf
alexnet.pdf
 
Pythran: Static compiler for high performance by Mehdi Amini PyData SV 2014
Pythran: Static compiler for high performance by Mehdi Amini PyData SV 2014Pythran: Static compiler for high performance by Mehdi Amini PyData SV 2014
Pythran: Static compiler for high performance by Mehdi Amini PyData SV 2014
 
Fast, stable and scalable true radix sorting with Matt Dowle at useR! Aalborg
Fast, stable and scalable true radix sorting with Matt Dowle at useR! AalborgFast, stable and scalable true radix sorting with Matt Dowle at useR! Aalborg
Fast, stable and scalable true radix sorting with Matt Dowle at useR! Aalborg
 
Optimization and Mathematical Programming in R and ROI - R Optimization Infra...
Optimization and Mathematical Programming in R and ROI - R Optimization Infra...Optimization and Mathematical Programming in R and ROI - R Optimization Infra...
Optimization and Mathematical Programming in R and ROI - R Optimization Infra...
 
Python Programming - IX. On Randomness
Python Programming - IX. On RandomnessPython Programming - IX. On Randomness
Python Programming - IX. On Randomness
 

Mais de Cdiscount

Presentation r markdown
Presentation r markdown Presentation r markdown
Presentation r markdown Cdiscount
 
Paris2012 session4
Paris2012 session4Paris2012 session4
Paris2012 session4Cdiscount
 
Paris2012 session3b
Paris2012 session3bParis2012 session3b
Paris2012 session3bCdiscount
 
Scm prix blé_2012_11_06
Scm prix blé_2012_11_06Scm prix blé_2012_11_06
Scm prix blé_2012_11_06Cdiscount
 
Scm indicateurs prospectifs_2012_11_06
Scm indicateurs prospectifs_2012_11_06Scm indicateurs prospectifs_2012_11_06
Scm indicateurs prospectifs_2012_11_06Cdiscount
 
State Space Model
State Space ModelState Space Model
State Space ModelCdiscount
 
Paris2012 session2
Paris2012 session2Paris2012 session2
Paris2012 session2Cdiscount
 
Paris2012 session1
Paris2012 session1Paris2012 session1
Paris2012 session1Cdiscount
 
Prévisions trafic aérien
Prévisions trafic aérienPrévisions trafic aérien
Prévisions trafic aérienCdiscount
 
Robust sequentiel learning
Robust sequentiel learningRobust sequentiel learning
Robust sequentiel learningCdiscount
 
Prediction of Quantiles by Statistical Learning and Application to GDP Foreca...
Prediction of Quantiles by Statistical Learning and Application to GDP Foreca...Prediction of Quantiles by Statistical Learning and Application to GDP Foreca...
Prediction of Quantiles by Statistical Learning and Application to GDP Foreca...Cdiscount
 
Comptabilité Nationale avec R
Comptabilité Nationale avec RComptabilité Nationale avec R
Comptabilité Nationale avec RCdiscount
 
Prévision de consommation électrique avec adaptive GAM
Prévision de consommation électrique avec adaptive GAMPrévision de consommation électrique avec adaptive GAM
Prévision de consommation électrique avec adaptive GAMCdiscount
 
Forecasting GDP profile with an application to French Business Surveys
Forecasting GDP profile with an application to French Business SurveysForecasting GDP profile with an application to French Business Surveys
Forecasting GDP profile with an application to French Business SurveysCdiscount
 
Prediction in dynamic Graphs
Prediction in dynamic GraphsPrediction in dynamic Graphs
Prediction in dynamic GraphsCdiscount
 

Mais de Cdiscount (17)

R Devtools
R DevtoolsR Devtools
R Devtools
 
Presentation r markdown
Presentation r markdown Presentation r markdown
Presentation r markdown
 
Paris2012 session4
Paris2012 session4Paris2012 session4
Paris2012 session4
 
Paris2012 session3b
Paris2012 session3bParis2012 session3b
Paris2012 session3b
 
Scm prix blé_2012_11_06
Scm prix blé_2012_11_06Scm prix blé_2012_11_06
Scm prix blé_2012_11_06
 
Scm indicateurs prospectifs_2012_11_06
Scm indicateurs prospectifs_2012_11_06Scm indicateurs prospectifs_2012_11_06
Scm indicateurs prospectifs_2012_11_06
 
Scm risques
Scm risquesScm risques
Scm risques
 
State Space Model
State Space ModelState Space Model
State Space Model
 
Paris2012 session2
Paris2012 session2Paris2012 session2
Paris2012 session2
 
Paris2012 session1
Paris2012 session1Paris2012 session1
Paris2012 session1
 
Prévisions trafic aérien
Prévisions trafic aérienPrévisions trafic aérien
Prévisions trafic aérien
 
Robust sequentiel learning
Robust sequentiel learningRobust sequentiel learning
Robust sequentiel learning
 
Prediction of Quantiles by Statistical Learning and Application to GDP Foreca...
Prediction of Quantiles by Statistical Learning and Application to GDP Foreca...Prediction of Quantiles by Statistical Learning and Application to GDP Foreca...
Prediction of Quantiles by Statistical Learning and Application to GDP Foreca...
 
Comptabilité Nationale avec R
Comptabilité Nationale avec RComptabilité Nationale avec R
Comptabilité Nationale avec R
 
Prévision de consommation électrique avec adaptive GAM
Prévision de consommation électrique avec adaptive GAMPrévision de consommation électrique avec adaptive GAM
Prévision de consommation électrique avec adaptive GAM
 
Forecasting GDP profile with an application to French Business Surveys
Forecasting GDP profile with an application to French Business SurveysForecasting GDP profile with an application to French Business Surveys
Forecasting GDP profile with an application to French Business Surveys
 
Prediction in dynamic Graphs
Prediction in dynamic GraphsPrediction in dynamic Graphs
Prediction in dynamic Graphs
 

Último

Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 

Último (20)

Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 

Parallel R in snow (english after 2nd slide)

  • 1. Gagnez du temp en parall´lisant sous R e Maxime Tˆ o June 12, 2012
  • 2. Parall´liser sous R e On utilise ici le package SNOW: http://www.sfu.ca/~sblay/R/snow.html
  • 3. This presentation is based on my own practice of R. I do not know if it is optimal, but it made me gain a lot of time...
  • 4. Parall´liser sous R e How does parallel computing work? Using the snow package “we open as many R session as the number of nodes we choose”: library(snow) cl <- makeCluster(3, type = "SOCK")
  • 5. Parall´liser sous R e The clusterEvalQ() function allows to execute R code on all sessions: clusterEvalQ(cl, ls()) > clusterEvalQ(cl, 1 + 1) [[1]] [1] 2 [[2]] [1] 2 [[3]] [1] 2
  • 6. Parall´liser sous R e Nodes may be called independently: > clusterEvalQ(cl[1], a <- 1) > clusterEvalQ(cl[2], a <- 2) > clusterEvalQ(cl[3], a <- 3) > clusterEvalQ(cl, a) [[1]] [1] 1 [[2]] [1] 2 [[3]] [1] 3
  • 7. Parall´liser sous R e The snow package comes with many parallelized versions of usual R functions as parLapply, parApply, etc. which are not always efficients: > a <- matrix(rnorm(10000000), ncol = 1000) > system.time(apply(a, 1, sum)) utilisateur syst`me e e ´coul´ e 0.27 0.02 0.28 > system.time(parApply(cl, a, 1, sum)) utilisateur syst`me e e ´coul´ e 0.67 0.39 1.09
  • 8. Parall´liser sous R e Using parallel code is not always efficient: It always takes some time to serialize and unserialize data If the data is huge R may need some time to copy it...
  • 9. Parall´liser sous R e One solution is to first export data to all nodes and then execute the code on each node: > #### First Export: > columns <- clusterSplit(cl, 1:10000) > for (cc in 1:3){ + aa <- a[columns[[cc]],] + clusterExport(cl[cc], "aa") + } > #### Then execute > > system.time(do.call("c", clusterEvalQ(cl, apply(aa, 1, sum)))) utilisateur syst`me e e ´coul´ e 0.00 0.00 0.16
  • 10. Parall´liser sous R e Of course, it is not necessary optimal to always export the data first... but in many cases it may be usefull: If one has many computation to do on one dataset For any iterative method: Bootstrap Iterative estimation: ML, GMM, etc. The idea is to first export data and then execute the code on the different nodes Exporting data is the costly step. Making a synthesis of the results is often quite easy (sum, c, cbind, etc.)
  • 11. We simple problem We want to estimate a probit model ML estimation is iterative. You need to estimate partial derivatives for the gradient and the hessian matrix thus you need to evaluate the objective function many many times to obtain numerical derivatives Reducing the time of one iteration reduces the whole time of iteration a lot...
  • 12. The probit model The model is given by: Y ∗ = X β + varepsilon Y = 1{Y ∗ >0} The individual contribution to the likelihood is then : L = Φ(X β)Y Φ(−X β)(1−Y )
  • 13. A very simple problem > n <- 5000000 > param <- c(1,2,-.5) > X1 <- rnorm(n) > X2 <- rnorm(n, mean = 1, sd = 2) > Ys <- param[1] + param[2] * X1 + + param[3] * X2 + rnorm(n) > Y <- Ys > 0 > probit <- function(para, y, x1, x2){ + mu <- para[1] + para[2] * x1 + para[3] * x2 + sum(pnorm(mu, log = T)*y + pnorm(-mu, log = T)*(1 - y)) + } > > system.time(test1 <- probit(param, Y, X1, X2)) utilisateur syst`me e e ´coul´ e 1.72 0.08 1.80
  • 14. Make a parallel version We build a parallel version of our program doing the following steps: 1. Make clusters 2. Divide the data over the nodes 3. Write the likelihood 4. Execute the likelihood on each node 5. Collect the results
  • 15. Divide data: > nn <- clusterSplit(cl, 1:n) > for (cc in 1:3){ + YY <- Y[nn[[cc]]] + XX1 <- X1[nn[[cc]]] + XX2 <- X2[nn[[cc]]] + clusterExport(cl[cc], c("YY", "XX1", "XX2")) + } > clusterExport(cl, "probit") > clusterEvalQ(cl, ls()) [[1]] [1] "probit" "XX1" "XX2" "YY" [[2]] [1] "probit" "XX1" "XX2" "YY" [[3]] [1] "probit" "XX1" "XX2" "YY"
  • 16. Write a new version of the likelihood: > gets<-function(n, v) { + assign(n,v, envir=.GlobalEnv);NULL + } > lik <- function(para){ + clusterCall(cl, gets ,"para", get("para")) + do.call("sum", + clusterEvalQ(cl, probit(para, YY, XX1, XX2))) + }
  • 17. Execute and compare theg results: > system.time(test2 <- lik(param)) ## 1.5 sec utilisateur syst`me e e ´coul´ e 0.00 0.00 0.78 > c(test1, test2) ## Same results [1] -1432674 -1432674
  • 18. Conclusion By using parallel versions of R, one may gain a lot of time... A wrong use of R packages may also be costly... Of course, for probit problem, use glm package... Don’t forget to close the nodes: > stopCluster(cl)