SlideShare uma empresa Scribd logo
1 de 47
RUBY AND R


Chang Sau Sheong
Director, Applied Research, HP Labs Singapore


1   © Copyright 2010 Hewlett-Packard Development Company, L.P.
About HP Labs



2   © Copyright 2010 Hewlett-Packard Development Company, L.P.
HP LABS
– Exploratory and advanced
  research group for Hewlett-Packard
– Global organization that tackles
  complex challenges facing our
  customers and society over the next
  decade
– Pushes the frontiers of fundamental
  science
– HQ Palo Alto



3   © Copyright 2010 Hewlett-Packard Development Company, L.P.
HP LABS AROUND THE WORLD

                                                                 Bristol   St. Petersburg

                                                                                 Beijing
           Palo Alto

                                                                             Bangalore

                      Haifa                                                 Singapore




4   © Copyright 2010 Hewlett-Packard Development Company, L.P.
HP LABS SINGAPORE
– Set up in February 2010
– Focus on Cloud Computing
      Research                                                   Applied Research
            •   Exploratory research                              •   Applied Research
            •   Researchers                                       •   Innovators
            •   Change the state of the art                       •   Take the research to the next
                                                                      stage
            •   Working closely with the
                academic community                                •   Work closely with customers
                                                                      and business units



5   © Copyright 2010 Hewlett-Packard Development Company, L.P.
Ruby and R



6   © Copyright 2010 Hewlett-Packard Development Company, L.P.
Programming language and
    platform for statistical computing,
           licensed under GPL


7   © Copyright 2010 Hewlett-Packard Development Company, L.P.
Strengths in
               statistical processing
                                                                 and
                          data visualization

8   © Copyright 2010 Hewlett-Packard Development Company, L.P.
Extensive library of statistical
           computing packages (CRAN)
              written by statisticians



9   © Copyright 2010 Hewlett-Packard Development Company, L.P.
Statistics is not just
                            for statisticians


10   © Copyright 2010 Hewlett-Packard Development Company, L.P.
Recommendation                                                       Speech
         engine                                                         recognition
        Fingerprint         Spam detection
       identification
                    Card fraud Financial
         Face        detection forecasting
     recognition

                       Data                                       OCR      Credit scoring
                      mining
11   © Copyright 2010 Hewlett-Packard Development Company, L.P.
CRAN
– Almost 2000 packages, mostly created by
  statisticians
     • BiodiversityR                           – GUI for biodiversity and community ecology
       analysis
     • Emu – analyze speech patterns
     • GenABEL – study human genome
     • Quantmod– quantitative financial modeling framework
     • Ftrading – technical trading analysis
     • Cyclones – cyclone identification
     • DOSim – disease analysis toolkit for gene set
     • Agricolae – statistical procedures for agricultural research


12   © Copyright 2010 Hewlett-Packard Development Company, L.P.
EXAMPLE R CODE
– EPL data from football-data.co.uk
– Show home/away goals distribution for 201 season
                                           1




13   © Copyright 2010 Hewlett-Packard Development Company, L.P.
Why Ruby and R?



14   © Copyright 2010 Hewlett-Packard Development Company, L.P.
Stand on shoulders
                          of giants


15   © Copyright 2010 Hewlett-Packard Development Company, L.P.
–Ruby
     • Human   focused programming!
     • Better general purpose programming capabilities
     • Great                  frameworks!
     • Great                  libraries (20,000+ gems in RubyGems)
–R
     • Focus   on statistical computing/crunching
     • Lots of packages written by domain experts/
       statisticians
     • Great                  graphing libraries

16   © Copyright 2010 Hewlett-Packard Development Company, L.P.
Ruby and R
                                                    integration


17   © Copyright 2010 Hewlett-Packard Development Company, L.P.
RINRUBY
– 100% Ruby
– Uses pipes to send commands and evals
– Uses TCP/IP Sockets to send and retrieve data
– Pros:
     •   Doesn't requires anything but R
     •   Works flawlessly on Windows
     •   Work with Ruby 1.8, 1.9 and JRuby 1.5
     •   All API tested

– Cons:
     •   VERY SLOW in assigning
     •   Very limited datatypes: only Vector and Matrix
     •   Not released since 2009
     •   Poor documentation


18   © Copyright 2010 Hewlett-Packard Development Company, L.P.
RSRUBY
– C Extension for Ruby, linked to R's shared library
– Pros:
     •   Blazing speed! 5-10 times faster than Rserve and 100-1000 than RinRuby.
     •   Seamless integration with Ruby. Every method and object is treated like a Ruby object

– Cons:
     •   Transformation between R and Ruby types aren't trivial
     •   Dependent on operating system, Ruby implementation and R version
     •   Not available for alternative implementations of Ruby (eg JRuby)
     •   Not released since 2009
     •   Poor documentation




19   © Copyright 2010 Hewlett-Packard Development Company, L.P.
RSERVE
– 100% Ruby
– Uses TCP/IP sockets to interchange data and commands
– Requires Rserve installed on the server machine
– Access with Ruby uses Ruby-Rserve-Client library
– Pros:
     •   Work with Ruby 1.8, 1.9 and JRuby 1.5.
     •   Session allows to process data asynchronously
     •   Fast: 5-10 times faster than RinRuby
     •   Most recently updated (Jan 2011)

– Cons:
     •   Requires Rserve
     •   Limited features on Windows
     •   Poor documentation



20   © Copyright 2010 Hewlett-Packard Development Company, L.P.
RAPACHE/RRACK
– Web service based
– Run R scripts as web services, consumed by Ruby front-end apps
– Pros:
     •   Modular and separate (no direct integration)
     •   Can be scalable, ‘cloud’-ready

– Cons:
     •   Requires Rapache/rRack
     •   rRack is very new (not accepted by CRAN yet, as of today!), requires R 2.13 (just
         released a few weeks ago)
     •   Rapache specific to Apache web server only
     •   Communications overhead for smaller integrations




21   © Copyright 2010 Hewlett-Packard Development Company, L.P.
Let’s look at some
                                    code!
                                                  (I’m going to use Rserve)




22   © Copyright 2010 Hewlett-Packard Development Company, L.P.
Text classification



23   © Copyright 2010 Hewlett-Packard Development Company, L.P.
TEXT CLASSIFICATION
–Automatically sorting a set of documents into
 different categories from a predefined set
–Classic uses:                                                    Training
                                                                                          Test data
     • Spam               filtering                                 data
     • Email              prioritization
                                                                             Classifier




                                                                             category


24   © Copyright 2010 Hewlett-Packard Development Company, L.P.
25   © Copyright 2010 Hewlett-Packard Development Company, L.P.
TEXT CLASSIFIER CODE

 Prepare




26   © Copyright 2010 Hewlett-Packard Development Company, L.P.
Train classifier by counting frequency of
each word in the document




27   © Copyright 2010 Hewlett-Packard Development Company, L.P.
Get word count




28   © Copyright 2010 Hewlett-Packard Development Company, L.P.
What you get
     {"check"=>1, "result"=>3, "marissa"=>1, "experi"=>1,
     "click"=>1, "engin"=>1, "simpli"=>1, "mistakenli"=>1,
     "pick"=>1, "prevent"=>1, "40"=>1, "regularli"=>1, "place"=>1,
     "user"=>5, "prefer"=>1, "malevol"=>1, "access"=>1,
     "robust"=>1, "servic"=>1, "fault"=>1, "malici"=>1, "list"=>2,
     "hand"=>1, "internet"=>1, "attribut"=>1, "instal"=>1,
     "file"=>1, "unabl"=>1, "vice"=>1, "stopbadwareorg"=>2,
     "merit"=>1, "decid"=>1, "flag"=>2, "saturdai"=>2, "hit"=>2,
     "offici"=>1, "error"=>3, "work"=>1, "site"=>5, "happen"=>2,
     "incid"=>1, "technic"=>1, "advis"=>1, "put"=>1, "human"=>3,
     "harm"=>2, "softwar"=>1, "ms"=>1, "affect"=>1, "carefulli"=>1,
     "product"=>1, "presid"=>1, "complaint"=>1, "potenti"=>2,
     "googl"=>6, "comput"=>2, "peopl"=>1, "investig"=>2,
     "consum"=>1, "danger"=>2, "period"=>1, "wrote"=>2,
     "search"=>7, "ascertain"=>1, "blog"=>1, "warn"=>2,
     "problem"=>1, "updat"=>2, "minut"=>1, "mayer"=>2}




29   © Copyright 2010 Hewlett-Packard Development Company, L.P.
Generate training data for prediction




30   © Copyright 2010 Hewlett-Packard Development Company, L.P.
Training data



31   © Copyright 2010 Hewlett-Packard Development Company, L.P.
category,googl,report,search,user,review,court,mckinnon,year,internet,microsoft,site,sof
twar,warn,browser,oper,expert,rise,lawyer,digit,extradit,sharpli,error,group,result,syst
em,rebel,econom,presid,crisi,find,year,accus,global,obama,china,civilian,shrink,hous,wal
l,street,quarter,white,heavi,lehman,economi,session,ey,time,davo,human
not_interesting,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,3,0,0,0,0,0,0,
0,0,0,0,1,0,0,0,0,0,0,0,0,0
not_interesting,0,1,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,1,0,0,5,0,2,0,0,0,3,0,0,0,3,
1,0,0,0,0,0,3,0,0,0,0,0,0,2
not_interesting,0,1,0,0,0,0,0,2,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,1,0,3,0,3,1,2,0,2,0,0,0,
0,0,0,0,0,0,0,3,1,3,1,0,2,0
not_interesting,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,


                                                                     The top 25 most
0,0,0,0,0,0,0,0,0,0,0,0,0,1
not_interesting,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,2,0,0,1,2,1,4,0,
0,2,0,0,0,2,0,0,0,0,2,0,1,0

                                                                    frequent words in
not_interesting,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,1,0,0,0,1,0,0,
0,0,3,3,0,0,0,0,0,0,0,2,0,0


                                                                   the training dataset
not_interesting,0,0,0,0,0,0,0,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,2,0,0,2,0,0,
2,1,0,0,2,1,0,0,2,0,0,1,0,0
interesting,6,0,7,5,0,0,0,0,1,0,5,1,2,0,0,0,0,0,0,0,0,3,0,3,0,0,0,1,0,0,0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,0,3
interesting,0,7,0,0,2,0,0,0,0,0,0,0,1,0,0,1,0,0,3,0,0,0,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,0,0
interesting,0,1,0,0,0,0,0,3,3,1,0,1,1,1,0,3,3,0,1,0,3,0,1,0,2,0,1,0,0,0,3,0,0,0,0,0,0,0,
0,0,0,0,0,0,1,1,0,0,3,0
interesting,0,0,0,0,3,5,5,0,0,0,0,0,0,0,0,0,1,4,0,3,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,0,0
interesting,6,0,1,1,0,0,0,0,0,0,0,1,0,0,4,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,0,0
interesting,0,0,0,2,0,0,0,2,1,4,0,2,0,3,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,2,0,0



 32   © Copyright 2010 Hewlett-Packard Development Company, L.P.
category,googl,report,search,user,review,court,mckinnon,year,internet,microsoft,site,sof
twar,warn,browser,oper,expert,rise,lawyer,digit,extradit,sharpli,error,group,result,syst
em,rebel,econom,presid,crisi,find,year,accus,global,obama,china,civilian,shrink,hous,wal
l,street,quarter,white,heavi,lehman,economi,session,ey,time,davo,human
not_interesting,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,3,0,0,0,0,0,0,
0,0,0,0,1,0,0,0,0,0,0,0,0,0
not_interesting,0,1,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,1,0,0,5,0,2,0,0,0,3,0,0,0,3,
1,0,0,0,0,0,3,0,0,0,0,0,0,2
not_interesting,0,1,0,0,0,0,0,2,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,1,0,3,0,3,1,2,0,2,0,0,0,
0,0,0,0,0,0,0,3,1,3,1,0,2,0
not_interesting,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,


                                                                       Each line
0,0,0,0,0,0,0,0,0,0,0,0,0,1
not_interesting,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,2,0,0,1,2,1,4,0,
0,2,0,0,0,2,0,0,0,0,2,0,1,0

                                                                     represents 1
not_interesting,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,1,0,0,0,1,0,0,
0,0,3,3,0,0,0,0,0,0,0,2,0,0


                                                                   document trained
not_interesting,0,0,0,0,0,0,0,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,2,0,0,2,0,0,
2,1,0,0,2,1,0,0,2,0,0,1,0,0
interesting,6,0,7,5,0,0,0,0,1,0,5,1,2,0,0,0,0,0,0,0,0,3,0,3,0,0,0,1,0,0,0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,0,3
interesting,0,7,0,0,2,0,0,0,0,0,0,0,1,0,0,1,0,0,3,0,0,0,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,0,0
interesting,0,1,0,0,0,0,0,3,3,1,0,1,1,1,0,3,3,0,1,0,3,0,1,0,2,0,1,0,0,0,3,0,0,0,0,0,0,0,
0,0,0,0,0,0,1,1,0,0,3,0
interesting,0,0,0,0,3,5,5,0,0,0,0,0,0,0,0,0,1,4,0,3,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,0,0
interesting,6,0,1,1,0,0,0,0,0,0,0,1,0,0,4,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,0,0
interesting,0,0,0,2,0,0,0,2,1,4,0,2,0,3,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,2,0,0



 33   © Copyright 2010 Hewlett-Packard Development Company, L.P.
category,googl,report,search,user,review,court,mckinnon,year,internet,microsoft,site
,softwar,warn,browser,oper,expert,rise,lawyer,digit,extradit,sharpli,error,group,result,
system,rebel,econom,presid,crisi,find,year,accus,global,obama,china,civilian,shrink,hous
,wall,street,quarter,white,heavi,lehman,economi,session,ey,time,davo,human
not_interesting,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,3,0,0,0
,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0
not_interesting,0,1,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,1,0,0,5,0,2,0,0,0,3,0,0,0,3,
1,0,0,0,0,0,3,0,0,0,0,0,0,2
not_interesting,0,1,0,0,0,0,0,2,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,1,0,3,0,3,1,2,0,2,0,0,0,
0,0,0,0,0,0,0,3,1,3,1,0,2,0


                                                                    Categories set
not_interesting,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,0,0,0,1
not_interesting,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,2,0,0,1,2,1,4,0,
0,2,0,0,0,2,0,0,0,0,2,0,1,0
                                                                   when the classifier
not_interesting,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,1,0,0,0,1,0,0,


                                                                      is created
0,0,3,3,0,0,0,0,0,0,0,2,0,0
not_interesting,0,0,0,0,0,0,0,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,2,0,0,2,0,0,
2,1,0,0,2,1,0,0,2,0,0,1,0,0
interesting,6,0,7,5,0,0,0,0,1,0,5,1,2,0,0,0,0,0,0,0,0,3,0,3,0,0,0,1,0,0,0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,0,3
interesting,0,7,0,0,2,0,0,0,0,0,0,0,1,0,0,1,0,0,3,0,0,0,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,0,0
interesting,0,1,0,0,0,0,0,3,3,1,0,1,1,1,0,3,3,0,1,0,3,0,1,0,2,0,1,0,0,0,3,0,0,0,0,0,0,0,
0,0,0,0,0,0,1,1,0,0,3,0
interesting,0,0,0,0,3,5,5,0,0,0,0,0,0,0,0,0,1,4,0,3,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,0,0
interesting,6,0,1,1,0,0,0,0,0,0,0,1,0,0,4,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,0,0
interesting,0,0,0,2,0,0,0,2,1,4,0,2,0,3,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,2,0,0


 34   © Copyright 2010 Hewlett-Packard Development Company, L.P.
category,googl,report,search,user,review,court,mckinnon,year,internet,microsoft,site,s
oftwar,warn,browser,oper,expert,rise,lawyer,digit,extradit,sharpli,error,group,result,sy
stem,rebel,econom,presid,crisi,find,year,accus,global,obama,china,civilian,shrink,hous,w
all,street,quarter,white,heavi,lehman,economi,session,ey,time,davo,human
not_interesting,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,3,0,0,0,0,0,0,
0,0,0,0,1,0,0,0,0,0,0,0,0,0
not_interesting,0,1,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,1,0,0,5,0,2,0,0,0,3,0,0,0,3,
1,0,0,0,0,0,3,0,0,0,0,0,0,2
not_interesting,0,1,0,0,0,0,0,2,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,1,0,3,0,3,1,2,0,2,0,0,0,


                                                                   Number indicates the
0,0,0,0,0,0,0,3,1,3,1,0,2,0
not_interesting,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,0,0,0,1

                                                                   number of times the
not_interesting,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,2,0,0,1,2,1,4,0,
0,2,0,0,0,2,0,0,0,0,2,0,1,0


                                                                   word appears in that
not_interesting,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,1,0,0,0,1,0,0,
0,0,3,3,0,0,0,0,0,0,0,2,0,0
not_interesting,0,0,0,0,0,0,0,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,2,0,0,2,0,0,

                                                                        document
2,1,0,0,2,1,0,0,2,0,0,1,0,0
interesting,6,0,7,5,0,0,0,0,1,0,5,1,2,0,0,0,0,0,0,0,0,3,0,3,0,0,0,1,0,0,0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,0,3
interesting,0,7,0,0,2,0,0,0,0,0,0,0,1,0,0,1,0,0,3,0,0,0,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,0,0
interesting,0,1,0,0,0,0,0,3,3,1,0,1,1,1,0,3,3,0,1,0,3,0,1,0,2,0,1,0,0,0,3,0,0,0,0,0,0,0,
0,0,0,0,0,0,1,1,0,0,3,0
interesting,0,0,0,0,3,5,5,0,0,0,0,0,0,0,0,0,1,4,0,3,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,0,0
interesting,6,0,1,1,0,0,0,0,0,0,0,1,0,0,4,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,0,0
interesting,0,0,0,2,0,0,0,2,1,4,0,2,0,3,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,2,0,0


 35   © Copyright 2010 Hewlett-Packard Development Company, L.P.
Test data



36   © Copyright 2010 Hewlett-Packard Development Company, L.P.
category,googl,report,search,user,review,court,mckinnon,year,internet,micr
 osoft,site,softwar,warn,browser,oper,expert,rise,lawyer,digit,extradit,sha
 rpli,error,group,result,system,rebel,econom,presid,crisi,find,year,accus,g
 lobal,obama,china,civilian,shrink,hous,wall,street,quarter,white,heavi,leh
 man,economi,session,ey,time,davo,human
 category,0,0,0,2,0,0,0,2,1,4,0,2,0,3,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2,0,0
 ,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2,0,0

37   © Copyright 2010 Hewlett-Packard Development Company, L.P.
Using different
                  classification models


38   © Copyright 2010 Hewlett-Packard Development Company, L.P.
NAÏVE BAYES




39   © Copyright 2010 Hewlett-Packard Development Company, L.P.
SVM




40   © Copyright 2010 Hewlett-Packard Development Company, L.P.
RANDOM FOREST




41   © Copyright 2010 Hewlett-Packard Development Company, L.P.
NEURAL NETWORKS




42   © Copyright 2010 Hewlett-Packard Development Company, L.P.
Using the classifier



43   © Copyright 2010 Hewlett-Packard Development Company, L.P.
44   © Copyright 2010 Hewlett-Packard Development Company, L.P.
45   © Copyright 2010 Hewlett-Packard Development Company, L.P.
RESOURCES
– HP Labs Worldwide                                               – Rserve-Ruby-Client
http://www.hpl.hp.com/                                            https://github.com/clbustos/Rserve-
– R Project                                                       Ruby-client

http://www.r-project.org/                                         – rApache
– RsRuby                                                          http://rapache.net/index.html

https://github.com/alexgutteridge/rsrub                           – rRack
y                                                                 https://github.com/jeffreyhorner/rRack/
– RinRuby
http://rinruby.ddahl.org/
– Rserve
http://www.rforge.net/Rserve/


46   © Copyright 2010 Hewlett-Packard Development Company, L.P.
Thank you

 sausheong@hp.com
 http://twitter.com/sausheong
 http://blog.saush.com
47   © Copyright 2010 Hewlett-Packard Development Company, L.P.

Mais conteúdo relacionado

Semelhante a Ruby and R

Mrinal devadas, Hortonworks Making Sense Of Big Data
Mrinal devadas, Hortonworks Making Sense Of Big DataMrinal devadas, Hortonworks Making Sense Of Big Data
Mrinal devadas, Hortonworks Making Sense Of Big Data
PatrickCrompton
 
Create a Smarter Data Lake with HP Haven and Apache Hadoop
Create a Smarter Data Lake with HP Haven and Apache HadoopCreate a Smarter Data Lake with HP Haven and Apache Hadoop
Create a Smarter Data Lake with HP Haven and Apache Hadoop
Hortonworks
 

Semelhante a Ruby and R (20)

Evented programming
Evented programmingEvented programming
Evented programming
 
Python course in hyderabad
Python course in hyderabadPython course in hyderabad
Python course in hyderabad
 
Introduction to pig
Introduction to pigIntroduction to pig
Introduction to pig
 
HP Helion Webinar #1 - Introduction to HP Helion OpenStack w/Christian Frank
HP Helion Webinar #1 - Introduction to HP Helion OpenStack w/Christian FrankHP Helion Webinar #1 - Introduction to HP Helion OpenStack w/Christian Frank
HP Helion Webinar #1 - Introduction to HP Helion OpenStack w/Christian Frank
 
Are You Ready for Big Data Big Analytics?
Are You Ready for Big Data Big Analytics? Are You Ready for Big Data Big Analytics?
Are You Ready for Big Data Big Analytics?
 
Revolution Analytics Podcast
Revolution Analytics PodcastRevolution Analytics Podcast
Revolution Analytics Podcast
 
A modern, flexible approach to Hadoop implementation incorporating innovation...
A modern, flexible approach to Hadoop implementation incorporating innovation...A modern, flexible approach to Hadoop implementation incorporating innovation...
A modern, flexible approach to Hadoop implementation incorporating innovation...
 
Reason To learn & use r
Reason To learn & use rReason To learn & use r
Reason To learn & use r
 
Mrinal devadas, Hortonworks Making Sense Of Big Data
Mrinal devadas, Hortonworks Making Sense Of Big DataMrinal devadas, Hortonworks Making Sense Of Big Data
Mrinal devadas, Hortonworks Making Sense Of Big Data
 
iKariera 2015
iKariera 2015iKariera 2015
iKariera 2015
 
Pilot Project Highlights: Ruby on Rails - November 2006
Pilot Project Highlights: Ruby on Rails - November 2006Pilot Project Highlights: Ruby on Rails - November 2006
Pilot Project Highlights: Ruby on Rails - November 2006
 
Helion meetup-2014
Helion meetup-2014Helion meetup-2014
Helion meetup-2014
 
Yahoo! Hack Europe
Yahoo! Hack EuropeYahoo! Hack Europe
Yahoo! Hack Europe
 
Big Data & SQL: The On-Ramp to Hadoop
Big Data & SQL: The On-Ramp to Hadoop Big Data & SQL: The On-Ramp to Hadoop
Big Data & SQL: The On-Ramp to Hadoop
 
Trafodion – an enterprise class sql based on hadoop
Trafodion – an enterprise class sql based on hadoopTrafodion – an enterprise class sql based on hadoop
Trafodion – an enterprise class sql based on hadoop
 
2019 DSA 105 Introduction to Data Science Week 4
2019 DSA 105 Introduction to Data Science Week 42019 DSA 105 Introduction to Data Science Week 4
2019 DSA 105 Introduction to Data Science Week 4
 
Create a Smarter Data Lake with HP Haven and Apache Hadoop
Create a Smarter Data Lake with HP Haven and Apache HadoopCreate a Smarter Data Lake with HP Haven and Apache Hadoop
Create a Smarter Data Lake with HP Haven and Apache Hadoop
 
Pig programming is fun
Pig programming is funPig programming is fun
Pig programming is fun
 
Storm Demo Talk - Colorado Springs May 2015
Storm Demo Talk - Colorado Springs May 2015Storm Demo Talk - Colorado Springs May 2015
Storm Demo Talk - Colorado Springs May 2015
 
HP and linux
HP and linuxHP and linux
HP and linux
 

Último

Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 

Último (20)

Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital Adaptability
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 

Ruby and R