SlideShare uma empresa Scribd logo
1 de 27
Baixar para ler offline
Adrian Cole / Cloudsoft


       Big Blobs: moving big data in
       and out of the cloud

Wednesday, November 2, 11
Adrian Cole (@jclouds)
    founded jclouds march 2009
    chief evangelist at Cloudsoft




Wednesday, November 2, 11
Agenda




    • intro to jclouds blobstore
    • Omixon case study
    • awkward silence (or Q/A)




Wednesday, November 2, 11
Portable APIs


               BlobStore          LoadBalancer


               Compute            Table


       Provider-Specific Hooks

       Embeddable


      Over 30 Tested Providers!


                                                 4

Wednesday, November 2, 11
Who’s integrating?




Wednesday, November 2, 11
Blob Storage



                      global name space
                      key, value with metadata
                      sites on demand
                      unlimited size

                                                 6

Wednesday, November 2, 11
Blob Storage

    Set<String> containers = namespacesInMyAccount;

    Map<String, InputStream> keyValues = contentsOfContainer




                                                      7

Wednesday, November 2, 11
Blob Storage
                                                    adrian@googlestorage




                                                          Love Letters


                                                         Movies
                 Tron                     putBlob
                                                    The One    Shrek




                                                    Goonies   The Blob
               3d = true
               url = http://disney.go.com/tron




                                                                           8

Wednesday, November 2, 11
java overview                        github jclouds/jclouds


 // init
 context = new BlobStoreContextFactory().createContext("s3",
                                                       accesskeyid,
                                                       secret);
 blobStore = context.getBlobStore();

 // create container
 blobStore.createContainerInLocation(null, “adriansmovies”);

 // add blob
 blob = blobStore.blobBuilder("sushi.avi").payload(file).build();
 blobStore.putBlob(“adriansmovies”, blob);




                                                               9

Wednesday, November 2, 11
clojure overview                 github jclouds/jclouds



 (use 'org.jclouds.blobstore2)

 (def *blobstore* (blobstore “azureblob” account key))

 (create-container *blobstore* “movies”)
 (put-blob *blobstore* “movies”
   (blob “tron.mp4“ :payload tron-file))




                                                 10

Wednesday, November 2, 11
Big data pipelines with
            Scale-out on the cloud

                             @tiborkisstibor




                                       11

Wednesday, November 2, 11
bioinformatic pipelines
     Usually requires high
     CPU

     Continuously increasing
     data volumes

     Complex algorithms on
     top of large datasets




                                    12

Wednesday, November 2, 11
bioinformatics SaaS




                                          13

Wednesday, November 2, 11
challenges of SaaS building
       Hadoop cluster startup/shutdown
        - Cluster starting problems
         - Automatic cluster shutdown strategies
       Hadoop cluster monitoring on the cloud
       System monitoring
       Consumption based monitoring
       Data transfer paths
       AWS Import -> S3 -> hdfs -> S3 -> AWS Export
       ACL settings for client's buckets
       S3 <=> hdfs transfers

                                                      14

Wednesday, November 2, 11
where did we start?
          30GB file @max 16MB/s upload to S3
                                               32 minutes
          1PB file @max 16MB/s upload to S3
                                               18.2 hours



                                                   15

Wednesday, November 2, 11
where did we end up?
          30GB file @max 100MB/s upload to S3
                                                 32 5 minutes
          1PB file @max 100MB/s upload to S3
                                                18.2 2.9 hours



                                                        16

Wednesday, November 2, 11
How did we get there?

         Add multi-part upload support
         Optimize slicing
         Optimize parallel upload strategy
         Find big guns



                                             17

Wednesday, November 2, 11
Multi-Part upload
         Large Blobs cannot be sent in a single request in most
         BlobStores. (ex. 5GB max in S3)
         Large X-fers are likely to fail at inconvenient positions,
         and without resume.
         Multi-part uploads allow you to send slices of a
         payload, which the server assembles later



                                                              18

Wednesday, November 2, 11
Slicing
       Each upload part must advance to the appropriate
       position in the source payload efficiently.


          Payload slice(Payload input, long offset, long length);


       ex. NettyPayloadSlicer uses ChunkedFileInputStream




                                                            19

Wednesday, November 2, 11
Slicing Algorithm
       A Blob can be sliced into a maximum number of parts,
       and these parts have min and max sizes.
       up to 3.2GB, converge 32M parts
       then increase part size approaching max (5GB)
       then continue at max part size or overflow




                                                       20

Wednesday, November 2, 11
Upload Strategy

       Start sequential, stabilize, then parallelize

       SequentialMultipartUploadStrategy
       Simpler, less likely to fail, easier to retry, little to optimize outside chunk size

       ParallelMultipartUploadStrategy
       Much better throughput, but need to optimize degree, retries & error
       handling



                                                                                  21

Wednesday, November 2, 11
22

Wednesday, November 2, 11
What’s the top-speed?




                            23

Wednesday, November 2, 11
Is this as good as it gets?

             10GigE should be able to do 1280MB/s
             cc1.4xlarge has been measured up to ~560MB/s local
             but we’re only getting ~100MB/s sustained




                                                         24

Wednesday, November 2, 11
So, where do we go now?
           zero copy transfer
           more work on slice algorithms
           tools and integrations (ex. hdfs)


           add implementations for other blobstores



                                                      25

Wednesday, November 2, 11
Wanna play?
    blobStore.putBlob(“movies”, blob, multipart());



    (put-blob *blobstore* “movies” blob
                          :multipart? true)


    or just visit github jclouds-examples
                                blobstore-largeblob
                                blobstore-hdfs

                                              26

Wednesday, November 2, 11
Questions?
                            github jclouds-examples


   @jclouds @tiborkisstibor
                     adrian@cloudsoftcorp.com


                                                      27

Wednesday, November 2, 11

Mais conteúdo relacionado

Mais procurados

Performance of Microservice frameworks on different JVMs
Performance of Microservice frameworks on different JVMsPerformance of Microservice frameworks on different JVMs
Performance of Microservice frameworks on different JVMs
Maarten Smeets
 
Using Puppet - Real World Configuration Management
Using Puppet - Real World Configuration ManagementUsing Puppet - Real World Configuration Management
Using Puppet - Real World Configuration Management
James Turnbull
 
Performance of Microservice Frameworks on different JVMs
Performance of Microservice Frameworks on different JVMsPerformance of Microservice Frameworks on different JVMs
Performance of Microservice Frameworks on different JVMs
Maarten Smeets
 

Mais procurados (20)

Optimizing Docker Images
Optimizing Docker ImagesOptimizing Docker Images
Optimizing Docker Images
 
Steamlining your puppet development workflow
Steamlining your puppet development workflowSteamlining your puppet development workflow
Steamlining your puppet development workflow
 
State of the Jenkins Automation
State of the Jenkins AutomationState of the Jenkins Automation
State of the Jenkins Automation
 
Kubernetes - Using Persistent Disks with WordPress and MySQL
Kubernetes - Using Persistent Disks with WordPress and MySQLKubernetes - Using Persistent Disks with WordPress and MySQL
Kubernetes - Using Persistent Disks with WordPress and MySQL
 
Workshop: Know Before You Push 'Go': Using the Beaker Acceptance Test Framewo...
Workshop: Know Before You Push 'Go': Using the Beaker Acceptance Test Framewo...Workshop: Know Before You Push 'Go': Using the Beaker Acceptance Test Framewo...
Workshop: Know Before You Push 'Go': Using the Beaker Acceptance Test Framewo...
 
Introduction openstack-meetup-nov-28
Introduction openstack-meetup-nov-28Introduction openstack-meetup-nov-28
Introduction openstack-meetup-nov-28
 
제4회 한국IBM과 함께하는 난공불락 오픈소스 인프라 세미나-Ranchers
제4회 한국IBM과 함께하는 난공불락 오픈소스 인프라 세미나-Ranchers제4회 한국IBM과 함께하는 난공불락 오픈소스 인프라 세미나-Ranchers
제4회 한국IBM과 함께하는 난공불락 오픈소스 인프라 세미나-Ranchers
 
Scaling Django
Scaling DjangoScaling Django
Scaling Django
 
Java & containers: What I wish I knew before I used it | DevNation Tech Talk
Java & containers: What I wish I knew before I used it | DevNation Tech TalkJava & containers: What I wish I knew before I used it | DevNation Tech Talk
Java & containers: What I wish I knew before I used it | DevNation Tech Talk
 
Performance of Microservice frameworks on different JVMs
Performance of Microservice frameworks on different JVMsPerformance of Microservice frameworks on different JVMs
Performance of Microservice frameworks on different JVMs
 
Portland PUG April 2014: Beaker 101: Acceptance Test Everything
Portland PUG April 2014: Beaker 101: Acceptance Test EverythingPortland PUG April 2014: Beaker 101: Acceptance Test Everything
Portland PUG April 2014: Beaker 101: Acceptance Test Everything
 
Puppet at DemonWare - Ruaidhri Power - Puppetcamp Dublin '12
Puppet at DemonWare - Ruaidhri Power - Puppetcamp Dublin '12Puppet at DemonWare - Ruaidhri Power - Puppetcamp Dublin '12
Puppet at DemonWare - Ruaidhri Power - Puppetcamp Dublin '12
 
Using Puppet - Real World Configuration Management
Using Puppet - Real World Configuration ManagementUsing Puppet - Real World Configuration Management
Using Puppet - Real World Configuration Management
 
Integrated Cache on Netscaler
Integrated Cache on NetscalerIntegrated Cache on Netscaler
Integrated Cache on Netscaler
 
Threads Needles Stacks Heaps - Java edition
Threads Needles Stacks Heaps - Java editionThreads Needles Stacks Heaps - Java edition
Threads Needles Stacks Heaps - Java edition
 
Cassandra on Docker
Cassandra on DockerCassandra on Docker
Cassandra on Docker
 
Performance of Microservice Frameworks on different JVMs
Performance of Microservice Frameworks on different JVMsPerformance of Microservice Frameworks on different JVMs
Performance of Microservice Frameworks on different JVMs
 
Docker in production: reality, not hype (OSCON 2015)
Docker in production: reality, not hype (OSCON 2015)Docker in production: reality, not hype (OSCON 2015)
Docker in production: reality, not hype (OSCON 2015)
 
De-centralise and Conquer: Masterless Puppet in a Dynamic Environment
De-centralise and Conquer: Masterless Puppet in a Dynamic EnvironmentDe-centralise and Conquer: Masterless Puppet in a Dynamic Environment
De-centralise and Conquer: Masterless Puppet in a Dynamic Environment
 
Installation Openstack Swift
Installation Openstack SwiftInstallation Openstack Swift
Installation Openstack Swift
 

Destaque

London web perfug_performancefocused_devops_feb2014
London web perfug_performancefocused_devops_feb2014London web perfug_performancefocused_devops_feb2014
London web perfug_performancefocused_devops_feb2014
Andreas Grabner
 
Hum2310 sp2016 annotated study guide
Hum2310 sp2016 annotated study guideHum2310 sp2016 annotated study guide
Hum2310 sp2016 annotated study guide
ProfWillAdams
 
квест Pons
квест Ponsквест Pons
квест Pons
MarkovDA
 
2007 Spring Newsletter
2007 Spring Newsletter2007 Spring Newsletter
2007 Spring Newsletter
Direct Relief
 
Eerste sessie Unizo ondernemersforum 21 01-2014
Eerste sessie Unizo ondernemersforum 21 01-2014Eerste sessie Unizo ondernemersforum 21 01-2014
Eerste sessie Unizo ondernemersforum 21 01-2014
Paul Verwilt
 
Daily routines
Daily routinesDaily routines
Daily routines
Digna Rita
 
Aperitive festive
Aperitive festiveAperitive festive
Aperitive festive
Ralu Toia
 
Hum2220 1030 pompeii roman time capsule
Hum2220 1030 pompeii   roman time capsuleHum2220 1030 pompeii   roman time capsule
Hum2220 1030 pompeii roman time capsule
ProfWillAdams
 
Hum2250 the analytical life of sigmund freud
Hum2250 the analytical life of sigmund freudHum2250 the analytical life of sigmund freud
Hum2250 the analytical life of sigmund freud
ProfWillAdams
 

Destaque (20)

Jadwal motor gp
Jadwal motor gpJadwal motor gp
Jadwal motor gp
 
London web perfug_performancefocused_devops_feb2014
London web perfug_performancefocused_devops_feb2014London web perfug_performancefocused_devops_feb2014
London web perfug_performancefocused_devops_feb2014
 
Noooo
NooooNoooo
Noooo
 
Hum2310 sp2016 annotated study guide
Hum2310 sp2016 annotated study guideHum2310 sp2016 annotated study guide
Hum2310 sp2016 annotated study guide
 
API Design and Enterprise Mobile Apps
API Design and Enterprise Mobile AppsAPI Design and Enterprise Mobile Apps
API Design and Enterprise Mobile Apps
 
квест Pons
квест Ponsквест Pons
квест Pons
 
National FORUM of Multicultural Issues Journal, 8(2) 2011
National FORUM of Multicultural Issues Journal, 8(2) 2011National FORUM of Multicultural Issues Journal, 8(2) 2011
National FORUM of Multicultural Issues Journal, 8(2) 2011
 
Квест "Хоббит: нежданное путешествие" - фотоохота.
Квест "Хоббит: нежданное путешествие" - фотоохота.Квест "Хоббит: нежданное путешествие" - фотоохота.
Квест "Хоббит: нежданное путешествие" - фотоохота.
 
Hum2220 fa2016 syllabus
Hum2220 fa2016 syllabusHum2220 fa2016 syllabus
Hum2220 fa2016 syllabus
 
besaran-dan-satuan
besaran-dan-satuanbesaran-dan-satuan
besaran-dan-satuan
 
2007 Spring Newsletter
2007 Spring Newsletter2007 Spring Newsletter
2007 Spring Newsletter
 
Eerste sessie Unizo ondernemersforum 21 01-2014
Eerste sessie Unizo ondernemersforum 21 01-2014Eerste sessie Unizo ondernemersforum 21 01-2014
Eerste sessie Unizo ondernemersforum 21 01-2014
 
Daily routines
Daily routinesDaily routines
Daily routines
 
Vice President Resume
Vice President ResumeVice President Resume
Vice President Resume
 
Joplin MO - 6 months after the tornado
Joplin MO - 6 months after the tornadoJoplin MO - 6 months after the tornado
Joplin MO - 6 months after the tornado
 
Aperitive festive
Aperitive festiveAperitive festive
Aperitive festive
 
2011 Year in Review
2011 Year in Review2011 Year in Review
2011 Year in Review
 
Tsahim 2
Tsahim 2Tsahim 2
Tsahim 2
 
Hum2220 1030 pompeii roman time capsule
Hum2220 1030 pompeii   roman time capsuleHum2220 1030 pompeii   roman time capsule
Hum2220 1030 pompeii roman time capsule
 
Hum2250 the analytical life of sigmund freud
Hum2250 the analytical life of sigmund freudHum2250 the analytical life of sigmund freud
Hum2250 the analytical life of sigmund freud
 

Semelhante a Java Tech & Tools | Big Blobs: Moving Big Data In and Out of the Cloud | Adrian Cole

ZFS and FreeBSD Jails
ZFS and FreeBSD JailsZFS and FreeBSD Jails
ZFS and FreeBSD Jails
apeiron
 
Devon 2011-f-4-improve your-javascript
Devon 2011-f-4-improve your-javascriptDevon 2011-f-4-improve your-javascript
Devon 2011-f-4-improve your-javascript
Daum DNA
 
Addressing vendor weaknesses in user space (Robert Treat)
Addressing vendor weaknesses in user space (Robert Treat)Addressing vendor weaknesses in user space (Robert Treat)
Addressing vendor weaknesses in user space (Robert Treat)
Ontico
 
Rcos presentation
Rcos presentationRcos presentation
Rcos presentation
mskmoorthy
 
Log management with Graylog2 - FrOSCon 2012
Log management with Graylog2 - FrOSCon 2012Log management with Graylog2 - FrOSCon 2012
Log management with Graylog2 - FrOSCon 2012
lennartkoopmann
 

Semelhante a Java Tech & Tools | Big Blobs: Moving Big Data In and Out of the Cloud | Adrian Cole (20)

Building A Scalable Open Source Storage Solution
Building A Scalable Open Source Storage SolutionBuilding A Scalable Open Source Storage Solution
Building A Scalable Open Source Storage Solution
 
ZFS and FreeBSD Jails
ZFS and FreeBSD JailsZFS and FreeBSD Jails
ZFS and FreeBSD Jails
 
Devon 2011-f-4-improve your-javascript
Devon 2011-f-4-improve your-javascriptDevon 2011-f-4-improve your-javascript
Devon 2011-f-4-improve your-javascript
 
GemStone/S Update
GemStone/S UpdateGemStone/S Update
GemStone/S Update
 
Fast & Furious: Speed in the Opera browser
Fast & Furious: Speed in the Opera browserFast & Furious: Speed in the Opera browser
Fast & Furious: Speed in the Opera browser
 
Ruby-on-Infinispan
Ruby-on-InfinispanRuby-on-Infinispan
Ruby-on-Infinispan
 
soft-shake.ch - Data grids and Data Grids
soft-shake.ch - Data grids and Data Gridssoft-shake.ch - Data grids and Data Grids
soft-shake.ch - Data grids and Data Grids
 
Addressing vendor weaknesses in user space (Robert Treat)
Addressing vendor weaknesses in user space (Robert Treat)Addressing vendor weaknesses in user space (Robert Treat)
Addressing vendor weaknesses in user space (Robert Treat)
 
Rcos presentation
Rcos presentationRcos presentation
Rcos presentation
 
OSDC 2016 - Interesting things you can do with ZFS by Allan Jude&Benedict Reu...
OSDC 2016 - Interesting things you can do with ZFS by Allan Jude&Benedict Reu...OSDC 2016 - Interesting things you can do with ZFS by Allan Jude&Benedict Reu...
OSDC 2016 - Interesting things you can do with ZFS by Allan Jude&Benedict Reu...
 
JClouds at San Francisco Java User Group
JClouds at San Francisco Java User GroupJClouds at San Francisco Java User Group
JClouds at San Francisco Java User Group
 
Hardware Acceleration on Mobile, Ariya Hidayat & Jarred Nicholls
Hardware Acceleration on Mobile, Ariya Hidayat & Jarred NichollsHardware Acceleration on Mobile, Ariya Hidayat & Jarred Nicholls
Hardware Acceleration on Mobile, Ariya Hidayat & Jarred Nicholls
 
Log management with Graylog2 - FrOSCon 2012
Log management with Graylog2 - FrOSCon 2012Log management with Graylog2 - FrOSCon 2012
Log management with Graylog2 - FrOSCon 2012
 
CloudFoundry and MongoDb, a marriage made in heaven
CloudFoundry and MongoDb, a marriage made in heavenCloudFoundry and MongoDb, a marriage made in heaven
CloudFoundry and MongoDb, a marriage made in heaven
 
Everyday - mongodb
Everyday - mongodbEveryday - mongodb
Everyday - mongodb
 
Macruby - RubyConf Presentation 2010
Macruby - RubyConf Presentation 2010Macruby - RubyConf Presentation 2010
Macruby - RubyConf Presentation 2010
 
Move Over, Rsync
Move Over, RsyncMove Over, Rsync
Move Over, Rsync
 
NDH2k12 Cloud Computing Security
NDH2k12 Cloud Computing SecurityNDH2k12 Cloud Computing Security
NDH2k12 Cloud Computing Security
 
Zfs intro v2
Zfs intro v2Zfs intro v2
Zfs intro v2
 
(BAC309) Automating Backup and Archiving with AWS and CommVault | AWS re:Inve...
(BAC309) Automating Backup and Archiving with AWS and CommVault | AWS re:Inve...(BAC309) Automating Backup and Archiving with AWS and CommVault | AWS re:Inve...
(BAC309) Automating Backup and Archiving with AWS and CommVault | AWS re:Inve...
 

Mais de JAX London

Spring Day | Data Access 2.0? Please Welcome Spring Data! | Oliver Gierke
Spring Day | Data Access 2.0? Please Welcome Spring Data! | Oliver GierkeSpring Day | Data Access 2.0? Please Welcome Spring Data! | Oliver Gierke
Spring Day | Data Access 2.0? Please Welcome Spring Data! | Oliver Gierke
JAX London
 
Keynote | The Rise and Fall and Rise of Java | James Governor
Keynote | The Rise and Fall and Rise of Java | James GovernorKeynote | The Rise and Fall and Rise of Java | James Governor
Keynote | The Rise and Fall and Rise of Java | James Governor
JAX London
 
Java Tech & Tools | OSGi Best Practices | Emily Jiang
Java Tech & Tools | OSGi Best Practices | Emily JiangJava Tech & Tools | OSGi Best Practices | Emily Jiang
Java Tech & Tools | OSGi Best Practices | Emily Jiang
JAX London
 
Java Tech & Tools | Deploying Java & Play Framework Apps to the Cloud | Sande...
Java Tech & Tools | Deploying Java & Play Framework Apps to the Cloud | Sande...Java Tech & Tools | Deploying Java & Play Framework Apps to the Cloud | Sande...
Java Tech & Tools | Deploying Java & Play Framework Apps to the Cloud | Sande...
JAX London
 

Mais de JAX London (20)

Java Tech & Tools | Continuous Delivery - the Writing is on the Wall | John S...
Java Tech & Tools | Continuous Delivery - the Writing is on the Wall | John S...Java Tech & Tools | Continuous Delivery - the Writing is on the Wall | John S...
Java Tech & Tools | Continuous Delivery - the Writing is on the Wall | John S...
 
Java Tech & Tools | Mapping, GIS and Geolocating Data in Java | Joachim Van d...
Java Tech & Tools | Mapping, GIS and Geolocating Data in Java | Joachim Van d...Java Tech & Tools | Mapping, GIS and Geolocating Data in Java | Joachim Van d...
Java Tech & Tools | Mapping, GIS and Geolocating Data in Java | Joachim Van d...
 
Keynote | Middleware Everywhere - Ready for Mobile and Cloud | Dr. Mark Little
Keynote | Middleware Everywhere - Ready for Mobile and Cloud | Dr. Mark LittleKeynote | Middleware Everywhere - Ready for Mobile and Cloud | Dr. Mark Little
Keynote | Middleware Everywhere - Ready for Mobile and Cloud | Dr. Mark Little
 
Spring Day | WaveMaker - Spring Roo - SpringSource Tool Suite: Choosing the R...
Spring Day | WaveMaker - Spring Roo - SpringSource Tool Suite: Choosing the R...Spring Day | WaveMaker - Spring Roo - SpringSource Tool Suite: Choosing the R...
Spring Day | WaveMaker - Spring Roo - SpringSource Tool Suite: Choosing the R...
 
Spring Day | Behind the Scenes at Spring Batch | Dave Syer
Spring Day | Behind the Scenes at Spring Batch | Dave SyerSpring Day | Behind the Scenes at Spring Batch | Dave Syer
Spring Day | Behind the Scenes at Spring Batch | Dave Syer
 
Spring Day | Spring 3.1 in a Nutshell | Sam Brannen
Spring Day | Spring 3.1 in a Nutshell | Sam BrannenSpring Day | Spring 3.1 in a Nutshell | Sam Brannen
Spring Day | Spring 3.1 in a Nutshell | Sam Brannen
 
Spring Day | Identity Management with Spring Security | Dave Syer
Spring Day | Identity Management with Spring Security | Dave SyerSpring Day | Identity Management with Spring Security | Dave Syer
Spring Day | Identity Management with Spring Security | Dave Syer
 
Spring Day | Spring and Scala | Eberhard Wolff
Spring Day | Spring and Scala | Eberhard WolffSpring Day | Spring and Scala | Eberhard Wolff
Spring Day | Spring and Scala | Eberhard Wolff
 
Spring Day | Data Access 2.0? Please Welcome Spring Data! | Oliver Gierke
Spring Day | Data Access 2.0? Please Welcome Spring Data! | Oliver GierkeSpring Day | Data Access 2.0? Please Welcome Spring Data! | Oliver Gierke
Spring Day | Data Access 2.0? Please Welcome Spring Data! | Oliver Gierke
 
Keynote | The Rise and Fall and Rise of Java | James Governor
Keynote | The Rise and Fall and Rise of Java | James GovernorKeynote | The Rise and Fall and Rise of Java | James Governor
Keynote | The Rise and Fall and Rise of Java | James Governor
 
Java Tech & Tools | OSGi Best Practices | Emily Jiang
Java Tech & Tools | OSGi Best Practices | Emily JiangJava Tech & Tools | OSGi Best Practices | Emily Jiang
Java Tech & Tools | OSGi Best Practices | Emily Jiang
 
Java Tech & Tools | Beyond the Data Grid: Coherence, Normalisation, Joins and...
Java Tech & Tools | Beyond the Data Grid: Coherence, Normalisation, Joins and...Java Tech & Tools | Beyond the Data Grid: Coherence, Normalisation, Joins and...
Java Tech & Tools | Beyond the Data Grid: Coherence, Normalisation, Joins and...
 
Java Tech & Tools | Social Media in Programming in Java | Khanderao Kand
Java Tech & Tools | Social Media in Programming in Java | Khanderao KandJava Tech & Tools | Social Media in Programming in Java | Khanderao Kand
Java Tech & Tools | Social Media in Programming in Java | Khanderao Kand
 
Java Tech & Tools | Just Keep Passing the Message | Russel Winder
Java Tech & Tools | Just Keep Passing the Message | Russel WinderJava Tech & Tools | Just Keep Passing the Message | Russel Winder
Java Tech & Tools | Just Keep Passing the Message | Russel Winder
 
Java Tech & Tools | Deploying Java & Play Framework Apps to the Cloud | Sande...
Java Tech & Tools | Deploying Java & Play Framework Apps to the Cloud | Sande...Java Tech & Tools | Deploying Java & Play Framework Apps to the Cloud | Sande...
Java Tech & Tools | Deploying Java & Play Framework Apps to the Cloud | Sande...
 
Java EE | Modular EJBs for Enterprise OSGi | Tim Ward
Java EE | Modular EJBs for Enterprise OSGi | Tim WardJava EE | Modular EJBs for Enterprise OSGi | Tim Ward
Java EE | Modular EJBs for Enterprise OSGi | Tim Ward
 
Java EE | Apache TomEE - Java EE Web Profile on Tomcat | Jonathan Gallimore
Java EE | Apache TomEE - Java EE Web Profile on Tomcat | Jonathan GallimoreJava EE | Apache TomEE - Java EE Web Profile on Tomcat | Jonathan Gallimore
Java EE | Apache TomEE - Java EE Web Profile on Tomcat | Jonathan Gallimore
 
Java Core | Understanding the Disruptor: a Beginner's Guide to Hardcore Concu...
Java Core | Understanding the Disruptor: a Beginner's Guide to Hardcore Concu...Java Core | Understanding the Disruptor: a Beginner's Guide to Hardcore Concu...
Java Core | Understanding the Disruptor: a Beginner's Guide to Hardcore Concu...
 
Java Core | Java 8 and OSGi Modularisation | Tim Ellison & Neil Bartlett
Java Core | Java 8 and OSGi Modularisation | Tim Ellison & Neil BartlettJava Core | Java 8 and OSGi Modularisation | Tim Ellison & Neil Bartlett
Java Core | Java 8 and OSGi Modularisation | Tim Ellison & Neil Bartlett
 
Java Core | JavaFX 2.0: Great User Interfaces in Java | Simon Ritter
Java Core | JavaFX 2.0: Great User Interfaces in Java | Simon RitterJava Core | JavaFX 2.0: Great User Interfaces in Java | Simon Ritter
Java Core | JavaFX 2.0: Great User Interfaces in Java | Simon Ritter
 

Último

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 

Último (20)

EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
A Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusA Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source Milvus
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 

Java Tech & Tools | Big Blobs: Moving Big Data In and Out of the Cloud | Adrian Cole

  • 1. Adrian Cole / Cloudsoft Big Blobs: moving big data in and out of the cloud Wednesday, November 2, 11
  • 2. Adrian Cole (@jclouds) founded jclouds march 2009 chief evangelist at Cloudsoft Wednesday, November 2, 11
  • 3. Agenda • intro to jclouds blobstore • Omixon case study • awkward silence (or Q/A) Wednesday, November 2, 11
  • 4. Portable APIs BlobStore LoadBalancer Compute Table Provider-Specific Hooks Embeddable Over 30 Tested Providers! 4 Wednesday, November 2, 11
  • 6. Blob Storage global name space key, value with metadata sites on demand unlimited size 6 Wednesday, November 2, 11
  • 7. Blob Storage Set<String> containers = namespacesInMyAccount; Map<String, InputStream> keyValues = contentsOfContainer 7 Wednesday, November 2, 11
  • 8. Blob Storage adrian@googlestorage Love Letters Movies Tron putBlob The One Shrek Goonies The Blob 3d = true url = http://disney.go.com/tron 8 Wednesday, November 2, 11
  • 9. java overview github jclouds/jclouds // init context = new BlobStoreContextFactory().createContext("s3", accesskeyid, secret); blobStore = context.getBlobStore(); // create container blobStore.createContainerInLocation(null, “adriansmovies”); // add blob blob = blobStore.blobBuilder("sushi.avi").payload(file).build(); blobStore.putBlob(“adriansmovies”, blob); 9 Wednesday, November 2, 11
  • 10. clojure overview github jclouds/jclouds (use 'org.jclouds.blobstore2) (def *blobstore* (blobstore “azureblob” account key)) (create-container *blobstore* “movies”) (put-blob *blobstore* “movies” (blob “tron.mp4“ :payload tron-file)) 10 Wednesday, November 2, 11
  • 11. Big data pipelines with Scale-out on the cloud @tiborkisstibor 11 Wednesday, November 2, 11
  • 12. bioinformatic pipelines Usually requires high CPU Continuously increasing data volumes Complex algorithms on top of large datasets 12 Wednesday, November 2, 11
  • 13. bioinformatics SaaS 13 Wednesday, November 2, 11
  • 14. challenges of SaaS building Hadoop cluster startup/shutdown - Cluster starting problems - Automatic cluster shutdown strategies Hadoop cluster monitoring on the cloud System monitoring Consumption based monitoring Data transfer paths AWS Import -> S3 -> hdfs -> S3 -> AWS Export ACL settings for client's buckets S3 <=> hdfs transfers 14 Wednesday, November 2, 11
  • 15. where did we start? 30GB file @max 16MB/s upload to S3 32 minutes 1PB file @max 16MB/s upload to S3 18.2 hours 15 Wednesday, November 2, 11
  • 16. where did we end up? 30GB file @max 100MB/s upload to S3 32 5 minutes 1PB file @max 100MB/s upload to S3 18.2 2.9 hours 16 Wednesday, November 2, 11
  • 17. How did we get there? Add multi-part upload support Optimize slicing Optimize parallel upload strategy Find big guns 17 Wednesday, November 2, 11
  • 18. Multi-Part upload Large Blobs cannot be sent in a single request in most BlobStores. (ex. 5GB max in S3) Large X-fers are likely to fail at inconvenient positions, and without resume. Multi-part uploads allow you to send slices of a payload, which the server assembles later 18 Wednesday, November 2, 11
  • 19. Slicing Each upload part must advance to the appropriate position in the source payload efficiently. Payload slice(Payload input, long offset, long length); ex. NettyPayloadSlicer uses ChunkedFileInputStream 19 Wednesday, November 2, 11
  • 20. Slicing Algorithm A Blob can be sliced into a maximum number of parts, and these parts have min and max sizes. up to 3.2GB, converge 32M parts then increase part size approaching max (5GB) then continue at max part size or overflow 20 Wednesday, November 2, 11
  • 21. Upload Strategy Start sequential, stabilize, then parallelize SequentialMultipartUploadStrategy Simpler, less likely to fail, easier to retry, little to optimize outside chunk size ParallelMultipartUploadStrategy Much better throughput, but need to optimize degree, retries & error handling 21 Wednesday, November 2, 11
  • 23. What’s the top-speed? 23 Wednesday, November 2, 11
  • 24. Is this as good as it gets? 10GigE should be able to do 1280MB/s cc1.4xlarge has been measured up to ~560MB/s local but we’re only getting ~100MB/s sustained 24 Wednesday, November 2, 11
  • 25. So, where do we go now? zero copy transfer more work on slice algorithms tools and integrations (ex. hdfs) add implementations for other blobstores 25 Wednesday, November 2, 11
  • 26. Wanna play? blobStore.putBlob(“movies”, blob, multipart()); (put-blob *blobstore* “movies” blob :multipart? true) or just visit github jclouds-examples blobstore-largeblob blobstore-hdfs 26 Wednesday, November 2, 11
  • 27. Questions? github jclouds-examples @jclouds @tiborkisstibor adrian@cloudsoftcorp.com 27 Wednesday, November 2, 11