SlideShare uma empresa Scribd logo
1 de 22
SeqWare on the Cloud:
Porting a Genome Center’s Infrastructure
        to Amazon Web Services

               Brian O'Connor

          SeqWare Software Architect &
         Manager for Software Engineering

      The Ontario Institute for Cancer Research
Effective Scaling


      Integration               Expertise
      & Sharing
                    Effective
                    System


                    Compute &
                     Storage



SeqWare was designed to scale in these ways
Effective Scaling


Query          Integration               Expertise
Engine         & Sharing
Poster
                             Effective
                             System


                             Compute &
                              Storage



         SeqWare was designed to scale in these ways
The Open Source SeqWare Project
               SeqWare                SeqWare
                 Web                Query Engine
                Service



 SeqWare Portal
                                          SeqWare
                          SeqWare
                                          Pipeline
                          MetaDB

                                                         Local
                                                        Cluster



                                                     Cloud
  Big Data

  Small Data
Distinguishing Features of SeqWare
 Firehose


                                ●   Infrastructure Toolkit
     Tavern                     ●   Developer Framework
     a Open Source/Community
                                ●   Automation
                                ●   Environment-Agnostic
                   Commercial   ●   Tailored for Big Projects
                                ●   User-Created Workflows
                                ●   Packaging Format
                                ●   Provenance Tracking
                                ●   Fault Tolerant
                                ●   Tools-Agnostic
                                ●   Open Source
Projects Using SeqWare

   UNC           Ontario
Lineberger     Institute for
  Cancer         Cancer
  Center        Research            Iceman,                         Plant Genome
                                   HuRef 300x,       Clinical         Assembly
                                    Others...      Sequencing

               + local projects
   + local
  projects        Exome,                            Targeted
               Whole Genome,                      Resequencing
                Targeted Re-                                        Whole Genome
 RNASeq         Sequencing,       Whole Genome
                  RNASeq
                                                   Hundreds of       2 genomes,
 1.5 TBase        38 TBase          9 genomes     patient samples    JBrowse on
927 samples    1,522 samples      a 300x genome                         iPad
 982 “lanes”    2,297 “lanes”
Scaling Expertise:
        Analyzing Illumina Data @ OICR
●   September 2011 rolled out SeqWare at OICR

●   Goal: to deploy SeqWare and streamline
    production analysis through automation
    ●   4 groups working together
    ●   SeqWare Workflows for
        –   Large projects and common tasks
        –   Projects with “public uploads”
SeqWare at OICR
              SeqWare                SeqWare
                Web                Query Engine
               Service



SeqWare Portal
                                         SeqWare
                         SeqWare
                                         Pipeline
                         MetaDB

                                                        Local
                                                       Cluster



                                                    Cloud
 Big Data

 Small Data
SeqWare at OICR
        SeqWare                SeqWare
          Web                Query Engine
Software Service
Engineering

SeqWare Portal
                                   SeqWare
                   SeqWare
                                   Pipeline
                   MetaDB

                                                  Local
                                                 Cluster



                                              Cloud
  Big Data

  Small Data
SeqWare at OICR
              SeqWare                SeqWare
                Web                Query Engine
               Service



SeqWare Portal
                                         SeqWare
                         SeqWare
                                         Pipeline
                         MetaDB

                                                      Local
                                                     Cluster

                                         Pipeline &
 Big Data                                Tool Evaluation
                                                    Cloud


 Small Data
SeqWare at OICR
               SeqWare                SeqWare
                 Web                Query Engine
                Service



SeqWare Portal
                                          SeqWare
                          SeqWare
                                          Pipeline
                          MetaDB

              Sequencing Facility                        Local
                                                        Cluster

              LIMS
                                                     Cloud
 Big Data

 Small Data
SeqWare at OICR
              SeqWare                SeqWare
                Web                Query Engine
               Service
                                               User + Data =


SeqWare Portal
                                           SeqWare
                         SeqWare
                                           Pipeline
                         MetaDB “deciders”

                                                          Local
                                                         Cluster

                                Production
 Big Data                       Informatics           Cloud


 Small Data
OICR Production Workflows
Multiple groups contributed including the new Pipeline and Evaluation Team




     In Production                 Staging , Testing, & Development Workflows
OICR SeqWare Results

                                            38 Trillion
Samples




                                             Bases
                                             Aligned

                  Time (~2 years)

          ●   Automated key components in production
          ●   What about sharing our infrastructure?
          ●   To the     Cloud!
“The Cloud”
Want to share infrastructure without sharing infrastructure

  Core Services            Transfer & Web            Other Nifty
                              Services                Services
      Elastic Cloud
      Compute
                               Elastic Beanstalk
       Simple
       Storage                                          Glacier
                                  Import/Export
       Service
                                                   Elastic Map/Reduce
       Linux tools                DirectConnect
       for disk and                                   DynamoDB
       file encryption
                                                        HBase


                      Hardware through API calls
Scaling Computation:
         Analyzing SOLiD Data on Amazon

●   Life Collaborations Division had 9 human
    genomes such as the Iceman's genome and
    HuRef resequenced at high depth

●   Goal: to deploy SeqWare infrastructure on
    AWS and analyze data in a scalable way
    ●   Without building infrastructure
    ●   Using open source tools
SeqWare Infrastructure on EC2
                                               Instance
Workflow    Command               SeqWare         or
 Bundle       Line                MetaDB        Cluster
              Tools                            Launcher

 Config
             SeqWare
               Web
             Service             Amazon      Amazon EC2
                                   S3
 Import
                                             SeqWare
                                             Pipeline
Result
 Files:        SeqWare
 BAM,           Portal
 VCF,
Reports


  User     SeqWare Interfaces
                                Amazon Instance or Cluster
Workflow Outputs
Results via Project Website on EC2          Variants Loaded in JBrowse
                                        Genome Browser on Elastic Beanstalk




      http://icemangenome.net

                    Variants in Database and Files in S3


       Variant           BAM           VCF        Annotated VCF
      Database
Results
●   Cloud delivered fantastic computational and
    storage scalability

●   Analyzed 9 human genomes, one at 300x!

●   Costs
    ●   8 node HPC cluster, about 4 days
    ●   30x coverage genome was ~$1000 (<$15/GBase)
    ●   ~$150 per exome ($10/GBase)
    ●   ~$50/month/genome storage, website, & browser
The Future of SeqWare
●   Scalability
    ●   Cloud-based cluster launching (Starcluster/Cloudman)
    ●   Release encryption and distributed filesystem tools
    ●   Better documentation and easier setup
●   Expertise
    ●   Simplify pipeline language(s) and development process
    ●   Release OICR public workflows
●   Integration
    ●   Expand NOSQL variant/annotation database
    ●   Support for other tools like Galaxy
Availability
●   SeqWare available at:
    http://seqware.github.com, @SeqWare



                                    Virtual Box
                                         &
                                        AMI



●   Brian O'Connor
    boconnor@oicr.on.ca
Acknowledgements
●   SeqWare @ OICR                  ●   Tim Harkins
    ●   Morgan Taschuk,             ●   Barry Merriman
        Denis Yuen, Yong Liang      ●   Jason Warner
●   OICR SeqProdBio                 ●   Kevin McKernan
    ●   Tim Beck, Zheng Zha, Tony   ●   Vincent Ferretti
        DeBat
●   OICR Bioinformatics Core
                                    ●   Lincoln Stein

    ●   Francis Ouellette,
        Zhibin Lu
●   SeqWare @ UNC
    ●   Neil Hayes, Sara Grimm,
        Stuart Jefferys, Matt
        Solloway, and the
        Lineberger group

Mais conteúdo relacionado

Mais procurados

MetaCDN: Enabling High Performance, Low Cost Content Storage and Delivery via...
MetaCDN: Enabling High Performance, Low Cost Content Storage and Delivery via...MetaCDN: Enabling High Performance, Low Cost Content Storage and Delivery via...
MetaCDN: Enabling High Performance, Low Cost Content Storage and Delivery via...
James Broberg
 
OpenStack at Xen summit Asia
OpenStack at Xen summit Asia OpenStack at Xen summit Asia
OpenStack at Xen summit Asia
Jaesuk Ahn
 
Hitchhiker's Guide to Open Source Cloud Computing
Hitchhiker's Guide to Open Source Cloud ComputingHitchhiker's Guide to Open Source Cloud Computing
Hitchhiker's Guide to Open Source Cloud Computing
Mark Hinkle
 
OSCON 2013 - The Hitchiker’s Guide to Open Source Cloud Computing
OSCON 2013 - The Hitchiker’s Guide to Open Source Cloud ComputingOSCON 2013 - The Hitchiker’s Guide to Open Source Cloud Computing
OSCON 2013 - The Hitchiker’s Guide to Open Source Cloud Computing
Mark Hinkle
 

Mais procurados (20)

Microservice message routing on Kubernetes
Microservice message routing on KubernetesMicroservice message routing on Kubernetes
Microservice message routing on Kubernetes
 
MetaCDN: Enabling High Performance, Low Cost Content Storage and Delivery via...
MetaCDN: Enabling High Performance, Low Cost Content Storage and Delivery via...MetaCDN: Enabling High Performance, Low Cost Content Storage and Delivery via...
MetaCDN: Enabling High Performance, Low Cost Content Storage and Delivery via...
 
Cloudian dynamic consistency
Cloudian dynamic consistencyCloudian dynamic consistency
Cloudian dynamic consistency
 
OpenStack 101 Technical Overview
OpenStack 101 Technical OverviewOpenStack 101 Technical Overview
OpenStack 101 Technical Overview
 
OpenStack at Xen summit Asia
OpenStack at Xen summit Asia OpenStack at Xen summit Asia
OpenStack at Xen summit Asia
 
Distributed Block-level Storage Management for OpenStack, by Danile lee
Distributed Block-level Storage Management for OpenStack, by Danile leeDistributed Block-level Storage Management for OpenStack, by Danile lee
Distributed Block-level Storage Management for OpenStack, by Danile lee
 
Openstack Global Meetup
Openstack Global Meetup Openstack Global Meetup
Openstack Global Meetup
 
Play with cloud foundry
Play with cloud foundryPlay with cloud foundry
Play with cloud foundry
 
SVG in Data Acquisition and Control Systems
SVG in Data Acquisition and Control SystemsSVG in Data Acquisition and Control Systems
SVG in Data Acquisition and Control Systems
 
RunningQuantumOnQuantumAtNicira.pdf
RunningQuantumOnQuantumAtNicira.pdfRunningQuantumOnQuantumAtNicira.pdf
RunningQuantumOnQuantumAtNicira.pdf
 
2012 CloudStack Design Camp in Taiwan--- CloudStack Overview-1
2012 CloudStack Design Camp in Taiwan--- CloudStack Overview-12012 CloudStack Design Camp in Taiwan--- CloudStack Overview-1
2012 CloudStack Design Camp in Taiwan--- CloudStack Overview-1
 
SV Forum Platform Architecture SIG - Netflix Open Source Platform
SV Forum Platform Architecture SIG - Netflix Open Source PlatformSV Forum Platform Architecture SIG - Netflix Open Source Platform
SV Forum Platform Architecture SIG - Netflix Open Source Platform
 
JavaOne 2012 - BOF7955 ­ Avoiding Java EE Application Design Traps to Achieve...
JavaOne 2012 - BOF7955 ­ Avoiding Java EE Application Design Traps to Achieve...JavaOne 2012 - BOF7955 ­ Avoiding Java EE Application Design Traps to Achieve...
JavaOne 2012 - BOF7955 ­ Avoiding Java EE Application Design Traps to Achieve...
 
Hitchhiker's Guide to Open Source Cloud Computing
Hitchhiker's Guide to Open Source Cloud ComputingHitchhiker's Guide to Open Source Cloud Computing
Hitchhiker's Guide to Open Source Cloud Computing
 
OSCON 2013 - The Hitchiker’s Guide to Open Source Cloud Computing
OSCON 2013 - The Hitchiker’s Guide to Open Source Cloud ComputingOSCON 2013 - The Hitchiker’s Guide to Open Source Cloud Computing
OSCON 2013 - The Hitchiker’s Guide to Open Source Cloud Computing
 
Domestic cloud
Domestic cloudDomestic cloud
Domestic cloud
 
Cloud Computing for Developers and Architects - QCon 2008 Tutorial
Cloud Computing for Developers and Architects - QCon 2008 TutorialCloud Computing for Developers and Architects - QCon 2008 Tutorial
Cloud Computing for Developers and Architects - QCon 2008 Tutorial
 
What is OpenStack and the added value of IBM solutions
What is OpenStack and the added value of IBM solutionsWhat is OpenStack and the added value of IBM solutions
What is OpenStack and the added value of IBM solutions
 
MED301 Is My CDN Performing? - AWS re: Invent 2012
MED301 Is My CDN Performing? - AWS re: Invent 2012MED301 Is My CDN Performing? - AWS re: Invent 2012
MED301 Is My CDN Performing? - AWS re: Invent 2012
 
3 Networking CloudStack Developer Day
3  Networking CloudStack Developer Day 3  Networking CloudStack Developer Day
3 Networking CloudStack Developer Day
 

Destaque

How Much Does A Hospital Day Cost?
How Much Does A Hospital Day Cost?How Much Does A Hospital Day Cost?
How Much Does A Hospital Day Cost?
clearflow
 
Histoire de la médecine
Histoire de la médecineHistoire de la médecine
Histoire de la médecine
Réseau Pro Santé
 
Azon commission crusher
Azon commission crusherAzon commission crusher
Azon commission crusher
John Candy
 
Día de los muertos evan walsh
Día de los muertos evan walshDía de los muertos evan walsh
Día de los muertos evan walsh
walshe289
 
مقدمة عن قواعد البيانات
مقدمة عن قواعد البياناتمقدمة عن قواعد البيانات
مقدمة عن قواعد البيانات
alihassan_siwa
 

Destaque (16)

Los valles de república dominicana
Los valles de república dominicanaLos valles de república dominicana
Los valles de república dominicana
 
kanhaiya jha
kanhaiya jhakanhaiya jha
kanhaiya jha
 
How Much Does A Hospital Day Cost?
How Much Does A Hospital Day Cost?How Much Does A Hospital Day Cost?
How Much Does A Hospital Day Cost?
 
Tarea.pilar.montes.loo.19.05.2016
Tarea.pilar.montes.loo.19.05.2016Tarea.pilar.montes.loo.19.05.2016
Tarea.pilar.montes.loo.19.05.2016
 
Histoire de la médecine
Histoire de la médecineHistoire de la médecine
Histoire de la médecine
 
Azon commission crusher
Azon commission crusherAzon commission crusher
Azon commission crusher
 
Día de los muertos evan walsh
Día de los muertos evan walshDía de los muertos evan walsh
Día de los muertos evan walsh
 
Guia leyes ponderales
Guia  leyes ponderalesGuia  leyes ponderales
Guia leyes ponderales
 
cirujia
cirujiacirujia
cirujia
 
C1.ics.p3.s6. problemas sociales y su impacto en el desarrollo del adolescente
C1.ics.p3.s6. problemas sociales y su impacto en el desarrollo del adolescenteC1.ics.p3.s6. problemas sociales y su impacto en el desarrollo del adolescente
C1.ics.p3.s6. problemas sociales y su impacto en el desarrollo del adolescente
 
Semester project (strategic managment 70415)
Semester project (strategic managment 70415)Semester project (strategic managment 70415)
Semester project (strategic managment 70415)
 
Informe cueva blanca conci
Informe cueva blanca conciInforme cueva blanca conci
Informe cueva blanca conci
 
Abuso sexual
Abuso sexualAbuso sexual
Abuso sexual
 
مقدمة عن قواعد البيانات
مقدمة عن قواعد البياناتمقدمة عن قواعد البيانات
مقدمة عن قواعد البيانات
 
الوحدة الاولى - قاعدة البيانات وادارتها
الوحدة الاولى - قاعدة البيانات وادارتهاالوحدة الاولى - قاعدة البيانات وادارتها
الوحدة الاولى - قاعدة البيانات وادارتها
 
Materi app ii
Materi app iiMateri app ii
Materi app ii
 

Semelhante a SeqWare on the Cloud: Porting a Genome Center's Infrastructure to Amazon Web Services

Orchestration & provisioning
Orchestration & provisioningOrchestration & provisioning
Orchestration & provisioning
buildacloud
 
SPEC INDIA Java Case Study
SPEC INDIA Java Case StudySPEC INDIA Java Case Study
SPEC INDIA Java Case Study
SPEC INDIA
 
Operating the Hyperscale Cloud
Operating the Hyperscale CloudOperating the Hyperscale Cloud
Operating the Hyperscale Cloud
Open Stack
 
Adapative Provisioning of Stream Processing Systems in the Cloud
Adapative Provisioning of Stream Processing Systems in the CloudAdapative Provisioning of Stream Processing Systems in the Cloud
Adapative Provisioning of Stream Processing Systems in the Cloud
Javier Cerviño
 
Big Gains With Little Virtual Machines Sumeet Mehra
Big Gains With Little Virtual Machines Sumeet MehraBig Gains With Little Virtual Machines Sumeet Mehra
Big Gains With Little Virtual Machines Sumeet Mehra
Jay Leone
 

Semelhante a SeqWare on the Cloud: Porting a Genome Center's Infrastructure to Amazon Web Services (20)

Weblogic Server
Weblogic ServerWeblogic Server
Weblogic Server
 
Orchestration & provisioning
Orchestration & provisioningOrchestration & provisioning
Orchestration & provisioning
 
Cloud Conference Day - A High-Speed Data Ingestion Service in Java Using MQTT...
Cloud Conference Day - A High-Speed Data Ingestion Service in Java Using MQTT...Cloud Conference Day - A High-Speed Data Ingestion Service in Java Using MQTT...
Cloud Conference Day - A High-Speed Data Ingestion Service in Java Using MQTT...
 
BarcelonaJUG - A High-Speed Data Ingestion Service in Java Using MQTT, AMQP, ...
BarcelonaJUG - A High-Speed Data Ingestion Service in Java Using MQTT, AMQP, ...BarcelonaJUG - A High-Speed Data Ingestion Service in Java Using MQTT, AMQP, ...
BarcelonaJUG - A High-Speed Data Ingestion Service in Java Using MQTT, AMQP, ...
 
Cybera - Clouds & other computational frameworks for science
Cybera - Clouds & other computational frameworks for scienceCybera - Clouds & other computational frameworks for science
Cybera - Clouds & other computational frameworks for science
 
SPEC INDIA Java Case Study
SPEC INDIA Java Case StudySPEC INDIA Java Case Study
SPEC INDIA Java Case Study
 
Accelerate your PaaS to the Mobile World: Silicon Valley Code Camp 2012
Accelerate your PaaS to the Mobile World: Silicon Valley Code Camp 2012Accelerate your PaaS to the Mobile World: Silicon Valley Code Camp 2012
Accelerate your PaaS to the Mobile World: Silicon Valley Code Camp 2012
 
Operating the Hyperscale Cloud
Operating the Hyperscale CloudOperating the Hyperscale Cloud
Operating the Hyperscale Cloud
 
BarcelonaJUG - Revolutionize Java Database Application Development with React...
BarcelonaJUG - Revolutionize Java Database Application Development with React...BarcelonaJUG - Revolutionize Java Database Application Development with React...
BarcelonaJUG - Revolutionize Java Database Application Development with React...
 
OSMC 2021 | Use OpenSource monitoring for an Enterprise Grade Platform
OSMC 2021 | Use OpenSource monitoring for an Enterprise Grade PlatformOSMC 2021 | Use OpenSource monitoring for an Enterprise Grade Platform
OSMC 2021 | Use OpenSource monitoring for an Enterprise Grade Platform
 
Open stack in sina
Open stack in sinaOpen stack in sina
Open stack in sina
 
Adapative Provisioning of Stream Processing Systems in the Cloud
Adapative Provisioning of Stream Processing Systems in the CloudAdapative Provisioning of Stream Processing Systems in the Cloud
Adapative Provisioning of Stream Processing Systems in the Cloud
 
Porto Tech Hub Conference 2023 - Revolutionize Java DB AppDev with Reactive S...
Porto Tech Hub Conference 2023 - Revolutionize Java DB AppDev with Reactive S...Porto Tech Hub Conference 2023 - Revolutionize Java DB AppDev with Reactive S...
Porto Tech Hub Conference 2023 - Revolutionize Java DB AppDev with Reactive S...
 
V fabric overview
V fabric overviewV fabric overview
V fabric overview
 
Windows azure
Windows azureWindows azure
Windows azure
 
Big Gains With Little Virtual Machines Sumeet Mehra
Big Gains With Little Virtual Machines Sumeet MehraBig Gains With Little Virtual Machines Sumeet Mehra
Big Gains With Little Virtual Machines Sumeet Mehra
 
Ogce Workflow Suite Tg09
Ogce Workflow Suite Tg09Ogce Workflow Suite Tg09
Ogce Workflow Suite Tg09
 
(ATS3-GS03) Accelrys Enterprise Platform Deeper Dive
(ATS3-GS03) Accelrys Enterprise Platform Deeper Dive(ATS3-GS03) Accelrys Enterprise Platform Deeper Dive
(ATS3-GS03) Accelrys Enterprise Platform Deeper Dive
 
Open in the Cloud Java &amp; Windows Azure
Open in the Cloud Java &amp; Windows AzureOpen in the Cloud Java &amp; Windows Azure
Open in the Cloud Java &amp; Windows Azure
 
DWX23 - Revolutionize Java DB AppDev with Reactive Streams and Virtual Threads
DWX23 - Revolutionize Java DB AppDev with Reactive Streams and Virtual ThreadsDWX23 - Revolutionize Java DB AppDev with Reactive Streams and Virtual Threads
DWX23 - Revolutionize Java DB AppDev with Reactive Streams and Virtual Threads
 

Último

College Call Girls in Haridwar 9667172968 Short 4000 Night 10000 Best call gi...
College Call Girls in Haridwar 9667172968 Short 4000 Night 10000 Best call gi...College Call Girls in Haridwar 9667172968 Short 4000 Night 10000 Best call gi...
College Call Girls in Haridwar 9667172968 Short 4000 Night 10000 Best call gi...
perfect solution
 

Último (20)

Premium Call Girls Cottonpet Whatsapp 7001035870 Independent Escort Service
Premium Call Girls Cottonpet Whatsapp 7001035870 Independent Escort ServicePremium Call Girls Cottonpet Whatsapp 7001035870 Independent Escort Service
Premium Call Girls Cottonpet Whatsapp 7001035870 Independent Escort Service
 
(Low Rate RASHMI ) Rate Of Call Girls Jaipur ❣ 8445551418 ❣ Elite Models & Ce...
(Low Rate RASHMI ) Rate Of Call Girls Jaipur ❣ 8445551418 ❣ Elite Models & Ce...(Low Rate RASHMI ) Rate Of Call Girls Jaipur ❣ 8445551418 ❣ Elite Models & Ce...
(Low Rate RASHMI ) Rate Of Call Girls Jaipur ❣ 8445551418 ❣ Elite Models & Ce...
 
O898O367676 Call Girls In Ahmedabad Escort Service Available 24×7 In Ahmedabad
O898O367676 Call Girls In Ahmedabad Escort Service Available 24×7 In AhmedabadO898O367676 Call Girls In Ahmedabad Escort Service Available 24×7 In Ahmedabad
O898O367676 Call Girls In Ahmedabad Escort Service Available 24×7 In Ahmedabad
 
College Call Girls in Haridwar 9667172968 Short 4000 Night 10000 Best call gi...
College Call Girls in Haridwar 9667172968 Short 4000 Night 10000 Best call gi...College Call Girls in Haridwar 9667172968 Short 4000 Night 10000 Best call gi...
College Call Girls in Haridwar 9667172968 Short 4000 Night 10000 Best call gi...
 
(👑VVIP ISHAAN ) Russian Call Girls Service Navi Mumbai🖕9920874524🖕Independent...
(👑VVIP ISHAAN ) Russian Call Girls Service Navi Mumbai🖕9920874524🖕Independent...(👑VVIP ISHAAN ) Russian Call Girls Service Navi Mumbai🖕9920874524🖕Independent...
(👑VVIP ISHAAN ) Russian Call Girls Service Navi Mumbai🖕9920874524🖕Independent...
 
Mumbai ] (Call Girls) in Mumbai 10k @ I'm VIP Independent Escorts Girls 98333...
Mumbai ] (Call Girls) in Mumbai 10k @ I'm VIP Independent Escorts Girls 98333...Mumbai ] (Call Girls) in Mumbai 10k @ I'm VIP Independent Escorts Girls 98333...
Mumbai ] (Call Girls) in Mumbai 10k @ I'm VIP Independent Escorts Girls 98333...
 
Call Girls Tirupati Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Tirupati Just Call 8250077686 Top Class Call Girl Service AvailableCall Girls Tirupati Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Tirupati Just Call 8250077686 Top Class Call Girl Service Available
 
Top Rated Hyderabad Call Girls Erragadda ⟟ 9332606886 ⟟ Call Me For Genuine ...
Top Rated  Hyderabad Call Girls Erragadda ⟟ 9332606886 ⟟ Call Me For Genuine ...Top Rated  Hyderabad Call Girls Erragadda ⟟ 9332606886 ⟟ Call Me For Genuine ...
Top Rated Hyderabad Call Girls Erragadda ⟟ 9332606886 ⟟ Call Me For Genuine ...
 
Top Rated Bangalore Call Girls Ramamurthy Nagar ⟟ 9332606886 ⟟ Call Me For G...
Top Rated Bangalore Call Girls Ramamurthy Nagar ⟟  9332606886 ⟟ Call Me For G...Top Rated Bangalore Call Girls Ramamurthy Nagar ⟟  9332606886 ⟟ Call Me For G...
Top Rated Bangalore Call Girls Ramamurthy Nagar ⟟ 9332606886 ⟟ Call Me For G...
 
Call Girls Jabalpur Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Jabalpur Just Call 8250077686 Top Class Call Girl Service AvailableCall Girls Jabalpur Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Jabalpur Just Call 8250077686 Top Class Call Girl Service Available
 
Premium Call Girls In Jaipur {8445551418} ❤️VVIP SEEMA Call Girl in Jaipur Ra...
Premium Call Girls In Jaipur {8445551418} ❤️VVIP SEEMA Call Girl in Jaipur Ra...Premium Call Girls In Jaipur {8445551418} ❤️VVIP SEEMA Call Girl in Jaipur Ra...
Premium Call Girls In Jaipur {8445551418} ❤️VVIP SEEMA Call Girl in Jaipur Ra...
 
Best Rate (Guwahati ) Call Girls Guwahati ⟟ 8617370543 ⟟ High Class Call Girl...
Best Rate (Guwahati ) Call Girls Guwahati ⟟ 8617370543 ⟟ High Class Call Girl...Best Rate (Guwahati ) Call Girls Guwahati ⟟ 8617370543 ⟟ High Class Call Girl...
Best Rate (Guwahati ) Call Girls Guwahati ⟟ 8617370543 ⟟ High Class Call Girl...
 
Call Girls Bareilly Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Bareilly Just Call 8250077686 Top Class Call Girl Service AvailableCall Girls Bareilly Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Bareilly Just Call 8250077686 Top Class Call Girl Service Available
 
Book Paid Powai Call Girls Mumbai 𖠋 9930245274 𖠋Low Budget Full Independent H...
Book Paid Powai Call Girls Mumbai 𖠋 9930245274 𖠋Low Budget Full Independent H...Book Paid Powai Call Girls Mumbai 𖠋 9930245274 𖠋Low Budget Full Independent H...
Book Paid Powai Call Girls Mumbai 𖠋 9930245274 𖠋Low Budget Full Independent H...
 
All Time Service Available Call Girls Marine Drive 📳 9820252231 For 18+ VIP C...
All Time Service Available Call Girls Marine Drive 📳 9820252231 For 18+ VIP C...All Time Service Available Call Girls Marine Drive 📳 9820252231 For 18+ VIP C...
All Time Service Available Call Girls Marine Drive 📳 9820252231 For 18+ VIP C...
 
Night 7k to 12k Navi Mumbai Call Girl Photo 👉 BOOK NOW 9833363713 👈 ♀️ night ...
Night 7k to 12k Navi Mumbai Call Girl Photo 👉 BOOK NOW 9833363713 👈 ♀️ night ...Night 7k to 12k Navi Mumbai Call Girl Photo 👉 BOOK NOW 9833363713 👈 ♀️ night ...
Night 7k to 12k Navi Mumbai Call Girl Photo 👉 BOOK NOW 9833363713 👈 ♀️ night ...
 
Top Quality Call Girl Service Kalyanpur 6378878445 Available Call Girls Any Time
Top Quality Call Girl Service Kalyanpur 6378878445 Available Call Girls Any TimeTop Quality Call Girl Service Kalyanpur 6378878445 Available Call Girls Any Time
Top Quality Call Girl Service Kalyanpur 6378878445 Available Call Girls Any Time
 
Call Girls Varanasi Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Varanasi Just Call 8250077686 Top Class Call Girl Service AvailableCall Girls Varanasi Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Varanasi Just Call 8250077686 Top Class Call Girl Service Available
 
Call Girls Guntur Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Guntur  Just Call 8250077686 Top Class Call Girl Service AvailableCall Girls Guntur  Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Guntur Just Call 8250077686 Top Class Call Girl Service Available
 
Russian Call Girls Service Jaipur {8445551418} ❤️PALLAVI VIP Jaipur Call Gir...
Russian Call Girls Service  Jaipur {8445551418} ❤️PALLAVI VIP Jaipur Call Gir...Russian Call Girls Service  Jaipur {8445551418} ❤️PALLAVI VIP Jaipur Call Gir...
Russian Call Girls Service Jaipur {8445551418} ❤️PALLAVI VIP Jaipur Call Gir...
 

SeqWare on the Cloud: Porting a Genome Center's Infrastructure to Amazon Web Services

  • 1. SeqWare on the Cloud: Porting a Genome Center’s Infrastructure to Amazon Web Services Brian O'Connor SeqWare Software Architect & Manager for Software Engineering The Ontario Institute for Cancer Research
  • 2. Effective Scaling Integration Expertise & Sharing Effective System Compute & Storage SeqWare was designed to scale in these ways
  • 3. Effective Scaling Query Integration Expertise Engine & Sharing Poster Effective System Compute & Storage SeqWare was designed to scale in these ways
  • 4. The Open Source SeqWare Project SeqWare SeqWare Web Query Engine Service SeqWare Portal SeqWare SeqWare Pipeline MetaDB Local Cluster Cloud Big Data Small Data
  • 5. Distinguishing Features of SeqWare Firehose ● Infrastructure Toolkit Tavern ● Developer Framework a Open Source/Community ● Automation ● Environment-Agnostic Commercial ● Tailored for Big Projects ● User-Created Workflows ● Packaging Format ● Provenance Tracking ● Fault Tolerant ● Tools-Agnostic ● Open Source
  • 6. Projects Using SeqWare UNC Ontario Lineberger Institute for Cancer Cancer Center Research Iceman, Plant Genome HuRef 300x, Clinical Assembly Others... Sequencing + local projects + local projects Exome, Targeted Whole Genome, Resequencing Targeted Re- Whole Genome RNASeq Sequencing, Whole Genome RNASeq Hundreds of 2 genomes, 1.5 TBase 38 TBase 9 genomes patient samples JBrowse on 927 samples 1,522 samples a 300x genome iPad 982 “lanes” 2,297 “lanes”
  • 7. Scaling Expertise: Analyzing Illumina Data @ OICR ● September 2011 rolled out SeqWare at OICR ● Goal: to deploy SeqWare and streamline production analysis through automation ● 4 groups working together ● SeqWare Workflows for – Large projects and common tasks – Projects with “public uploads”
  • 8. SeqWare at OICR SeqWare SeqWare Web Query Engine Service SeqWare Portal SeqWare SeqWare Pipeline MetaDB Local Cluster Cloud Big Data Small Data
  • 9. SeqWare at OICR SeqWare SeqWare Web Query Engine Software Service Engineering SeqWare Portal SeqWare SeqWare Pipeline MetaDB Local Cluster Cloud Big Data Small Data
  • 10. SeqWare at OICR SeqWare SeqWare Web Query Engine Service SeqWare Portal SeqWare SeqWare Pipeline MetaDB Local Cluster Pipeline & Big Data Tool Evaluation Cloud Small Data
  • 11. SeqWare at OICR SeqWare SeqWare Web Query Engine Service SeqWare Portal SeqWare SeqWare Pipeline MetaDB Sequencing Facility Local Cluster LIMS Cloud Big Data Small Data
  • 12. SeqWare at OICR SeqWare SeqWare Web Query Engine Service User + Data = SeqWare Portal SeqWare SeqWare Pipeline MetaDB “deciders” Local Cluster Production Big Data Informatics Cloud Small Data
  • 13. OICR Production Workflows Multiple groups contributed including the new Pipeline and Evaluation Team In Production Staging , Testing, & Development Workflows
  • 14. OICR SeqWare Results 38 Trillion Samples Bases Aligned Time (~2 years) ● Automated key components in production ● What about sharing our infrastructure? ● To the Cloud!
  • 15. “The Cloud” Want to share infrastructure without sharing infrastructure Core Services Transfer & Web Other Nifty Services Services Elastic Cloud Compute Elastic Beanstalk Simple Storage Glacier Import/Export Service Elastic Map/Reduce Linux tools DirectConnect for disk and DynamoDB file encryption HBase Hardware through API calls
  • 16. Scaling Computation: Analyzing SOLiD Data on Amazon ● Life Collaborations Division had 9 human genomes such as the Iceman's genome and HuRef resequenced at high depth ● Goal: to deploy SeqWare infrastructure on AWS and analyze data in a scalable way ● Without building infrastructure ● Using open source tools
  • 17. SeqWare Infrastructure on EC2 Instance Workflow Command SeqWare or Bundle Line MetaDB Cluster Tools Launcher Config SeqWare Web Service Amazon Amazon EC2 S3 Import SeqWare Pipeline Result Files: SeqWare BAM, Portal VCF, Reports User SeqWare Interfaces Amazon Instance or Cluster
  • 18. Workflow Outputs Results via Project Website on EC2 Variants Loaded in JBrowse Genome Browser on Elastic Beanstalk http://icemangenome.net Variants in Database and Files in S3 Variant BAM VCF Annotated VCF Database
  • 19. Results ● Cloud delivered fantastic computational and storage scalability ● Analyzed 9 human genomes, one at 300x! ● Costs ● 8 node HPC cluster, about 4 days ● 30x coverage genome was ~$1000 (<$15/GBase) ● ~$150 per exome ($10/GBase) ● ~$50/month/genome storage, website, & browser
  • 20. The Future of SeqWare ● Scalability ● Cloud-based cluster launching (Starcluster/Cloudman) ● Release encryption and distributed filesystem tools ● Better documentation and easier setup ● Expertise ● Simplify pipeline language(s) and development process ● Release OICR public workflows ● Integration ● Expand NOSQL variant/annotation database ● Support for other tools like Galaxy
  • 21. Availability ● SeqWare available at: http://seqware.github.com, @SeqWare Virtual Box & AMI ● Brian O'Connor boconnor@oicr.on.ca
  • 22. Acknowledgements ● SeqWare @ OICR ● Tim Harkins ● Morgan Taschuk, ● Barry Merriman Denis Yuen, Yong Liang ● Jason Warner ● OICR SeqProdBio ● Kevin McKernan ● Tim Beck, Zheng Zha, Tony ● Vincent Ferretti DeBat ● OICR Bioinformatics Core ● Lincoln Stein ● Francis Ouellette, Zhibin Lu ● SeqWare @ UNC ● Neil Hayes, Sara Grimm, Stuart Jefferys, Matt Solloway, and the Lineberger group