SlideShare uma empresa Scribd logo
1 de 16
De novo assembly, a
multi-technology approach:
Illumina, PacBio, and OpGen
PhD. Francesco Vezzi
Senior Bioinformatician, NGI-Stockholm
Both Stockholm and Uppsala nodes
Illumina HiSeq 2000/2500 16
Illumina MiSeq 3
Life Technologies SOLiD 5500xl 4
Life Technologies SOLiD 5500wildfire 2
Life Technologies Ion Torrent 2
Life Technologies Ion Proton 6
Life Technologies Sanger ABI3730 2
Pacific Biosciences RSII 1
Argus Whole Genome Mapping System 1
One of 3 best-equipped sequencing sites in Europe
In this talk
Illumina (Stockholm):
• 100/150 bp paired reads (low error rate)
• 900/200 Gbp in 6/2 day(s)
PacBio (Uppsala):
• 8.5 Kbp reads, (max 30Kbp, high error rate)
• 375 Mbp (1 SMRT Cell) in 10 hours
OpGen Argus System (Stockholm):
• ~300 Kbp maps
• 10 Gbp in ~1 day
Optical Maps
• Restriction Map
◦ Representation of the cut sites on a
given DNA molecule to provide spatial
information of genetic loci
• An enzyme is selected and used
to cut the molecules. This
provides a 2D representation of
the molecule structure
Optical Maps: workflow
DNA extraction directly
from culture
Quality control of
extracted material
Prepare a chip
Run Argus System
Data assembly
StepsTime
3-8h
1h
1.5h
1h
2-8h
Notes
Closing genomes with Optical Maps
De novo reconstructs parts
missing in the reference strain
Correctly assembles long tandem
repeats
De Novo assembly
(Illumina, PacBio)
Set of un-ordered and
not oriented contigs
Optical Map
Contigs
Case Study: Combing all the technologies
~15 Mbp genome sequenced at High Coverage with:
• Illumina HiSeq:
• 500X PE libraries (180bp and 650bp insert)
• 150X MP library (3Kbp)
• 150X MP library (7Kbp)
• PacBio
• 50/60X with reads longer than 2Kbp
• OpGen
• 3 chips (only one worked really well)
• 300X coverage
• Average map length 320Kbp
Assembly Strategy
https://github.com/vezzi/de_novo_scilife
Semi-automated pipeline for de novo assembly:
• Global configuration file  tools and system configuration
• Sample configuration file  samples description
3 modules:
1. QC-module (Illumina only):
• Adaptor removal, kmer-analysis, fastqc, (insert size estimation)
2. Assemble-module (Illumina only):
• Runs specified assemblers and outputs executed commands
3. Validation-module:
• FRCbam, coverage analysis, GC-analysis, (N50)
I NEED USERS/FEEDBACK/CONTIRBUTIONS
QC-Module
Kmer analysis:
• Samples complexity
• Error rate
• Heterozygosity
0 1000 2000 3000 4000 5000 6000
05000100001500020000
Insert Size Histogram for All_Reads
in file lib_3000.bam
Insert Size
Count FR
RF
TANDEM
FASTQC
Adaptor removal
Alignment (partial assembly)
Assemble-Module
Illumina only:
• SOAPdenovo
• MaSuRCA
• Allpaths-LG
PacBio only:
• HGAP
• CABOG
Hybrid:
• PB-jelly (HAH)
>5000
#scaffolds totalLength maxContigLength N50 N80 percentageNs
Allpaths-LG 227 14513103 596012 139364 57619 15%
MASURCA 163 18549484 1188669 526519 282507 2%
HGAP 290 14399273 763592 142483 37117 0%
PB-Jelly 179 14718213 747750 195225 85127 13%
• Try-and-fail process
• Automated pipeline developed in order to
streamline these analysis
• MASURCA surprisingly the “best” assembler
MaSuRCA HGAP PB-Jelly (HAH)
Validation-Module
FRCbam
Validation-Module
PacBio-only assembly is
clearly outperforming
the others
Optical Maps
PacBio produces the best assembly however 290 contigs contigs are produced.
Optical Maps allowed to obtain
the 2D representation of the 7
chromosomes.
N.B. chromosome number was
one of the biological questions of
this project!!!
But much more can be done!!!
Incredible tool to finish (or almost finish) genomes
% contigs placed
Total size of placed
contigs
% size placed
contigs
% genome
covered
pacBio+OpGene 94.12 11578995 97% 77.05
Allpaths+OpGene 71.88 10692027 84% 52.88
Allpaths+Masurca+Opgene 80.65 27506424 92% 69.64
Allpaths+PacBio+Opgene 82.32 22271022 91% 83.05
Masurca+PacBio+pgene 94.44 28393392 98% 83.79
Allpaths+Masurca+PacBio+Opgene 85.42 39085419 94% 87.39
Combing all the technologies
Conclusions – Take home message
Attempt to automate de novo assembly process:
• https://github.com/vezzi/de_novo_scilife
• Not 100% automated
Illumina, PacBio, Hybrid assemblies:
• PacBio alone seems to produce the best assemblers
• Hybrid assembly seems to not be able to correct merged-assembly
problems
Mixing technologies is always a good idea:
• Possibility to compensate technological biases
• Allows to produce better assemblies
Thanks
https://github.com/vezzi/de_novo_scilife

Mais conteúdo relacionado

Mais procurados

Behalf Of Pamela Collaboration
Behalf Of Pamela CollaborationBehalf Of Pamela Collaboration
Behalf Of Pamela Collaborationahmad bassiouny
 
SkySweeper: A High Wire Robot
SkySweeper: A High Wire RobotSkySweeper: A High Wire Robot
SkySweeper: A High Wire RobotNick Morozovsky
 
自律移動ロボット向けハード・ソフト協調のためのコンポーネント設計支援ツール
自律移動ロボット向けハード・ソフト協調のためのコンポーネント設計支援ツール自律移動ロボット向けハード・ソフト協調のためのコンポーネント設計支援ツール
自律移動ロボット向けハード・ソフト協調のためのコンポーネント設計支援ツールKazushi Yamashina
 
FPGA処理をROSコンポーネント化する自動設計環境
FPGA処理をROSコンポーネント化する自動設計環境FPGA処理をROSコンポーネント化する自動設計環境
FPGA処理をROSコンポーネント化する自動設計環境Kazushi Yamashina
 
Track Finding in LHCb's 2020 Trigger
Track Finding in LHCb's 2020 TriggerTrack Finding in LHCb's 2020 Trigger
Track Finding in LHCb's 2020 TriggerTimothy Head
 
IGARSS 2011 pt slides_7 28 2011.ppt
IGARSS 2011 pt slides_7 28 2011.pptIGARSS 2011 pt slides_7 28 2011.ppt
IGARSS 2011 pt slides_7 28 2011.pptgrssieee
 
cReComp : Automated Design Tool for ROS-Compliant FPGA Component
cReComp : Automated Design Tool  for ROS-Compliant FPGA Component cReComp : Automated Design Tool  for ROS-Compliant FPGA Component
cReComp : Automated Design Tool for ROS-Compliant FPGA Component Kazushi Yamashina
 
Uav flight control system with ins gps
Uav flight control system with ins gpsUav flight control system with ins gps
Uav flight control system with ins gpsamir amiry
 
FPGAを用いた処理のロボット向けコンポーネントの設計生産性評価
FPGAを用いた処理のロボット向けコンポーネントの設計生産性評価FPGAを用いた処理のロボット向けコンポーネントの設計生産性評価
FPGAを用いた処理のロボット向けコンポーネントの設計生産性評価Kazushi Yamashina
 
Discos: A common control software for the SRT and the other italian radiotele...
Discos: A common control software for the SRT and the other italian radiotele...Discos: A common control software for the SRT and the other italian radiotele...
Discos: A common control software for the SRT and the other italian radiotele...Sergio Poppi
 
Review regional Source Specific Station Corrections (SSSCs) developed for no...
Review regional Source Specific Station Corrections (SSSCs) developed for  no...Review regional Source Specific Station Corrections (SSSCs) developed for  no...
Review regional Source Specific Station Corrections (SSSCs) developed for no...Ivan Kitov
 
Snowmobile mode surveys by ClearView Geophysics Inc.
Snowmobile mode surveys by ClearView Geophysics Inc.Snowmobile mode surveys by ClearView Geophysics Inc.
Snowmobile mode surveys by ClearView Geophysics Inc.JoeMihelcic
 
RT15 Berkeley | Requirements on Power Amplifiers and HIL Real-Time Processors...
RT15 Berkeley | Requirements on Power Amplifiers and HIL Real-Time Processors...RT15 Berkeley | Requirements on Power Amplifiers and HIL Real-Time Processors...
RT15 Berkeley | Requirements on Power Amplifiers and HIL Real-Time Processors...OPAL-RT TECHNOLOGIES
 
postertemplate_plc_v36_final2
postertemplate_plc_v36_final2postertemplate_plc_v36_final2
postertemplate_plc_v36_final2Patrick Cavins
 
4 IGARSS2011kobayashiPi-SARearthquak20110724b.ppt
4 IGARSS2011kobayashiPi-SARearthquak20110724b.ppt4 IGARSS2011kobayashiPi-SARearthquak20110724b.ppt
4 IGARSS2011kobayashiPi-SARearthquak20110724b.pptgrssieee
 
High Definition On MPEG In Internet Protocol (Wbm Comments)
High Definition On MPEG In Internet Protocol (Wbm Comments)High Definition On MPEG In Internet Protocol (Wbm Comments)
High Definition On MPEG In Internet Protocol (Wbm Comments)Kelly Daniels
 
OSMC 2012 | Monitoring at CERN by Christophe Haen
OSMC 2012 | Monitoring at CERN by Christophe HaenOSMC 2012 | Monitoring at CERN by Christophe Haen
OSMC 2012 | Monitoring at CERN by Christophe HaenNETWAYS
 
LAM_TOMMY_PRESENTATION_FIN
LAM_TOMMY_PRESENTATION_FINLAM_TOMMY_PRESENTATION_FIN
LAM_TOMMY_PRESENTATION_FINTommy Lam
 

Mais procurados (18)

Behalf Of Pamela Collaboration
Behalf Of Pamela CollaborationBehalf Of Pamela Collaboration
Behalf Of Pamela Collaboration
 
SkySweeper: A High Wire Robot
SkySweeper: A High Wire RobotSkySweeper: A High Wire Robot
SkySweeper: A High Wire Robot
 
自律移動ロボット向けハード・ソフト協調のためのコンポーネント設計支援ツール
自律移動ロボット向けハード・ソフト協調のためのコンポーネント設計支援ツール自律移動ロボット向けハード・ソフト協調のためのコンポーネント設計支援ツール
自律移動ロボット向けハード・ソフト協調のためのコンポーネント設計支援ツール
 
FPGA処理をROSコンポーネント化する自動設計環境
FPGA処理をROSコンポーネント化する自動設計環境FPGA処理をROSコンポーネント化する自動設計環境
FPGA処理をROSコンポーネント化する自動設計環境
 
Track Finding in LHCb's 2020 Trigger
Track Finding in LHCb's 2020 TriggerTrack Finding in LHCb's 2020 Trigger
Track Finding in LHCb's 2020 Trigger
 
IGARSS 2011 pt slides_7 28 2011.ppt
IGARSS 2011 pt slides_7 28 2011.pptIGARSS 2011 pt slides_7 28 2011.ppt
IGARSS 2011 pt slides_7 28 2011.ppt
 
cReComp : Automated Design Tool for ROS-Compliant FPGA Component
cReComp : Automated Design Tool  for ROS-Compliant FPGA Component cReComp : Automated Design Tool  for ROS-Compliant FPGA Component
cReComp : Automated Design Tool for ROS-Compliant FPGA Component
 
Uav flight control system with ins gps
Uav flight control system with ins gpsUav flight control system with ins gps
Uav flight control system with ins gps
 
FPGAを用いた処理のロボット向けコンポーネントの設計生産性評価
FPGAを用いた処理のロボット向けコンポーネントの設計生産性評価FPGAを用いた処理のロボット向けコンポーネントの設計生産性評価
FPGAを用いた処理のロボット向けコンポーネントの設計生産性評価
 
Discos: A common control software for the SRT and the other italian radiotele...
Discos: A common control software for the SRT and the other italian radiotele...Discos: A common control software for the SRT and the other italian radiotele...
Discos: A common control software for the SRT and the other italian radiotele...
 
Review regional Source Specific Station Corrections (SSSCs) developed for no...
Review regional Source Specific Station Corrections (SSSCs) developed for  no...Review regional Source Specific Station Corrections (SSSCs) developed for  no...
Review regional Source Specific Station Corrections (SSSCs) developed for no...
 
Snowmobile mode surveys by ClearView Geophysics Inc.
Snowmobile mode surveys by ClearView Geophysics Inc.Snowmobile mode surveys by ClearView Geophysics Inc.
Snowmobile mode surveys by ClearView Geophysics Inc.
 
RT15 Berkeley | Requirements on Power Amplifiers and HIL Real-Time Processors...
RT15 Berkeley | Requirements on Power Amplifiers and HIL Real-Time Processors...RT15 Berkeley | Requirements on Power Amplifiers and HIL Real-Time Processors...
RT15 Berkeley | Requirements on Power Amplifiers and HIL Real-Time Processors...
 
postertemplate_plc_v36_final2
postertemplate_plc_v36_final2postertemplate_plc_v36_final2
postertemplate_plc_v36_final2
 
4 IGARSS2011kobayashiPi-SARearthquak20110724b.ppt
4 IGARSS2011kobayashiPi-SARearthquak20110724b.ppt4 IGARSS2011kobayashiPi-SARearthquak20110724b.ppt
4 IGARSS2011kobayashiPi-SARearthquak20110724b.ppt
 
High Definition On MPEG In Internet Protocol (Wbm Comments)
High Definition On MPEG In Internet Protocol (Wbm Comments)High Definition On MPEG In Internet Protocol (Wbm Comments)
High Definition On MPEG In Internet Protocol (Wbm Comments)
 
OSMC 2012 | Monitoring at CERN by Christophe Haen
OSMC 2012 | Monitoring at CERN by Christophe HaenOSMC 2012 | Monitoring at CERN by Christophe Haen
OSMC 2012 | Monitoring at CERN by Christophe Haen
 
LAM_TOMMY_PRESENTATION_FIN
LAM_TOMMY_PRESENTATION_FINLAM_TOMMY_PRESENTATION_FIN
LAM_TOMMY_PRESENTATION_FIN
 

Semelhante a SeRC: de novo assembly workshop. Francesco Vezzi

Integrated Detector Electronics (IDEAS) ASIC product update
Integrated Detector Electronics (IDEAS) ASIC product updateIntegrated Detector Electronics (IDEAS) ASIC product update
Integrated Detector Electronics (IDEAS) ASIC product updateGunnar Maehlum
 
Positioning techniques in 3 g networks (1)
Positioning techniques in 3 g networks (1)Positioning techniques in 3 g networks (1)
Positioning techniques in 3 g networks (1)kike2005
 
Advanced lock in amplifier for detection of phase transitions in liquid crystals
Advanced lock in amplifier for detection of phase transitions in liquid crystalsAdvanced lock in amplifier for detection of phase transitions in liquid crystals
Advanced lock in amplifier for detection of phase transitions in liquid crystalsIAEME Publication
 
Advanced Oscilloscope Technologies enabling Terabit Optical Communications
Advanced Oscilloscope Technologies enabling Terabit Optical CommunicationsAdvanced Oscilloscope Technologies enabling Terabit Optical Communications
Advanced Oscilloscope Technologies enabling Terabit Optical CommunicationsCPqD
 
AMAR_KANTETI_RESUME
AMAR_KANTETI_RESUMEAMAR_KANTETI_RESUME
AMAR_KANTETI_RESUMEamar kanteti
 
IEEE CASE 2011, Italy - Conference Paper Presentation
IEEE CASE 2011, Italy - Conference Paper PresentationIEEE CASE 2011, Italy - Conference Paper Presentation
IEEE CASE 2011, Italy - Conference Paper Presentationashishrratnakar
 
OpenCAPI-based Image Analysis Pipeline for 18 GB/s kilohertz-framerate X-ray ...
OpenCAPI-based Image Analysis Pipeline for 18 GB/s kilohertz-framerate X-ray ...OpenCAPI-based Image Analysis Pipeline for 18 GB/s kilohertz-framerate X-ray ...
OpenCAPI-based Image Analysis Pipeline for 18 GB/s kilohertz-framerate X-ray ...Ganesan Narayanasamy
 
The Search for Gravitational Waves
The Search for Gravitational WavesThe Search for Gravitational Waves
The Search for Gravitational Wavesinside-BigData.com
 
Optical Modulation Analysis (OMA) Present and Future
Optical Modulation Analysis (OMA) Present and FutureOptical Modulation Analysis (OMA) Present and Future
Optical Modulation Analysis (OMA) Present and FutureCPqD
 
150807 Fast R-CNN
150807 Fast R-CNN150807 Fast R-CNN
150807 Fast R-CNNJunho Cho
 
Huawei_MIMO_solution.pdf
Huawei_MIMO_solution.pdfHuawei_MIMO_solution.pdf
Huawei_MIMO_solution.pdfssuser32515c
 
customization of a deep learning accelerator, based on NVDLA
customization of a deep learning accelerator, based on NVDLAcustomization of a deep learning accelerator, based on NVDLA
customization of a deep learning accelerator, based on NVDLAShien-Chun Luo
 
Towards Terabit per Second Optical Networking
Towards Terabit per Second Optical NetworkingTowards Terabit per Second Optical Networking
Towards Terabit per Second Optical NetworkingCPqD
 
LTE Features, Link Budget & Basic Principle
LTE Features, Link Budget & Basic PrincipleLTE Features, Link Budget & Basic Principle
LTE Features, Link Budget & Basic PrincipleMd Mustafizur Rahman
 

Semelhante a SeRC: de novo assembly workshop. Francesco Vezzi (20)

Integrated Detector Electronics (IDEAS) ASIC product update
Integrated Detector Electronics (IDEAS) ASIC product updateIntegrated Detector Electronics (IDEAS) ASIC product update
Integrated Detector Electronics (IDEAS) ASIC product update
 
ThesisPresentation_Upd
ThesisPresentation_UpdThesisPresentation_Upd
ThesisPresentation_Upd
 
Positioning techniques in 3 g networks (1)
Positioning techniques in 3 g networks (1)Positioning techniques in 3 g networks (1)
Positioning techniques in 3 g networks (1)
 
Corralling Big Data at TACC
Corralling Big Data at TACCCorralling Big Data at TACC
Corralling Big Data at TACC
 
Advanced lock in amplifier for detection of phase transitions in liquid crystals
Advanced lock in amplifier for detection of phase transitions in liquid crystalsAdvanced lock in amplifier for detection of phase transitions in liquid crystals
Advanced lock in amplifier for detection of phase transitions in liquid crystals
 
Advanced Oscilloscope Technologies enabling Terabit Optical Communications
Advanced Oscilloscope Technologies enabling Terabit Optical CommunicationsAdvanced Oscilloscope Technologies enabling Terabit Optical Communications
Advanced Oscilloscope Technologies enabling Terabit Optical Communications
 
AMAR_KANTETI_RESUME
AMAR_KANTETI_RESUMEAMAR_KANTETI_RESUME
AMAR_KANTETI_RESUME
 
IEEE CASE 2011, Italy - Conference Paper Presentation
IEEE CASE 2011, Italy - Conference Paper PresentationIEEE CASE 2011, Italy - Conference Paper Presentation
IEEE CASE 2011, Italy - Conference Paper Presentation
 
OpenCAPI-based Image Analysis Pipeline for 18 GB/s kilohertz-framerate X-ray ...
OpenCAPI-based Image Analysis Pipeline for 18 GB/s kilohertz-framerate X-ray ...OpenCAPI-based Image Analysis Pipeline for 18 GB/s kilohertz-framerate X-ray ...
OpenCAPI-based Image Analysis Pipeline for 18 GB/s kilohertz-framerate X-ray ...
 
The Search for Gravitational Waves
The Search for Gravitational WavesThe Search for Gravitational Waves
The Search for Gravitational Waves
 
Optical Modulation Analysis (OMA) Present and Future
Optical Modulation Analysis (OMA) Present and FutureOptical Modulation Analysis (OMA) Present and Future
Optical Modulation Analysis (OMA) Present and Future
 
150807 Fast R-CNN
150807 Fast R-CNN150807 Fast R-CNN
150807 Fast R-CNN
 
Huawei_MIMO_solution.pdf
Huawei_MIMO_solution.pdfHuawei_MIMO_solution.pdf
Huawei_MIMO_solution.pdf
 
Resume201411
Resume201411Resume201411
Resume201411
 
customization of a deep learning accelerator, based on NVDLA
customization of a deep learning accelerator, based on NVDLAcustomization of a deep learning accelerator, based on NVDLA
customization of a deep learning accelerator, based on NVDLA
 
Towards Terabit per Second Optical Networking
Towards Terabit per Second Optical NetworkingTowards Terabit per Second Optical Networking
Towards Terabit per Second Optical Networking
 
LTE Features, Link Budget & Basic Principle
LTE Features, Link Budget & Basic PrincipleLTE Features, Link Budget & Basic Principle
LTE Features, Link Budget & Basic Principle
 
6600ingles
6600ingles6600ingles
6600ingles
 
Thesis presentation
Thesis presentationThesis presentation
Thesis presentation
 
Parameters for drive test
Parameters for drive testParameters for drive test
Parameters for drive test
 

Último

BUMI DAN ANTARIKSA PROJEK IPAS SMK KELAS X.pdf
BUMI DAN ANTARIKSA PROJEK IPAS SMK KELAS X.pdfBUMI DAN ANTARIKSA PROJEK IPAS SMK KELAS X.pdf
BUMI DAN ANTARIKSA PROJEK IPAS SMK KELAS X.pdfWildaNurAmalia2
 
User Guide: Pulsar™ Weather Station (Columbia Weather Systems)
User Guide: Pulsar™ Weather Station (Columbia Weather Systems)User Guide: Pulsar™ Weather Station (Columbia Weather Systems)
User Guide: Pulsar™ Weather Station (Columbia Weather Systems)Columbia Weather Systems
 
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptx
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptxTHE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptx
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptxNandakishor Bhaurao Deshmukh
 
Citronella presentation SlideShare mani upadhyay
Citronella presentation SlideShare mani upadhyayCitronella presentation SlideShare mani upadhyay
Citronella presentation SlideShare mani upadhyayupadhyaymani499
 
Pests of jatropha_Bionomics_identification_Dr.UPR.pdf
Pests of jatropha_Bionomics_identification_Dr.UPR.pdfPests of jatropha_Bionomics_identification_Dr.UPR.pdf
Pests of jatropha_Bionomics_identification_Dr.UPR.pdfPirithiRaju
 
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...Universidade Federal de Sergipe - UFS
 
《Queensland毕业文凭-昆士兰大学毕业证成绩单》
《Queensland毕业文凭-昆士兰大学毕业证成绩单》《Queensland毕业文凭-昆士兰大学毕业证成绩单》
《Queensland毕业文凭-昆士兰大学毕业证成绩单》rnrncn29
 
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptxLIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptxmalonesandreagweneth
 
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...lizamodels9
 
User Guide: Capricorn FLX™ Weather Station
User Guide: Capricorn FLX™ Weather StationUser Guide: Capricorn FLX™ Weather Station
User Guide: Capricorn FLX™ Weather StationColumbia Weather Systems
 
The dark energy paradox leads to a new structure of spacetime.pptx
The dark energy paradox leads to a new structure of spacetime.pptxThe dark energy paradox leads to a new structure of spacetime.pptx
The dark energy paradox leads to a new structure of spacetime.pptxEran Akiva Sinbar
 
Call Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 Genuine
Call Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 GenuineCall Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 Genuine
Call Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 Genuinethapagita
 
Transposable elements in prokaryotes.ppt
Transposable elements in prokaryotes.pptTransposable elements in prokaryotes.ppt
Transposable elements in prokaryotes.pptArshadWarsi13
 
Neurodevelopmental disorders according to the dsm 5 tr
Neurodevelopmental disorders according to the dsm 5 trNeurodevelopmental disorders according to the dsm 5 tr
Neurodevelopmental disorders according to the dsm 5 trssuser06f238
 
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
Dubai Calls Girl Lisa O525547819 Lexi Call Girls In Dubai
Dubai Calls Girl Lisa O525547819 Lexi Call Girls In DubaiDubai Calls Girl Lisa O525547819 Lexi Call Girls In Dubai
Dubai Calls Girl Lisa O525547819 Lexi Call Girls In Dubaikojalkojal131
 
Davis plaque method.pptx recombinant DNA technology
Davis plaque method.pptx recombinant DNA technologyDavis plaque method.pptx recombinant DNA technology
Davis plaque method.pptx recombinant DNA technologycaarthichand2003
 
Microteaching on terms used in filtration .Pharmaceutical Engineering
Microteaching on terms used in filtration .Pharmaceutical EngineeringMicroteaching on terms used in filtration .Pharmaceutical Engineering
Microteaching on terms used in filtration .Pharmaceutical EngineeringPrajakta Shinde
 
Pests of Bengal gram_Identification_Dr.UPR.pdf
Pests of Bengal gram_Identification_Dr.UPR.pdfPests of Bengal gram_Identification_Dr.UPR.pdf
Pests of Bengal gram_Identification_Dr.UPR.pdfPirithiRaju
 

Último (20)

BUMI DAN ANTARIKSA PROJEK IPAS SMK KELAS X.pdf
BUMI DAN ANTARIKSA PROJEK IPAS SMK KELAS X.pdfBUMI DAN ANTARIKSA PROJEK IPAS SMK KELAS X.pdf
BUMI DAN ANTARIKSA PROJEK IPAS SMK KELAS X.pdf
 
User Guide: Pulsar™ Weather Station (Columbia Weather Systems)
User Guide: Pulsar™ Weather Station (Columbia Weather Systems)User Guide: Pulsar™ Weather Station (Columbia Weather Systems)
User Guide: Pulsar™ Weather Station (Columbia Weather Systems)
 
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptx
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptxTHE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptx
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptx
 
Citronella presentation SlideShare mani upadhyay
Citronella presentation SlideShare mani upadhyayCitronella presentation SlideShare mani upadhyay
Citronella presentation SlideShare mani upadhyay
 
Pests of jatropha_Bionomics_identification_Dr.UPR.pdf
Pests of jatropha_Bionomics_identification_Dr.UPR.pdfPests of jatropha_Bionomics_identification_Dr.UPR.pdf
Pests of jatropha_Bionomics_identification_Dr.UPR.pdf
 
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...
 
《Queensland毕业文凭-昆士兰大学毕业证成绩单》
《Queensland毕业文凭-昆士兰大学毕业证成绩单》《Queensland毕业文凭-昆士兰大学毕业证成绩单》
《Queensland毕业文凭-昆士兰大学毕业证成绩单》
 
Volatile Oils Pharmacognosy And Phytochemistry -I
Volatile Oils Pharmacognosy And Phytochemistry -IVolatile Oils Pharmacognosy And Phytochemistry -I
Volatile Oils Pharmacognosy And Phytochemistry -I
 
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptxLIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
 
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
 
User Guide: Capricorn FLX™ Weather Station
User Guide: Capricorn FLX™ Weather StationUser Guide: Capricorn FLX™ Weather Station
User Guide: Capricorn FLX™ Weather Station
 
The dark energy paradox leads to a new structure of spacetime.pptx
The dark energy paradox leads to a new structure of spacetime.pptxThe dark energy paradox leads to a new structure of spacetime.pptx
The dark energy paradox leads to a new structure of spacetime.pptx
 
Call Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 Genuine
Call Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 GenuineCall Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 Genuine
Call Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 Genuine
 
Transposable elements in prokaryotes.ppt
Transposable elements in prokaryotes.pptTransposable elements in prokaryotes.ppt
Transposable elements in prokaryotes.ppt
 
Neurodevelopmental disorders according to the dsm 5 tr
Neurodevelopmental disorders according to the dsm 5 trNeurodevelopmental disorders according to the dsm 5 tr
Neurodevelopmental disorders according to the dsm 5 tr
 
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
 
Dubai Calls Girl Lisa O525547819 Lexi Call Girls In Dubai
Dubai Calls Girl Lisa O525547819 Lexi Call Girls In DubaiDubai Calls Girl Lisa O525547819 Lexi Call Girls In Dubai
Dubai Calls Girl Lisa O525547819 Lexi Call Girls In Dubai
 
Davis plaque method.pptx recombinant DNA technology
Davis plaque method.pptx recombinant DNA technologyDavis plaque method.pptx recombinant DNA technology
Davis plaque method.pptx recombinant DNA technology
 
Microteaching on terms used in filtration .Pharmaceutical Engineering
Microteaching on terms used in filtration .Pharmaceutical EngineeringMicroteaching on terms used in filtration .Pharmaceutical Engineering
Microteaching on terms used in filtration .Pharmaceutical Engineering
 
Pests of Bengal gram_Identification_Dr.UPR.pdf
Pests of Bengal gram_Identification_Dr.UPR.pdfPests of Bengal gram_Identification_Dr.UPR.pdf
Pests of Bengal gram_Identification_Dr.UPR.pdf
 

SeRC: de novo assembly workshop. Francesco Vezzi

  • 1. De novo assembly, a multi-technology approach: Illumina, PacBio, and OpGen PhD. Francesco Vezzi Senior Bioinformatician, NGI-Stockholm
  • 2. Both Stockholm and Uppsala nodes Illumina HiSeq 2000/2500 16 Illumina MiSeq 3 Life Technologies SOLiD 5500xl 4 Life Technologies SOLiD 5500wildfire 2 Life Technologies Ion Torrent 2 Life Technologies Ion Proton 6 Life Technologies Sanger ABI3730 2 Pacific Biosciences RSII 1 Argus Whole Genome Mapping System 1 One of 3 best-equipped sequencing sites in Europe
  • 3. In this talk Illumina (Stockholm): • 100/150 bp paired reads (low error rate) • 900/200 Gbp in 6/2 day(s) PacBio (Uppsala): • 8.5 Kbp reads, (max 30Kbp, high error rate) • 375 Mbp (1 SMRT Cell) in 10 hours OpGen Argus System (Stockholm): • ~300 Kbp maps • 10 Gbp in ~1 day
  • 4. Optical Maps • Restriction Map ◦ Representation of the cut sites on a given DNA molecule to provide spatial information of genetic loci • An enzyme is selected and used to cut the molecules. This provides a 2D representation of the molecule structure
  • 5. Optical Maps: workflow DNA extraction directly from culture Quality control of extracted material Prepare a chip Run Argus System Data assembly StepsTime 3-8h 1h 1.5h 1h 2-8h Notes
  • 6. Closing genomes with Optical Maps De novo reconstructs parts missing in the reference strain Correctly assembles long tandem repeats De Novo assembly (Illumina, PacBio) Set of un-ordered and not oriented contigs Optical Map Contigs
  • 7. Case Study: Combing all the technologies ~15 Mbp genome sequenced at High Coverage with: • Illumina HiSeq: • 500X PE libraries (180bp and 650bp insert) • 150X MP library (3Kbp) • 150X MP library (7Kbp) • PacBio • 50/60X with reads longer than 2Kbp • OpGen • 3 chips (only one worked really well) • 300X coverage • Average map length 320Kbp
  • 8. Assembly Strategy https://github.com/vezzi/de_novo_scilife Semi-automated pipeline for de novo assembly: • Global configuration file  tools and system configuration • Sample configuration file  samples description 3 modules: 1. QC-module (Illumina only): • Adaptor removal, kmer-analysis, fastqc, (insert size estimation) 2. Assemble-module (Illumina only): • Runs specified assemblers and outputs executed commands 3. Validation-module: • FRCbam, coverage analysis, GC-analysis, (N50) I NEED USERS/FEEDBACK/CONTIRBUTIONS
  • 9. QC-Module Kmer analysis: • Samples complexity • Error rate • Heterozygosity 0 1000 2000 3000 4000 5000 6000 05000100001500020000 Insert Size Histogram for All_Reads in file lib_3000.bam Insert Size Count FR RF TANDEM FASTQC Adaptor removal Alignment (partial assembly)
  • 10. Assemble-Module Illumina only: • SOAPdenovo • MaSuRCA • Allpaths-LG PacBio only: • HGAP • CABOG Hybrid: • PB-jelly (HAH) >5000 #scaffolds totalLength maxContigLength N50 N80 percentageNs Allpaths-LG 227 14513103 596012 139364 57619 15% MASURCA 163 18549484 1188669 526519 282507 2% HGAP 290 14399273 763592 142483 37117 0% PB-Jelly 179 14718213 747750 195225 85127 13% • Try-and-fail process • Automated pipeline developed in order to streamline these analysis • MASURCA surprisingly the “best” assembler
  • 11. MaSuRCA HGAP PB-Jelly (HAH) Validation-Module
  • 13. Optical Maps PacBio produces the best assembly however 290 contigs contigs are produced. Optical Maps allowed to obtain the 2D representation of the 7 chromosomes. N.B. chromosome number was one of the biological questions of this project!!! But much more can be done!!!
  • 14. Incredible tool to finish (or almost finish) genomes % contigs placed Total size of placed contigs % size placed contigs % genome covered pacBio+OpGene 94.12 11578995 97% 77.05 Allpaths+OpGene 71.88 10692027 84% 52.88 Allpaths+Masurca+Opgene 80.65 27506424 92% 69.64 Allpaths+PacBio+Opgene 82.32 22271022 91% 83.05 Masurca+PacBio+pgene 94.44 28393392 98% 83.79 Allpaths+Masurca+PacBio+Opgene 85.42 39085419 94% 87.39 Combing all the technologies
  • 15. Conclusions – Take home message Attempt to automate de novo assembly process: • https://github.com/vezzi/de_novo_scilife • Not 100% automated Illumina, PacBio, Hybrid assemblies: • PacBio alone seems to produce the best assemblers • Hybrid assembly seems to not be able to correct merged-assembly problems Mixing technologies is always a good idea: • Possibility to compensate technological biases • Allows to produce better assemblies