SlideShare uma empresa Scribd logo
1 de 28
Luca Cozzuto
Bioinformatics Core Facility
vectorQC
A pipeline for assembling
and annotation of vectors
Background
A vector is a DNA molecule used as a vehicle to carry foreign genetic material into a
cell, where it can be replicated and/or expressed.
The vector itself is generally a DNA sequence that consists of an insert (transgene) and
a larger sequence that serves as the "backbone" of the vector.
Background
Vector
Host cell
A vector is a DNA molecule used as a vehicle to carry foreign genetic material into a
cell, where it can be replicated and/or expressed.
The vector itself is generally a DNA sequence that consists of an insert (transgene) and
a larger sequence that serves as the "backbone" of the vector.
Background
Vector
Host cell
Amplification (cloning vector)
A vector is a DNA molecule used as a vehicle to carry foreign genetic material into a
cell, where it can be replicated and/or expressed.
The vector itself is generally a DNA sequence that consists of an insert (transgene) and
a larger sequence that serves as the "backbone" of the vector.
Background
Vector
Host cell
Amplification (cloning vector)
Expression (expression vector)
A vector is a DNA molecule used as a vehicle to carry foreign genetic material into a
cell, where it can be replicated and/or expressed.
The vector itself is generally a DNA sequence that consists of an insert (transgene) and
a larger sequence that serves as the "backbone" of the vector.
Background
A vector is composed of different elements:
• Origin of replication
• Cloning sites: one or more targets for restriction enzymes
The pBR322 plasmid
• Reporter genes: genes that activate / inactivate
their function after successful insertion and colour
the positive colonies
• Antibiotic resistance: for selecting only the
colonies containing the vector
• Promoter
• …
Source: wikipedia
The problem
Nowadays vectors are considered a basic tools in biotechnology and having a library of
vector in a lab / facility is quite common.
After each year there is an increase of the risk of mis-labelling, construct degradation,
contamination.
Having a quality control of the integrity of the
vectors backbone and of the inserted DNA
could help in avoiding wasting of time and
money and in reducing errors.
Solution
Biomolecular Screening
&
Protein Technologies Unit
Genomics Unit
Bioinformatics Unit
Solution
Massive
sequencing
Pool of vectors
Solution
Massive
sequencing
Pool of vectors Analysis
Reproducible
pipeline
Solution
Massive
sequencing
Pool of vectors Analysis
Reproducible
pipeline
Result
Report and map of
each vector
Database
The pipeline: vectorQC
Fragmented DNA
Scaffolds / whole
constructs
Quality
trimming and
assembly
vectorQC
Fragmented DNA
Scaffolds / whole
constructs
Quality
trimming and
assembly
Annotation of
features
DB of features
+ list of inserts
Annotations
Fragmented DNA
Scaffolds / whole
constructs
Quality
trimming and
assembly
Annotation of
features
DB of features
+ list of inserts
Annotations
Generating
maps Generating report
and sequences
vectorQC
Quality control and trimming
• FASTQC: QC of initial and trimmed reads
• Skewer: trimming the raw reads.
vectorQC
Quality control and trimming
• FASTQC: QC of initial and trimmed reads
• Skewer: trimming the raw reads.
Read assembly
• Flash: merging of overlapping reads (optional)
• SPAdes: assembly that is corrected with a custom script for addressing the circularity
• Custom script: to randomly join the scaffolds in a single molecule
vectorQC
Quality control and trimming
• FASTQC: QC of initial and trimmed reads
• Skewer: trimming the raw reads.
Read assembly
• Flash: merging of overlapping reads (optional)
• SPAdes: assembly that is corrected with a custom script for addressing the circularity
• Custom script: to randomly join the scaffolds in a single molecule
Annotation
• Blast: annotating features and eventually detecting the DNA insert.
• Restrict (Emboss): for detecting restriction enzyme sites
• Circular Genome Viewer: for generating the maps
• MultiQC: for collecting the results in a comprehensive report
vectorQC
Available resources
• Database of features: from Plasmapper tool, but can be expanded
• Database of restriction enzyme: REBASE
Custom resources
• Insert list: custom fasta file with the name of the inserts
vectorQC
Available resources
• Database of features: from Plasmapper tool, but can be expanded
• Database of restriction enzyme: REBASE
Custom resources
• Insert list: custom fasta file with the name of the inserts
https://github.com/biocorecrg/vectorQC
vectorQC
Available resources
• Database of features: from Plasmapper tool, but can be expanded
• Database of restriction enzyme: REBASE
Custom resources
• Insert list: custom fasta file with the name of the inserts
https://github.com/biocorecrg/vectorQC
vectorQC
vectorQC
vectorQC
vectorQC
Good practices
Good practices
Continuous integration
Good practices
Docker image in dockerhub with automatic buildings
Next developments
• Improving the assembly: removing the low covered contigs
• Comparison with reference: if provided we should check the concordance of the
contigs with the reference
• Detection of variants: SNP / Indel calling against the reference if provided
https://github.com/biocorecrg/vectorQC
Thank you!
Toni Hermoso Pulido
Julia Ponomarenko
Sarah Bonnin
Jochen Hecht (Genomics Unit)
Carlo Carolis (BS&PT Unit)

Mais conteúdo relacionado

Mais procurados

Approaches to analysing 1000s of bacterial isolates - ICEID 2015 Atlanta, USA...
Approaches to analysing 1000s of bacterial isolates - ICEID 2015 Atlanta, USA...Approaches to analysing 1000s of bacterial isolates - ICEID 2015 Atlanta, USA...
Approaches to analysing 1000s of bacterial isolates - ICEID 2015 Atlanta, USA...Torsten Seemann
 
GLBIO/CCBC Metagenomics Workshop
GLBIO/CCBC Metagenomics WorkshopGLBIO/CCBC Metagenomics Workshop
GLBIO/CCBC Metagenomics WorkshopMorgan Langille
 
Long read sequencing - LSCC lab talk - fri 5 june 2015
Long read sequencing - LSCC lab talk - fri 5 june 2015Long read sequencing - LSCC lab talk - fri 5 june 2015
Long read sequencing - LSCC lab talk - fri 5 june 2015Torsten Seemann
 
Making Use of NGS Data: From Reads to Trees and Annotations
Making Use of NGS Data: From Reads to Trees and AnnotationsMaking Use of NGS Data: From Reads to Trees and Annotations
Making Use of NGS Data: From Reads to Trees and AnnotationsJoão André Carriço
 
wings2014 Workshop 1 Design, sequence, align, count, visualize
wings2014 Workshop 1 Design, sequence, align, count, visualizewings2014 Workshop 1 Design, sequence, align, count, visualize
wings2014 Workshop 1 Design, sequence, align, count, visualizeAnn Loraine
 
Reproducible bioinformatics pipelines with Docker and Anduril
Reproducible bioinformatics pipelines with Docker and AndurilReproducible bioinformatics pipelines with Docker and Anduril
Reproducible bioinformatics pipelines with Docker and AndurilChristian Frech
 
Viral biodiversity in rodents
Viral biodiversity in rodentsViral biodiversity in rodents
Viral biodiversity in rodentsNacho Caballero
 
DEseq, voom and vst
DEseq, voom and vstDEseq, voom and vst
DEseq, voom and vstQiang Kou
 
NGx Sequencing 101-platforms
NGx Sequencing 101-platformsNGx Sequencing 101-platforms
NGx Sequencing 101-platformsAllSeq
 
Caporaso sloan qiime_workshop_slides_18_oct2012
Caporaso sloan qiime_workshop_slides_18_oct2012Caporaso sloan qiime_workshop_slides_18_oct2012
Caporaso sloan qiime_workshop_slides_18_oct2012gregcaporaso
 
LUGM-Update of the Illumina Analysis Pipeline
LUGM-Update of the Illumina Analysis PipelineLUGM-Update of the Illumina Analysis Pipeline
LUGM-Update of the Illumina Analysis PipelineHai-Wei Yen
 
Genomic Big Data Management, Integration and Mining - Emanuel Weitschek
Genomic Big Data Management, Integration and Mining - Emanuel WeitschekGenomic Big Data Management, Integration and Mining - Emanuel Weitschek
Genomic Big Data Management, Integration and Mining - Emanuel WeitschekData Driven Innovation
 
BITS training - UCSC Genome Browser - Part 2
BITS training - UCSC Genome Browser - Part 2BITS training - UCSC Genome Browser - Part 2
BITS training - UCSC Genome Browser - Part 2BITS
 
Introduction to 16S rRNA gene multivariate analysis
Introduction to 16S rRNA gene multivariate analysisIntroduction to 16S rRNA gene multivariate analysis
Introduction to 16S rRNA gene multivariate analysisJosh Neufeld
 
Pipeline or pipe dream - Midlands Micro Meeting UK - mon 15 sep 2014
Pipeline or pipe dream - Midlands Micro Meeting UK - mon 15 sep 2014Pipeline or pipe dream - Midlands Micro Meeting UK - mon 15 sep 2014
Pipeline or pipe dream - Midlands Micro Meeting UK - mon 15 sep 2014Torsten Seemann
 
An introduction to RNA-seq data analysis
An introduction to RNA-seq data analysisAn introduction to RNA-seq data analysis
An introduction to RNA-seq data analysisAGRF_Ltd
 
Genome simulation and applications
Genome simulation and applicationsGenome simulation and applications
Genome simulation and applicationsHari Prasad
 

Mais procurados (20)

Approaches to analysing 1000s of bacterial isolates - ICEID 2015 Atlanta, USA...
Approaches to analysing 1000s of bacterial isolates - ICEID 2015 Atlanta, USA...Approaches to analysing 1000s of bacterial isolates - ICEID 2015 Atlanta, USA...
Approaches to analysing 1000s of bacterial isolates - ICEID 2015 Atlanta, USA...
 
Abrf 2017 hadfield j
Abrf 2017 hadfield jAbrf 2017 hadfield j
Abrf 2017 hadfield j
 
GLBIO/CCBC Metagenomics Workshop
GLBIO/CCBC Metagenomics WorkshopGLBIO/CCBC Metagenomics Workshop
GLBIO/CCBC Metagenomics Workshop
 
Long read sequencing - LSCC lab talk - fri 5 june 2015
Long read sequencing - LSCC lab talk - fri 5 june 2015Long read sequencing - LSCC lab talk - fri 5 june 2015
Long read sequencing - LSCC lab talk - fri 5 june 2015
 
Making Use of NGS Data: From Reads to Trees and Annotations
Making Use of NGS Data: From Reads to Trees and AnnotationsMaking Use of NGS Data: From Reads to Trees and Annotations
Making Use of NGS Data: From Reads to Trees and Annotations
 
wings2014 Workshop 1 Design, sequence, align, count, visualize
wings2014 Workshop 1 Design, sequence, align, count, visualizewings2014 Workshop 1 Design, sequence, align, count, visualize
wings2014 Workshop 1 Design, sequence, align, count, visualize
 
Reproducible bioinformatics pipelines with Docker and Anduril
Reproducible bioinformatics pipelines with Docker and AndurilReproducible bioinformatics pipelines with Docker and Anduril
Reproducible bioinformatics pipelines with Docker and Anduril
 
Viral biodiversity in rodents
Viral biodiversity in rodentsViral biodiversity in rodents
Viral biodiversity in rodents
 
DEseq, voom and vst
DEseq, voom and vstDEseq, voom and vst
DEseq, voom and vst
 
NGx Sequencing 101-platforms
NGx Sequencing 101-platformsNGx Sequencing 101-platforms
NGx Sequencing 101-platforms
 
Robust tn5 transposase
Robust tn5 transposaseRobust tn5 transposase
Robust tn5 transposase
 
Benjamin Stielow - Fungi
Benjamin Stielow - FungiBenjamin Stielow - Fungi
Benjamin Stielow - Fungi
 
Caporaso sloan qiime_workshop_slides_18_oct2012
Caporaso sloan qiime_workshop_slides_18_oct2012Caporaso sloan qiime_workshop_slides_18_oct2012
Caporaso sloan qiime_workshop_slides_18_oct2012
 
LUGM-Update of the Illumina Analysis Pipeline
LUGM-Update of the Illumina Analysis PipelineLUGM-Update of the Illumina Analysis Pipeline
LUGM-Update of the Illumina Analysis Pipeline
 
Genomic Big Data Management, Integration and Mining - Emanuel Weitschek
Genomic Big Data Management, Integration and Mining - Emanuel WeitschekGenomic Big Data Management, Integration and Mining - Emanuel Weitschek
Genomic Big Data Management, Integration and Mining - Emanuel Weitschek
 
BITS training - UCSC Genome Browser - Part 2
BITS training - UCSC Genome Browser - Part 2BITS training - UCSC Genome Browser - Part 2
BITS training - UCSC Genome Browser - Part 2
 
Introduction to 16S rRNA gene multivariate analysis
Introduction to 16S rRNA gene multivariate analysisIntroduction to 16S rRNA gene multivariate analysis
Introduction to 16S rRNA gene multivariate analysis
 
Pipeline or pipe dream - Midlands Micro Meeting UK - mon 15 sep 2014
Pipeline or pipe dream - Midlands Micro Meeting UK - mon 15 sep 2014Pipeline or pipe dream - Midlands Micro Meeting UK - mon 15 sep 2014
Pipeline or pipe dream - Midlands Micro Meeting UK - mon 15 sep 2014
 
An introduction to RNA-seq data analysis
An introduction to RNA-seq data analysisAn introduction to RNA-seq data analysis
An introduction to RNA-seq data analysis
 
Genome simulation and applications
Genome simulation and applicationsGenome simulation and applications
Genome simulation and applications
 

Semelhante a Vector assembly and annotation pipeline vectorQC

Galaxy dna-seq-variant calling-presentationandpractical_gent_april-2016
Galaxy dna-seq-variant calling-presentationandpractical_gent_april-2016Galaxy dna-seq-variant calling-presentationandpractical_gent_april-2016
Galaxy dna-seq-variant calling-presentationandpractical_gent_april-2016Prof. Wim Van Criekinge
 
Comparison Between Different Types Of Vectors
Comparison Between Different Types Of Vectors Comparison Between Different Types Of Vectors
Comparison Between Different Types Of Vectors فہیمہ کاسی
 
Production Bioinformatics, emphasis on Production
Production Bioinformatics, emphasis on ProductionProduction Bioinformatics, emphasis on Production
Production Bioinformatics, emphasis on ProductionChris Dwan
 
Genomiclibrary 151004020241-lva1-app6891
Genomiclibrary 151004020241-lva1-app6891Genomiclibrary 151004020241-lva1-app6891
Genomiclibrary 151004020241-lva1-app6891saurabh verma
 
Genomic and c dna library by Kailash Sontakke
Genomic and c dna library by Kailash SontakkeGenomic and c dna library by Kailash Sontakke
Genomic and c dna library by Kailash SontakkeKAILASHSONTAKKE
 
cloning vectors.pptx Biotechnology class
cloning vectors.pptx Biotechnology classcloning vectors.pptx Biotechnology class
cloning vectors.pptx Biotechnology classrakeshbarik8
 
Genome in a Bottle - Towards new benchmarks for the “dark matter” of the huma...
Genome in a Bottle - Towards new benchmarks for the “dark matter” of the huma...Genome in a Bottle - Towards new benchmarks for the “dark matter” of the huma...
Genome in a Bottle - Towards new benchmarks for the “dark matter” of the huma...GenomeInABottle
 
Genome Simulation & Applications: Use of Managed Distributed Compute Infrastr...
Genome Simulation & Applications: Use of Managed Distributed Compute Infrastr...Genome Simulation & Applications: Use of Managed Distributed Compute Infrastr...
Genome Simulation & Applications: Use of Managed Distributed Compute Infrastr...Nick Brown
 
Kim Pruitt trainingbiocuration2015
Kim Pruitt trainingbiocuration2015Kim Pruitt trainingbiocuration2015
Kim Pruitt trainingbiocuration2015Kim D. Pruitt
 
DNA_cloning_principles and procedures.ppt
DNA_cloning_principles and procedures.pptDNA_cloning_principles and procedures.ppt
DNA_cloning_principles and procedures.pptChisamaSichone1
 

Semelhante a Vector assembly and annotation pipeline vectorQC (20)

Galaxy dna-seq-variant calling-presentationandpractical_gent_april-2016
Galaxy dna-seq-variant calling-presentationandpractical_gent_april-2016Galaxy dna-seq-variant calling-presentationandpractical_gent_april-2016
Galaxy dna-seq-variant calling-presentationandpractical_gent_april-2016
 
Vectors
VectorsVectors
Vectors
 
Variant analysis and whole exome sequencing
Variant analysis and whole exome sequencingVariant analysis and whole exome sequencing
Variant analysis and whole exome sequencing
 
Major databases in bioinformatics
Major databases in bioinformaticsMajor databases in bioinformatics
Major databases in bioinformatics
 
cloning vectors.ppt
cloning vectors.pptcloning vectors.ppt
cloning vectors.ppt
 
Comparison Between Different Types Of Vectors
Comparison Between Different Types Of Vectors Comparison Between Different Types Of Vectors
Comparison Between Different Types Of Vectors
 
Production Bioinformatics, emphasis on Production
Production Bioinformatics, emphasis on ProductionProduction Bioinformatics, emphasis on Production
Production Bioinformatics, emphasis on Production
 
BioWeka
BioWekaBioWeka
BioWeka
 
Genomiclibrary 151004020241-lva1-app6891
Genomiclibrary 151004020241-lva1-app6891Genomiclibrary 151004020241-lva1-app6891
Genomiclibrary 151004020241-lva1-app6891
 
Gene library
Gene libraryGene library
Gene library
 
Gwas.emes.comp
Gwas.emes.compGwas.emes.comp
Gwas.emes.comp
 
Cloning vector
Cloning vectorCloning vector
Cloning vector
 
Cloning vectors
Cloning vectorsCloning vectors
Cloning vectors
 
Genomic and c dna library by Kailash Sontakke
Genomic and c dna library by Kailash SontakkeGenomic and c dna library by Kailash Sontakke
Genomic and c dna library by Kailash Sontakke
 
cloning vectors.pptx Biotechnology class
cloning vectors.pptx Biotechnology classcloning vectors.pptx Biotechnology class
cloning vectors.pptx Biotechnology class
 
Principles of cloning DNA introduction
Principles of cloning DNA introductionPrinciples of cloning DNA introduction
Principles of cloning DNA introduction
 
Genome in a Bottle - Towards new benchmarks for the “dark matter” of the huma...
Genome in a Bottle - Towards new benchmarks for the “dark matter” of the huma...Genome in a Bottle - Towards new benchmarks for the “dark matter” of the huma...
Genome in a Bottle - Towards new benchmarks for the “dark matter” of the huma...
 
Genome Simulation & Applications: Use of Managed Distributed Compute Infrastr...
Genome Simulation & Applications: Use of Managed Distributed Compute Infrastr...Genome Simulation & Applications: Use of Managed Distributed Compute Infrastr...
Genome Simulation & Applications: Use of Managed Distributed Compute Infrastr...
 
Kim Pruitt trainingbiocuration2015
Kim Pruitt trainingbiocuration2015Kim Pruitt trainingbiocuration2015
Kim Pruitt trainingbiocuration2015
 
DNA_cloning_principles and procedures.ppt
DNA_cloning_principles and procedures.pptDNA_cloning_principles and procedures.ppt
DNA_cloning_principles and procedures.ppt
 

Mais de Luca Cozzuto

Course on parsing methods for biologists with a focus on ChIP-seq data
Course on parsing methods for biologists with a focus on ChIP-seq dataCourse on parsing methods for biologists with a focus on ChIP-seq data
Course on parsing methods for biologists with a focus on ChIP-seq dataLuca Cozzuto
 
From Zero to Nextflow 2017
From Zero to Nextflow 2017From Zero to Nextflow 2017
From Zero to Nextflow 2017Luca Cozzuto
 
Benchmarking 16S rRNA gene sequencing and bioinformatics tools for identifica...
Benchmarking 16S rRNA gene sequencing and bioinformatics tools for identifica...Benchmarking 16S rRNA gene sequencing and bioinformatics tools for identifica...
Benchmarking 16S rRNA gene sequencing and bioinformatics tools for identifica...Luca Cozzuto
 
Annotating nc-RNAs with Rfam
Annotating nc-RNAs with RfamAnnotating nc-RNAs with Rfam
Annotating nc-RNAs with RfamLuca Cozzuto
 

Mais de Luca Cozzuto (6)

Course on parsing methods for biologists with a focus on ChIP-seq data
Course on parsing methods for biologists with a focus on ChIP-seq dataCourse on parsing methods for biologists with a focus on ChIP-seq data
Course on parsing methods for biologists with a focus on ChIP-seq data
 
From Zero to Nextflow 2017
From Zero to Nextflow 2017From Zero to Nextflow 2017
From Zero to Nextflow 2017
 
Benchmarking 16S rRNA gene sequencing and bioinformatics tools for identifica...
Benchmarking 16S rRNA gene sequencing and bioinformatics tools for identifica...Benchmarking 16S rRNA gene sequencing and bioinformatics tools for identifica...
Benchmarking 16S rRNA gene sequencing and bioinformatics tools for identifica...
 
AnnoWiki
AnnoWikiAnnoWiki
AnnoWiki
 
Macs course
Macs courseMacs course
Macs course
 
Annotating nc-RNAs with Rfam
Annotating nc-RNAs with RfamAnnotating nc-RNAs with Rfam
Annotating nc-RNAs with Rfam
 

Último

Call Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCR
Call Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCRCall Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCR
Call Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCRlizamodels9
 
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...Universidade Federal de Sergipe - UFS
 
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...Universidade Federal de Sergipe - UFS
 
User Guide: Magellan MX™ Weather Station
User Guide: Magellan MX™ Weather StationUser Guide: Magellan MX™ Weather Station
User Guide: Magellan MX™ Weather StationColumbia Weather Systems
 
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.PraveenaKalaiselvan1
 
User Guide: Pulsar™ Weather Station (Columbia Weather Systems)
User Guide: Pulsar™ Weather Station (Columbia Weather Systems)User Guide: Pulsar™ Weather Station (Columbia Weather Systems)
User Guide: Pulsar™ Weather Station (Columbia Weather Systems)Columbia Weather Systems
 
User Guide: Orion™ Weather Station (Columbia Weather Systems)
User Guide: Orion™ Weather Station (Columbia Weather Systems)User Guide: Orion™ Weather Station (Columbia Weather Systems)
User Guide: Orion™ Weather Station (Columbia Weather Systems)Columbia Weather Systems
 
Base editing, prime editing, Cas13 & RNA editing and organelle base editing
Base editing, prime editing, Cas13 & RNA editing and organelle base editingBase editing, prime editing, Cas13 & RNA editing and organelle base editing
Base editing, prime editing, Cas13 & RNA editing and organelle base editingNetHelix
 
Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...
Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...
Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...D. B. S. College Kanpur
 
Four Spheres of the Earth Presentation.ppt
Four Spheres of the Earth Presentation.pptFour Spheres of the Earth Presentation.ppt
Four Spheres of the Earth Presentation.pptJoemSTuliba
 
Environmental Biotechnology Topic:- Microbial Biosensor
Environmental Biotechnology Topic:- Microbial BiosensorEnvironmental Biotechnology Topic:- Microbial Biosensor
Environmental Biotechnology Topic:- Microbial Biosensorsonawaneprad
 
The dark energy paradox leads to a new structure of spacetime.pptx
The dark energy paradox leads to a new structure of spacetime.pptxThe dark energy paradox leads to a new structure of spacetime.pptx
The dark energy paradox leads to a new structure of spacetime.pptxEran Akiva Sinbar
 
Davis plaque method.pptx recombinant DNA technology
Davis plaque method.pptx recombinant DNA technologyDavis plaque method.pptx recombinant DNA technology
Davis plaque method.pptx recombinant DNA technologycaarthichand2003
 
Harmful and Useful Microorganisms Presentation
Harmful and Useful Microorganisms PresentationHarmful and Useful Microorganisms Presentation
Harmful and Useful Microorganisms Presentationtahreemzahra82
 
Pests of safflower_Binomics_Identification_Dr.UPR.pdf
Pests of safflower_Binomics_Identification_Dr.UPR.pdfPests of safflower_Binomics_Identification_Dr.UPR.pdf
Pests of safflower_Binomics_Identification_Dr.UPR.pdfPirithiRaju
 
Pests of jatropha_Bionomics_identification_Dr.UPR.pdf
Pests of jatropha_Bionomics_identification_Dr.UPR.pdfPests of jatropha_Bionomics_identification_Dr.UPR.pdf
Pests of jatropha_Bionomics_identification_Dr.UPR.pdfPirithiRaju
 
Microteaching on terms used in filtration .Pharmaceutical Engineering
Microteaching on terms used in filtration .Pharmaceutical EngineeringMicroteaching on terms used in filtration .Pharmaceutical Engineering
Microteaching on terms used in filtration .Pharmaceutical EngineeringPrajakta Shinde
 
OECD bibliometric indicators: Selected highlights, April 2024
OECD bibliometric indicators: Selected highlights, April 2024OECD bibliometric indicators: Selected highlights, April 2024
OECD bibliometric indicators: Selected highlights, April 2024innovationoecd
 

Último (20)

Call Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCR
Call Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCRCall Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCR
Call Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCR
 
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...
 
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...
 
User Guide: Magellan MX™ Weather Station
User Guide: Magellan MX™ Weather StationUser Guide: Magellan MX™ Weather Station
User Guide: Magellan MX™ Weather Station
 
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
 
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.
 
User Guide: Pulsar™ Weather Station (Columbia Weather Systems)
User Guide: Pulsar™ Weather Station (Columbia Weather Systems)User Guide: Pulsar™ Weather Station (Columbia Weather Systems)
User Guide: Pulsar™ Weather Station (Columbia Weather Systems)
 
User Guide: Orion™ Weather Station (Columbia Weather Systems)
User Guide: Orion™ Weather Station (Columbia Weather Systems)User Guide: Orion™ Weather Station (Columbia Weather Systems)
User Guide: Orion™ Weather Station (Columbia Weather Systems)
 
Base editing, prime editing, Cas13 & RNA editing and organelle base editing
Base editing, prime editing, Cas13 & RNA editing and organelle base editingBase editing, prime editing, Cas13 & RNA editing and organelle base editing
Base editing, prime editing, Cas13 & RNA editing and organelle base editing
 
Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...
Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...
Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...
 
Four Spheres of the Earth Presentation.ppt
Four Spheres of the Earth Presentation.pptFour Spheres of the Earth Presentation.ppt
Four Spheres of the Earth Presentation.ppt
 
Environmental Biotechnology Topic:- Microbial Biosensor
Environmental Biotechnology Topic:- Microbial BiosensorEnvironmental Biotechnology Topic:- Microbial Biosensor
Environmental Biotechnology Topic:- Microbial Biosensor
 
The dark energy paradox leads to a new structure of spacetime.pptx
The dark energy paradox leads to a new structure of spacetime.pptxThe dark energy paradox leads to a new structure of spacetime.pptx
The dark energy paradox leads to a new structure of spacetime.pptx
 
Davis plaque method.pptx recombinant DNA technology
Davis plaque method.pptx recombinant DNA technologyDavis plaque method.pptx recombinant DNA technology
Davis plaque method.pptx recombinant DNA technology
 
Harmful and Useful Microorganisms Presentation
Harmful and Useful Microorganisms PresentationHarmful and Useful Microorganisms Presentation
Harmful and Useful Microorganisms Presentation
 
Pests of safflower_Binomics_Identification_Dr.UPR.pdf
Pests of safflower_Binomics_Identification_Dr.UPR.pdfPests of safflower_Binomics_Identification_Dr.UPR.pdf
Pests of safflower_Binomics_Identification_Dr.UPR.pdf
 
Hot Sexy call girls in Moti Nagar,🔝 9953056974 🔝 escort Service
Hot Sexy call girls in  Moti Nagar,🔝 9953056974 🔝 escort ServiceHot Sexy call girls in  Moti Nagar,🔝 9953056974 🔝 escort Service
Hot Sexy call girls in Moti Nagar,🔝 9953056974 🔝 escort Service
 
Pests of jatropha_Bionomics_identification_Dr.UPR.pdf
Pests of jatropha_Bionomics_identification_Dr.UPR.pdfPests of jatropha_Bionomics_identification_Dr.UPR.pdf
Pests of jatropha_Bionomics_identification_Dr.UPR.pdf
 
Microteaching on terms used in filtration .Pharmaceutical Engineering
Microteaching on terms used in filtration .Pharmaceutical EngineeringMicroteaching on terms used in filtration .Pharmaceutical Engineering
Microteaching on terms used in filtration .Pharmaceutical Engineering
 
OECD bibliometric indicators: Selected highlights, April 2024
OECD bibliometric indicators: Selected highlights, April 2024OECD bibliometric indicators: Selected highlights, April 2024
OECD bibliometric indicators: Selected highlights, April 2024
 

Vector assembly and annotation pipeline vectorQC

  • 1. Luca Cozzuto Bioinformatics Core Facility vectorQC A pipeline for assembling and annotation of vectors
  • 2. Background A vector is a DNA molecule used as a vehicle to carry foreign genetic material into a cell, where it can be replicated and/or expressed. The vector itself is generally a DNA sequence that consists of an insert (transgene) and a larger sequence that serves as the "backbone" of the vector.
  • 3. Background Vector Host cell A vector is a DNA molecule used as a vehicle to carry foreign genetic material into a cell, where it can be replicated and/or expressed. The vector itself is generally a DNA sequence that consists of an insert (transgene) and a larger sequence that serves as the "backbone" of the vector.
  • 4. Background Vector Host cell Amplification (cloning vector) A vector is a DNA molecule used as a vehicle to carry foreign genetic material into a cell, where it can be replicated and/or expressed. The vector itself is generally a DNA sequence that consists of an insert (transgene) and a larger sequence that serves as the "backbone" of the vector.
  • 5. Background Vector Host cell Amplification (cloning vector) Expression (expression vector) A vector is a DNA molecule used as a vehicle to carry foreign genetic material into a cell, where it can be replicated and/or expressed. The vector itself is generally a DNA sequence that consists of an insert (transgene) and a larger sequence that serves as the "backbone" of the vector.
  • 6. Background A vector is composed of different elements: • Origin of replication • Cloning sites: one or more targets for restriction enzymes The pBR322 plasmid • Reporter genes: genes that activate / inactivate their function after successful insertion and colour the positive colonies • Antibiotic resistance: for selecting only the colonies containing the vector • Promoter • … Source: wikipedia
  • 7. The problem Nowadays vectors are considered a basic tools in biotechnology and having a library of vector in a lab / facility is quite common. After each year there is an increase of the risk of mis-labelling, construct degradation, contamination. Having a quality control of the integrity of the vectors backbone and of the inserted DNA could help in avoiding wasting of time and money and in reducing errors.
  • 8. Solution Biomolecular Screening & Protein Technologies Unit Genomics Unit Bioinformatics Unit
  • 10. Solution Massive sequencing Pool of vectors Analysis Reproducible pipeline
  • 11. Solution Massive sequencing Pool of vectors Analysis Reproducible pipeline Result Report and map of each vector Database
  • 12. The pipeline: vectorQC Fragmented DNA Scaffolds / whole constructs Quality trimming and assembly
  • 13. vectorQC Fragmented DNA Scaffolds / whole constructs Quality trimming and assembly Annotation of features DB of features + list of inserts Annotations
  • 14. Fragmented DNA Scaffolds / whole constructs Quality trimming and assembly Annotation of features DB of features + list of inserts Annotations Generating maps Generating report and sequences vectorQC
  • 15. Quality control and trimming • FASTQC: QC of initial and trimmed reads • Skewer: trimming the raw reads. vectorQC
  • 16. Quality control and trimming • FASTQC: QC of initial and trimmed reads • Skewer: trimming the raw reads. Read assembly • Flash: merging of overlapping reads (optional) • SPAdes: assembly that is corrected with a custom script for addressing the circularity • Custom script: to randomly join the scaffolds in a single molecule vectorQC
  • 17. Quality control and trimming • FASTQC: QC of initial and trimmed reads • Skewer: trimming the raw reads. Read assembly • Flash: merging of overlapping reads (optional) • SPAdes: assembly that is corrected with a custom script for addressing the circularity • Custom script: to randomly join the scaffolds in a single molecule Annotation • Blast: annotating features and eventually detecting the DNA insert. • Restrict (Emboss): for detecting restriction enzyme sites • Circular Genome Viewer: for generating the maps • MultiQC: for collecting the results in a comprehensive report vectorQC
  • 18. Available resources • Database of features: from Plasmapper tool, but can be expanded • Database of restriction enzyme: REBASE Custom resources • Insert list: custom fasta file with the name of the inserts vectorQC
  • 19. Available resources • Database of features: from Plasmapper tool, but can be expanded • Database of restriction enzyme: REBASE Custom resources • Insert list: custom fasta file with the name of the inserts https://github.com/biocorecrg/vectorQC vectorQC
  • 20. Available resources • Database of features: from Plasmapper tool, but can be expanded • Database of restriction enzyme: REBASE Custom resources • Insert list: custom fasta file with the name of the inserts https://github.com/biocorecrg/vectorQC vectorQC
  • 26. Good practices Docker image in dockerhub with automatic buildings
  • 27. Next developments • Improving the assembly: removing the low covered contigs • Comparison with reference: if provided we should check the concordance of the contigs with the reference • Detection of variants: SNP / Indel calling against the reference if provided https://github.com/biocorecrg/vectorQC
  • 28. Thank you! Toni Hermoso Pulido Julia Ponomarenko Sarah Bonnin Jochen Hecht (Genomics Unit) Carlo Carolis (BS&PT Unit)