SlideShare uma empresa Scribd logo
1 de 29
Baixar para ler offline
・
・
・
・
‘the complete set of phylogenetic trees derived
from the proteome of an organism’
Sicheritz-Pontén and Andersson, 2001. Nuc. Acids Res. 29: 545

genome-wide events
+
gene family-specific events

August 2012. At Daitoku-ji Temple, Kyoto
Hypothesis A

Hypothesis B

Hypothesis C

chicken

chicken

chicken

shark

shark

shark

lamprey
hagfish

lamprey

hagfish

lamprey
hagfish

Cyclostomes

human

Cyclostomes

human
Cyclostomes

human

amphioxus

amphioxus

amphioxus

tunicate

tunicate

tunicate

- Mol. phylogeny of 55 gene families
Kuraku et al., 2009. MBE

- Globin gene phylogeny
Hoffmann et al., 2010. PNAS

- Sea lamprey genome analysis
Smith, Kuraku et al., 2013.
Nature Genetics

- Composition of Hox/Dlx clusters
Neidert et al., 2001. PNAS
Irvine et al., 2002. J Exp Zool B
Force et al., 2002. J Exp Zool B etc
- Mol. phylogeny of 33 gene families
Escriva et al., 2002. MBE
- Amphioxus genome
Putnam et al., 2008. Nature

- ParaHox clusters
Furlong et al., 2007. MBE
Kuraku and Kuratani, 2011

Heuristic ML
JTT+G4
ML-BP/NJ-BP
(Kuraku & Kuratani, 2011. Genome Biol. Evol.)

(cf. hidden paralogy)
Informatics

Modern sequencing

Genome Resource & Analysis Unit
Center for Developmental Biology
RIKEN, Kobe, Japan

Molecular Developmental Biology
Sanger sequencing, Cell sorting with FACS, clone distribution, etc.

illumina HiSeq1500

~150 bp reads
in Rapid Run mode

Installed in November 2011
Not only sequencing

Kuraku et al., 2013. Nucleic Acids Res.

Amemiya et al., 2013
・
・
・
・
Our experiences at GRAS
・Main applications: RNA-seq & ChIP-seq
・Diverse non-model organisms for RNA-seq
・Trouble shooting with tight wet-dry communication
・Many requests with limited sample amounts
For retrieving complete genome and original transcriptome

・Sequencers ‘can’ produce ‘data’ from problematic samples
Low quality DNA/RNA, contamination, over-amplification, …

・Look carefully for acceptable pricing and service contents
e.g. How many reads do you need?

・Longer illumina reads are not necessarily beneficial
~150bp on HiSeq & ~300bp MiSeq (as of September 2013)
Prep of libraries with longer inserts
・
・
・
・
Species

Sequenced
at

Gene model by

Sequencing
technology

Published in

# of
authors

Started
in

sea
lamprey

Wash. Univ.

Yandell lab /
Ensembl

Sanger

Nat. Genet.
(2013)

59

2005?

soft-shelled
turtle

BGI

BGI / Ensembl

illumina

Nat. Genet.
(2013)

34

2010

coelacanth

Broad
Institute

Broad / Ensembl

illumina

Nature
(2013)

91

2011
Sequenced at Wash. Univ. Genome Institute

International consortium
Smith, Kuraku, et al. 2013.
Nature Genetics
Contributed analysis
Vertebrate ‘new genes’
GC & codon usage bias
Myelin-associated genes

In-house annotation effort
Trained gene prediction setting
available at Augustus web server

GC-content & codon usage bias
Qiu et al., 2011. BMC Genomics

Horizontal gene transfer
Kuraku et al., 2012. Genome Biol. Evol.

http://www.ensembl.org/Petromyzon_marinus/Info/Index

Coding genes: 10,415

Incomplete genome assembly: Pax6 missing
Incomplete gene annotation: Fgf8/17-A missing
(as of September 2013; release 73)
Amino acid composition

CA

Methods: Correspondence analysis for frequencies of 20 amino acids

CA

Deviation of ‘gene model’ in lamprey genome
Smith, Kuraku, et al. 2013. Nature Genetics
Codon usage bias
Methods: RSCU (Sharp et al., 1986) and ENc (Wright, 1990)
N
sea lamprey
stickleback
Tetraodon
Takifugu
platypus
medaka
dog
human
mouse
ghost shark
zebrafish
chicken
anole lizard
opossum
X. tropicalis

Heavy use of GC-rich codons
Qiu et al., 2011. BMC Genomics
Genomic DNA
Sanger, 454, illumina, or/and PacBio
Heterochromatin etc.

Raw reads
Assembly
Repeats, regions with low depth

Genome assembly (contigs/scaffolds)
Gene prediction (after ‘training’)
‘Unusual’ genes

‘Gene model’
(protein-coding sequences)

Reference: transcriptome, annotated genes in GenBank
Genomic DNA
Sanger, 454, illumina, or/and PacBio

Raw reads
Assembly

Genome assembly (contigs/scaffolds)
Gene prediction (after ‘training’)

‘Gene model’
(protein-coding sequences)

Reference: transcriptome, annotated genes in GenBank
(cf. Assemblathon2 - Bradnam et al., 2013)

‘NG50’ instead of N50
CEGMA (Parra et al., 2007) – coverage of CEGs
CGAL, REAPR, ALE – evaluation by identifying misassemblies

QUAST – computation of assembly summary
Species

Assembly release

# of CEGs found
(including ‘partial’)

Published?

human

GRCh37 (hg19)

248

First draft in 2001

mouse

GRCm38 (mm10)

239

First draft in 2002

X. tropicalis

JGI_4.2

239

Hellsten et al., 2010

coelacanth

LatCal1

236

Amemiya et al., 2013

spotted gar

LepOcu1

235

soft-shell turtle

PelSin_1.0

232

Wang et al., 2013

anole lizard

AnoCar2.0

231

Alföldi et al., 2011

zebrafish

Zv9

230

Howe et al., 2013

chicken

galGal4

220

chicken

WASHUC2.63 (galGal3)

210

First draft in 2004

Japanese lamprey

LetCam1

199

Mehta et al., 2013

sea lamprey

PerMar1

172

Smith et al., 2013

little skate

version2

77

elephant shark

(1.4x)

58

unpublished

unpublished
Venkatesh et al., 2007

248 core eukaryotic genes (CEGs)
Genomic DNA
Sanger, 454, illumina, or/and PacBio

Raw reads
Assembly

Genome assembly (contigs/scaffolds)
Gene prediction (after ‘training’)

‘Gene model’
(protein-coding sequences)

Reference: transcriptome, annotated genes in GenBank
(cf. Assemblathon2 - Bradnam et al., 2013)

‘NG50’ instead of N50
CEGMA (Parra et al., 2007) – coverage of CEGs
CGAL, REAPR, ALE – evaluation by identifying misassemblies

QUAST – computation of assembly summary

‘Annotation Turnover’ and ‘AED’ (Eilbeck et al., 2009)
Also, run CEGMA to check transcript diversity?
– Nakamura et al., 2013
・
・
・
・
- Phylogenetic property of the species of your interest
e.g. Ploidy level, distance to close relatives, …

www.genomesize.com, www.timetree.org

- Any clue about its molecular attributes ?
e.g. GC-content, repeats, intron/UTR length, …
Using existing resources at SRA & Sanger traces at NCBI dbEST
- Genome or transcriptome to sequence ?
Any existing or emerging resources?

- RNA-seq: sequence identification or quantification?
- Sample prep mostly determines the fate of the project
Quantification with Qubit; rRNA removal controlled with BioAnalyzer
Replication > Depth (Rapaport et al., 2013. Genome Biol.)

- Rigorous QC of prepared libraries before sequencing
ChIP-qPCR before ChIP-seq
- Fostering more productive sequencing facilities in Japan
GRAS

accepts visits of facility managers/staffs

- Education of researchers
with dual (wet/dry) capabilities
‘A sequencer or a bioinformatician ?‘
Learning material: ‘Unix & Perl for Biologists’ by Korf Lab
http://korflab.ucdavis.edu/unix_and_Perl/

- Importing latest information from overseas
→ shigehiro-kuraku@cdb.riken.jp

Mais conteúdo relacionado

Mais procurados

PhD Research
PhD ResearchPhD Research
PhD Research
jdcarrick
 
Goldy ABRCMS Poster Final
Goldy ABRCMS Poster FinalGoldy ABRCMS Poster Final
Goldy ABRCMS Poster Final
Goldy Landau
 
RelationshipofZebrafishNeuromastbetween2dpfand7dpf
RelationshipofZebrafishNeuromastbetween2dpfand7dpfRelationshipofZebrafishNeuromastbetween2dpfand7dpf
RelationshipofZebrafishNeuromastbetween2dpfand7dpf
Shermann Alconcel
 
1-s2.0-S0531556514002551-main(1)
1-s2.0-S0531556514002551-main(1)1-s2.0-S0531556514002551-main(1)
1-s2.0-S0531556514002551-main(1)
Xavier Manière
 

Mais procurados (20)

PhD Research
PhD ResearchPhD Research
PhD Research
 
Phylogenetic and Phylogenomic Approaches to the Study of Microbes and Microbi...
Phylogenetic and Phylogenomic Approaches to the Study of Microbes and Microbi...Phylogenetic and Phylogenomic Approaches to the Study of Microbes and Microbi...
Phylogenetic and Phylogenomic Approaches to the Study of Microbes and Microbi...
 
UC Davis EVE161 Lecture 17 by @phylogenomics
 UC Davis EVE161 Lecture 17 by @phylogenomics UC Davis EVE161 Lecture 17 by @phylogenomics
UC Davis EVE161 Lecture 17 by @phylogenomics
 
EVE 161 Winter 2018 Class 15
EVE 161 Winter 2018 Class 15EVE 161 Winter 2018 Class 15
EVE 161 Winter 2018 Class 15
 
Comparing the Amount and Quality of Information from Different Sequencing Str...
Comparing the Amount and Quality of Information from Different Sequencing Str...Comparing the Amount and Quality of Information from Different Sequencing Str...
Comparing the Amount and Quality of Information from Different Sequencing Str...
 
Artículo alzheimer
Artículo alzheimerArtículo alzheimer
Artículo alzheimer
 
UC Davis EVE161 Lecture 18 by @phylogenomics
 UC Davis EVE161 Lecture 18 by @phylogenomics UC Davis EVE161 Lecture 18 by @phylogenomics
UC Davis EVE161 Lecture 18 by @phylogenomics
 
Microbial Phylogenomics (EVE161) Class 13 - Comparative Genomics
Microbial Phylogenomics (EVE161) Class 13 - Comparative GenomicsMicrobial Phylogenomics (EVE161) Class 13 - Comparative Genomics
Microbial Phylogenomics (EVE161) Class 13 - Comparative Genomics
 
Goldy ABRCMS Poster Final
Goldy ABRCMS Poster FinalGoldy ABRCMS Poster Final
Goldy ABRCMS Poster Final
 
Genetic engineering
Genetic engineering Genetic engineering
Genetic engineering
 
Swansea University (October-2020): Challenges of using GWAS in bacteria
Swansea University (October-2020): Challenges of using GWAS in bacteriaSwansea University (October-2020): Challenges of using GWAS in bacteria
Swansea University (October-2020): Challenges of using GWAS in bacteria
 
The Seagrass Microbiome Project
The Seagrass Microbiome Project The Seagrass Microbiome Project
The Seagrass Microbiome Project
 
RelationshipofZebrafishNeuromastbetween2dpfand7dpf
RelationshipofZebrafishNeuromastbetween2dpfand7dpfRelationshipofZebrafishNeuromastbetween2dpfand7dpf
RelationshipofZebrafishNeuromastbetween2dpfand7dpf
 
Microbial Phylogenomics (EVE161) Class 15: Shotgun Metagenomics
Microbial Phylogenomics (EVE161) Class 15: Shotgun Metagenomics Microbial Phylogenomics (EVE161) Class 15: Shotgun Metagenomics
Microbial Phylogenomics (EVE161) Class 15: Shotgun Metagenomics
 
American Gut Project presentation at Masaryk University
American Gut Project presentation at Masaryk UniversityAmerican Gut Project presentation at Masaryk University
American Gut Project presentation at Masaryk University
 
1-s2.0-S0531556514002551-main(1)
1-s2.0-S0531556514002551-main(1)1-s2.0-S0531556514002551-main(1)
1-s2.0-S0531556514002551-main(1)
 
EVE 161 Winter 2018 Class 14
EVE 161 Winter 2018 Class 14EVE 161 Winter 2018 Class 14
EVE 161 Winter 2018 Class 14
 
Sweden_eemis_big_data
Sweden_eemis_big_dataSweden_eemis_big_data
Sweden_eemis_big_data
 
Polymerase Chain Reaction
Polymerase Chain ReactionPolymerase Chain Reaction
Polymerase Chain Reaction
 
Using Supercomputers to Discover the 100 Trillion Bacteria Living Within Each...
Using Supercomputers to Discover the 100 Trillion Bacteria Living Within Each...Using Supercomputers to Discover the 100 Trillion Bacteria Living Within Each...
Using Supercomputers to Discover the 100 Trillion Bacteria Living Within Each...
 

Destaque (6)

How Cool Brands Stay Hot at ACAM VMVM
How Cool Brands Stay Hot at ACAM VMVMHow Cool Brands Stay Hot at ACAM VMVM
How Cool Brands Stay Hot at ACAM VMVM
 
Cesc limited
Cesc limitedCesc limited
Cesc limited
 
Screenshot Präsentation Feed Engine
Screenshot Präsentation Feed EngineScreenshot Präsentation Feed Engine
Screenshot Präsentation Feed Engine
 
Production Time Profiling Out of the Box
Production Time Profiling Out of the BoxProduction Time Profiling Out of the Box
Production Time Profiling Out of the Box
 
Agent Banking: Future-proofing Investments with Mobile Solutions
Agent Banking: Future-proofing Investments with Mobile SolutionsAgent Banking: Future-proofing Investments with Mobile Solutions
Agent Banking: Future-proofing Investments with Mobile Solutions
 
I Ciclo de Talleres Creativos para la Igualdad
I Ciclo de Talleres Creativos para la IgualdadI Ciclo de Talleres Creativos para la Igualdad
I Ciclo de Talleres Creativos para la Igualdad
 

Semelhante a Presentation at ZSJ 2013 by Shigehiro Kuraku

Final Draft Convergent Evolution
Final Draft Convergent EvolutionFinal Draft Convergent Evolution
Final Draft Convergent Evolution
Kevin Varty
 
Joe Walsh Thesis
Joe Walsh ThesisJoe Walsh Thesis
Joe Walsh Thesis
Joe Walsh
 
The Fabrication And Modification Of T Cuas With Cellulose...
The Fabrication And Modification Of T Cuas With Cellulose...The Fabrication And Modification Of T Cuas With Cellulose...
The Fabrication And Modification Of T Cuas With Cellulose...
Christy Hunt
 
genetics lab poster SRC
genetics lab poster SRCgenetics lab poster SRC
genetics lab poster SRC
Juan Barrera
 
final AEGIS report
final AEGIS reportfinal AEGIS report
final AEGIS report
Elise Mason
 
Computational prediction and characterization of genomic islands: insights i...
Computational prediction and characterization of genomic islands: insights i...Computational prediction and characterization of genomic islands: insights i...
Computational prediction and characterization of genomic islands: insights i...
Morgan Langille
 
Genome sequencing in vegetable crops
Genome sequencing in vegetable cropsGenome sequencing in vegetable crops
Genome sequencing in vegetable crops
Bommesh
 
Munoz torres web-apollo-workshop_exeter-2014_ss
Munoz torres web-apollo-workshop_exeter-2014_ssMunoz torres web-apollo-workshop_exeter-2014_ss
Munoz torres web-apollo-workshop_exeter-2014_ss
Monica Munoz-Torres
 

Semelhante a Presentation at ZSJ 2013 by Shigehiro Kuraku (20)

Pattemore 2015
Pattemore 2015Pattemore 2015
Pattemore 2015
 
Synthetic biology
Synthetic biologySynthetic biology
Synthetic biology
 
The Emerging Global Community of Microbial Metagenomics Researchers
The Emerging Global Community of Microbial Metagenomics ResearchersThe Emerging Global Community of Microbial Metagenomics Researchers
The Emerging Global Community of Microbial Metagenomics Researchers
 
Final Draft Convergent Evolution
Final Draft Convergent EvolutionFinal Draft Convergent Evolution
Final Draft Convergent Evolution
 
CRISPR PROJECT.pptx
CRISPR PROJECT.pptxCRISPR PROJECT.pptx
CRISPR PROJECT.pptx
 
Science Article by Murakawa
Science Article by MurakawaScience Article by Murakawa
Science Article by Murakawa
 
Zhipeng Li at #ICG12: Draft Genome of the Reindeer (Rangifer tarandus)
Zhipeng Li at #ICG12: Draft Genome of the Reindeer (Rangifer tarandus)Zhipeng Li at #ICG12: Draft Genome of the Reindeer (Rangifer tarandus)
Zhipeng Li at #ICG12: Draft Genome of the Reindeer (Rangifer tarandus)
 
Joe Walsh Thesis
Joe Walsh ThesisJoe Walsh Thesis
Joe Walsh Thesis
 
iGem Project University College Cork
iGem Project University College CorkiGem Project University College Cork
iGem Project University College Cork
 
Poster
PosterPoster
Poster
 
The Fabrication And Modification Of T Cuas With Cellulose...
The Fabrication And Modification Of T Cuas With Cellulose...The Fabrication And Modification Of T Cuas With Cellulose...
The Fabrication And Modification Of T Cuas With Cellulose...
 
Investigation of phylogenic relationships of shrew populations using genetic...
Investigation of phylogenic relationships  of shrew populations using genetic...Investigation of phylogenic relationships  of shrew populations using genetic...
Investigation of phylogenic relationships of shrew populations using genetic...
 
Investigation of phylogenic relationships of shrew populations using genetic...
Investigation of phylogenic relationships  of shrew populations using genetic...Investigation of phylogenic relationships  of shrew populations using genetic...
Investigation of phylogenic relationships of shrew populations using genetic...
 
genetics lab poster SRC
genetics lab poster SRCgenetics lab poster SRC
genetics lab poster SRC
 
final AEGIS report
final AEGIS reportfinal AEGIS report
final AEGIS report
 
Computational prediction and characterization of genomic islands: insights i...
Computational prediction and characterization of genomic islands: insights i...Computational prediction and characterization of genomic islands: insights i...
Computational prediction and characterization of genomic islands: insights i...
 
Genome sequencing in vegetable crops
Genome sequencing in vegetable cropsGenome sequencing in vegetable crops
Genome sequencing in vegetable crops
 
Munoz torres web-apollo-workshop_exeter-2014_ss
Munoz torres web-apollo-workshop_exeter-2014_ssMunoz torres web-apollo-workshop_exeter-2014_ss
Munoz torres web-apollo-workshop_exeter-2014_ss
 
Yeast Genome
Yeast Genome Yeast Genome
Yeast Genome
 
Pielak_DevBiol2004
Pielak_DevBiol2004Pielak_DevBiol2004
Pielak_DevBiol2004
 

Último

The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
heathfieldcps1
 
An Overview of Mutual Funds Bcom Project.pdf
An Overview of Mutual Funds Bcom Project.pdfAn Overview of Mutual Funds Bcom Project.pdf
An Overview of Mutual Funds Bcom Project.pdf
SanaAli374401
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
QucHHunhnh
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global Impact
PECB
 
Gardella_Mateo_IntellectualProperty.pdf.
Gardella_Mateo_IntellectualProperty.pdf.Gardella_Mateo_IntellectualProperty.pdf.
Gardella_Mateo_IntellectualProperty.pdf.
MateoGardella
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
QucHHunhnh
 

Último (20)

The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
 
Measures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDMeasures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SD
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The Basics
 
Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..
 
An Overview of Mutual Funds Bcom Project.pdf
An Overview of Mutual Funds Bcom Project.pdfAn Overview of Mutual Funds Bcom Project.pdf
An Overview of Mutual Funds Bcom Project.pdf
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global Impact
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and Mode
 
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17  How to Extend Models Using Mixin ClassesMixin Classes in Odoo 17  How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
 
Gardella_Mateo_IntellectualProperty.pdf.
Gardella_Mateo_IntellectualProperty.pdf.Gardella_Mateo_IntellectualProperty.pdf.
Gardella_Mateo_IntellectualProperty.pdf.
 
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptxINDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
 
Unit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptxUnit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptx
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
 
Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1
 
Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.ppt
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdf
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot Graph
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
 

Presentation at ZSJ 2013 by Shigehiro Kuraku

  • 1.
  • 3.
  • 4. ‘the complete set of phylogenetic trees derived from the proteome of an organism’ Sicheritz-Pontén and Andersson, 2001. Nuc. Acids Res. 29: 545 genome-wide events + gene family-specific events August 2012. At Daitoku-ji Temple, Kyoto
  • 5. Hypothesis A Hypothesis B Hypothesis C chicken chicken chicken shark shark shark lamprey hagfish lamprey hagfish lamprey hagfish Cyclostomes human Cyclostomes human Cyclostomes human amphioxus amphioxus amphioxus tunicate tunicate tunicate - Mol. phylogeny of 55 gene families Kuraku et al., 2009. MBE - Globin gene phylogeny Hoffmann et al., 2010. PNAS - Sea lamprey genome analysis Smith, Kuraku et al., 2013. Nature Genetics - Composition of Hox/Dlx clusters Neidert et al., 2001. PNAS Irvine et al., 2002. J Exp Zool B Force et al., 2002. J Exp Zool B etc - Mol. phylogeny of 33 gene families Escriva et al., 2002. MBE - Amphioxus genome Putnam et al., 2008. Nature - ParaHox clusters Furlong et al., 2007. MBE
  • 6. Kuraku and Kuratani, 2011 Heuristic ML JTT+G4 ML-BP/NJ-BP
  • 7. (Kuraku & Kuratani, 2011. Genome Biol. Evol.) (cf. hidden paralogy)
  • 8. Informatics Modern sequencing Genome Resource & Analysis Unit Center for Developmental Biology RIKEN, Kobe, Japan Molecular Developmental Biology
  • 9. Sanger sequencing, Cell sorting with FACS, clone distribution, etc. illumina HiSeq1500 ~150 bp reads in Rapid Run mode Installed in November 2011
  • 10. Not only sequencing Kuraku et al., 2013. Nucleic Acids Res. Amemiya et al., 2013
  • 12. Our experiences at GRAS ・Main applications: RNA-seq & ChIP-seq ・Diverse non-model organisms for RNA-seq ・Trouble shooting with tight wet-dry communication ・Many requests with limited sample amounts
  • 13. For retrieving complete genome and original transcriptome ・Sequencers ‘can’ produce ‘data’ from problematic samples Low quality DNA/RNA, contamination, over-amplification, … ・Look carefully for acceptable pricing and service contents e.g. How many reads do you need? ・Longer illumina reads are not necessarily beneficial ~150bp on HiSeq & ~300bp MiSeq (as of September 2013) Prep of libraries with longer inserts
  • 15. Species Sequenced at Gene model by Sequencing technology Published in # of authors Started in sea lamprey Wash. Univ. Yandell lab / Ensembl Sanger Nat. Genet. (2013) 59 2005? soft-shelled turtle BGI BGI / Ensembl illumina Nat. Genet. (2013) 34 2010 coelacanth Broad Institute Broad / Ensembl illumina Nature (2013) 91 2011
  • 16. Sequenced at Wash. Univ. Genome Institute International consortium Smith, Kuraku, et al. 2013. Nature Genetics Contributed analysis Vertebrate ‘new genes’ GC & codon usage bias Myelin-associated genes In-house annotation effort Trained gene prediction setting available at Augustus web server GC-content & codon usage bias Qiu et al., 2011. BMC Genomics Horizontal gene transfer Kuraku et al., 2012. Genome Biol. Evol. http://www.ensembl.org/Petromyzon_marinus/Info/Index Coding genes: 10,415 Incomplete genome assembly: Pax6 missing Incomplete gene annotation: Fgf8/17-A missing (as of September 2013; release 73)
  • 17. Amino acid composition CA Methods: Correspondence analysis for frequencies of 20 amino acids CA Deviation of ‘gene model’ in lamprey genome Smith, Kuraku, et al. 2013. Nature Genetics
  • 18. Codon usage bias Methods: RSCU (Sharp et al., 1986) and ENc (Wright, 1990) N sea lamprey stickleback Tetraodon Takifugu platypus medaka dog human mouse ghost shark zebrafish chicken anole lizard opossum X. tropicalis Heavy use of GC-rich codons Qiu et al., 2011. BMC Genomics
  • 19. Genomic DNA Sanger, 454, illumina, or/and PacBio Heterochromatin etc. Raw reads Assembly Repeats, regions with low depth Genome assembly (contigs/scaffolds) Gene prediction (after ‘training’) ‘Unusual’ genes ‘Gene model’ (protein-coding sequences) Reference: transcriptome, annotated genes in GenBank
  • 20. Genomic DNA Sanger, 454, illumina, or/and PacBio Raw reads Assembly Genome assembly (contigs/scaffolds) Gene prediction (after ‘training’) ‘Gene model’ (protein-coding sequences) Reference: transcriptome, annotated genes in GenBank
  • 21. (cf. Assemblathon2 - Bradnam et al., 2013) ‘NG50’ instead of N50 CEGMA (Parra et al., 2007) – coverage of CEGs CGAL, REAPR, ALE – evaluation by identifying misassemblies QUAST – computation of assembly summary
  • 22. Species Assembly release # of CEGs found (including ‘partial’) Published? human GRCh37 (hg19) 248 First draft in 2001 mouse GRCm38 (mm10) 239 First draft in 2002 X. tropicalis JGI_4.2 239 Hellsten et al., 2010 coelacanth LatCal1 236 Amemiya et al., 2013 spotted gar LepOcu1 235 soft-shell turtle PelSin_1.0 232 Wang et al., 2013 anole lizard AnoCar2.0 231 Alföldi et al., 2011 zebrafish Zv9 230 Howe et al., 2013 chicken galGal4 220 chicken WASHUC2.63 (galGal3) 210 First draft in 2004 Japanese lamprey LetCam1 199 Mehta et al., 2013 sea lamprey PerMar1 172 Smith et al., 2013 little skate version2 77 elephant shark (1.4x) 58 unpublished unpublished Venkatesh et al., 2007 248 core eukaryotic genes (CEGs)
  • 23. Genomic DNA Sanger, 454, illumina, or/and PacBio Raw reads Assembly Genome assembly (contigs/scaffolds) Gene prediction (after ‘training’) ‘Gene model’ (protein-coding sequences) Reference: transcriptome, annotated genes in GenBank
  • 24. (cf. Assemblathon2 - Bradnam et al., 2013) ‘NG50’ instead of N50 CEGMA (Parra et al., 2007) – coverage of CEGs CGAL, REAPR, ALE – evaluation by identifying misassemblies QUAST – computation of assembly summary ‘Annotation Turnover’ and ‘AED’ (Eilbeck et al., 2009) Also, run CEGMA to check transcript diversity?
  • 25. – Nakamura et al., 2013
  • 27. - Phylogenetic property of the species of your interest e.g. Ploidy level, distance to close relatives, … www.genomesize.com, www.timetree.org - Any clue about its molecular attributes ? e.g. GC-content, repeats, intron/UTR length, … Using existing resources at SRA & Sanger traces at NCBI dbEST
  • 28. - Genome or transcriptome to sequence ? Any existing or emerging resources? - RNA-seq: sequence identification or quantification? - Sample prep mostly determines the fate of the project Quantification with Qubit; rRNA removal controlled with BioAnalyzer Replication > Depth (Rapaport et al., 2013. Genome Biol.) - Rigorous QC of prepared libraries before sequencing ChIP-qPCR before ChIP-seq
  • 29. - Fostering more productive sequencing facilities in Japan GRAS accepts visits of facility managers/staffs - Education of researchers with dual (wet/dry) capabilities ‘A sequencer or a bioinformatician ?‘ Learning material: ‘Unix & Perl for Biologists’ by Korf Lab http://korflab.ucdavis.edu/unix_and_Perl/ - Importing latest information from overseas → shigehiro-kuraku@cdb.riken.jp