SlideShare a Scribd company logo
1 of 24
Download to read offline
Transcript reconstruction algorithms available in the
Trinity RNA-Seq package
Daniel Standage
Brendel Group, Indiana University

4 Mar 2014

Daniel Standage (Brendel Group @ IU)

Trinity Assembly

4 Mar 2014

1 / 24
Introduction

RNA-Seq

RNA-Seq

Examination of transcriptomes
deep
effective
affordable

Daniel Standage (Brendel Group @ IU)

Trinity Assembly

4 Mar 2014

2 / 24
Introduction

RNA-Seq

RNA-Seq

High throughput comes at the expense of
contiguity.

Daniel Standage (Brendel Group @ IU)

Trinity Assembly

4 Mar 2014

3 / 24
Introduction

RNA-Seq

RNA-Seq

High throughput comes at the expense of
contiguity...well, at least for now.

Daniel Standage (Brendel Group @ IU)

Trinity Assembly

4 Mar 2014

4 / 24
Introduction

Assembly with Trinity

Transcriptome assembly

In the absence of full-length transcript sequences,
reconstruct full-length sequences from fragments.

Daniel Standage (Brendel Group @ IU)

Trinity Assembly

4 Mar 2014

5 / 24
Introduction

Assembly with Trinity

Trinity RNA-Seq

Daniel Standage (Brendel Group @ IU)

Trinity Assembly

4 Mar 2014

6 / 24
Introduction

Assembly with Trinity

Trinity RNA-Seq

Now with 3 transcript reconstruction modes!
Butterfly (default)
--PasaFly
--CuffFly

Daniel Standage (Brendel Group @ IU)

Trinity Assembly

4 Mar 2014

7 / 24
Introduction

Assembly with Trinity

Review outline

Trinity algorithm
PASA algorithm
Cufflinks algorithm
Discussion

Daniel Standage (Brendel Group @ IU)

Trinity Assembly

4 Mar 2014

8 / 24
Trinity

Inchworm

Step 1: Inchworm

Assemble unique contigs representing transcript
subsequences.
Often produces dominant isoform in full length, and then just unique
portions of alternative isoforms.

Daniel Standage (Brendel Group @ IU)

Trinity Assembly

4 Mar 2014

9 / 24
Trinity

Inchworm

Inchworm procedure

1

Create dictionary of k-mers (k = 25)

2

Remove k-mers containing probable errors (based on coverage?)

3

Selects highest occurring k-mer

4

Build contig by extending k-mer (find highest occurring k-mer with
k − 1 bp overlap, extend 1 bp), remove k-mer from dictionary

5

Repeat previous step until the contig cannot be extended further,
report contig

6

Repeat steps 3-5 until all k-mers are exhausted

Daniel Standage (Brendel Group @ IU)

Trinity Assembly

4 Mar 2014

10 / 24
Trinity

Chrysalis

Step 2: Chrysalis

Group Inchworm contigs, construct de Bruijn
graph for each cluster.
Each connected component of the graph corresponds to one or more genes
with shared sequence.

Daniel Standage (Brendel Group @ IU)

Trinity Assembly

4 Mar 2014

11 / 24
Trinity

Chrysalis

Chrysalis procedure

1

Group contigs if they share perfect overlap of k − 1 bp (with reads
supporting the overlap)

2

Build de Bruijn graph with k − 1 word size for nodes, k for edges;
edges weighted by supporting reads

3

Assign each read to component with which it shares the largest
number of k-mers

Daniel Standage (Brendel Group @ IU)

Trinity Assembly

4 Mar 2014

12 / 24
Trinity

Butterfly

Step 3: Butterfly

Traverse read-supported paths in each subgraph,
enumerate plausible sequences.

Daniel Standage (Brendel Group @ IU)

Trinity Assembly

4 Mar 2014

13 / 24
Trinity

Butterfly

Butterfly procedure

1

2

Graph simplification: merge consecutive nodes in linear paths,
pruning minor deviations
Plausible path scoring: identify paths in graph with read support
Initialize DP table with source nodes (no incoming edges)
Fill in table by extending path prefixes by one node

Daniel Standage (Brendel Group @ IU)

Trinity Assembly

4 Mar 2014

14 / 24
PASA

PASA

Program to Assemble Spliced Alignments
designed for ESTs and FL-cDNAs (pre-NGS era)
works on sequence alignments
computes consensus spliced alignments

Daniel Standage (Brendel Group @ IU)

Trinity Assembly

4 Mar 2014

15 / 24
PASA

PASA algorithm

Input: a set of spliced cDNA alignments A
Output: for each alignment a ∈ A, the largest assembly containing a
1

Sort alignments

2

Test overlapping alignments for compatibility

3

Build DP table, backtrace to find maximal assembly A∗

4

If ∃a ∈ A∗ , build reciprocal DP table, trace to enumerate additional
/
assemblies

Daniel Standage (Brendel Group @ IU)

Trinity Assembly

4 Mar 2014

16 / 24
PASA

PASA algorithm

Recurrences
La = max{Ca , Lb + Ca/b }
b

Ra = max{Ca , Rb + Ca/b }
b

La , Ra : maximum number of cDNAs in an assembly that contains
alignment a, starting from left and right (respectively)
Ca : number of a-compatible alignments in the span of a
Ca/b : number of a-compatible alignments in the span of a but not in
the span of b

Daniel Standage (Brendel Group @ IU)

Trinity Assembly

4 Mar 2014

17 / 24
PASA

PASA algorithm

Daniel Standage (Brendel Group @ IU)

Trinity Assembly

4 Mar 2014

18 / 24
Cufflinks

Cufflinks

designed for short transcript reads (NGS era)
works on read alignments (mappings)
identifies fewest number of transcripts that “explain” the read
mappings

Daniel Standage (Brendel Group @ IU)

Trinity Assembly

4 Mar 2014

19 / 24
Cufflinks

Cufflinks algorithm
Input: overlap graph G of mapped reads
Output: a minimal path cover of G , with each path corresponding
to a single assembled transcript
1

Alignments divided into non-overlapping loci

2

Erroneous read alignments removed

3

Compute transitive reduction of G , G

4

5

Construct bipartite graph G ∗ from transitive closure of G ,with edges
weighted by coverage to “phase” distant exons by their coverage
Compute minimum-cost maximal matching in G ∗ , which corresponds
to minimum path cover of G

Daniel Standage (Brendel Group @ IU)

Trinity Assembly

4 Mar 2014

20 / 24
Discussion

Three different construction approaches

Butterfly: enumerate all plausible transcripts with minimal read
support
PASA: for each alignment, find largest assembly (transcript)
containing the alignment
CuffLinks: find minimal assembl(y|ies) that explain the data,
using read coverage to “phase” distant exons

Daniel Standage (Brendel Group @ IU)

Trinity Assembly

4 Mar 2014

21 / 24
Discussion

Next time: comparison of 8 Trinity assemblies

Four assembly settings
Butterfly
--PasaFly
--CuffFly
Butterfly, --min kmer cov 2

Two input data sets
Groomed data
Groomed data with digital normalization

Daniel Standage (Brendel Group @ IU)

Trinity Assembly

4 Mar 2014

22 / 24
Discussion

Next time: comparison of 8 Trinity assemblies

Hypotheses

(transcripts per assembly)

Butterfly > PasaFly > CuffFly
Diginorm > No diginorm

Daniel Standage (Brendel Group @ IU)

Trinity Assembly

4 Mar 2014

23 / 24
Discussion

Thank you!

Daniel Standage (Brendel Group @ IU)

Trinity Assembly

4 Mar 2014

24 / 24

More Related Content

Viewers also liked

On Diversity: Contemporary Black Midwives Perceptions of Organizational Diver...
On Diversity: Contemporary Black Midwives Perceptions of Organizational Diver...On Diversity: Contemporary Black Midwives Perceptions of Organizational Diver...
On Diversity: Contemporary Black Midwives Perceptions of Organizational Diver...Keisha_Goode
 
Giorth 28 oktober
Giorth 28 oktoberGiorth 28 oktober
Giorth 28 oktoberpopimerg
 
Creations Lingerie 2010 by Juliette Dekeyser
Creations Lingerie 2010 by Juliette DekeyserCreations Lingerie 2010 by Juliette Dekeyser
Creations Lingerie 2010 by Juliette DekeyserJuliette Dekeyser
 
NeuroVault and the vision for data sharing in neuroimaging
NeuroVault and the vision for data sharing in neuroimagingNeuroVault and the vision for data sharing in neuroimaging
NeuroVault and the vision for data sharing in neuroimagingKrzysztof Gorgolewski
 
Evaluation Question 5
Evaluation Question 5Evaluation Question 5
Evaluation Question 5Henry Tait
 
Food To Grow Taller
Food To Grow TallerFood To Grow Taller
Food To Grow Tallersliffebr
 
Balance Management myTeam
Balance Management myTeamBalance Management myTeam
Balance Management myTeamEngage Hill
 
How to prepare Empanadas 4 CAD
How to prepare Empanadas 4 CADHow to prepare Empanadas 4 CAD
How to prepare Empanadas 4 CAD4cadenglish
 
Article Revue Générale des Chemins de Fer décembre2015
Article Revue Générale des Chemins de Fer décembre2015Article Revue Générale des Chemins de Fer décembre2015
Article Revue Générale des Chemins de Fer décembre2015Logicités
 
Celebration service 4.1.14
Celebration service 4.1.14Celebration service 4.1.14
Celebration service 4.1.14KeepSinging
 

Viewers also liked (12)

On Diversity: Contemporary Black Midwives Perceptions of Organizational Diver...
On Diversity: Contemporary Black Midwives Perceptions of Organizational Diver...On Diversity: Contemporary Black Midwives Perceptions of Organizational Diver...
On Diversity: Contemporary Black Midwives Perceptions of Organizational Diver...
 
Giorth 28 oktober
Giorth 28 oktoberGiorth 28 oktober
Giorth 28 oktober
 
Yourprezi
YourpreziYourprezi
Yourprezi
 
Creations Lingerie 2010 by Juliette Dekeyser
Creations Lingerie 2010 by Juliette DekeyserCreations Lingerie 2010 by Juliette Dekeyser
Creations Lingerie 2010 by Juliette Dekeyser
 
NeuroVault and the vision for data sharing in neuroimaging
NeuroVault and the vision for data sharing in neuroimagingNeuroVault and the vision for data sharing in neuroimaging
NeuroVault and the vision for data sharing in neuroimaging
 
Kewirausahaan
KewirausahaanKewirausahaan
Kewirausahaan
 
Evaluation Question 5
Evaluation Question 5Evaluation Question 5
Evaluation Question 5
 
Food To Grow Taller
Food To Grow TallerFood To Grow Taller
Food To Grow Taller
 
Balance Management myTeam
Balance Management myTeamBalance Management myTeam
Balance Management myTeam
 
How to prepare Empanadas 4 CAD
How to prepare Empanadas 4 CADHow to prepare Empanadas 4 CAD
How to prepare Empanadas 4 CAD
 
Article Revue Générale des Chemins de Fer décembre2015
Article Revue Générale des Chemins de Fer décembre2015Article Revue Générale des Chemins de Fer décembre2015
Article Revue Générale des Chemins de Fer décembre2015
 
Celebration service 4.1.14
Celebration service 4.1.14Celebration service 4.1.14
Celebration service 4.1.14
 

More from danielstandage

Brendel Group Presentation: 6 Mar 2013
Brendel Group Presentation: 6 Mar 2013Brendel Group Presentation: 6 Mar 2013
Brendel Group Presentation: 6 Mar 2013danielstandage
 
Brendel Group Presentation: 21 Nov 2013
Brendel Group Presentation: 21 Nov 2013Brendel Group Presentation: 21 Nov 2013
Brendel Group Presentation: 21 Nov 2013danielstandage
 
Brendel Group Presentation: 19 Nov 2013
Brendel Group Presentation: 19 Nov 2013Brendel Group Presentation: 19 Nov 2013
Brendel Group Presentation: 19 Nov 2013danielstandage
 
Brendel Group Presentation: 5 Nov 2013
Brendel Group Presentation: 5 Nov 2013Brendel Group Presentation: 5 Nov 2013
Brendel Group Presentation: 5 Nov 2013danielstandage
 
Brendel Group Presentation: 15 Oct 2013
Brendel Group Presentation: 15 Oct 2013Brendel Group Presentation: 15 Oct 2013
Brendel Group Presentation: 15 Oct 2013danielstandage
 
Brendel Group Presentation: 17 Oct 2013
Brendel Group Presentation: 17 Oct 2013Brendel Group Presentation: 17 Oct 2013
Brendel Group Presentation: 17 Oct 2013danielstandage
 

More from danielstandage (6)

Brendel Group Presentation: 6 Mar 2013
Brendel Group Presentation: 6 Mar 2013Brendel Group Presentation: 6 Mar 2013
Brendel Group Presentation: 6 Mar 2013
 
Brendel Group Presentation: 21 Nov 2013
Brendel Group Presentation: 21 Nov 2013Brendel Group Presentation: 21 Nov 2013
Brendel Group Presentation: 21 Nov 2013
 
Brendel Group Presentation: 19 Nov 2013
Brendel Group Presentation: 19 Nov 2013Brendel Group Presentation: 19 Nov 2013
Brendel Group Presentation: 19 Nov 2013
 
Brendel Group Presentation: 5 Nov 2013
Brendel Group Presentation: 5 Nov 2013Brendel Group Presentation: 5 Nov 2013
Brendel Group Presentation: 5 Nov 2013
 
Brendel Group Presentation: 15 Oct 2013
Brendel Group Presentation: 15 Oct 2013Brendel Group Presentation: 15 Oct 2013
Brendel Group Presentation: 15 Oct 2013
 
Brendel Group Presentation: 17 Oct 2013
Brendel Group Presentation: 17 Oct 2013Brendel Group Presentation: 17 Oct 2013
Brendel Group Presentation: 17 Oct 2013
 

Recently uploaded

A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersNicole Novielli
 
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...itnewsafrica
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfpanagenda
 
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...panagenda
 
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Nikki Chapple
 
JET Technology Labs White Paper for Virtualized Security and Encryption Techn...
JET Technology Labs White Paper for Virtualized Security and Encryption Techn...JET Technology Labs White Paper for Virtualized Security and Encryption Techn...
JET Technology Labs White Paper for Virtualized Security and Encryption Techn...amber724300
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Hiroshi SHIBATA
 
Irene Moetsana-Moeng: Stakeholders in Cybersecurity: Collaborative Defence fo...
Irene Moetsana-Moeng: Stakeholders in Cybersecurity: Collaborative Defence fo...Irene Moetsana-Moeng: Stakeholders in Cybersecurity: Collaborative Defence fo...
Irene Moetsana-Moeng: Stakeholders in Cybersecurity: Collaborative Defence fo...itnewsafrica
 
Microservices, Docker deploy and Microservices source code in C#
Microservices, Docker deploy and Microservices source code in C#Microservices, Docker deploy and Microservices source code in C#
Microservices, Docker deploy and Microservices source code in C#Karmanjay Verma
 
React JS; all concepts. Contains React Features, JSX, functional & Class comp...
React JS; all concepts. Contains React Features, JSX, functional & Class comp...React JS; all concepts. Contains React Features, JSX, functional & Class comp...
React JS; all concepts. Contains React Features, JSX, functional & Class comp...Karmanjay Verma
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesKari Kakkonen
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Strongerpanagenda
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsRavi Sanghani
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityIES VE
 
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security ObservabilityGlenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security Observabilityitnewsafrica
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesThousandEyes
 
Microsoft 365 Copilot: How to boost your productivity with AI – Part two: Dat...
Microsoft 365 Copilot: How to boost your productivity with AI – Part two: Dat...Microsoft 365 Copilot: How to boost your productivity with AI – Part two: Dat...
Microsoft 365 Copilot: How to boost your productivity with AI – Part two: Dat...Nikki Chapple
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfIngrid Airi González
 
React Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkReact Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkPixlogix Infotech
 
All These Sophisticated Attacks, Can We Really Detect Them - PDF
All These Sophisticated Attacks, Can We Really Detect Them - PDFAll These Sophisticated Attacks, Can We Really Detect Them - PDF
All These Sophisticated Attacks, Can We Really Detect Them - PDFMichael Gough
 

Recently uploaded (20)

A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software Developers
 
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
 
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
 
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
 
JET Technology Labs White Paper for Virtualized Security and Encryption Techn...
JET Technology Labs White Paper for Virtualized Security and Encryption Techn...JET Technology Labs White Paper for Virtualized Security and Encryption Techn...
JET Technology Labs White Paper for Virtualized Security and Encryption Techn...
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024
 
Irene Moetsana-Moeng: Stakeholders in Cybersecurity: Collaborative Defence fo...
Irene Moetsana-Moeng: Stakeholders in Cybersecurity: Collaborative Defence fo...Irene Moetsana-Moeng: Stakeholders in Cybersecurity: Collaborative Defence fo...
Irene Moetsana-Moeng: Stakeholders in Cybersecurity: Collaborative Defence fo...
 
Microservices, Docker deploy and Microservices source code in C#
Microservices, Docker deploy and Microservices source code in C#Microservices, Docker deploy and Microservices source code in C#
Microservices, Docker deploy and Microservices source code in C#
 
React JS; all concepts. Contains React Features, JSX, functional & Class comp...
React JS; all concepts. Contains React Features, JSX, functional & Class comp...React JS; all concepts. Contains React Features, JSX, functional & Class comp...
React JS; all concepts. Contains React Features, JSX, functional & Class comp...
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examples
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and Insights
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a reality
 
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security ObservabilityGlenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
 
Microsoft 365 Copilot: How to boost your productivity with AI – Part two: Dat...
Microsoft 365 Copilot: How to boost your productivity with AI – Part two: Dat...Microsoft 365 Copilot: How to boost your productivity with AI – Part two: Dat...
Microsoft 365 Copilot: How to boost your productivity with AI – Part two: Dat...
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdf
 
React Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkReact Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App Framework
 
All These Sophisticated Attacks, Can We Really Detect Them - PDF
All These Sophisticated Attacks, Can We Really Detect Them - PDFAll These Sophisticated Attacks, Can We Really Detect Them - PDF
All These Sophisticated Attacks, Can We Really Detect Them - PDF
 

Brendel Group Presentation: 4 Mar 2013

  • 1. Transcript reconstruction algorithms available in the Trinity RNA-Seq package Daniel Standage Brendel Group, Indiana University 4 Mar 2014 Daniel Standage (Brendel Group @ IU) Trinity Assembly 4 Mar 2014 1 / 24
  • 2. Introduction RNA-Seq RNA-Seq Examination of transcriptomes deep effective affordable Daniel Standage (Brendel Group @ IU) Trinity Assembly 4 Mar 2014 2 / 24
  • 3. Introduction RNA-Seq RNA-Seq High throughput comes at the expense of contiguity. Daniel Standage (Brendel Group @ IU) Trinity Assembly 4 Mar 2014 3 / 24
  • 4. Introduction RNA-Seq RNA-Seq High throughput comes at the expense of contiguity...well, at least for now. Daniel Standage (Brendel Group @ IU) Trinity Assembly 4 Mar 2014 4 / 24
  • 5. Introduction Assembly with Trinity Transcriptome assembly In the absence of full-length transcript sequences, reconstruct full-length sequences from fragments. Daniel Standage (Brendel Group @ IU) Trinity Assembly 4 Mar 2014 5 / 24
  • 6. Introduction Assembly with Trinity Trinity RNA-Seq Daniel Standage (Brendel Group @ IU) Trinity Assembly 4 Mar 2014 6 / 24
  • 7. Introduction Assembly with Trinity Trinity RNA-Seq Now with 3 transcript reconstruction modes! Butterfly (default) --PasaFly --CuffFly Daniel Standage (Brendel Group @ IU) Trinity Assembly 4 Mar 2014 7 / 24
  • 8. Introduction Assembly with Trinity Review outline Trinity algorithm PASA algorithm Cufflinks algorithm Discussion Daniel Standage (Brendel Group @ IU) Trinity Assembly 4 Mar 2014 8 / 24
  • 9. Trinity Inchworm Step 1: Inchworm Assemble unique contigs representing transcript subsequences. Often produces dominant isoform in full length, and then just unique portions of alternative isoforms. Daniel Standage (Brendel Group @ IU) Trinity Assembly 4 Mar 2014 9 / 24
  • 10. Trinity Inchworm Inchworm procedure 1 Create dictionary of k-mers (k = 25) 2 Remove k-mers containing probable errors (based on coverage?) 3 Selects highest occurring k-mer 4 Build contig by extending k-mer (find highest occurring k-mer with k − 1 bp overlap, extend 1 bp), remove k-mer from dictionary 5 Repeat previous step until the contig cannot be extended further, report contig 6 Repeat steps 3-5 until all k-mers are exhausted Daniel Standage (Brendel Group @ IU) Trinity Assembly 4 Mar 2014 10 / 24
  • 11. Trinity Chrysalis Step 2: Chrysalis Group Inchworm contigs, construct de Bruijn graph for each cluster. Each connected component of the graph corresponds to one or more genes with shared sequence. Daniel Standage (Brendel Group @ IU) Trinity Assembly 4 Mar 2014 11 / 24
  • 12. Trinity Chrysalis Chrysalis procedure 1 Group contigs if they share perfect overlap of k − 1 bp (with reads supporting the overlap) 2 Build de Bruijn graph with k − 1 word size for nodes, k for edges; edges weighted by supporting reads 3 Assign each read to component with which it shares the largest number of k-mers Daniel Standage (Brendel Group @ IU) Trinity Assembly 4 Mar 2014 12 / 24
  • 13. Trinity Butterfly Step 3: Butterfly Traverse read-supported paths in each subgraph, enumerate plausible sequences. Daniel Standage (Brendel Group @ IU) Trinity Assembly 4 Mar 2014 13 / 24
  • 14. Trinity Butterfly Butterfly procedure 1 2 Graph simplification: merge consecutive nodes in linear paths, pruning minor deviations Plausible path scoring: identify paths in graph with read support Initialize DP table with source nodes (no incoming edges) Fill in table by extending path prefixes by one node Daniel Standage (Brendel Group @ IU) Trinity Assembly 4 Mar 2014 14 / 24
  • 15. PASA PASA Program to Assemble Spliced Alignments designed for ESTs and FL-cDNAs (pre-NGS era) works on sequence alignments computes consensus spliced alignments Daniel Standage (Brendel Group @ IU) Trinity Assembly 4 Mar 2014 15 / 24
  • 16. PASA PASA algorithm Input: a set of spliced cDNA alignments A Output: for each alignment a ∈ A, the largest assembly containing a 1 Sort alignments 2 Test overlapping alignments for compatibility 3 Build DP table, backtrace to find maximal assembly A∗ 4 If ∃a ∈ A∗ , build reciprocal DP table, trace to enumerate additional / assemblies Daniel Standage (Brendel Group @ IU) Trinity Assembly 4 Mar 2014 16 / 24
  • 17. PASA PASA algorithm Recurrences La = max{Ca , Lb + Ca/b } b Ra = max{Ca , Rb + Ca/b } b La , Ra : maximum number of cDNAs in an assembly that contains alignment a, starting from left and right (respectively) Ca : number of a-compatible alignments in the span of a Ca/b : number of a-compatible alignments in the span of a but not in the span of b Daniel Standage (Brendel Group @ IU) Trinity Assembly 4 Mar 2014 17 / 24
  • 18. PASA PASA algorithm Daniel Standage (Brendel Group @ IU) Trinity Assembly 4 Mar 2014 18 / 24
  • 19. Cufflinks Cufflinks designed for short transcript reads (NGS era) works on read alignments (mappings) identifies fewest number of transcripts that “explain” the read mappings Daniel Standage (Brendel Group @ IU) Trinity Assembly 4 Mar 2014 19 / 24
  • 20. Cufflinks Cufflinks algorithm Input: overlap graph G of mapped reads Output: a minimal path cover of G , with each path corresponding to a single assembled transcript 1 Alignments divided into non-overlapping loci 2 Erroneous read alignments removed 3 Compute transitive reduction of G , G 4 5 Construct bipartite graph G ∗ from transitive closure of G ,with edges weighted by coverage to “phase” distant exons by their coverage Compute minimum-cost maximal matching in G ∗ , which corresponds to minimum path cover of G Daniel Standage (Brendel Group @ IU) Trinity Assembly 4 Mar 2014 20 / 24
  • 21. Discussion Three different construction approaches Butterfly: enumerate all plausible transcripts with minimal read support PASA: for each alignment, find largest assembly (transcript) containing the alignment CuffLinks: find minimal assembl(y|ies) that explain the data, using read coverage to “phase” distant exons Daniel Standage (Brendel Group @ IU) Trinity Assembly 4 Mar 2014 21 / 24
  • 22. Discussion Next time: comparison of 8 Trinity assemblies Four assembly settings Butterfly --PasaFly --CuffFly Butterfly, --min kmer cov 2 Two input data sets Groomed data Groomed data with digital normalization Daniel Standage (Brendel Group @ IU) Trinity Assembly 4 Mar 2014 22 / 24
  • 23. Discussion Next time: comparison of 8 Trinity assemblies Hypotheses (transcripts per assembly) Butterfly > PasaFly > CuffFly Diginorm > No diginorm Daniel Standage (Brendel Group @ IU) Trinity Assembly 4 Mar 2014 23 / 24
  • 24. Discussion Thank you! Daniel Standage (Brendel Group @ IU) Trinity Assembly 4 Mar 2014 24 / 24