SlideShare uma empresa Scribd logo
1 de 37
Synthetic gene design with a large number of hidden stops Authors:    Phan, V.,  Saha, S.,  Pandey, A.,  Wong, T-Y Published in:  Intl. Journal of Data Mining and Bioinformatics Vol. 4, No. 4, 2010 Presented by: Khaled Monsoor Bioinformatics Masters Program The University of Memphis Mail: kmonsoor@memphis.edu Date: Nov 05, 2010
Overview ,[object Object]
  Why ?
  How ?
  Result ?
  Conclusion Synthetic gene design with a large number of hidden stops
Sleeping is waste of precious time Stay awake Like him …
What the paper talks about ? ,[object Object]
 Can we “redesign” genes to include more Hidden stops  ?
  How clever computer algorithms can help us ?,[object Object]
Why ?
  How ?
  Result ?
  Conclusion Synthetic gene design with a large number of hidden stops
Why we need to ? It is now feasible to construct artificial genomes. Researchers at the C. Venter Research Institute created artificially the genome of Mycoplasmagenitalium, completed in 2010  …. To increase efficiency of protein synthesis in ‘designed’ genes ? How to increase efficiency … ,[object Object],	by  terminating them early ,[object Object],very long non-functional proteins
Universal Genetic Code ,[object Object]
   Has evolved through millions of years
   A protein is a sequence of amino acids
  Contains 20(twenty) amino acids8
Universal Genetic Code
mRNA: ATGTCCAAACCT Protein: M  S  LP 10 Translation
Triplets representing P (Proline) 11 CCT, CCC, CCA, CCG all represent P (Proline) A mutation in the 3rd positions does not change the amino acid
Deletion/Insertion is dangerous Deletion creates frame shifts, which change entire subsequence content RNA:  ….. CAT.CAT.CAT.CAT …. Protein: …HHHH…  (chain of Histidine) Deletion of  3rd character (T): CAC.ATC.ATC.AT Protein: HII      	... Totally bizarre something else  !!! 12
Like them … :-(
Regular Expression for a Protein (start) (codon)k (stop) Start –   ATG Stop  –   TAA, TAG, TGA Codon– any triplet not equal to TAA, TAG, orTGA Example: ATG.ACC.AAT.CGG.TAA 14 Stop  codon (but  hidden)
Why a hidden stop is good ? Hidden stops can protect against frame shifts by terminating consequence translation early Without hidden stops, frame shifts can cause very long non-functional proteins, resulting to not only waste of time, amino acid resources (money), ATP (energy) but also produce some deadly toxin  Ref: Seligmann and Pollock, DNA and Cell Biology, 2004 15
Overview ,[object Object]
  Why ?
How ?
  Result ?
  Conclusion Synthetic gene design with a large number of hidden stops
Goal ,[object Object]
Constraints: None,  by matching GC content, and  by matching codon usage 17
Example: protein is MSDSKED 18
Hidden Stops Consider this protein is MSDSKED Both sequences encode for this protein: ATG.AGT.GAT.AGT.AAA.GAA.GAC.TAA ATG.TCC.GAT.TCG.AAA.GAA.GAC.TAA Sequence (1) is better!  It has 4 hidden stops! 19
Algorithm for No Constraint Goal:  ,[object Object],20

Mais conteúdo relacionado

Semelhante a Maximizing hidden stop codon on gene design

SAGE (Serial analysis of Gene Expression)
SAGE (Serial analysis of Gene Expression)SAGE (Serial analysis of Gene Expression)
SAGE (Serial analysis of Gene Expression)talhakhat
 
Towards reading genomic data using deep learning-driven NLP techniques
Towards reading genomic data using deep learning-driven NLP techniquesTowards reading genomic data using deep learning-driven NLP techniques
Towards reading genomic data using deep learning-driven NLP techniquesWesley De Neve
 
2013 talk at TGAC, November 4
2013 talk at TGAC, November 42013 talk at TGAC, November 4
2013 talk at TGAC, November 4c.titus.brown
 
Improved Reagents & Methods for Target Enrichment in Next Generation Sequencing
Improved Reagents & Methods for Target Enrichment in Next Generation SequencingImproved Reagents & Methods for Target Enrichment in Next Generation Sequencing
Improved Reagents & Methods for Target Enrichment in Next Generation SequencingIntegrated DNA Technologies
 
2014 khmer protocols
2014 khmer protocols2014 khmer protocols
2014 khmer protocolsc.titus.brown
 
A genetic algorithm for the optimal design of a multistage amplifier
A genetic algorithm for the optimal design of a multistage amplifier  A genetic algorithm for the optimal design of a multistage amplifier
A genetic algorithm for the optimal design of a multistage amplifier IJECEIAES
 
Antao Biopython Bosc2008
Antao Biopython Bosc2008Antao Biopython Bosc2008
Antao Biopython Bosc2008bosc_2008
 
Genomic and cDNA Libraries.ppt
Genomic and cDNA Libraries.pptGenomic and cDNA Libraries.ppt
Genomic and cDNA Libraries.pptsumitraDas14
 
Unit B7 8 Protein Synthesis2
Unit B7 8 Protein Synthesis2Unit B7 8 Protein Synthesis2
Unit B7 8 Protein Synthesis2sciencechris
 
Dna library CONSTRUCTION
Dna library CONSTRUCTIONDna library CONSTRUCTION
Dna library CONSTRUCTIONMSCW Mysore
 
2015 bioinformatics go_hmm_wim_vancriekinge
2015 bioinformatics go_hmm_wim_vancriekinge2015 bioinformatics go_hmm_wim_vancriekinge
2015 bioinformatics go_hmm_wim_vancriekingeProf. Wim Van Criekinge
 
BOSC 2008 Biopython
BOSC 2008 BiopythonBOSC 2008 Biopython
BOSC 2008 Biopythontiago
 

Semelhante a Maximizing hidden stop codon on gene design (20)

SAGE (Serial analysis of Gene Expression)
SAGE (Serial analysis of Gene Expression)SAGE (Serial analysis of Gene Expression)
SAGE (Serial analysis of Gene Expression)
 
bioinformatic.pptx
bioinformatic.pptxbioinformatic.pptx
bioinformatic.pptx
 
Final doc of dna
Final  doc of dnaFinal  doc of dna
Final doc of dna
 
Towards reading genomic data using deep learning-driven NLP techniques
Towards reading genomic data using deep learning-driven NLP techniquesTowards reading genomic data using deep learning-driven NLP techniques
Towards reading genomic data using deep learning-driven NLP techniques
 
cloning
cloningcloning
cloning
 
cloning
cloningcloning
cloning
 
Cloning
CloningCloning
Cloning
 
C:\fakepath\cloning
C:\fakepath\cloningC:\fakepath\cloning
C:\fakepath\cloning
 
Cloning
CloningCloning
Cloning
 
2013 talk at TGAC, November 4
2013 talk at TGAC, November 42013 talk at TGAC, November 4
2013 talk at TGAC, November 4
 
Homology directed repair (HDR) Knock-in
Homology directed repair (HDR) Knock-inHomology directed repair (HDR) Knock-in
Homology directed repair (HDR) Knock-in
 
Improved Reagents & Methods for Target Enrichment in Next Generation Sequencing
Improved Reagents & Methods for Target Enrichment in Next Generation SequencingImproved Reagents & Methods for Target Enrichment in Next Generation Sequencing
Improved Reagents & Methods for Target Enrichment in Next Generation Sequencing
 
2014 khmer protocols
2014 khmer protocols2014 khmer protocols
2014 khmer protocols
 
A genetic algorithm for the optimal design of a multistage amplifier
A genetic algorithm for the optimal design of a multistage amplifier  A genetic algorithm for the optimal design of a multistage amplifier
A genetic algorithm for the optimal design of a multistage amplifier
 
Antao Biopython Bosc2008
Antao Biopython Bosc2008Antao Biopython Bosc2008
Antao Biopython Bosc2008
 
Genomic and cDNA Libraries.ppt
Genomic and cDNA Libraries.pptGenomic and cDNA Libraries.ppt
Genomic and cDNA Libraries.ppt
 
Unit B7 8 Protein Synthesis2
Unit B7 8 Protein Synthesis2Unit B7 8 Protein Synthesis2
Unit B7 8 Protein Synthesis2
 
Dna library CONSTRUCTION
Dna library CONSTRUCTIONDna library CONSTRUCTION
Dna library CONSTRUCTION
 
2015 bioinformatics go_hmm_wim_vancriekinge
2015 bioinformatics go_hmm_wim_vancriekinge2015 bioinformatics go_hmm_wim_vancriekinge
2015 bioinformatics go_hmm_wim_vancriekinge
 
BOSC 2008 Biopython
BOSC 2008 BiopythonBOSC 2008 Biopython
BOSC 2008 Biopython
 

Último

Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfPrecisely
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 

Último (20)

Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 

Maximizing hidden stop codon on gene design

  • 1. Synthetic gene design with a large number of hidden stops Authors: Phan, V., Saha, S., Pandey, A., Wong, T-Y Published in: Intl. Journal of Data Mining and Bioinformatics Vol. 4, No. 4, 2010 Presented by: Khaled Monsoor Bioinformatics Masters Program The University of Memphis Mail: kmonsoor@memphis.edu Date: Nov 05, 2010
  • 2.
  • 6. Conclusion Synthetic gene design with a large number of hidden stops
  • 7.
  • 8. Sleeping is waste of precious time Stay awake Like him …
  • 9.
  • 10. Can we “redesign” genes to include more Hidden stops ?
  • 11.
  • 12. Why ?
  • 13. How ?
  • 15. Conclusion Synthetic gene design with a large number of hidden stops
  • 16.
  • 17.
  • 18. Has evolved through millions of years
  • 19. A protein is a sequence of amino acids
  • 20. Contains 20(twenty) amino acids8
  • 22. mRNA: ATGTCCAAACCT Protein: M S LP 10 Translation
  • 23. Triplets representing P (Proline) 11 CCT, CCC, CCA, CCG all represent P (Proline) A mutation in the 3rd positions does not change the amino acid
  • 24. Deletion/Insertion is dangerous Deletion creates frame shifts, which change entire subsequence content RNA: ….. CAT.CAT.CAT.CAT …. Protein: …HHHH… (chain of Histidine) Deletion of 3rd character (T): CAC.ATC.ATC.AT Protein: HII ... Totally bizarre something else !!! 12
  • 26. Regular Expression for a Protein (start) (codon)k (stop) Start – ATG Stop – TAA, TAG, TGA Codon– any triplet not equal to TAA, TAG, orTGA Example: ATG.ACC.AAT.CGG.TAA 14 Stop codon (but hidden)
  • 27. Why a hidden stop is good ? Hidden stops can protect against frame shifts by terminating consequence translation early Without hidden stops, frame shifts can cause very long non-functional proteins, resulting to not only waste of time, amino acid resources (money), ATP (energy) but also produce some deadly toxin Ref: Seligmann and Pollock, DNA and Cell Biology, 2004 15
  • 28.
  • 29. Why ?
  • 30. How ?
  • 32. Conclusion Synthetic gene design with a large number of hidden stops
  • 33.
  • 34. Constraints: None, by matching GC content, and by matching codon usage 17
  • 35. Example: protein is MSDSKED 18
  • 36. Hidden Stops Consider this protein is MSDSKED Both sequences encode for this protein: ATG.AGT.GAT.AGT.AAA.GAA.GAC.TAA ATG.TCC.GAT.TCG.AAA.GAA.GAC.TAA Sequence (1) is better! It has 4 hidden stops! 19
  • 37.
  • 38. Dynamic Programming approach Idea: Optimal design of whole sequence is based on optimal design of partial sequences H(i, j) = optimal design up to ith amino acid, Ai, which is coded by its jthcodon 21
  • 39. Optimal Substructure of algorithm This formula can be computed recursively (in linear time, O(n)) H(i, j) = maxk { H(i-1, k) + Ikj } Maximizing over all k codons coding the previous amino acid, Ai-1 Ikj = 1 if the kth codon of Ai-1 and jth codon of Ai is a stop codon 22
  • 40. Strategy: Back Translation Protein  DNA This is a 1-to-many mapping Back translation should: Satisfy constraints imposed by host genomes, Serve specific design purpose 23
  • 42. Constrained by GC Content GC content = number of G & C in sequence GC content relates to the stability of DNA Algorithm’s objectives: maximizenumber of hidden stops, then, matchGC content of host genome 25
  • 43. Algorithm considering GC content Constraint and “Fitting” approach
  • 44.
  • 45.
  • 46. Still “better” than wild-type genes27
  • 47. For a particular amino acid, triplets are not distributed uniformly 28 For Leucine, codon CUG is used 51% in E. Coli.
  • 48. Algorithm considering Codon Usage Constraints
  • 49.
  • 50. Why ?
  • 51. How ?
  • 53. Conclusion Synthetic gene design with a large number of hidden stops
  • 54. Comparison “Wild type” (genes from NCBI) Random gene (constrained by Codon usage of “wild type” “Optimal” – design with no constraint (max stop codon) Constrained by GC content of wild type Constrained by Codon usage of wild type 31
  • 55. Genes for re-design study . . .
  • 56. Overall comparison of all approaches Number of hidden stop codon
  • 57.
  • 58. Why ?
  • 59. How ?
  • 61.
  • 62. Any question ? As a lagging grad student, I’ll try my best to answer …
  • 63. Thank you for attending his boring presentation … oh

Notas do Editor

  1. H = Histidine, I= Isoleucine