O slideshow foi denunciado.
Utilizamos seu perfil e dados de atividades no LinkedIn para personalizar e exibir anúncios mais relevantes. Altere suas preferências de anúncios quando desejar.

XPRIME: A Novel Motif Searching Method

829 visualizações

Publicada em

Presentation prepared for the WNAR conference held at Portland State University in 2009

  • Seja o primeiro a comentar

  • Seja a primeira pessoa a gostar disto

XPRIME: A Novel Motif Searching Method

  1. 1. XPRIME: A Novel Motif Searching Method Rachel L. Poulsen Department of Statistics Brigham Young University June 15, 2009
  2. 2. Introduction DNA contains the genetic instructions that uniquely define an organism RNA is created to carry genetic instructions from the DNA to the rest of the cell
  3. 3. Introduction DNA contains the genetic instructions that uniquely define an organism RNA is created to carry genetic instructions from the DNA to the rest of the cell The process of DNA “talking” to the rest of the cell is called transcription
  4. 4. Transcription DNA
  5. 5. Transcription DNA RNA
  6. 6. Transcription DNA RNA
  7. 7. Position Weight Matrix (PWM) (Hertz et al 1990)
  8. 8. Position Weight Matrix (PWM) (Hertz et al 1990) ETS1 TF binding motif Position:  1 2 3 4 5 6 7 8  A 0.067 0.333 0.0 0.0 1.0 0.533 0.267 0.067 C   0.933 0.600 0.0 0.0 0.0 0.133 0.067 0.400   G  0.000 0.000 1.0 1.0 0.0 0.000 0.667 0.000  T 0.000 0.067 0.0 0.0 0.0 0.333 0.000 0.533
  9. 9. Sequence Logos Figure: DNA binding motif for the ETS1 TF
  10. 10. De Novo motif searching
  11. 11. De Novo motif searching Regular expression enumeration
  12. 12. De Novo motif searching Regular expression enumeration 1 Actual count vs. expected count 2 Dictionary-based sequence model (Bussemaker et al. 2000)
  13. 13. De Novo motif searching Regular expression enumeration 1 Actual count vs. expected count 2 Dictionary-based sequence model (Bussemaker et al. 2000) PWM updating
  14. 14. De Novo motif searching Regular expression enumeration 1 Actual count vs. expected count 2 Dictionary-based sequence model (Bussemaker et al. 2000) PWM updating 1 MEME (Bailey et al 1995) 2 Gibbs Motif Sampler (GMS) (Lawrence et al 1993) 3 BioProspector (Liu et al 2001) 4 AlignACE (Roth et al 1998)
  15. 15. Known Motif Search 1 GREP 2 Database search with scoring function (Hertz et al 1990)
  16. 16. XPIME: An Improved Method
  17. 17. XPIME: An Improved Method TRANSFAC (Matys et al 2003) Information pulled from in vitro experiments and literature Most methods justify results using TRANSFAC
  18. 18. XPIME: An Improved Method TRANSFAC (Matys et al 2003) Information pulled from in vitro experiments and literature Most methods justify results using TRANSFAC XPRIME incorporates prior information
  19. 19. XPIME: An Improved Method TRANSFAC (Matys et al 2003) Information pulled from in vitro experiments and literature Most methods justify results using TRANSFAC XPRIME incorporates prior information XPRIME can search for both de novo motifs and known motifs simultaneously
  20. 20. Notation and Data
  21. 21. Notation and Data Indices w: width of motif L: length of sequence m: motif indicator i: position in sequence j: position in motif s: indicates sequence
  22. 22. Notation and Data Indices w: width of motif L: length of sequence m: motif indicator i: position in sequence j: position in motif s: indicates sequence The data, zs
  23. 23. Notation and Data Indices w: width of motif L: length of sequence m: motif indicator i: position in sequence j: position in motif s: indicates sequence The data, zs zs = (yis , ∆1i , ∆2i , · · · , ∆(m+1)i ) yi represents the position (w-mer) ∆mi indicates if yi belongs to motif m or not ∆(m+1)i indicates if yi belongs to the backgrond motif or not
  24. 24. The Scoring Function w MotifScore = f (y) = pij I (yj = i). j=1 i∈A,C ,G ,T
  25. 25. Methods: Complete Data Likelihood (m+1) – component mixture model
  26. 26. Methods: Complete Data Likelihood (m+1) – component mixture model Ls L(θ|z) = C (yi )[r1 f1 (yi )]∆1i [r2 f2 (yi )]∆2i · · · [rm+1 fm+1 ]∆(m+1)i i=1 f(y) is the Motif Score equation
  27. 27. Methods: Priors fm+1 (y ) is fixed a priori ∆(m+1)i ’s are missing a priori f1 (y ), · · · , fm (y ) have product Dirichlet priors such that L ap mij −1 π(fm (y )) ∝ pmjk j=1 k∈(A,C ,G ,T ) r also has a Dirichlet prior M ari −1 π(r) ∝ ri i=1
  28. 28. Methods: Gibbs Algorithm
  29. 29. Methods: Gibbs Algorithm 1 Draws ∆’s from a multinomial distribution p∆ ∝ rM ∗ fM (y )
  30. 30. Methods: Gibbs Algorithm 1 Draws ∆’s from a multinomial distribution p∆ ∝ rM ∗ fM (y ) 2 Draws r from a Dirichlet distribution L αr = i=1 ∆Mi + aM
  31. 31. Methods: Gibbs Algorithm 1 Draws ∆’s from a multinomial distribution p∆ ∝ rM ∗ fM (y ) 2 Draws r from a Dirichlet distribution L αr = i=1 ∆Mi + aM 3 Draws pmij from a Dirichlet distribution L αpmij = i=1 k={A,C ,G ,T } ∆mi I (yij = k) + apmij
  32. 32. An Example: ETS1 We hypothesize that ETS1 has a specific binding site The Data 1 ETS1 only 2 GABP only 3 ETS1 and GABP
  33. 33. ETS1 Binding Motifs (a) ETS1 from TRANSFAC (b) ETS1 from ETS1 only (c) ETS1 from GABP only (d) ETS1 from ETS1/GABP
  34. 34. Justification of Prior Information Pete Hollenhorst sequence logo
  35. 35. Justification of Prior Information Figure: Motif found without prior specification Figure: Motif found with prior specification
  36. 36. Conclusions and Future Research
  37. 37. Conclusions and Future Research XPRIME successfully searches for de novo and known motifs
  38. 38. Conclusions and Future Research XPRIME successfully searches for de novo and known motifs Evidence found suggesting ETS1 has its own binding motif
  39. 39. Conclusions and Future Research XPRIME successfully searches for de novo and known motifs Evidence found suggesting ETS1 has its own binding motif Hidden Markov Models and forward backward algorithm
  40. 40. Conclusions and Future Research XPRIME successfully searches for de novo and known motifs Evidence found suggesting ETS1 has its own binding motif Hidden Markov Models and forward backward algorithm Prior information on r

×