SlideShare uma empresa Scribd logo
1 de 99
Baixar para ler offline
Finding Planted (l, d)-Motifs in Parallel
                   using Random Projection on GPUs

                                     Jhoirene Barasi Clemente

                                  Algorithms and Complexity Laboratory
                                    Department of Computer Science
                                  University of the Philippines-Diliman
                                         jbclemente@up.edu.ph

                                          March 31, 2012




J.B. Clemente (ACLab, DCS, UPD)               CUDA-FMURP                  March 31, 2012   1 / 88
Overview

Overview




     Introduction
     Definitions and Notations
     Finding Motifs using Random Projection (FMURP)
     Parallel Implementations of CUDA-FMURP
     Results and Analysis
     Conclusion




J.B. Clemente (ACLab, DCS, UPD)      CUDA-FMURP       March 31, 2012   2 / 88
Introduction




     In this work, we are interested in solving Planted (l, d)-Motif Problem
     using Random Projection (FMURP).
     The focus of this study is on parallelization of FMURP, where we
     present three versions of the parallel algorithm. Correctness of the
     parallelization is also discussed.
     We implement two of these parallel algorithms on GPUs. Theoretical
     and actual performance analyses are also presented.




J.B. Clemente (ACLab, DCS, UPD)         CUDA-FMURP                March 31, 2012   3 / 88
Introduction

Introduction




  A DNA motif is defined as a nucleic acid sequence pattern that has some
  biological significance such as being DNA binding sites for a regulatory
                protein. i.e., a transcription factor [Das,2007].


J.B. Clemente (ACLab, DCS, UPD)         CUDA-FMURP            March 31, 2012   4 / 88
Introduction

Introduction




                                  DNA Sequences as Strings




J.B. Clemente (ACLab, DCS, UPD)             CUDA-FMURP       March 31, 2012   5 / 88
Introduction

Introduction




The pattern is fairly short (5 to 20 base-pairs (bp) long) and is known to recur
      in different genes or several times within gene [Rombauts,1999].




 J.B. Clemente (ACLab, DCS, UPD)         CUDA-FMURP               March 31, 2012   6 / 88
Introduction        Notations

Notations


             Set of t sequences S.

Example 1 (Sequences S = {S0 , S1 , . . . , S(t−1) })
S0   :   C   G   G   G   G   C   T   A   T   G   G   A   A   C   T   G   G   G   T   C    G   T   C   A   C   A   T   T   C   C   C   C   T    T   T   C   G   A   T   A
S1   :   T   T   T   G   A   G   G   G   T   G   C   C   C   A   A   T   A   A   A   T    G   C   C   A   C   T   C   C   A   A   A   G   C    G   G   A   C   A   A   A
S2   :   G   G   A   T   G   C   A   A   C   T   G   A   T   G   C   C   G   T   T   T    G   A   C   G   A   C   C   T   A   A   A   T   C    A   A   C   G   G   C   C
S3   :   A   A   G   G   A   T   G   C   A   A   C   T   C   C   A   G   G   A   G   C    G   C   C   T   T   T   G   C   T   G   G   T   T    C   T   A   C   C   T   G
S4   :   A   A   T   T   T   T   C   T   A   A   A   A   A   G   A   T   T   A   T   A    A   T   G   T   C   G   G   T   C   C   A   T   G    C   A   A   C   T   T   C
S5   :   C   T   G   C   T   G   T   A   C   A   A   C   T   G   A   G   A   T   C   A    T   G   C   T   G   C   A   T   G   C   A   A   C    T   T   T   C   A   A   C
S6   :   T   A   C   A   T   G   A   T   C   T   T   T   T   G   A   T   G   C   A   A    C   G   T   G   G   A   T   G   A   G   G   G   A    A   T   G   A   T   G   C



Set of sequences S = {S0 , S1 , S2 , S3 , S4 , S5 , S6 }
defined over ΣDNA = {A, C, T, G},
where each sequence Si in S has length ni = 40 for all i ∈ {0, . . . , (t − 1)}



 J.B. Clemente (ACLab, DCS, UPD)                                             CUDA-FMURP                                                       March 31, 2012           7 / 88
Introduction   Notations

Notations


      An l-mer is a string of length l defined over ΣDNA .
      To denote an l-mer in S, we use
      Si,j , where i ∈ {0, 1, . . . , (t − 1)} is the sequence number
      and j ∈ {0, 1, . . . , (n − l)} is the starting position in Si .

Example 2 (Si,j in S)
For instance, an 8-mer S0,7 is

                                   ATGGAACT


S0 : C G G G G C T A T G G A A C T G G G T C G T C A C A T T C C C C T T T C G A T A




 J.B. Clemente (ACLab, DCS, UPD)         CUDA-FMURP                      March 31, 2012   8 / 88
Introduction   Notations

Notations




     Let s = (a0 , a1 , . . . , a(t−1) ) be the set of starting positions in S,
     where ai ∈ {0, 1, . . . , (n − l)}.
     Let A(s) denotes the alignment made by l-mers in the set
     {S0,a0 , S1,a1 , . . . , S(t−1),a(t−1) }.




J.B. Clemente (ACLab, DCS, UPD)          CUDA-FMURP                       March 31, 2012   9 / 88
Introduction        Notations

Notations

Example 3 (Alignment matrix A(s))
Suppose we have a starting position vector s = (7, 18, 2, 4, 30, 26, 14)

                                                  S0,7 :         A           T       G        G          A       A           C       T
                                                 S1,18 :         A           T       G        C          C       A           C       T
                                                  S2,2 :         A           T       G        C          A       A           C       T
                                 A(s)             S3,4 :         A           T       G        C          A       A           C       T
                                                 S4,30 :         A           T       G        C          A       A           C       T
                                                 S5,26 :         A           T       G        C          A       A           C       T
                                                 S6,14 :         A           T       G        C          A       A           C       G

S0   :   C   G   G   G   G   C   T   A   T   G   G   A   A   C   T   G   G   G   T   C   G   T   C   A   C   A   T   T   C   C   C   C   T   T   T   C   G   A   T    A
S1   :   T   T   T   G   A   G   G   G   T   G   C   C   C   A   A   T   A   A   A   T   G   C   C   A   C   T   C   C   A   A   A   G   C   G   G   A   C   A   A    A
S2   :   G   G   A   T   G   C   A   A   C   T   G   A   T   G   C   C   G   T   T   T   G   A   C   G   A   C   C   T   A   A   A   T   C   A   A   C   G   G   C    C
S3   :   A   A   G   G   A   T   G   C   A   A   C   T   C   C   A   G   G   A   G   C   G   C   C   T   T   T   G   C   T   G   G   T   T   C   T   A   C   C   T    G
S4   :   A   A   T   T   T   T   C   T   A   A   A   A   A   G   A   T   T   A   T   A   A   T   G   T   C   G   G   T   C   C   A   T   G   C   A   A   C   T   T    C
S5   :   C   T   G   C   T   G   T   A   C   A   A   C   T   G   A   G   A   T   C   A   T   G   C   T   G   C   A   T   G   C   A   A   C   T   T   T   C   A   A    C
S6   :   T   A   C   A   T   G   A   T   C   T   T   T   T   G   A   T   G   C   A   A   C   G   T   G   G   A   T   G   A   G   G   G   A   A   T   G   A   T   G    C



 J.B. Clemente (ACLab, DCS, UPD)                                             CUDA-FMURP                                                  March 31, 2012              10 / 88
Introduction       Notations

Notations

      A profile matrix P(s) with dimension equal to (|ΣDNA | × l) is derived
      from the frequency of each letter in each column of the A(s).

Example 4 (Profile Matrix P(s))
                                    S0,7 :   A     T        G       G       A   A   C   T
                                   S1,18 :   A     T        G       C       C   A   C   T
                                    S2,2 :   A     T        G       C       A   A   C   T
                       A(s)         S3,4 :   A     T        G       C       A   A   C   T
                                   S4,30 :   A     T        G       C       A   A   C   T
                                   S5,26 :   A     T        G       C       A   A   C   T
                                   S6,14 :   A     T        G       C       A   A   C   G

                                     A:      7     0        0        0      6   7   0   0
                                     T:      0     7        0        0      0   0   0   6
                       P(s)          C:      0     0        0        6      1   0   7   0
                                     G:      0     0        7        1      0   0   0   1

 J.B. Clemente (ACLab, DCS, UPD)                   CUDA-FMURP                               March 31, 2012   11 / 88
Introduction       Notations

Notations

      From P(s), we define MP(s) (j), where 0 ≤ j ≤ (l − 1), be the maximum
      number at jth column of the profile matrix.

Example 5 (MP(s),j )
                                    S0,7 :   A     T        G       G       A   A   C   T
                                   S1,18 :   A     T        G       C       C   A   C   T
                                    S2,2 :   A     T        G       C       A   A   C   T
                       A(s)         S3,4 :   A     T        G       C       A   A   C   T
                                   S4,30 :   A     T        G       C       A   A   C   T
                                   S5,26 :   A     T        G       C       A   A   C   T
                                   S6,14 :   A     T        G       C       A   A   C   G

                                     A:      7     0        0        0      6   7   0   0
                                     T:      0     7        0        0      0   0   0   6
                       P(s)          C:      0     0        0        6      1   0   7   0
                                     G:      0     0        7        1      0   0   0   1

 J.B. Clemente (ACLab, DCS, UPD)                   CUDA-FMURP                               March 31, 2012   12 / 88
Introduction   Notations

Notations
      A consensus string is an l-mer, where each of its elements is the
      nucleotide base corresponding to MP(s) (i).

Example 6 (Consensus String)
                                    S0,7 :    A      T      G    G   A   A   C       T
                                   S1,18 :    A      T      G    C   C   A   C       T
                                    S2,2 :    A      T      G    C   A   A   C       T
              A(s)                  S3,4 :    A      T      G    C   A   A   C       T
                                   S4,30 :    A      T      G    C   A   A   C       T
                                   S5,26 :    A      T      G    C   A   A   C       T
                                   S6,14 :    A      T      G    C   A   A   C       G

                                     A:       7      0      0    0   6   7   0        0
                                     T:       0      7      0    0   0   0   0        6
              P(s)                   C:       0      0      0    6   1   0   7        0
                                     G:       0      0      7    1   0   0   0        1

              Consensus String                A      T      G    C   A   A   C       T
 J.B. Clemente (ACLab, DCS, UPD)             CUDA-FMURP                          March 31, 2012   13 / 88
Introduction       Notations

Notations

      We define the Score(s,S) to be equal to
                                                              l−1
                                    Score(s, S) =                      MP(s) (i).                        (1)
                                                              i=0


Example 7 (Consensus Score())
                                   A:   7     0        0       0       6   7    0   0
                                   T:   0     7        0       0       0   0    0   6
                           P(s)    C:   0     0        0       6       1   0    7   0
                                   G:   0     0        7       1       0   0    0   1


                   Score(s, S) = 7 + 7 + 7 + 6 + 6 + 7 + 7 + 6 = 53


 J.B. Clemente (ACLab, DCS, UPD)              CUDA-FMURP                                March 31, 2012   14 / 88
Introduction       Notations

Notations

      We define the Score(s,S) to be equal to
                                                              l−1
                                    Score(s, S) =                      MP(s) (i).                        (1)
                                                              i=0


Example 7 (Consensus Score())
                                   A:   7     0        0       0       6   7    0   0
                                   T:   0     7        0       0       0   0    0   6
                           P(s)    C:   0     0        0       6       1   0    7   0
                                   G:   0     0        7       1       0   0    0   1


                   Score(s, S) = 7 + 7 + 7 + 6 + 6 + 7 + 7 + 6 = 53


 J.B. Clemente (ACLab, DCS, UPD)              CUDA-FMURP                                March 31, 2012   14 / 88
Introduction   Motif Finding Problem

Motif Finding Problem



Definition 8 (Motif Finding Problem [Pevzner,2004])

INPUT:
      A motif length l
      A set of t sequences S = {S0 , S1 , S2 , . . . , S(t−1) },
      where each Si is of length ni
OUTPUT:
      An array of starting positions s = (a0 , a1 , . . . , a(t−1) )
      maximizing consensus Score(s,S)




 J.B. Clemente (ACLab, DCS, UPD)          CUDA-FMURP                       March 31, 2012   15 / 88
Introduction   Motif Finding Problem

Naive MFP Solver [Pevzner,2004]

Input: DNA (sequences), motif length l
Output: Starting position s and consensus string corresponding to s
  1   For each possible starting position in S,
      i.e. s ∈ {(0, 0, . . . , 0), . . . , ((n − l), (n − l) . . . , (n − l))}.
          1   Get alignment A(s).
          2   Compute for P(s).
          3   Evaluate Score(s, S).
  2   From s with the maximum Score, get the consensus string.
  3   Output consensus string.
Step 1 needs to iterate (n − l + 1)t times because all possible starting
positions s is equal to

                     s = (a0 , a1 , . . . , a(t−1) ), ∀ ai ∈ {0, . . . , (n − l)}.



 J.B. Clemente (ACLab, DCS, UPD)               CUDA-FMURP                            March 31, 2012   16 / 88
Introduction   Motif Finding Problem

Naive MFP Solver [Pevzner,2004]

Input: DNA (sequences), motif length l
Output: Starting position s and consensus string corresponding to s
  1   For each possible starting position in S,
      i.e. s ∈ {(0, 0, . . . , 0), . . . , ((n − l), (n − l) . . . , (n − l))}.
          1   Get alignment A(s).
          2   Compute for P(s).
          3   Evaluate Score(s, S).
  2   From s with the maximum Score, get the consensus string.
  3   Output consensus string.
Step 1 needs to iterate (n − l + 1)t times because all possible starting
positions s is equal to

                     s = (a0 , a1 , . . . , a(t−1) ), ∀ ai ∈ {0, . . . , (n − l)}.



 J.B. Clemente (ACLab, DCS, UPD)               CUDA-FMURP                            March 31, 2012   16 / 88
Introduction   Planted (l, d)-Motif Finding Problem

Definitions


Definition 9 (Challenge Problem [Pevzner,2000])
INPUT:
      Motif length l = 15,
      Expected mismatches d,
      20 DNA sequences each with ni = 600 nucleotide bases
OUTPUT:
      A consensus string M from an alignment A(s), where each l-mer in A(s)
      has Si,ai

                                          dE (M, Si,ai ) = 4,
      for all i ∈ {0, . . . , (t − 1)}.



 J.B. Clemente (ACLab, DCS, UPD)           CUDA-FMURP                                      March 31, 2012   17 / 88
Introduction   Planted (l, d)-Motif Finding Problem

Why challenging?

Suppose we have A(s),

   S0,a0      A      C      T      T    G     G       G      G       C       A       A      G      A      G        G
   S1,a1      G      G      A      C    G     G       G      G       C       A       G      A      C      T        G
   S2,a2      A      C      T      T    G     C       T      A       A       A       G      A      C      T        G
   S3,a3      A      C      T      G    C     G       G      G       C       A       C      A      G      T        G
   S4,a4      A      C      C      T    G     G       G      T       C       G       T      A      C      T        G
    A:        4      0      1      0    0     0       0      1       1       4       1      4      1      0        0
    C:        0      4      1      1    1     1       0      0       4       0       1      0      2      0        0
    T:        0      0      3      3    0     0       1      1       0       0       1      0      1      4        0
    G:        1      1      0      1    4     4       4      3       0       1       2      1      1      1        5
              A      C      T      T    G     G       G      G       C       A       G      A      C      T        G

                                       dE (S0,a0 , S1,a1 ) = 2d = 8
Score(s, S) = 4 + 4 + 3 + 3 + 4 + 4 + 4 + 3 + 4 + 4 + 2 + 4 + 2 + 4 + 5 = 54

 J.B. Clemente (ACLab, DCS, UPD)                  CUDA-FMURP                                      March 31, 2012       18 / 88
Introduction   Planted (l, d)-Motif Finding Problem

Definitions

Definition 10 (Planted (l, d)-Motif Finding Problem [Tompa,2001])

INPUT:
      Motif length l,
      Expected number of mismatches d, and
      A set of t sequences S = {S0 , S1 , S2 , . . . , S(t−1) }, where each Si is of
      length ni
OUTPUT:


      A consensus string M from an alignment A(s), where each l-mer in A(s)
      has Si,ai
                                dE (M, Si,ai ) = d,
      for all i ∈ {0, . . . , (t − 1)}.

 J.B. Clemente (ACLab, DCS, UPD)           CUDA-FMURP                                      March 31, 2012   19 / 88
Introduction   Planted (l, d)-Motif Finding Problem

Solutions for Planted (l, d)-Motif Finding




     SP-STAR [Pevzner,2000]
     Winnower [Pevzner,2000]
     Random Projection [Tompa,2001]
     Aggregation [Mohammed,2004]
     GibbsDST [Shida,2006]




J.B. Clemente (ACLab, DCS, UPD)         CUDA-FMURP                                      March 31, 2012   20 / 88
Finding Motifs using Random Projection (FMURP)

Finding Motifs using Random Projection (FMURP)


INPUT: Set of sequences S, motif length l, expected mismatches d, projection
dimension k, and bucket threshold δ
OUTPUT: Motif
  1   Projection
          1   Get all l-mer Si,j s in S.
          2   Get projection hI (Si,j ) for each Si,j in S.
          3   Hash each Si,j to buckets with identifier hI (Si,j ).
          4   Get enriched buckets.
  2   Refine each enriched bucket using EM
  3   Refine each enriched bucket using SP-STARσ
  4   Maximize score to output best motif




 J.B. Clemente (ACLab, DCS, UPD)                      CUDA-FMURP     March 31, 2012   21 / 88
Finding Motifs using Random Projection (FMURP)




Definition 11
Random Projection Given an l-mer Si,j , projection dimension k, and a set
I ⊂ L = {0, . . . , (l − 1)}, where |I| = k, elements in I are sorted in increasing
order and are randomly chosen from the set L, a k-dimensional projection of
Si,j is
                     hI (Si,j ) = Si,j (I0 ), Si,j (I1 ), . . . , Si,j (I(k−1) ),
where hI (Si , j) is a k-mer and Ii denotes the ith element in I.




 J.B. Clemente (ACLab, DCS, UPD)                     CUDA-FMURP     March 31, 2012   22 / 88
Finding Motifs using Random Projection (FMURP)

FMURP: Example


Example 12
Given a set of DNA sequences S, pattern length l = 4, projection dimension
k = 2, and bucket threshold δ = 3.

                            S0 :      C      G       G        T   C   A   G   G
                            S1 :      T      T       C        G   A   C   A   T
                            S2 :      A      C       G        A   T   G   A   A
                          Figure: Set of t = 3 sequences each with n = 8


Let I = {0, 1}.




 J.B. Clemente (ACLab, DCS, UPD)                     CUDA-FMURP                   March 31, 2012   23 / 88
Finding Motifs using Random Projection (FMURP)

Projection




J.B. Clemente (ACLab, DCS, UPD)                     CUDA-FMURP   March 31, 2012   24 / 88
Finding Motifs using Random Projection (FMURP)

Projection




J.B. Clemente (ACLab, DCS, UPD)                     CUDA-FMURP   March 31, 2012   25 / 88
Finding Motifs using Random Projection (FMURP)

Projection




J.B. Clemente (ACLab, DCS, UPD)                     CUDA-FMURP   March 31, 2012   26 / 88
Finding Motifs using Random Projection (FMURP)

Projection




J.B. Clemente (ACLab, DCS, UPD)                     CUDA-FMURP   March 31, 2012   27 / 88
Parallel Motif Finding using Random Projection

How do we parallelize FMURP?

                                                                1   Projection
1   Projection                                                        1   Get all l-mer Si,j s in S in
        1   Get all l-mer Si,j s in S.                                    parallel.
        2   Get projection hI (Si,j ) for each                        2   Get projection hI (Si,j ) for each
            Si,j in S.                                                    Si,j in S in parallel.
        3   Hash each Si,j to buckets with                            3   Hash each Si,j to buckets with
            identifier hI (Si,j ).                                         identifier hI (Si,j ) in parallel.
        4   Get enriched buckets.                                     4   Get enriched buckets in
2   Refine each enriched bucket                                            parallel.
    using EM                                                    2   Refine each enriched bucket
3   Refine each enriched bucket                                      using EM in parallel
    using SP-STARσ                                              3   Refine each enriched bucket
4   Maximize score to output best                                   using SP-STARσ in parallel
    motif                                                       4   Maximize score to output best
                                                                    motif.

J.B. Clemente (ACLab, DCS, UPD)                        CUDA-FMURP                         March 31, 2012   28 / 88
Parallel Motif Finding using Random Projection

Parallel Algorithms for Motif Finding




     CUDA-MEME
     CUDA-Gibbs Sampling




J.B. Clemente (ACLab, DCS, UPD)                        CUDA-FMURP   March 31, 2012   29 / 88
Parallel Motif Finding using Random Projection

CUDA




J.B. Clemente (ACLab, DCS, UPD)                        CUDA-FMURP   March 31, 2012   30 / 88
Parallel Motif Finding using Random Projection   Parallel Motifs Finding using Random Projection (CUDA-FMURP v1)

Computing Framework




          Figure: Flowchart showing the processes done in the CPU and GPU

J.B. Clemente (ACLab, DCS, UPD)                        CUDA-FMURP                                     March 31, 2012     31 / 88
Parallel Motif Finding using Random Projection   Parallel Motifs Finding using Random Projection (CUDA-FMURP v1)

CUDA-FMURP v1




Figure: Thread ID is denoted by an ordered pair (i, j), 0 ≤ i ≤ w and 0 ≤ j ≤ v, where v is
the maximum thread per block and w is the number of allocated thread blocks in the grid. The
algorithm uses a total of x = t · (n − l + 1) threads that are linearly arranged in GPU.




 J.B. Clemente (ACLab, DCS, UPD)                        CUDA-FMURP                                     March 31, 2012     32 / 88
Parallel Motif Finding using Random Projection   Parallel Motifs Finding using Random Projection (CUDA-FMURP v1)

CUDA-FMURP v1
INPUT: Set of sequences S, motif length l, expected mismatches d, projection dimension k,
and bucket threshold δ
OUTPUT: Motif
  1   In CPU, generate k random positions for projection.
      Let this be the set I = {ˆ ˆ ∈ {0, . . . , (l − 1)}} and |I| = k.
                               i|i
  2   In GPU, for each thread tid in {0, . . . , (x − 1)},
          1   Get hI (Si,j )s for each Si,j in S,
                                                                                         ∗
          2   Convert each k-mer hI (Si,j ) to its corresponding integer representation ki,j .
                                                   ∗
          3   Perform a linear search over all ki,j s to determine which l-mers
                                                                      ∗
              are ‘hashed’ in the same bucket. The tid of matched ki,j s are noted instead
              of the actual l-mer.
  3   In CPU, identify the set of enriched buckets,
      and prune duplicates in preparation for EM refinement.
  4   In GPU, for each tid in {0, . . . , (e − 1)},
          1   Perform EM refinement for each enriched bucket.
          2   Perform SP-STARσ for each enriched bucket.
          3   Maximize σ score to output best motif.
 J.B. Clemente (ACLab, DCS, UPD)                        CUDA-FMURP                                     March 31, 2012     33 / 88
Parallel Motif Finding using Random Projection   Parallel Motifs Finding using Random Projection (CUDA-FMURP v1)

Integer Conversion
Step 2.2 represents each hI (Si,j ) to their corresponding integer representation
 ∗
ki,j . Given a unique k-mer from projection, a corresponding integer is
computed using the following mapping. Let us define
                                     f :      ΣDNA           →      {0, 1, 2, 3},
                                               A             →      0
                                               C             →      1
                                               G             →      2
                                               T             →      3
where each symbol in the DNA alphabet is mapped to a unique integer.
For a string v of length k,
                                    f∗ :       Σ+
                                                DNA          →      Z+ ∪ {0}
                                                                      k−1          i
                                                v            →        i=0 f (vi )4

where vi denotes the symbol at ith position starting from the least significant
digit and the integer representation is only defined on the positive integers
including {0}.
 J.B. Clemente (ACLab, DCS, UPD)                        CUDA-FMURP                                     March 31, 2012     34 / 88
Parallel Motif Finding using Random Projection   Parallel Motifs Finding using Random Projection (CUDA-FMURP v1)

CUDA-Projection v1: Example
Given a set of DNA sequences, pattern length l = 4, projection dimension
k = 2, and bucket threshold δ = 3. Projection in parallel is shown as follows




 J.B. Clemente (ACLab, DCS, UPD)                        CUDA-FMURP                                     March 31, 2012     35 / 88
Parallel Motif Finding using Random Projection   Parallel Motifs Finding using Random Projection (CUDA-FMURP v1)

CUDA-Projection v1: Integer Conversion example




J.B. Clemente (ACLab, DCS, UPD)                        CUDA-FMURP                                     March 31, 2012     36 / 88
Parallel Motif Finding using Random Projection   Parallel Motifs Finding using Random Projection (CUDA-FMURP v1)

CUDA-Projection: Parallel Integer Conversion Example




J.B. Clemente (ACLab, DCS, UPD)                        CUDA-FMURP                                     March 31, 2012     37 / 88
Parallel Motif Finding using Random Projection   Parallel Motifs Finding using Random Projection (CUDA-FMURP v1)

CUDA-Projection: Getting enriched buckets




J.B. Clemente (ACLab, DCS, UPD)                        CUDA-FMURP                                     March 31, 2012     38 / 88
Parallel Motif Finding using Random Projection   Parallel Motifs Finding using Random Projection (CUDA-FMURP v1)

CUDA-Projection: Getting enriched buckets




J.B. Clemente (ACLab, DCS, UPD)                        CUDA-FMURP                                     March 31, 2012     39 / 88
Parallel Motif Finding using Random Projection   Parallel Motifs Finding using Random Projection (CUDA-FMURP v1)

CUDA-EM




J.B. Clemente (ACLab, DCS, UPD)                        CUDA-FMURP                                     March 31, 2012     40 / 88
Parallel Motif Finding using Random Projection   Parallel Motifs Finding using Random Projection (CUDA-FMURP v1)

CUDA-SP-STARσ




J.B. Clemente (ACLab, DCS, UPD)                        CUDA-FMURP                                     March 31, 2012     41 / 88
Parallel Motif Finding using Random Projection   Parallel Motifs Finding using Random Projection (CUDA-FMURP v1)

Correctness v1

 1   In CPU, generate k random positions for projection.
     Let this be the set I = {ˆ ˆ ∈ {0, . . . , (l − 1)}} and |I| = k.
                              i|i
 2   In GPU, for each thread tid in {0, . . . , (x − 1)},
         1   Get hI (Si,j )s for each Si,j in S,
                                                                                        ∗
         2   Convert each k-mer hI (Si,j ) to its corresponding integer representation ki,j .
                                                  ∗
         3   Perform a linear search over all ki,j s to determine which l-mers
                                                                     ∗
             are ‘hashed’ in the same bucket. The tid of matched ki,j s are noted instead
             of the actual l-mer.
 3   In CPU, identify the set of enriched buckets,
     and prune duplicates in preparation for EM refinement.
 4   In GPU, for each tid in {0, . . . , (e − 1)},
         1   Perform EM refinement for each enriched bucket.
         2   Perform SP-STARσ for each enriched bucket.
         3   Maximize σ score to output best motif.


J.B. Clemente (ACLab, DCS, UPD)                        CUDA-FMURP                                     March 31, 2012     42 / 88
Parallel Motif Finding using Random Projection       Parallel Motifs Finding using Random Projection (CUDA-FMURP v1)

Correctness v1

The uniqueness of the representation we defined using f ∗ follows from the
results below.
Let Σk = {0, 1, 2, . . . , k − 1}, and let Ck a regular language such that,

                                        Ck = { } ∪ (Σk − {0})Σ∗ .
                                                              k


Theorem 4.1 (Fundamental Theorem of base-k Representation
[Allouche,2003])
Let k ≥ 2 be an integer. Then every non-negative integer has a unique
representation of the form
                                                                 t
                                                    N=                ai ki ,
                                                             i=0

where at = 0 and 0 ≤ ai < k for 0 ≤ i ≤ t.


 J.B. Clemente (ACLab, DCS, UPD)                        CUDA-FMURP                                         March 31, 2012     43 / 88
Parallel Motif Finding using Random Projection   Parallel Motifs Finding using Random Projection (CUDA-FMURP v1)

Correctness v1




In the case of our representation f ∗ , we have k = 4 and ai = f (vi ), where
vi ∈ ΣDNA . Note that the mapping f is one-to-one and onto by definition. Thus
we have the following:
Proposition 4.1

f ∗ provides a unique representation of hI (Si,j ), for each i, j, and element of I.




 J.B. Clemente (ACLab, DCS, UPD)                        CUDA-FMURP                                     March 31, 2012     44 / 88
Parallel Motif Finding using Random Projection   Parallel Motifs Finding using Random Projection (CUDA-FMURP v1)

Correctness v1

 1   In CPU, generate k random positions for projection.
     Let this be the set I = {ˆ ˆ ∈ {0, . . . , (l − 1)}} and |I| = k.
                              i|i
 2   In GPU, for each thread tid in {0, . . . , (x − 1)},
         1   Get hI (Si,j )s for each Si,j in S,
                                                                                        ∗
         2   Convert each k-mer hI (Si,j ) to its corresponding integer representation ki,j .
                                                  ∗
         3   Perform a linear search over all ki,j s to determine which l-mers
                                                                     ∗
             are ‘hashed’ in the same bucket. The tid of matched ki,j s are noted instead
             of the actual l-mer.
 3   In CPU, identify the set of enriched buckets,
     and prune duplicates in preparation for EM refinement.
 4   In GPU, for each tid in {0, . . . , (e − 1)},
         1   Perform EM refinement for each enriched bucket.
         2   Perform SP-STARσ for each enriched bucket.
         3   Maximize σ score to output best motif.


J.B. Clemente (ACLab, DCS, UPD)                        CUDA-FMURP                                     March 31, 2012     45 / 88
Parallel Motif Finding using Random Projection   Parallel Motifs Finding using Random Projection (CUDA-FMURP v1)

Correctness v1

We have to show that the set of enriched buckets EB obtained in FMURP is
                                          ¯
equivalent to the set of enriched buckets EB obtained in CUDA-FMURP v1.

                                               EB = {B| |B| ≥ δ}.
Two elements Si,j and Si ,j belongs to the same bucket B if it follows the
relation R defined below.
Definition 13 (Relation R)
                            (Si,j , Si ,j ) ∈ B           ⇔       (Si,j , Si ,j ) ∈ R
                            (Si,j , Si ,j ) ∈ R           ⇔       hI (Si,j ) = hI (Si ,j )

Proposition 4.2
                                       R is an equivalence relation.


 J.B. Clemente (ACLab, DCS, UPD)                        CUDA-FMURP                                     March 31, 2012     46 / 88
Parallel Motif Finding using Random Projection   Parallel Motifs Finding using Random Projection (CUDA-FMURP v1)

Correctness v1

We have to show that the set of enriched buckets EB obtained in FMURP is
                                          ¯
equivalent to the set of enriched buckets EB obtained in CUDA-FMURP v1.

                                               EB = {B| |B| ≥ δ}.
Two elements Si,j and Si ,j belongs to the same bucket B if it follows the
relation R defined below.
Definition 13 (Relation R)
                            (Si,j , Si ,j ) ∈ B           ⇔       (Si,j , Si ,j ) ∈ R
                            (Si,j , Si ,j ) ∈ R           ⇔       hI (Si,j ) = hI (Si ,j )

Proposition 4.2
                                       R is an equivalence relation.


 J.B. Clemente (ACLab, DCS, UPD)                        CUDA-FMURP                                     March 31, 2012     46 / 88
Parallel Motif Finding using Random Projection   Parallel Motifs Finding using Random Projection (CUDA-FMURP v1)

Correctness v1

We have to show that the set of enriched buckets EB obtained in FMURP is
                                          ¯
equivalent to the set of enriched buckets EB obtained in CUDA-FMURP v1.

                                               EB = {B| |B| ≥ δ}.
Two elements Si,j and Si ,j belongs to the same bucket B if it follows the
relation R defined below.
Definition 13 (Relation R)
                            (Si,j , Si ,j ) ∈ B           ⇔       (Si,j , Si ,j ) ∈ R
                            (Si,j , Si ,j ) ∈ R           ⇔       hI (Si,j ) = hI (Si ,j )

Proposition 4.2
                                       R is an equivalence relation.


 J.B. Clemente (ACLab, DCS, UPD)                        CUDA-FMURP                                     March 31, 2012     46 / 88
Parallel Motif Finding using Random Projection   Parallel Motifs Finding using Random Projection (CUDA-FMURP v1)

Correctness v1

We have to show that the set of enriched buckets EB obtained in FMURP is
                                          ¯
equivalent to the set of enriched buckets EB obtained in CUDA-FMURP v1.

                                               EB = {B| |B| ≥ δ}.
Two elements Si,j and Si ,j belongs to the same bucket B if it follows the
relation R defined below.
Definition 13 (Relation R)
                            (Si,j , Si ,j ) ∈ B           ⇔       (Si,j , Si ,j ) ∈ R
                            (Si,j , Si ,j ) ∈ R           ⇔       hI (Si,j ) = hI (Si ,j )

Proposition 4.2
                                       R is an equivalence relation.


 J.B. Clemente (ACLab, DCS, UPD)                        CUDA-FMURP                                     March 31, 2012     46 / 88
Parallel Motif Finding using Random Projection   Parallel Motifs Finding using Random Projection (CUDA-FMURP v1)

Correctness v1
In CUDA-FMURP v1, an enriched bucket is defined as

                                               ¯     ¯ ¯
                                               EB = {B| |B| ≥ δ}.
       ¯
where B is a bucket in CUDA-FMURP and two elements p and q belongs to
                ¯                            ¯
the same bucket B if it follows the relation R defined below.
                       ¯
Definition 14 (Relation R)
                                                ¯
                                       (p, q) ∈ B            ⇔       (p, q) ∈ R ¯
                                                ¯
                                       (p, q) ∈ R            ⇔        ∗ = k∗
                                                                     ki,j   ¯¯
                                                                             i,j

where i = p/(n − l + 1) , j = p mod (n − l + 1), ¯ = q/(n − l + 1) , and
                                                 i
¯ = q mod (n − l + 1).
j

Lemma 15
                                                  ¯
                                   Relation R and R are equivalent.

 J.B. Clemente (ACLab, DCS, UPD)                        CUDA-FMURP                                     March 31, 2012     47 / 88
Parallel Motif Finding using Random Projection   Parallel Motifs Finding using Random Projection (CUDA-FMURP v1)

Correctness v1
In CUDA-FMURP v1, an enriched bucket is defined as

                                               ¯     ¯ ¯
                                               EB = {B| |B| ≥ δ}.
       ¯
where B is a bucket in CUDA-FMURP and two elements p and q belongs to
                ¯                            ¯
the same bucket B if it follows the relation R defined below.
                       ¯
Definition 14 (Relation R)
                                                ¯
                                       (p, q) ∈ B            ⇔       (p, q) ∈ R ¯
                                                ¯
                                       (p, q) ∈ R            ⇔        ∗ = k∗
                                                                     ki,j   ¯¯
                                                                             i,j

where i = p/(n − l + 1) , j = p mod (n − l + 1), ¯ = q/(n − l + 1) , and
                                                 i
¯ = q mod (n − l + 1).
j

Lemma 15
                                                  ¯
                                   Relation R and R are equivalent.

 J.B. Clemente (ACLab, DCS, UPD)                        CUDA-FMURP                                     March 31, 2012     47 / 88
Parallel Motif Finding using Random Projection   Parallel Motifs Finding using Random Projection (CUDA-FMURP v1)

Correctness v1
In CUDA-FMURP v1, an enriched bucket is defined as

                                               ¯     ¯ ¯
                                               EB = {B| |B| ≥ δ}.
       ¯
where B is a bucket in CUDA-FMURP and two elements p and q belongs to
                ¯                            ¯
the same bucket B if it follows the relation R defined below.
                       ¯
Definition 14 (Relation R)
                                                ¯
                                       (p, q) ∈ B            ⇔       (p, q) ∈ R ¯
                                                ¯
                                       (p, q) ∈ R            ⇔        ∗ = k∗
                                                                     ki,j   ¯¯
                                                                             i,j

where i = p/(n − l + 1) , j = p mod (n − l + 1), ¯ = q/(n − l + 1) , and
                                                 i
¯ = q mod (n − l + 1).
j

Lemma 15
                                                  ¯
                                   Relation R and R are equivalent.

 J.B. Clemente (ACLab, DCS, UPD)                        CUDA-FMURP                                     March 31, 2012     47 / 88
Parallel Motif Finding using Random Projection   Parallel Motifs Finding using Random Projection (CUDA-FMURP v1)

Correctness
                                                          ¯
Note that elements in B involves Si,j s while elements in B involves the set of
integers p ∈ {0, . . . , (x − 1)}. Using Equations

                                          tid = i × (n − l + 1) + j                                                      (2)


                                                             tid
                                               i=                                                                        (3)
                                                         (n − l + 1)

                                         j = tid         mod (n − l + 1)                                                 (4)
we can retrieve the l-mer Si,j corresponding to tid and vice versa. The theorem
                                        ¯
below follows from the fact that R and R are equivalent.
Theorem 4.2
                                                    ¯
                     Set of enriched buckets EB and EB are equivalent.

 J.B. Clemente (ACLab, DCS, UPD)                        CUDA-FMURP                                     March 31, 2012     48 / 88
Parallel Motif Finding using Random Projection   Parallel Motifs Finding using Random Projection (CUDA-FMURP v1)

Correctness
                                                          ¯
Note that elements in B involves Si,j s while elements in B involves the set of
integers p ∈ {0, . . . , (x − 1)}. Using Equations

                                          tid = i × (n − l + 1) + j                                                      (2)


                                                             tid
                                               i=                                                                        (3)
                                                         (n − l + 1)

                                         j = tid         mod (n − l + 1)                                                 (4)
we can retrieve the l-mer Si,j corresponding to tid and vice versa. The theorem
                                        ¯
below follows from the fact that R and R are equivalent.
Theorem 4.2
                                                    ¯
                     Set of enriched buckets EB and EB are equivalent.

 J.B. Clemente (ACLab, DCS, UPD)                        CUDA-FMURP                                     March 31, 2012     48 / 88
Parallel Motif Finding using Random Projection   Parallel Motif Finding using Random Projection (CUDA-FMURP v2)

CUDA-FMURP v2
INPUT: Set of sequences S, motif length l, expected mismatches d, projection
dimension k, and bucket threshold δ
OUTPUT: Motif
  1   In CPU, generate k random positions for projection.
      Let this be the set I = {ˆ ˆ ∈ {0, . . . , (l − 1)}} and |I| = k.
                               i|i
  2   In GPU, for each thread tid in {0, . . . , (x − 1)},
          1   Get hI (Si,j )s for all Si,j s in S,
              where i ∈ {0, . . . , (t − 1)}, and j ∈ 0, . . . , (n − l).
          2   Convert each k-mer hI (Si,j ) to its corresponding
                                            ∗
              integer representation ki,j .
  3                             ∗
      In CPU, hash the list of ki,j s .
  4   In CPU, identify the set of enriched buckets.
  5   In GPU, for each tid in {0, . . . , (e − 1)},
          1   Perform EM refinement for each enriched bucket.
          2   Perform SP-STARσ for each enriched bucket.
          3   Maximize σ score to output best motif.
 J.B. Clemente (ACLab, DCS, UPD)                        CUDA-FMURP                                    March 31, 2012      49 / 88
Parallel Motif Finding using Random Projection   Parallel Motif Finding using Random Projection (CUDA-FMURP v2)

CUDA-FMURP v2
INPUT: Set of sequences S, motif length l, expected mismatches d, projection
dimension k, and bucket threshold δ
OUTPUT: Motif
  1   In CPU, generate k random positions for projection.
      Let this be the set I = {ˆ ˆ ∈ {0, . . . , (l − 1)}} and |I| = k.
                               i|i
  2   In GPU, for each thread tid in {0, . . . , (x − 1)},
          1   Get hI (Si,j )s for all Si,j s in S,
              where i ∈ {0, . . . , (t − 1)}, and j ∈ 0, . . . , (n − l).
          2   Convert each k-mer hI (Si,j ) to its corresponding
                                            ∗
              integer representation ki,j .
  3                             ∗
      In CPU, hash the list of ki,j s.
  4   In CPU, identify the set of enriched buckets.
  5   In GPU, for each tid in {0, . . . , (e − 1)},
          1   Perform EM refinement for each enriched bucket.
          2   Perform SP-STARσ for each enriched bucket.
          3   Maximize σ score to output best motif.
 J.B. Clemente (ACLab, DCS, UPD)                        CUDA-FMURP                                    March 31, 2012      50 / 88
Parallel Motif Finding using Random Projection   Parallel Motif Finding using Random Projection (CUDA-FMURP v2)

Hash Table in CPU




J.B. Clemente (ACLab, DCS, UPD)                        CUDA-FMURP                                    March 31, 2012      51 / 88
Parallel Motif Finding using Random Projection   Parallel Motif Finding using Random Projection (CUDA-FMURP v2)

Hash Table in CPU




J.B. Clemente (ACLab, DCS, UPD)                        CUDA-FMURP                                    March 31, 2012      52 / 88
Parallel Motif Finding using Random Projection   Parallel Motif Finding using Random Projection (CUDA-FMURP v2)

Hash Table in CPU




J.B. Clemente (ACLab, DCS, UPD)                        CUDA-FMURP                                    March 31, 2012      53 / 88
Parallel Motif Finding using Random Projection   Parallel Motif Finding using Random Projection (CUDA-FMURP v2)

Hash Table in CPU




J.B. Clemente (ACLab, DCS, UPD)                        CUDA-FMURP                                    March 31, 2012      54 / 88
Parallel Motif Finding using Random Projection   Parallel Motif Finding using Random Projection (CUDA-FMURP v2)

Hash Table in CPU


                                                                  ∗
To avoid collision between two items with different ki,j s, linear probing is
implemented.
Suppose, we will hash item p with key ki∗ ,j , and found out that h(ki∗ ,j ) is not
empty,
i.e. ∃ ki,j , such that h(ki,j ) = h(ki∗ ,j ) and ki,j = ki∗ ,j .
        ∗                  ∗                       ∗

We have to look for empty positions in table where we can place item p.
We explore positions

                                   h (ki∗ ,j , i) = (h(ki,j ) + i)
                                                        ∗
                                                                               mod x
for i from 0 to (m − 1), until an empty hash table position is found.




 J.B. Clemente (ACLab, DCS, UPD)                        CUDA-FMURP                                    March 31, 2012      55 / 88
Parallel Motif Finding using Random Projection   Parallel Motif Finding using Random Projection (CUDA-FMURP v2)

Hash Table in CPU


                                                                  ∗
To avoid collision between two items with different ki,j s, linear probing is
implemented.
Suppose, we will hash item p with key ki∗ ,j , and found out that h(ki∗ ,j ) is not
empty,
i.e. ∃ ki,j , such that h(ki,j ) = h(ki∗ ,j ) and ki,j = ki∗ ,j .
        ∗                  ∗                       ∗

We have to look for empty positions in table where we can place item p.
We explore positions

                                   h (ki∗ ,j , i) = (h(ki,j ) + i)
                                                        ∗
                                                                               mod x
for i from 0 to (m − 1), until an empty hash table position is found.




 J.B. Clemente (ACLab, DCS, UPD)                        CUDA-FMURP                                    March 31, 2012      55 / 88
Parallel Motif Finding using Random Projection   Parallel Motif Finding using Random Projection (CUDA-FMURP v2)

Hash Table in CPU


                                                                  ∗
To avoid collision between two items with different ki,j s, linear probing is
implemented.
Suppose, we will hash item p with key ki∗ ,j , and found out that h(ki∗ ,j ) is not
empty,
i.e. ∃ ki,j , such that h(ki,j ) = h(ki∗ ,j ) and ki,j = ki∗ ,j .
        ∗                  ∗                       ∗

We have to look for empty positions in table where we can place item p.
We explore positions

                                   h (ki∗ ,j , i) = (h(ki,j ) + i)
                                                        ∗
                                                                               mod x
for i from 0 to (m − 1), until an empty hash table position is found.




 J.B. Clemente (ACLab, DCS, UPD)                        CUDA-FMURP                                    March 31, 2012      55 / 88
Parallel Motif Finding using Random Projection   Parallel Motif Finding using Random Projection (CUDA-FMURP v3)

CUDA-FMURP v3
INPUT: Set of sequences S, motif length l, expected mismatches d, projection
dimension k, and bucket threshold δ
OUTPUT: Motif
  1   In CPU, generate k random positions for projection.
      Let this be the set I = {ˆ ˆ ∈ {0, . . . , (l − 1)}} and |I| = k.
                               i|i
  2   In GPU, for each thread tid in {0, . . . , (t − 1)},
          1   Get hI (Stid,j )s for all Stid,j s in S,
              where j ∈ 0, . . . , (n − l).
          2   Convert each k-mer hI (Stid,j ) to its corresponding
                                           ∗
              integer representation ktid,j .
  3                             ∗
      In CPU, hash the list of ki,j s.
  4   In CPU, identify the set of enriched buckets.
  5   In GPU, for each tid in {0, . . . , (e − 1)},
          1   Perform EM refinement for each enriched bucket.
          2   Perform SP-STARσ for each enriched bucket.
          3   Maximize σ score to output best motif.
 J.B. Clemente (ACLab, DCS, UPD)                        CUDA-FMURP                                    March 31, 2012      56 / 88
Parallel Motif Finding using Random Projection   Parallel Motif Finding using Random Projection (CUDA-FMURP v3)

CUDA-FMURP v3
INPUT: Set of sequences S, motif length l, expected mismatches d, projection
dimension k, and bucket threshold δ
OUTPUT: Motif
  1   In CPU, generate k random positions for projection.
      Let this be the set I = {ˆ ˆ ∈ {0, . . . , (l − 1)}} and |I| = k.
                               i|i
  2   In GPU, for each thread tid in {0, . . . , (t − 1)},
          1   Get hI (Stid,j )s for all Stid,j s in S,
              where j ∈ 0, . . . , (n − l).
          2   Convert each k-mer hI (Stid,j ) to its corresponding
                                           ∗
              integer representation ktid,j .
  3                             ∗
      In CPU, hash the list of ki,j s.
  4   In CPU, identify the set of enriched buckets.
  5   In GPU, for each tid in {0, . . . , (e − 1)},
          1   Perform EM refinement for each enriched bucket.
          2   Perform SP-STARσ for each enriched bucket.
          3   Maximize σ score to output best motif.
 J.B. Clemente (ACLab, DCS, UPD)                        CUDA-FMURP                                    March 31, 2012      57 / 88
Parallel Motif Finding using Random Projection   Parallel Motif Finding using Random Projection (CUDA-FMURP v3)

CUDA-Projection v3




J.B. Clemente (ACLab, DCS, UPD)                        CUDA-FMURP                                    March 31, 2012      58 / 88
Parallel Motif Finding using Random Projection   Parallel Motif Finding using Random Projection (CUDA-FMURP v3)

CUDA-Projection v3




J.B. Clemente (ACLab, DCS, UPD)                        CUDA-FMURP                                    March 31, 2012      59 / 88
Parallel Motif Finding using Random Projection   Parallel Motif Finding using Random Projection (CUDA-FMURP v3)

Integer Conversion




J.B. Clemente (ACLab, DCS, UPD)                        CUDA-FMURP                                    March 31, 2012      60 / 88
Result and Analysis

Running Time and Space Complexity




 Algorithm                          Time                  Space         Number of Processors
 FMURP                             O(log(x))               O(x)                 1
 SEQ-FMURP                          O(x2 )             Oe(n − l + 1)            1
 CUDA-FMURP v1                       O(x)             O(e(n − l + 1))            x
 CUDA-FMURP v2                       O(x)             O(e(n − l + 1))            x
 CUDA-FMURP v3                       O(x)             O(e(n − l + 1))            t

Table: Total running time and space complexity of the three parallel algorithms for
CUDA-FMURP in comparison with the two sequential implementations.




 J.B. Clemente (ACLab, DCS, UPD)                  CUDA-FMURP                  March 31, 2012   61 / 88
Result and Analysis

Speedup and Efficiency


FMURP: O(x log x)
The computation of Speedup is the ratio of sequential and parallel running
time.

                                    Sequential
                                         SP =
                                      Parallel
Comparison of Speedups SP , SP , and SP for CUDA-FMURP versions 1 to 3,
respectively is shown below.

                                                    O(x log x)
                            SP = SP = SP =                     = O(log x)
                                                      O(x)




 J.B. Clemente (ACLab, DCS, UPD)                CUDA-FMURP                  March 31, 2012   62 / 88
Result and Analysis

Speedup and Efficiency
Computation of processor Efficiency makes use of the speedup SP and
number of processors used ˆ.
                          p

                                       1
                                         · SPEP =
                                       ˆ
                                       p
Comparison of Efficiencies EP , EP , and EP for CUDA-FMURP versions 1 to
3, respectively is shown below.

                                           1              log x
                                   EP =      · O(log x) =                          (5)
                                           x                x
                                           1              log x
                                   EP =      · O(log x) =                          (6)
                                           x                x
                                           1              log x
                                   EP =      · O(log x) =                          (7)
                                           t                t

                                           EP = EP < EP
 J.B. Clemente (ACLab, DCS, UPD)                CUDA-FMURP        March 31, 2012   63 / 88
Result and Analysis    Dataset

Dataset

                          t         n         l       d        Instances generated
                         20        600       10       2                100
                         20        600       11       2                100
                         20        600       12       3                100
                         20        600       13       3                100
                         20        600       14       4                100
                         20        600       15       4                100
                         20        600       16       5                100
                         20        600       17       5                100
                         20        600       18       6                100
                         20        600       19       6                100
Table: Summary of generated dataset that is used to determine the accuracy of
CUDA-FMURP. For each of the instance generated, the search model OOPS is
assumed, that is each sequence contains exactly one occurrence of the planted motif.


 J.B. Clemente (ACLab, DCS, UPD)                      CUDA-FMURP                     March 31, 2012   64 / 88
Result and Analysis   Dataset

Accuracy

  t      n        l     d    FMURP      FMURP∗           SEQ-FMURP   CUDA-FMURP          m
 20     600      10     2      13         100                98          98             72
 20     600      11     2      99         100                100         100            16
 20     600      12     3      3          96                 83          83             259
 20     600      13     3      81         100                100         100            62
 20     600      14     4      1          86                 79          79             645
 20     600      15     4      49         100                100         100            172
 20     600      16     5      0          77                 53          53            1292
 20     600      17     5      19         98                 98          98             378
 20     600      18     6      0          82                 38          38            2217
 20     600      19     6      9          98                 94          94             711

Table: The table shows the number of correctly identified planted motif over 100
random input instances. For each of the instances, parameters k = 7 and s = 4 are
used. The column labelled FMURP∗ is based from the result presented in
[Tompa,2001] using the dataset they generated.

 J.B. Clemente (ACLab, DCS, UPD)                CUDA-FMURP                March 31, 2012   65 / 88
Result and Analysis   Machine Setups

Machine Setups




                                                                   System specifications       Values
System specifications       Values                                  Host processors (procs)    Core(TM) i7-2600 CPU 3.40GHz
Host processors (procs)    2 × Intel Quad-core 2.26GHz             Total number of cores      4 × 2 (hyperthreaded) = 8
Total number of cores      8                                       Max host RAM               8GB
Max host RAM               12GB                                    Device/s (GPU/s)           1 × NVIDIA GeForce GTX 580
Device/s (GPU/s)           2 × NVIDIA GT120                        Compute capability         2.0
Compute capability         1.1                                     CUDA Cores/GPU             16 (multiprocs) × 32 (cores/proc) = 512
CUDA Cores/GPU             4 (multiprocs) × 8 (cores/proc) = 32    GPU clock rate             1.54 GHz
GPU clock rate             1.40 GHz                                Memory clock rate          2004 Mhz
Memory clock rate          500 Mhz                                 Max device global memory   1535MB
Max device global memory   512MB                                   Operating system           64-bit Ubuntu 10.0.4
Operating system           Mac OS X 10.6.8                         CUDA version               4.1
CUDA version               3.2




 J.B. Clemente (ACLab, DCS, UPD)                          CUDA-FMURP                                      March 31, 2012        66 / 88
Result and Analysis   Actual Speedup

Actual speed of CUDA-Projection v3 with respect to
CUDA-Projection v1




J.B. Clemente (ACLab, DCS, UPD)                CUDA-FMURP                March 31, 2012   67 / 88
Result and Analysis   Actual Speedup

Actual speed of CUDA-FMURP v1 and CUDA-Projection
v3




J.B. Clemente (ACLab, DCS, UPD)                CUDA-FMURP                March 31, 2012   68 / 88
Result and Analysis   Actual Speedup

Actual Speed Result: Setup1




J.B. Clemente (ACLab, DCS, UPD)                CUDA-FMURP                March 31, 2012   69 / 88
Result and Analysis   Actual Speedup

Memory Requirement




J.B. Clemente (ACLab, DCS, UPD)                CUDA-FMURP                March 31, 2012   70 / 88
Result and Analysis   Actual Speedup

Actual speed comparison and speedup of CUDA-FMURP
v1 with respect to SEQ-FMURP and FMURP using Setup 2




J.B. Clemente (ACLab, DCS, UPD)                CUDA-FMURP                March 31, 2012   71 / 88
Conclusion

Conclusion



In this work, we presented three versions of parallel algorithms for FMURP.

 Algorithm                   Processors   SP wrt FMURP     SP wrt SEQ-FMURP    Efficiency
 CUDA-FMURP v1                   x            O(log x)             O(x)        (log x/x)
 CUDA-FMURP v2                   x            O(log x)             O(x)        (log x/x)
 CUDA-FMURP v3                   t            O(log x)             O(x)         (log x/t)

We implemented CUDA-FMURP v1 and CUDA-FMURP v2 and achieved a
maximum actual speedup of 6.8 and 6.6 respectively with respect to the
SEQ-FMURP.




 J.B. Clemente (ACLab, DCS, UPD)              CUDA-FMURP                 March 31, 2012   72 / 88
Conclusion




                                        curtain




J.B. Clemente (ACLab, DCS, UPD)       CUDA-FMURP   March 31, 2012   73 / 88
References

References
   J.P. Allouche and J. Shallit, “Automatic Sequences: Theory Applications
   and Generalizations”, Cambridge University Press,Chapter 3:
   Numeration Systems, pp 70-73, 2003
   P. Pevzner and S. H. Sze, “Combinatorial Approaches to Finding Subtle
   Signals in DNA Sequences”, Proceedings of 8th Int. Conf. Intelligent
   Systems for Molecular Biology (ISMB), 269-78, 2000
   J. Buhler, M. Tompa, “Finding Motifs Using Random Projections”,
   RECOMB ’01 Proceedings of the fifth annual international conference on
   Computational biology, 2001
   D. Kirk, W. Hwu, Programming Massively Parallel Processors: A Hands
   On Approach, 1st ed. MA, USA: Morgan Kaufmann, 2010
   M. Harris, “Mapping computational concepts to GPUs”, ACM
   SIGGRAPH 2005 Courses, NY, USA, 2005
      N. Jones, P. Pevzner,“An Introduction to Bioinformatics Algorithms”,
      Massachusetts Institute of Technology Press, 2004
J.B. Clemente (ACLab, DCS, UPD)     CUDA-FMURP                  March 31, 2012   74 / 88
Extra Slides

Finding Motifs using Random Projection (FMURP)

INPUT: Set of sequences S, motif length l, expected mismatches d, projection
dimension k, and bucket threshold δ
OUTPUT: Motif
  1   Projection
          1   Generate k random positions for projection.
              Let this be the set I = {ˆ ˆ ∈ {0, . . . , (l − 1)}} and |I| = k.
                                       i|i
          2   For each Si,j in S,
                  1   Get hI (Si,j )s from all Si,j s in S,
                      where i ∈ {0, . . . , (t − 1)}, and j ∈ 0, . . . , (n − l).
                  2   Sort Si,j s with respect to hI (Si,j ).
                  3   Perform a linear search over all hI (Si,j )s to determine which l-mers
                      are ‘hashed’ in the same bucket.
  2   Refine each enriched bucket using Expectation Maximization (EM)
  3   Refine each enriched bucket using SP-STARσ
  4   Maximize score to output best motif

 J.B. Clemente (ACLab, DCS, UPD)                CUDA-FMURP                          March 31, 2012   75 / 88
Extra Slides   Projection

Projection: Example
Given a set of DNA sequences S, pattern length l = 4, projection dimension
k = 2, and bucket threshold δ = 3.

                           S0 :    C   G    G         T    C       A   G   G
                           S1 :    T   T    C         G    A       C   A   T
                           S2 :    A   C    G         A    T       G   A   A
                         Figure: Set of t = 3 sequences each with n = 8


      We generate the set of k random positions used in the actual projection.
      Suppose we have the set I = {0, 1}.
      For all Si,j in S, we get hI (Si,j ) using the random positions in I generated
      in step 1.
      To hash Si,j s to corresponding buckets using its hI (Si,j ), the list defined
      above is sorted lexicographically in terms of hI (Si,j ) together with their
      corresponding Si,j s .The sorted list is obtained.
 J.B. Clemente (ACLab, DCS, UPD)             CUDA-FMURP                        March 31, 2012   76 / 88
Extra Slides   Projection

Projection: Example
   Label      Si,j      hI (Si,j )             Label             Sorted Si,j   Sorted hI (Si,j )
    S0,0     CGGT         CG                    S2,0              ACGA              AC
    S0,1     GGTC         GG                    S1,4               ACAT             AC
    S0,2     GTCA         GT                    S2,3               ATCA             AT
    S0,3     TCAG         TC                    S0,4              CAGG              CA
    S0,4     CAGG         CA                    S0,0               CGGT             CG
    S1,0     TTCG         TT                    S2,1               CGAT             CG
    S1,1     TCGA         TC                    S1,2              CGAC              CG
    S1,2     CGAC         CG                    S1,3              GACA              GA
    S1,3     GACA         GA                    S2,2               GATC             GA
    S1,4     ACAT         AC                    S0,1               GGTC             GG
    S2,0     ACGA         AC                    S0,2               GTCA             GT
    S2,1     CGAT         CG                    S1,1               TCGA             TC
    S2,2     GATC         GA                    S0,3               TCAG             TC
    S2,3     ATCA         AT                    S2,4              TGAA              TG
    S2,4     TGAA         TG                    S1,0               TTCG             TT
 J.B. Clemente (ACLab, DCS, UPD)         h (S )s computed from step 2. March 31, 2012
Figure: Illustration showing the set of CUDA-FMURP                     The sorted                  77 / 88
Extra Slides   Projection

Projection: Example
     To get the list of buckets, we will perform a linear search over hI (Si,j )s to
     get the corresponding Si,j with equivalent hI (Si,j )s.

                      hI (Si,j )   Count               Si,j
                        AC           2           { ACGA, ACAT }
                        AT           1              { ATCA }
                        CA           1              {CAGG }
                        CG           3        {CGGT, CGAT , CGAC }
                        GA           2           {GACA, GATC }
                        GG           1              {GGTC }
                        GT           1              {GTCA }
                        TC           2           {TCGA, TCAG }
                        TG           1              {TGAA }
                        TT           1               {TTCG}
                            Figure: Buckets obtained from Projection

J.B. Clemente (ACLab, DCS, UPD)              CUDA-FMURP                March 31, 2012   78 / 88
Extra Slides   Projection

Projection: Example
     From the set of buckets obtained, we identify which of those contains at
     least δ l-mers hashed and consider them enriched.

                      hI (Si,j )   Count               Si,j
                        AC           2           { ACGA, ACAT }
                        AT           1              { ATCA }
                        CA           1              {CAGG }
                        CG           3        {CGGT, CGAT , CGAC }
                        GA           2           {GACA, GATC }
                        GG           1              {GGTC }
                        GT           1              {GTCA }
                        TC           2           {TCGA, TCAG }
                        TG           1              {TGAA }
                        TT           1               {TTCG}
                            Figure: Buckets obtained from Projection

J.B. Clemente (ACLab, DCS, UPD)              CUDA-FMURP                March 31, 2012   79 / 88
Extra Slides   Projection

Projection: Example
     From the set of buckets obtained, we identify which of those contains at
     least δ l-mers hashed and consider them enriched.

                      hI (Si,j )   Count               Si,j
                        AC           2           { ACGA, ACAT }
                        AT           1              { ATCA }
                        CA           1              {CAGG }
                        CG           3        {CGGT, CGAT , CGAC }
                        GA           2           {GACA, GATC }
                        GG           1              {GGTC }
                        GT           1              {GTCA }
                        TC           2           {TCGA, TCAG }
                        TG           1              {TGAA }
                        TT           1               {TTCG}
                            Figure: Buckets obtained from Projection

J.B. Clemente (ACLab, DCS, UPD)              CUDA-FMURP                March 31, 2012   80 / 88
Extra Slides   Expectation Maximization (EM)

Expectation Maximization (EM)

INPUT: Motif model θ0 from one enriched bucket, maximum number of
iterations, and threshold for convergence δEM
OUTPUT: Motif model θy
  1   For j in {1, . . . , y} or until convergence
          1   E-step For all l-mer in each sequence Si ,
              compute E(Si,ai |θj ) given the current motif model.
          2   (M-step) For all Si in S,
              get starting positions s such that for each ai ∈ s,
              E(Si,ai |θj ) is maximum ∀ ai in {0, . . . , (n − l)}.
          3   (Test for Convergence) Compute L(θj ). Compare previous
              likelihood L(θj−1 ) to current L(θj ).
              If the difference satisfies the threshold δEM , stop iteration.
          4   (Update step) For the alignment made by starting position vector s
              identified in M-step,
              get motif model θj+1 .


 J.B. Clemente (ACLab, DCS, UPD)          CUDA-FMURP                               March 31, 2012   81 / 88
Extra Slides   Expectation Maximization (EM)

EM: Example
From the set of enriched bucket from Projection, EM performs the following
operations.
      From EB , get the alignment made by hashed l-mers.
                                  C G G T
                                  C G A C
                                  C G A T
      From the alignment made, a profile matrix is computed.
                                            C      G      G      T
                                            C      G      A      C
                                            C      G      A      T
                                   A:       0      0      2      0
                                   C:       3      0      0      1
                                   G:       0      3      1      0
                                   T:       0      0      0      2

 J.B. Clemente (ACLab, DCS, UPD)         CUDA-FMURP                               March 31, 2012   82 / 88
Extra Slides    Expectation Maximization (EM)

EM: Example

     Normalize the profile matrix obtained.
                            A: 0.00 0.00 0.33 0.00
                            C: 1.00 0.00 0.00 0.33
                            G: 0.00 1.00 0.66 0.00
                            T: 0.00 0.00 0.00 0.66
     To avoid zero values for Pr(Si,j |θ), [Tompa,2001] performed Laplace
     correction. For each row corresponding to a symbol say a, the
     probability pa that symbol a appears in the sequence is added to its
     corresponding row. Since all symbols in ΣDNA has uniform frequency
     distribution, 0.25 is added for each row.
                                  A:   0.25           0.25     0.58        0.25
                                  C:   1.25           0.25     0.25        0.58
                                  G:   0.25           1.25     0.91        0.25
                                  T:   0.25           0.25     0.25        0.91

J.B. Clemente (ACLab, DCS, UPD)              CUDA-FMURP                                March 31, 2012   83 / 88
Extra Slides     Expectation Maximization (EM)

EM: Example

     Normalize the matrix obtained and let the resulting matrix be the initial
     motif model θ0 .
                                  A:   0.125          0.125     0.290       0.125
                                  C:   0.625          0.125     0.125       0.290
                                  G:   0.125          0.625     0.455       0.125
                                  T:   0.125          0.125     0.125       0.455
     For each Si in S get j such that for all j ∈ {0, . . . , (n − l)}, E(Si,j |θ0 ) is
     maximum. For instance, let’s identify an l-mer in sequence S0 with
     maximum expectation E(S0,j |θ0 ).

        E(S0,0 |θ0 )    =    E(CGGT|θ0 )   =      ((0.625)(0.625)(0.455)(0.455))/(0.254 )            =   20.725
        E(S0,1 |θ0 )    =    E(GGTC|θ0 )   =      ((0.125)(0.625)(0.125)(0.125))/(0.254 )            =   00.313
        E(S0,2 |θ0 )    =    E(GTCA|θ0 )   =      ((0.125)(0.125)(0.125)(0.125))/(0.254 )            =   00.063
        E(S0,3 |θ0 )    =    E(TCAG|θ0 )   =      ((0.125)(0.125)(0.455)(0.290))/(0.254 )            =   00.528
        E(S0,4 |θ0 )    =    E(CAGG|θ0 )   =      ((0.625)(0.125)(0.455)(0.125))/(0.254 )            =   01.138

     From all S0,j s in S0 , l-mer S0,0 obtains the highest expectation.

J.B. Clemente (ACLab, DCS, UPD)              CUDA-FMURP                                 March 31, 2012   84 / 88
Extra Slides   Expectation Maximization (EM)

EM: Example

      The set of l-mers with the highest expectation in each sequence will
      define another alignment, like in Step 1. From this set of l-mers, we can
      obtain the next motif model θ1 .
                                      S0,0 :      C       G     G       T      : 20.73
                                      S1,2 :      C       G     A       C      : 08.41
                                      S2,1 :      C       G     A       T      : 13.20
      We compute the likelihood of a motif model θy using the best
      expectations.

                                   L(θ) = 20.73 + 08.41 + 13.20 = 42.34

      Update the motif model θ0 to get θ1 , using the set of l-mers from each
      sequence that maximize the expectation.
      Stop iteration if L(θy ) − L(θy−1 ) ≤ δEM .
The output of EM in this example is the consensus string CGAT.
 J.B. Clemente (ACLab, DCS, UPD)                     CUDA-FMURP                               March 31, 2012   85 / 88
Extra Slides   Expectation Maximization (EM)

EM: Example

      The set of l-mers with the highest expectation in each sequence will
      define another alignment, like in Step 1. From this set of l-mers, we can
      obtain the next motif model θ1 .
                                      S0,0 :      C       G     G       T      : 20.73
                                      S1,2 :      C       G     A       C      : 08.41
                                      S2,1 :      C       G     A       T      : 13.20
      We compute the likelihood of a motif model θy using the best
      expectations.

                                   L(θ) = 20.73 + 08.41 + 13.20 = 42.34

      Update the motif model θ0 to get θ1 , using the set of l-mers from each
      sequence that maximize the expectation.
      Stop iteration if L(θy ) − L(θy−1 ) ≤ δEM .
The output of EM in this example is the consensus string CGAT.
 J.B. Clemente (ACLab, DCS, UPD)                     CUDA-FMURP                               March 31, 2012   85 / 88
Extra Slides   Expectation Maximization (EM)

SP-STARσ

INPUT: Consensus string M from θy and expected mismatches d
OUTPUT: Refined consensus string M ∗
  1   For j in {1, . . . , y } or until convergence
          1   Compute for Sb , where Sb is the set of all l-mers from each sequence that
              has the least Edit distance from M.

                                   Sb = {Si,j |dE (M, Si,j ) is minimum ∀Si,j in Si }

          2   Compute for score σ(Sb ), where it is equal to the number of sequences in
              Sb such that
                                          dE (M, Si,j ) ≤ d
          3   Compute the consensus string M from alignment made by Sb .
          4   Compute Sb from M .
          5   Compute σ(Sb ).
          6   If σ(Sb ) > σ(Sb ), continue iteration using M = M ,
              else M ∗ = M .

 J.B. Clemente (ACLab, DCS, UPD)                  CUDA-FMURP                               March 31, 2012   86 / 88
Extra Slides   Expectation Maximization (EM)

SP-STARσ: Example


Using M =CGAT and expected mismatches d = 1.
      Compute for Sb . For S0 the S0,j is identified as follows.
                  dE (M, S0,0 )       =        dE (CGAT, CGGT)                      =    1
                  dE (M, S0,1 )       =        dE (CGAT, GGTC)                      =    3
                  dE (M, S0,2 )       =        dE (CGAT, GTCA)                      =    4
                  dE (M, S0,3 )       =        dE (CGAT, TCAG)                      =    3
                  dE (M, S0,4 )       =        dE (CGAT, CAGG)                      =    3
      The set Sb contains

                                     Sb = {S0,0 , S1,2 , S2,1 }
                                   Sb = CGGT, CGAC, CGAT



 J.B. Clemente (ACLab, DCS, UPD)           CUDA-FMURP                                   March 31, 2012   87 / 88
Extra Slides   Expectation Maximization (EM)

SP-STARσ: Example

     Score for Sb is
                                              σ(Sb ) = 3
     because the least edit distance in each sequence is 1, 1, 0. That is all 3
     sequences satisfies
                                   dE (M, Si,j ) ≤ 1
     Consensus string from Sb is M = CGAT.
     Sb from M is similar to Sb .

                                     Sb = {S0,0 , S1,2 , S2,1 }

                                  Sb = {CGGT, CGAC, CGAT}
     Since σ(Sb ) = σ(Sb ),
     M ∗ = M = CGAT.

J.B. Clemente (ACLab, DCS, UPD)            CUDA-FMURP                               March 31, 2012   88 / 88

Mais conteúdo relacionado

Semelhante a Parallel Random Projection for Motif Discovery on GPUs

Traditional vs Nontraditional Methods for Network Analytics - Ernesto Estrada
Traditional vs Nontraditional Methods for Network Analytics - Ernesto EstradaTraditional vs Nontraditional Methods for Network Analytics - Ernesto Estrada
Traditional vs Nontraditional Methods for Network Analytics - Ernesto EstradaLake Como School of Advanced Studies
 
Triumph- JEE Advanced Maths - Paper 1
Triumph- JEE Advanced Maths - Paper 1Triumph- JEE Advanced Maths - Paper 1
Triumph- JEE Advanced Maths - Paper 1askiitians
 
A SCC Recursive Meta-Algorithm for Computing Preferred Labellings in Abstract...
A SCC Recursive Meta-Algorithm for Computing Preferred Labellings in Abstract...A SCC Recursive Meta-Algorithm for Computing Preferred Labellings in Abstract...
A SCC Recursive Meta-Algorithm for Computing Preferred Labellings in Abstract...Federico Cerutti
 
The Smith Waterman algorithm
The Smith Waterman algorithmThe Smith Waterman algorithm
The Smith Waterman algorithmavrilcoghlan
 

Semelhante a Parallel Random Projection for Motif Discovery on GPUs (6)

Traditional vs Nontraditional Methods for Network Analytics - Ernesto Estrada
Traditional vs Nontraditional Methods for Network Analytics - Ernesto EstradaTraditional vs Nontraditional Methods for Network Analytics - Ernesto Estrada
Traditional vs Nontraditional Methods for Network Analytics - Ernesto Estrada
 
Asymptotic Analysis.ppt
Asymptotic Analysis.pptAsymptotic Analysis.ppt
Asymptotic Analysis.ppt
 
Triumph- JEE Advanced Maths - Paper 1
Triumph- JEE Advanced Maths - Paper 1Triumph- JEE Advanced Maths - Paper 1
Triumph- JEE Advanced Maths - Paper 1
 
A SCC Recursive Meta-Algorithm for Computing Preferred Labellings in Abstract...
A SCC Recursive Meta-Algorithm for Computing Preferred Labellings in Abstract...A SCC Recursive Meta-Algorithm for Computing Preferred Labellings in Abstract...
A SCC Recursive Meta-Algorithm for Computing Preferred Labellings in Abstract...
 
The Smith Waterman algorithm
The Smith Waterman algorithmThe Smith Waterman algorithm
The Smith Waterman algorithm
 
yaron_eurika_
yaron_eurika_yaron_eurika_
yaron_eurika_
 

Último

SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxiammrhaywood
 
Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3JemimahLaneBuaron
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfsanyamsingh5019
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeThiyagu K
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdfQucHHunhnh
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Sapana Sha
 
Arihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfArihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfchloefrazer622
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Krashi Coaching
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdfQucHHunhnh
 
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991RKavithamani
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxheathfieldcps1
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...EduSkills OECD
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactPECB
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactdawncurless
 
Mastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionMastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionSafetyChain Software
 
URLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website AppURLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website AppCeline George
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13Steve Thomason
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxGaneshChakor2
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfJayanti Pande
 

Último (20)

SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
 
Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdf
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and Mode
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
 
Arihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfArihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdf
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
 
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global Impact
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impact
 
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptxINDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
 
Mastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionMastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory Inspection
 
URLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website AppURLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website App
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptx
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdf
 

Parallel Random Projection for Motif Discovery on GPUs

  • 1. Finding Planted (l, d)-Motifs in Parallel using Random Projection on GPUs Jhoirene Barasi Clemente Algorithms and Complexity Laboratory Department of Computer Science University of the Philippines-Diliman jbclemente@up.edu.ph March 31, 2012 J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 1 / 88
  • 2. Overview Overview Introduction Definitions and Notations Finding Motifs using Random Projection (FMURP) Parallel Implementations of CUDA-FMURP Results and Analysis Conclusion J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 2 / 88
  • 3. Introduction In this work, we are interested in solving Planted (l, d)-Motif Problem using Random Projection (FMURP). The focus of this study is on parallelization of FMURP, where we present three versions of the parallel algorithm. Correctness of the parallelization is also discussed. We implement two of these parallel algorithms on GPUs. Theoretical and actual performance analyses are also presented. J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 3 / 88
  • 4. Introduction Introduction A DNA motif is defined as a nucleic acid sequence pattern that has some biological significance such as being DNA binding sites for a regulatory protein. i.e., a transcription factor [Das,2007]. J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 4 / 88
  • 5. Introduction Introduction DNA Sequences as Strings J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 5 / 88
  • 6. Introduction Introduction The pattern is fairly short (5 to 20 base-pairs (bp) long) and is known to recur in different genes or several times within gene [Rombauts,1999]. J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 6 / 88
  • 7. Introduction Notations Notations Set of t sequences S. Example 1 (Sequences S = {S0 , S1 , . . . , S(t−1) }) S0 : C G G G G C T A T G G A A C T G G G T C G T C A C A T T C C C C T T T C G A T A S1 : T T T G A G G G T G C C C A A T A A A T G C C A C T C C A A A G C G G A C A A A S2 : G G A T G C A A C T G A T G C C G T T T G A C G A C C T A A A T C A A C G G C C S3 : A A G G A T G C A A C T C C A G G A G C G C C T T T G C T G G T T C T A C C T G S4 : A A T T T T C T A A A A A G A T T A T A A T G T C G G T C C A T G C A A C T T C S5 : C T G C T G T A C A A C T G A G A T C A T G C T G C A T G C A A C T T T C A A C S6 : T A C A T G A T C T T T T G A T G C A A C G T G G A T G A G G G A A T G A T G C Set of sequences S = {S0 , S1 , S2 , S3 , S4 , S5 , S6 } defined over ΣDNA = {A, C, T, G}, where each sequence Si in S has length ni = 40 for all i ∈ {0, . . . , (t − 1)} J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 7 / 88
  • 8. Introduction Notations Notations An l-mer is a string of length l defined over ΣDNA . To denote an l-mer in S, we use Si,j , where i ∈ {0, 1, . . . , (t − 1)} is the sequence number and j ∈ {0, 1, . . . , (n − l)} is the starting position in Si . Example 2 (Si,j in S) For instance, an 8-mer S0,7 is ATGGAACT S0 : C G G G G C T A T G G A A C T G G G T C G T C A C A T T C C C C T T T C G A T A J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 8 / 88
  • 9. Introduction Notations Notations Let s = (a0 , a1 , . . . , a(t−1) ) be the set of starting positions in S, where ai ∈ {0, 1, . . . , (n − l)}. Let A(s) denotes the alignment made by l-mers in the set {S0,a0 , S1,a1 , . . . , S(t−1),a(t−1) }. J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 9 / 88
  • 10. Introduction Notations Notations Example 3 (Alignment matrix A(s)) Suppose we have a starting position vector s = (7, 18, 2, 4, 30, 26, 14) S0,7 : A T G G A A C T S1,18 : A T G C C A C T S2,2 : A T G C A A C T A(s) S3,4 : A T G C A A C T S4,30 : A T G C A A C T S5,26 : A T G C A A C T S6,14 : A T G C A A C G S0 : C G G G G C T A T G G A A C T G G G T C G T C A C A T T C C C C T T T C G A T A S1 : T T T G A G G G T G C C C A A T A A A T G C C A C T C C A A A G C G G A C A A A S2 : G G A T G C A A C T G A T G C C G T T T G A C G A C C T A A A T C A A C G G C C S3 : A A G G A T G C A A C T C C A G G A G C G C C T T T G C T G G T T C T A C C T G S4 : A A T T T T C T A A A A A G A T T A T A A T G T C G G T C C A T G C A A C T T C S5 : C T G C T G T A C A A C T G A G A T C A T G C T G C A T G C A A C T T T C A A C S6 : T A C A T G A T C T T T T G A T G C A A C G T G G A T G A G G G A A T G A T G C J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 10 / 88
  • 11. Introduction Notations Notations A profile matrix P(s) with dimension equal to (|ΣDNA | × l) is derived from the frequency of each letter in each column of the A(s). Example 4 (Profile Matrix P(s)) S0,7 : A T G G A A C T S1,18 : A T G C C A C T S2,2 : A T G C A A C T A(s) S3,4 : A T G C A A C T S4,30 : A T G C A A C T S5,26 : A T G C A A C T S6,14 : A T G C A A C G A: 7 0 0 0 6 7 0 0 T: 0 7 0 0 0 0 0 6 P(s) C: 0 0 0 6 1 0 7 0 G: 0 0 7 1 0 0 0 1 J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 11 / 88
  • 12. Introduction Notations Notations From P(s), we define MP(s) (j), where 0 ≤ j ≤ (l − 1), be the maximum number at jth column of the profile matrix. Example 5 (MP(s),j ) S0,7 : A T G G A A C T S1,18 : A T G C C A C T S2,2 : A T G C A A C T A(s) S3,4 : A T G C A A C T S4,30 : A T G C A A C T S5,26 : A T G C A A C T S6,14 : A T G C A A C G A: 7 0 0 0 6 7 0 0 T: 0 7 0 0 0 0 0 6 P(s) C: 0 0 0 6 1 0 7 0 G: 0 0 7 1 0 0 0 1 J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 12 / 88
  • 13. Introduction Notations Notations A consensus string is an l-mer, where each of its elements is the nucleotide base corresponding to MP(s) (i). Example 6 (Consensus String) S0,7 : A T G G A A C T S1,18 : A T G C C A C T S2,2 : A T G C A A C T A(s) S3,4 : A T G C A A C T S4,30 : A T G C A A C T S5,26 : A T G C A A C T S6,14 : A T G C A A C G A: 7 0 0 0 6 7 0 0 T: 0 7 0 0 0 0 0 6 P(s) C: 0 0 0 6 1 0 7 0 G: 0 0 7 1 0 0 0 1 Consensus String A T G C A A C T J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 13 / 88
  • 14. Introduction Notations Notations We define the Score(s,S) to be equal to l−1 Score(s, S) = MP(s) (i). (1) i=0 Example 7 (Consensus Score()) A: 7 0 0 0 6 7 0 0 T: 0 7 0 0 0 0 0 6 P(s) C: 0 0 0 6 1 0 7 0 G: 0 0 7 1 0 0 0 1 Score(s, S) = 7 + 7 + 7 + 6 + 6 + 7 + 7 + 6 = 53 J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 14 / 88
  • 15. Introduction Notations Notations We define the Score(s,S) to be equal to l−1 Score(s, S) = MP(s) (i). (1) i=0 Example 7 (Consensus Score()) A: 7 0 0 0 6 7 0 0 T: 0 7 0 0 0 0 0 6 P(s) C: 0 0 0 6 1 0 7 0 G: 0 0 7 1 0 0 0 1 Score(s, S) = 7 + 7 + 7 + 6 + 6 + 7 + 7 + 6 = 53 J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 14 / 88
  • 16. Introduction Motif Finding Problem Motif Finding Problem Definition 8 (Motif Finding Problem [Pevzner,2004]) INPUT: A motif length l A set of t sequences S = {S0 , S1 , S2 , . . . , S(t−1) }, where each Si is of length ni OUTPUT: An array of starting positions s = (a0 , a1 , . . . , a(t−1) ) maximizing consensus Score(s,S) J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 15 / 88
  • 17. Introduction Motif Finding Problem Naive MFP Solver [Pevzner,2004] Input: DNA (sequences), motif length l Output: Starting position s and consensus string corresponding to s 1 For each possible starting position in S, i.e. s ∈ {(0, 0, . . . , 0), . . . , ((n − l), (n − l) . . . , (n − l))}. 1 Get alignment A(s). 2 Compute for P(s). 3 Evaluate Score(s, S). 2 From s with the maximum Score, get the consensus string. 3 Output consensus string. Step 1 needs to iterate (n − l + 1)t times because all possible starting positions s is equal to s = (a0 , a1 , . . . , a(t−1) ), ∀ ai ∈ {0, . . . , (n − l)}. J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 16 / 88
  • 18. Introduction Motif Finding Problem Naive MFP Solver [Pevzner,2004] Input: DNA (sequences), motif length l Output: Starting position s and consensus string corresponding to s 1 For each possible starting position in S, i.e. s ∈ {(0, 0, . . . , 0), . . . , ((n − l), (n − l) . . . , (n − l))}. 1 Get alignment A(s). 2 Compute for P(s). 3 Evaluate Score(s, S). 2 From s with the maximum Score, get the consensus string. 3 Output consensus string. Step 1 needs to iterate (n − l + 1)t times because all possible starting positions s is equal to s = (a0 , a1 , . . . , a(t−1) ), ∀ ai ∈ {0, . . . , (n − l)}. J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 16 / 88
  • 19. Introduction Planted (l, d)-Motif Finding Problem Definitions Definition 9 (Challenge Problem [Pevzner,2000]) INPUT: Motif length l = 15, Expected mismatches d, 20 DNA sequences each with ni = 600 nucleotide bases OUTPUT: A consensus string M from an alignment A(s), where each l-mer in A(s) has Si,ai dE (M, Si,ai ) = 4, for all i ∈ {0, . . . , (t − 1)}. J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 17 / 88
  • 20. Introduction Planted (l, d)-Motif Finding Problem Why challenging? Suppose we have A(s), S0,a0 A C T T G G G G C A A G A G G S1,a1 G G A C G G G G C A G A C T G S2,a2 A C T T G C T A A A G A C T G S3,a3 A C T G C G G G C A C A G T G S4,a4 A C C T G G G T C G T A C T G A: 4 0 1 0 0 0 0 1 1 4 1 4 1 0 0 C: 0 4 1 1 1 1 0 0 4 0 1 0 2 0 0 T: 0 0 3 3 0 0 1 1 0 0 1 0 1 4 0 G: 1 1 0 1 4 4 4 3 0 1 2 1 1 1 5 A C T T G G G G C A G A C T G dE (S0,a0 , S1,a1 ) = 2d = 8 Score(s, S) = 4 + 4 + 3 + 3 + 4 + 4 + 4 + 3 + 4 + 4 + 2 + 4 + 2 + 4 + 5 = 54 J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 18 / 88
  • 21. Introduction Planted (l, d)-Motif Finding Problem Definitions Definition 10 (Planted (l, d)-Motif Finding Problem [Tompa,2001]) INPUT: Motif length l, Expected number of mismatches d, and A set of t sequences S = {S0 , S1 , S2 , . . . , S(t−1) }, where each Si is of length ni OUTPUT: A consensus string M from an alignment A(s), where each l-mer in A(s) has Si,ai dE (M, Si,ai ) = d, for all i ∈ {0, . . . , (t − 1)}. J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 19 / 88
  • 22. Introduction Planted (l, d)-Motif Finding Problem Solutions for Planted (l, d)-Motif Finding SP-STAR [Pevzner,2000] Winnower [Pevzner,2000] Random Projection [Tompa,2001] Aggregation [Mohammed,2004] GibbsDST [Shida,2006] J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 20 / 88
  • 23. Finding Motifs using Random Projection (FMURP) Finding Motifs using Random Projection (FMURP) INPUT: Set of sequences S, motif length l, expected mismatches d, projection dimension k, and bucket threshold δ OUTPUT: Motif 1 Projection 1 Get all l-mer Si,j s in S. 2 Get projection hI (Si,j ) for each Si,j in S. 3 Hash each Si,j to buckets with identifier hI (Si,j ). 4 Get enriched buckets. 2 Refine each enriched bucket using EM 3 Refine each enriched bucket using SP-STARσ 4 Maximize score to output best motif J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 21 / 88
  • 24. Finding Motifs using Random Projection (FMURP) Definition 11 Random Projection Given an l-mer Si,j , projection dimension k, and a set I ⊂ L = {0, . . . , (l − 1)}, where |I| = k, elements in I are sorted in increasing order and are randomly chosen from the set L, a k-dimensional projection of Si,j is hI (Si,j ) = Si,j (I0 ), Si,j (I1 ), . . . , Si,j (I(k−1) ), where hI (Si , j) is a k-mer and Ii denotes the ith element in I. J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 22 / 88
  • 25. Finding Motifs using Random Projection (FMURP) FMURP: Example Example 12 Given a set of DNA sequences S, pattern length l = 4, projection dimension k = 2, and bucket threshold δ = 3. S0 : C G G T C A G G S1 : T T C G A C A T S2 : A C G A T G A A Figure: Set of t = 3 sequences each with n = 8 Let I = {0, 1}. J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 23 / 88
  • 26. Finding Motifs using Random Projection (FMURP) Projection J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 24 / 88
  • 27. Finding Motifs using Random Projection (FMURP) Projection J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 25 / 88
  • 28. Finding Motifs using Random Projection (FMURP) Projection J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 26 / 88
  • 29. Finding Motifs using Random Projection (FMURP) Projection J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 27 / 88
  • 30. Parallel Motif Finding using Random Projection How do we parallelize FMURP? 1 Projection 1 Projection 1 Get all l-mer Si,j s in S in 1 Get all l-mer Si,j s in S. parallel. 2 Get projection hI (Si,j ) for each 2 Get projection hI (Si,j ) for each Si,j in S. Si,j in S in parallel. 3 Hash each Si,j to buckets with 3 Hash each Si,j to buckets with identifier hI (Si,j ). identifier hI (Si,j ) in parallel. 4 Get enriched buckets. 4 Get enriched buckets in 2 Refine each enriched bucket parallel. using EM 2 Refine each enriched bucket 3 Refine each enriched bucket using EM in parallel using SP-STARσ 3 Refine each enriched bucket 4 Maximize score to output best using SP-STARσ in parallel motif 4 Maximize score to output best motif. J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 28 / 88
  • 31. Parallel Motif Finding using Random Projection Parallel Algorithms for Motif Finding CUDA-MEME CUDA-Gibbs Sampling J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 29 / 88
  • 32. Parallel Motif Finding using Random Projection CUDA J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 30 / 88
  • 33. Parallel Motif Finding using Random Projection Parallel Motifs Finding using Random Projection (CUDA-FMURP v1) Computing Framework Figure: Flowchart showing the processes done in the CPU and GPU J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 31 / 88
  • 34. Parallel Motif Finding using Random Projection Parallel Motifs Finding using Random Projection (CUDA-FMURP v1) CUDA-FMURP v1 Figure: Thread ID is denoted by an ordered pair (i, j), 0 ≤ i ≤ w and 0 ≤ j ≤ v, where v is the maximum thread per block and w is the number of allocated thread blocks in the grid. The algorithm uses a total of x = t · (n − l + 1) threads that are linearly arranged in GPU. J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 32 / 88
  • 35. Parallel Motif Finding using Random Projection Parallel Motifs Finding using Random Projection (CUDA-FMURP v1) CUDA-FMURP v1 INPUT: Set of sequences S, motif length l, expected mismatches d, projection dimension k, and bucket threshold δ OUTPUT: Motif 1 In CPU, generate k random positions for projection. Let this be the set I = {ˆ ˆ ∈ {0, . . . , (l − 1)}} and |I| = k. i|i 2 In GPU, for each thread tid in {0, . . . , (x − 1)}, 1 Get hI (Si,j )s for each Si,j in S, ∗ 2 Convert each k-mer hI (Si,j ) to its corresponding integer representation ki,j . ∗ 3 Perform a linear search over all ki,j s to determine which l-mers ∗ are ‘hashed’ in the same bucket. The tid of matched ki,j s are noted instead of the actual l-mer. 3 In CPU, identify the set of enriched buckets, and prune duplicates in preparation for EM refinement. 4 In GPU, for each tid in {0, . . . , (e − 1)}, 1 Perform EM refinement for each enriched bucket. 2 Perform SP-STARσ for each enriched bucket. 3 Maximize σ score to output best motif. J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 33 / 88
  • 36. Parallel Motif Finding using Random Projection Parallel Motifs Finding using Random Projection (CUDA-FMURP v1) Integer Conversion Step 2.2 represents each hI (Si,j ) to their corresponding integer representation ∗ ki,j . Given a unique k-mer from projection, a corresponding integer is computed using the following mapping. Let us define f : ΣDNA → {0, 1, 2, 3}, A → 0 C → 1 G → 2 T → 3 where each symbol in the DNA alphabet is mapped to a unique integer. For a string v of length k, f∗ : Σ+ DNA → Z+ ∪ {0} k−1 i v → i=0 f (vi )4 where vi denotes the symbol at ith position starting from the least significant digit and the integer representation is only defined on the positive integers including {0}. J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 34 / 88
  • 37. Parallel Motif Finding using Random Projection Parallel Motifs Finding using Random Projection (CUDA-FMURP v1) CUDA-Projection v1: Example Given a set of DNA sequences, pattern length l = 4, projection dimension k = 2, and bucket threshold δ = 3. Projection in parallel is shown as follows J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 35 / 88
  • 38. Parallel Motif Finding using Random Projection Parallel Motifs Finding using Random Projection (CUDA-FMURP v1) CUDA-Projection v1: Integer Conversion example J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 36 / 88
  • 39. Parallel Motif Finding using Random Projection Parallel Motifs Finding using Random Projection (CUDA-FMURP v1) CUDA-Projection: Parallel Integer Conversion Example J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 37 / 88
  • 40. Parallel Motif Finding using Random Projection Parallel Motifs Finding using Random Projection (CUDA-FMURP v1) CUDA-Projection: Getting enriched buckets J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 38 / 88
  • 41. Parallel Motif Finding using Random Projection Parallel Motifs Finding using Random Projection (CUDA-FMURP v1) CUDA-Projection: Getting enriched buckets J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 39 / 88
  • 42. Parallel Motif Finding using Random Projection Parallel Motifs Finding using Random Projection (CUDA-FMURP v1) CUDA-EM J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 40 / 88
  • 43. Parallel Motif Finding using Random Projection Parallel Motifs Finding using Random Projection (CUDA-FMURP v1) CUDA-SP-STARσ J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 41 / 88
  • 44. Parallel Motif Finding using Random Projection Parallel Motifs Finding using Random Projection (CUDA-FMURP v1) Correctness v1 1 In CPU, generate k random positions for projection. Let this be the set I = {ˆ ˆ ∈ {0, . . . , (l − 1)}} and |I| = k. i|i 2 In GPU, for each thread tid in {0, . . . , (x − 1)}, 1 Get hI (Si,j )s for each Si,j in S, ∗ 2 Convert each k-mer hI (Si,j ) to its corresponding integer representation ki,j . ∗ 3 Perform a linear search over all ki,j s to determine which l-mers ∗ are ‘hashed’ in the same bucket. The tid of matched ki,j s are noted instead of the actual l-mer. 3 In CPU, identify the set of enriched buckets, and prune duplicates in preparation for EM refinement. 4 In GPU, for each tid in {0, . . . , (e − 1)}, 1 Perform EM refinement for each enriched bucket. 2 Perform SP-STARσ for each enriched bucket. 3 Maximize σ score to output best motif. J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 42 / 88
  • 45. Parallel Motif Finding using Random Projection Parallel Motifs Finding using Random Projection (CUDA-FMURP v1) Correctness v1 The uniqueness of the representation we defined using f ∗ follows from the results below. Let Σk = {0, 1, 2, . . . , k − 1}, and let Ck a regular language such that, Ck = { } ∪ (Σk − {0})Σ∗ . k Theorem 4.1 (Fundamental Theorem of base-k Representation [Allouche,2003]) Let k ≥ 2 be an integer. Then every non-negative integer has a unique representation of the form t N= ai ki , i=0 where at = 0 and 0 ≤ ai < k for 0 ≤ i ≤ t. J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 43 / 88
  • 46. Parallel Motif Finding using Random Projection Parallel Motifs Finding using Random Projection (CUDA-FMURP v1) Correctness v1 In the case of our representation f ∗ , we have k = 4 and ai = f (vi ), where vi ∈ ΣDNA . Note that the mapping f is one-to-one and onto by definition. Thus we have the following: Proposition 4.1 f ∗ provides a unique representation of hI (Si,j ), for each i, j, and element of I. J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 44 / 88
  • 47. Parallel Motif Finding using Random Projection Parallel Motifs Finding using Random Projection (CUDA-FMURP v1) Correctness v1 1 In CPU, generate k random positions for projection. Let this be the set I = {ˆ ˆ ∈ {0, . . . , (l − 1)}} and |I| = k. i|i 2 In GPU, for each thread tid in {0, . . . , (x − 1)}, 1 Get hI (Si,j )s for each Si,j in S, ∗ 2 Convert each k-mer hI (Si,j ) to its corresponding integer representation ki,j . ∗ 3 Perform a linear search over all ki,j s to determine which l-mers ∗ are ‘hashed’ in the same bucket. The tid of matched ki,j s are noted instead of the actual l-mer. 3 In CPU, identify the set of enriched buckets, and prune duplicates in preparation for EM refinement. 4 In GPU, for each tid in {0, . . . , (e − 1)}, 1 Perform EM refinement for each enriched bucket. 2 Perform SP-STARσ for each enriched bucket. 3 Maximize σ score to output best motif. J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 45 / 88
  • 48. Parallel Motif Finding using Random Projection Parallel Motifs Finding using Random Projection (CUDA-FMURP v1) Correctness v1 We have to show that the set of enriched buckets EB obtained in FMURP is ¯ equivalent to the set of enriched buckets EB obtained in CUDA-FMURP v1. EB = {B| |B| ≥ δ}. Two elements Si,j and Si ,j belongs to the same bucket B if it follows the relation R defined below. Definition 13 (Relation R) (Si,j , Si ,j ) ∈ B ⇔ (Si,j , Si ,j ) ∈ R (Si,j , Si ,j ) ∈ R ⇔ hI (Si,j ) = hI (Si ,j ) Proposition 4.2 R is an equivalence relation. J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 46 / 88
  • 49. Parallel Motif Finding using Random Projection Parallel Motifs Finding using Random Projection (CUDA-FMURP v1) Correctness v1 We have to show that the set of enriched buckets EB obtained in FMURP is ¯ equivalent to the set of enriched buckets EB obtained in CUDA-FMURP v1. EB = {B| |B| ≥ δ}. Two elements Si,j and Si ,j belongs to the same bucket B if it follows the relation R defined below. Definition 13 (Relation R) (Si,j , Si ,j ) ∈ B ⇔ (Si,j , Si ,j ) ∈ R (Si,j , Si ,j ) ∈ R ⇔ hI (Si,j ) = hI (Si ,j ) Proposition 4.2 R is an equivalence relation. J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 46 / 88
  • 50. Parallel Motif Finding using Random Projection Parallel Motifs Finding using Random Projection (CUDA-FMURP v1) Correctness v1 We have to show that the set of enriched buckets EB obtained in FMURP is ¯ equivalent to the set of enriched buckets EB obtained in CUDA-FMURP v1. EB = {B| |B| ≥ δ}. Two elements Si,j and Si ,j belongs to the same bucket B if it follows the relation R defined below. Definition 13 (Relation R) (Si,j , Si ,j ) ∈ B ⇔ (Si,j , Si ,j ) ∈ R (Si,j , Si ,j ) ∈ R ⇔ hI (Si,j ) = hI (Si ,j ) Proposition 4.2 R is an equivalence relation. J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 46 / 88
  • 51. Parallel Motif Finding using Random Projection Parallel Motifs Finding using Random Projection (CUDA-FMURP v1) Correctness v1 We have to show that the set of enriched buckets EB obtained in FMURP is ¯ equivalent to the set of enriched buckets EB obtained in CUDA-FMURP v1. EB = {B| |B| ≥ δ}. Two elements Si,j and Si ,j belongs to the same bucket B if it follows the relation R defined below. Definition 13 (Relation R) (Si,j , Si ,j ) ∈ B ⇔ (Si,j , Si ,j ) ∈ R (Si,j , Si ,j ) ∈ R ⇔ hI (Si,j ) = hI (Si ,j ) Proposition 4.2 R is an equivalence relation. J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 46 / 88
  • 52. Parallel Motif Finding using Random Projection Parallel Motifs Finding using Random Projection (CUDA-FMURP v1) Correctness v1 In CUDA-FMURP v1, an enriched bucket is defined as ¯ ¯ ¯ EB = {B| |B| ≥ δ}. ¯ where B is a bucket in CUDA-FMURP and two elements p and q belongs to ¯ ¯ the same bucket B if it follows the relation R defined below. ¯ Definition 14 (Relation R) ¯ (p, q) ∈ B ⇔ (p, q) ∈ R ¯ ¯ (p, q) ∈ R ⇔ ∗ = k∗ ki,j ¯¯ i,j where i = p/(n − l + 1) , j = p mod (n − l + 1), ¯ = q/(n − l + 1) , and i ¯ = q mod (n − l + 1). j Lemma 15 ¯ Relation R and R are equivalent. J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 47 / 88
  • 53. Parallel Motif Finding using Random Projection Parallel Motifs Finding using Random Projection (CUDA-FMURP v1) Correctness v1 In CUDA-FMURP v1, an enriched bucket is defined as ¯ ¯ ¯ EB = {B| |B| ≥ δ}. ¯ where B is a bucket in CUDA-FMURP and two elements p and q belongs to ¯ ¯ the same bucket B if it follows the relation R defined below. ¯ Definition 14 (Relation R) ¯ (p, q) ∈ B ⇔ (p, q) ∈ R ¯ ¯ (p, q) ∈ R ⇔ ∗ = k∗ ki,j ¯¯ i,j where i = p/(n − l + 1) , j = p mod (n − l + 1), ¯ = q/(n − l + 1) , and i ¯ = q mod (n − l + 1). j Lemma 15 ¯ Relation R and R are equivalent. J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 47 / 88
  • 54. Parallel Motif Finding using Random Projection Parallel Motifs Finding using Random Projection (CUDA-FMURP v1) Correctness v1 In CUDA-FMURP v1, an enriched bucket is defined as ¯ ¯ ¯ EB = {B| |B| ≥ δ}. ¯ where B is a bucket in CUDA-FMURP and two elements p and q belongs to ¯ ¯ the same bucket B if it follows the relation R defined below. ¯ Definition 14 (Relation R) ¯ (p, q) ∈ B ⇔ (p, q) ∈ R ¯ ¯ (p, q) ∈ R ⇔ ∗ = k∗ ki,j ¯¯ i,j where i = p/(n − l + 1) , j = p mod (n − l + 1), ¯ = q/(n − l + 1) , and i ¯ = q mod (n − l + 1). j Lemma 15 ¯ Relation R and R are equivalent. J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 47 / 88
  • 55. Parallel Motif Finding using Random Projection Parallel Motifs Finding using Random Projection (CUDA-FMURP v1) Correctness ¯ Note that elements in B involves Si,j s while elements in B involves the set of integers p ∈ {0, . . . , (x − 1)}. Using Equations tid = i × (n − l + 1) + j (2) tid i= (3) (n − l + 1) j = tid mod (n − l + 1) (4) we can retrieve the l-mer Si,j corresponding to tid and vice versa. The theorem ¯ below follows from the fact that R and R are equivalent. Theorem 4.2 ¯ Set of enriched buckets EB and EB are equivalent. J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 48 / 88
  • 56. Parallel Motif Finding using Random Projection Parallel Motifs Finding using Random Projection (CUDA-FMURP v1) Correctness ¯ Note that elements in B involves Si,j s while elements in B involves the set of integers p ∈ {0, . . . , (x − 1)}. Using Equations tid = i × (n − l + 1) + j (2) tid i= (3) (n − l + 1) j = tid mod (n − l + 1) (4) we can retrieve the l-mer Si,j corresponding to tid and vice versa. The theorem ¯ below follows from the fact that R and R are equivalent. Theorem 4.2 ¯ Set of enriched buckets EB and EB are equivalent. J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 48 / 88
  • 57. Parallel Motif Finding using Random Projection Parallel Motif Finding using Random Projection (CUDA-FMURP v2) CUDA-FMURP v2 INPUT: Set of sequences S, motif length l, expected mismatches d, projection dimension k, and bucket threshold δ OUTPUT: Motif 1 In CPU, generate k random positions for projection. Let this be the set I = {ˆ ˆ ∈ {0, . . . , (l − 1)}} and |I| = k. i|i 2 In GPU, for each thread tid in {0, . . . , (x − 1)}, 1 Get hI (Si,j )s for all Si,j s in S, where i ∈ {0, . . . , (t − 1)}, and j ∈ 0, . . . , (n − l). 2 Convert each k-mer hI (Si,j ) to its corresponding ∗ integer representation ki,j . 3 ∗ In CPU, hash the list of ki,j s . 4 In CPU, identify the set of enriched buckets. 5 In GPU, for each tid in {0, . . . , (e − 1)}, 1 Perform EM refinement for each enriched bucket. 2 Perform SP-STARσ for each enriched bucket. 3 Maximize σ score to output best motif. J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 49 / 88
  • 58. Parallel Motif Finding using Random Projection Parallel Motif Finding using Random Projection (CUDA-FMURP v2) CUDA-FMURP v2 INPUT: Set of sequences S, motif length l, expected mismatches d, projection dimension k, and bucket threshold δ OUTPUT: Motif 1 In CPU, generate k random positions for projection. Let this be the set I = {ˆ ˆ ∈ {0, . . . , (l − 1)}} and |I| = k. i|i 2 In GPU, for each thread tid in {0, . . . , (x − 1)}, 1 Get hI (Si,j )s for all Si,j s in S, where i ∈ {0, . . . , (t − 1)}, and j ∈ 0, . . . , (n − l). 2 Convert each k-mer hI (Si,j ) to its corresponding ∗ integer representation ki,j . 3 ∗ In CPU, hash the list of ki,j s. 4 In CPU, identify the set of enriched buckets. 5 In GPU, for each tid in {0, . . . , (e − 1)}, 1 Perform EM refinement for each enriched bucket. 2 Perform SP-STARσ for each enriched bucket. 3 Maximize σ score to output best motif. J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 50 / 88
  • 59. Parallel Motif Finding using Random Projection Parallel Motif Finding using Random Projection (CUDA-FMURP v2) Hash Table in CPU J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 51 / 88
  • 60. Parallel Motif Finding using Random Projection Parallel Motif Finding using Random Projection (CUDA-FMURP v2) Hash Table in CPU J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 52 / 88
  • 61. Parallel Motif Finding using Random Projection Parallel Motif Finding using Random Projection (CUDA-FMURP v2) Hash Table in CPU J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 53 / 88
  • 62. Parallel Motif Finding using Random Projection Parallel Motif Finding using Random Projection (CUDA-FMURP v2) Hash Table in CPU J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 54 / 88
  • 63. Parallel Motif Finding using Random Projection Parallel Motif Finding using Random Projection (CUDA-FMURP v2) Hash Table in CPU ∗ To avoid collision between two items with different ki,j s, linear probing is implemented. Suppose, we will hash item p with key ki∗ ,j , and found out that h(ki∗ ,j ) is not empty, i.e. ∃ ki,j , such that h(ki,j ) = h(ki∗ ,j ) and ki,j = ki∗ ,j . ∗ ∗ ∗ We have to look for empty positions in table where we can place item p. We explore positions h (ki∗ ,j , i) = (h(ki,j ) + i) ∗ mod x for i from 0 to (m − 1), until an empty hash table position is found. J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 55 / 88
  • 64. Parallel Motif Finding using Random Projection Parallel Motif Finding using Random Projection (CUDA-FMURP v2) Hash Table in CPU ∗ To avoid collision between two items with different ki,j s, linear probing is implemented. Suppose, we will hash item p with key ki∗ ,j , and found out that h(ki∗ ,j ) is not empty, i.e. ∃ ki,j , such that h(ki,j ) = h(ki∗ ,j ) and ki,j = ki∗ ,j . ∗ ∗ ∗ We have to look for empty positions in table where we can place item p. We explore positions h (ki∗ ,j , i) = (h(ki,j ) + i) ∗ mod x for i from 0 to (m − 1), until an empty hash table position is found. J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 55 / 88
  • 65. Parallel Motif Finding using Random Projection Parallel Motif Finding using Random Projection (CUDA-FMURP v2) Hash Table in CPU ∗ To avoid collision between two items with different ki,j s, linear probing is implemented. Suppose, we will hash item p with key ki∗ ,j , and found out that h(ki∗ ,j ) is not empty, i.e. ∃ ki,j , such that h(ki,j ) = h(ki∗ ,j ) and ki,j = ki∗ ,j . ∗ ∗ ∗ We have to look for empty positions in table where we can place item p. We explore positions h (ki∗ ,j , i) = (h(ki,j ) + i) ∗ mod x for i from 0 to (m − 1), until an empty hash table position is found. J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 55 / 88
  • 66. Parallel Motif Finding using Random Projection Parallel Motif Finding using Random Projection (CUDA-FMURP v3) CUDA-FMURP v3 INPUT: Set of sequences S, motif length l, expected mismatches d, projection dimension k, and bucket threshold δ OUTPUT: Motif 1 In CPU, generate k random positions for projection. Let this be the set I = {ˆ ˆ ∈ {0, . . . , (l − 1)}} and |I| = k. i|i 2 In GPU, for each thread tid in {0, . . . , (t − 1)}, 1 Get hI (Stid,j )s for all Stid,j s in S, where j ∈ 0, . . . , (n − l). 2 Convert each k-mer hI (Stid,j ) to its corresponding ∗ integer representation ktid,j . 3 ∗ In CPU, hash the list of ki,j s. 4 In CPU, identify the set of enriched buckets. 5 In GPU, for each tid in {0, . . . , (e − 1)}, 1 Perform EM refinement for each enriched bucket. 2 Perform SP-STARσ for each enriched bucket. 3 Maximize σ score to output best motif. J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 56 / 88
  • 67. Parallel Motif Finding using Random Projection Parallel Motif Finding using Random Projection (CUDA-FMURP v3) CUDA-FMURP v3 INPUT: Set of sequences S, motif length l, expected mismatches d, projection dimension k, and bucket threshold δ OUTPUT: Motif 1 In CPU, generate k random positions for projection. Let this be the set I = {ˆ ˆ ∈ {0, . . . , (l − 1)}} and |I| = k. i|i 2 In GPU, for each thread tid in {0, . . . , (t − 1)}, 1 Get hI (Stid,j )s for all Stid,j s in S, where j ∈ 0, . . . , (n − l). 2 Convert each k-mer hI (Stid,j ) to its corresponding ∗ integer representation ktid,j . 3 ∗ In CPU, hash the list of ki,j s. 4 In CPU, identify the set of enriched buckets. 5 In GPU, for each tid in {0, . . . , (e − 1)}, 1 Perform EM refinement for each enriched bucket. 2 Perform SP-STARσ for each enriched bucket. 3 Maximize σ score to output best motif. J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 57 / 88
  • 68. Parallel Motif Finding using Random Projection Parallel Motif Finding using Random Projection (CUDA-FMURP v3) CUDA-Projection v3 J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 58 / 88
  • 69. Parallel Motif Finding using Random Projection Parallel Motif Finding using Random Projection (CUDA-FMURP v3) CUDA-Projection v3 J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 59 / 88
  • 70. Parallel Motif Finding using Random Projection Parallel Motif Finding using Random Projection (CUDA-FMURP v3) Integer Conversion J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 60 / 88
  • 71. Result and Analysis Running Time and Space Complexity Algorithm Time Space Number of Processors FMURP O(log(x)) O(x) 1 SEQ-FMURP O(x2 ) Oe(n − l + 1) 1 CUDA-FMURP v1 O(x) O(e(n − l + 1)) x CUDA-FMURP v2 O(x) O(e(n − l + 1)) x CUDA-FMURP v3 O(x) O(e(n − l + 1)) t Table: Total running time and space complexity of the three parallel algorithms for CUDA-FMURP in comparison with the two sequential implementations. J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 61 / 88
  • 72. Result and Analysis Speedup and Efficiency FMURP: O(x log x) The computation of Speedup is the ratio of sequential and parallel running time. Sequential SP = Parallel Comparison of Speedups SP , SP , and SP for CUDA-FMURP versions 1 to 3, respectively is shown below. O(x log x) SP = SP = SP = = O(log x) O(x) J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 62 / 88
  • 73. Result and Analysis Speedup and Efficiency Computation of processor Efficiency makes use of the speedup SP and number of processors used ˆ. p 1 · SPEP = ˆ p Comparison of Efficiencies EP , EP , and EP for CUDA-FMURP versions 1 to 3, respectively is shown below. 1 log x EP = · O(log x) = (5) x x 1 log x EP = · O(log x) = (6) x x 1 log x EP = · O(log x) = (7) t t EP = EP < EP J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 63 / 88
  • 74. Result and Analysis Dataset Dataset t n l d Instances generated 20 600 10 2 100 20 600 11 2 100 20 600 12 3 100 20 600 13 3 100 20 600 14 4 100 20 600 15 4 100 20 600 16 5 100 20 600 17 5 100 20 600 18 6 100 20 600 19 6 100 Table: Summary of generated dataset that is used to determine the accuracy of CUDA-FMURP. For each of the instance generated, the search model OOPS is assumed, that is each sequence contains exactly one occurrence of the planted motif. J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 64 / 88
  • 75. Result and Analysis Dataset Accuracy t n l d FMURP FMURP∗ SEQ-FMURP CUDA-FMURP m 20 600 10 2 13 100 98 98 72 20 600 11 2 99 100 100 100 16 20 600 12 3 3 96 83 83 259 20 600 13 3 81 100 100 100 62 20 600 14 4 1 86 79 79 645 20 600 15 4 49 100 100 100 172 20 600 16 5 0 77 53 53 1292 20 600 17 5 19 98 98 98 378 20 600 18 6 0 82 38 38 2217 20 600 19 6 9 98 94 94 711 Table: The table shows the number of correctly identified planted motif over 100 random input instances. For each of the instances, parameters k = 7 and s = 4 are used. The column labelled FMURP∗ is based from the result presented in [Tompa,2001] using the dataset they generated. J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 65 / 88
  • 76. Result and Analysis Machine Setups Machine Setups System specifications Values System specifications Values Host processors (procs) Core(TM) i7-2600 CPU 3.40GHz Host processors (procs) 2 × Intel Quad-core 2.26GHz Total number of cores 4 × 2 (hyperthreaded) = 8 Total number of cores 8 Max host RAM 8GB Max host RAM 12GB Device/s (GPU/s) 1 × NVIDIA GeForce GTX 580 Device/s (GPU/s) 2 × NVIDIA GT120 Compute capability 2.0 Compute capability 1.1 CUDA Cores/GPU 16 (multiprocs) × 32 (cores/proc) = 512 CUDA Cores/GPU 4 (multiprocs) × 8 (cores/proc) = 32 GPU clock rate 1.54 GHz GPU clock rate 1.40 GHz Memory clock rate 2004 Mhz Memory clock rate 500 Mhz Max device global memory 1535MB Max device global memory 512MB Operating system 64-bit Ubuntu 10.0.4 Operating system Mac OS X 10.6.8 CUDA version 4.1 CUDA version 3.2 J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 66 / 88
  • 77. Result and Analysis Actual Speedup Actual speed of CUDA-Projection v3 with respect to CUDA-Projection v1 J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 67 / 88
  • 78. Result and Analysis Actual Speedup Actual speed of CUDA-FMURP v1 and CUDA-Projection v3 J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 68 / 88
  • 79. Result and Analysis Actual Speedup Actual Speed Result: Setup1 J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 69 / 88
  • 80. Result and Analysis Actual Speedup Memory Requirement J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 70 / 88
  • 81. Result and Analysis Actual Speedup Actual speed comparison and speedup of CUDA-FMURP v1 with respect to SEQ-FMURP and FMURP using Setup 2 J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 71 / 88
  • 82. Conclusion Conclusion In this work, we presented three versions of parallel algorithms for FMURP. Algorithm Processors SP wrt FMURP SP wrt SEQ-FMURP Efficiency CUDA-FMURP v1 x O(log x) O(x) (log x/x) CUDA-FMURP v2 x O(log x) O(x) (log x/x) CUDA-FMURP v3 t O(log x) O(x) (log x/t) We implemented CUDA-FMURP v1 and CUDA-FMURP v2 and achieved a maximum actual speedup of 6.8 and 6.6 respectively with respect to the SEQ-FMURP. J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 72 / 88
  • 83. Conclusion curtain J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 73 / 88
  • 84. References References J.P. Allouche and J. Shallit, “Automatic Sequences: Theory Applications and Generalizations”, Cambridge University Press,Chapter 3: Numeration Systems, pp 70-73, 2003 P. Pevzner and S. H. Sze, “Combinatorial Approaches to Finding Subtle Signals in DNA Sequences”, Proceedings of 8th Int. Conf. Intelligent Systems for Molecular Biology (ISMB), 269-78, 2000 J. Buhler, M. Tompa, “Finding Motifs Using Random Projections”, RECOMB ’01 Proceedings of the fifth annual international conference on Computational biology, 2001 D. Kirk, W. Hwu, Programming Massively Parallel Processors: A Hands On Approach, 1st ed. MA, USA: Morgan Kaufmann, 2010 M. Harris, “Mapping computational concepts to GPUs”, ACM SIGGRAPH 2005 Courses, NY, USA, 2005 N. Jones, P. Pevzner,“An Introduction to Bioinformatics Algorithms”, Massachusetts Institute of Technology Press, 2004 J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 74 / 88
  • 85. Extra Slides Finding Motifs using Random Projection (FMURP) INPUT: Set of sequences S, motif length l, expected mismatches d, projection dimension k, and bucket threshold δ OUTPUT: Motif 1 Projection 1 Generate k random positions for projection. Let this be the set I = {ˆ ˆ ∈ {0, . . . , (l − 1)}} and |I| = k. i|i 2 For each Si,j in S, 1 Get hI (Si,j )s from all Si,j s in S, where i ∈ {0, . . . , (t − 1)}, and j ∈ 0, . . . , (n − l). 2 Sort Si,j s with respect to hI (Si,j ). 3 Perform a linear search over all hI (Si,j )s to determine which l-mers are ‘hashed’ in the same bucket. 2 Refine each enriched bucket using Expectation Maximization (EM) 3 Refine each enriched bucket using SP-STARσ 4 Maximize score to output best motif J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 75 / 88
  • 86. Extra Slides Projection Projection: Example Given a set of DNA sequences S, pattern length l = 4, projection dimension k = 2, and bucket threshold δ = 3. S0 : C G G T C A G G S1 : T T C G A C A T S2 : A C G A T G A A Figure: Set of t = 3 sequences each with n = 8 We generate the set of k random positions used in the actual projection. Suppose we have the set I = {0, 1}. For all Si,j in S, we get hI (Si,j ) using the random positions in I generated in step 1. To hash Si,j s to corresponding buckets using its hI (Si,j ), the list defined above is sorted lexicographically in terms of hI (Si,j ) together with their corresponding Si,j s .The sorted list is obtained. J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 76 / 88
  • 87. Extra Slides Projection Projection: Example Label Si,j hI (Si,j ) Label Sorted Si,j Sorted hI (Si,j ) S0,0 CGGT CG S2,0 ACGA AC S0,1 GGTC GG S1,4 ACAT AC S0,2 GTCA GT S2,3 ATCA AT S0,3 TCAG TC S0,4 CAGG CA S0,4 CAGG CA S0,0 CGGT CG S1,0 TTCG TT S2,1 CGAT CG S1,1 TCGA TC S1,2 CGAC CG S1,2 CGAC CG S1,3 GACA GA S1,3 GACA GA S2,2 GATC GA S1,4 ACAT AC S0,1 GGTC GG S2,0 ACGA AC S0,2 GTCA GT S2,1 CGAT CG S1,1 TCGA TC S2,2 GATC GA S0,3 TCAG TC S2,3 ATCA AT S2,4 TGAA TG S2,4 TGAA TG S1,0 TTCG TT J.B. Clemente (ACLab, DCS, UPD) h (S )s computed from step 2. March 31, 2012 Figure: Illustration showing the set of CUDA-FMURP The sorted 77 / 88
  • 88. Extra Slides Projection Projection: Example To get the list of buckets, we will perform a linear search over hI (Si,j )s to get the corresponding Si,j with equivalent hI (Si,j )s. hI (Si,j ) Count Si,j AC 2 { ACGA, ACAT } AT 1 { ATCA } CA 1 {CAGG } CG 3 {CGGT, CGAT , CGAC } GA 2 {GACA, GATC } GG 1 {GGTC } GT 1 {GTCA } TC 2 {TCGA, TCAG } TG 1 {TGAA } TT 1 {TTCG} Figure: Buckets obtained from Projection J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 78 / 88
  • 89. Extra Slides Projection Projection: Example From the set of buckets obtained, we identify which of those contains at least δ l-mers hashed and consider them enriched. hI (Si,j ) Count Si,j AC 2 { ACGA, ACAT } AT 1 { ATCA } CA 1 {CAGG } CG 3 {CGGT, CGAT , CGAC } GA 2 {GACA, GATC } GG 1 {GGTC } GT 1 {GTCA } TC 2 {TCGA, TCAG } TG 1 {TGAA } TT 1 {TTCG} Figure: Buckets obtained from Projection J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 79 / 88
  • 90. Extra Slides Projection Projection: Example From the set of buckets obtained, we identify which of those contains at least δ l-mers hashed and consider them enriched. hI (Si,j ) Count Si,j AC 2 { ACGA, ACAT } AT 1 { ATCA } CA 1 {CAGG } CG 3 {CGGT, CGAT , CGAC } GA 2 {GACA, GATC } GG 1 {GGTC } GT 1 {GTCA } TC 2 {TCGA, TCAG } TG 1 {TGAA } TT 1 {TTCG} Figure: Buckets obtained from Projection J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 80 / 88
  • 91. Extra Slides Expectation Maximization (EM) Expectation Maximization (EM) INPUT: Motif model θ0 from one enriched bucket, maximum number of iterations, and threshold for convergence δEM OUTPUT: Motif model θy 1 For j in {1, . . . , y} or until convergence 1 E-step For all l-mer in each sequence Si , compute E(Si,ai |θj ) given the current motif model. 2 (M-step) For all Si in S, get starting positions s such that for each ai ∈ s, E(Si,ai |θj ) is maximum ∀ ai in {0, . . . , (n − l)}. 3 (Test for Convergence) Compute L(θj ). Compare previous likelihood L(θj−1 ) to current L(θj ). If the difference satisfies the threshold δEM , stop iteration. 4 (Update step) For the alignment made by starting position vector s identified in M-step, get motif model θj+1 . J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 81 / 88
  • 92. Extra Slides Expectation Maximization (EM) EM: Example From the set of enriched bucket from Projection, EM performs the following operations. From EB , get the alignment made by hashed l-mers. C G G T C G A C C G A T From the alignment made, a profile matrix is computed. C G G T C G A C C G A T A: 0 0 2 0 C: 3 0 0 1 G: 0 3 1 0 T: 0 0 0 2 J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 82 / 88
  • 93. Extra Slides Expectation Maximization (EM) EM: Example Normalize the profile matrix obtained. A: 0.00 0.00 0.33 0.00 C: 1.00 0.00 0.00 0.33 G: 0.00 1.00 0.66 0.00 T: 0.00 0.00 0.00 0.66 To avoid zero values for Pr(Si,j |θ), [Tompa,2001] performed Laplace correction. For each row corresponding to a symbol say a, the probability pa that symbol a appears in the sequence is added to its corresponding row. Since all symbols in ΣDNA has uniform frequency distribution, 0.25 is added for each row. A: 0.25 0.25 0.58 0.25 C: 1.25 0.25 0.25 0.58 G: 0.25 1.25 0.91 0.25 T: 0.25 0.25 0.25 0.91 J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 83 / 88
  • 94. Extra Slides Expectation Maximization (EM) EM: Example Normalize the matrix obtained and let the resulting matrix be the initial motif model θ0 . A: 0.125 0.125 0.290 0.125 C: 0.625 0.125 0.125 0.290 G: 0.125 0.625 0.455 0.125 T: 0.125 0.125 0.125 0.455 For each Si in S get j such that for all j ∈ {0, . . . , (n − l)}, E(Si,j |θ0 ) is maximum. For instance, let’s identify an l-mer in sequence S0 with maximum expectation E(S0,j |θ0 ). E(S0,0 |θ0 ) = E(CGGT|θ0 ) = ((0.625)(0.625)(0.455)(0.455))/(0.254 ) = 20.725 E(S0,1 |θ0 ) = E(GGTC|θ0 ) = ((0.125)(0.625)(0.125)(0.125))/(0.254 ) = 00.313 E(S0,2 |θ0 ) = E(GTCA|θ0 ) = ((0.125)(0.125)(0.125)(0.125))/(0.254 ) = 00.063 E(S0,3 |θ0 ) = E(TCAG|θ0 ) = ((0.125)(0.125)(0.455)(0.290))/(0.254 ) = 00.528 E(S0,4 |θ0 ) = E(CAGG|θ0 ) = ((0.625)(0.125)(0.455)(0.125))/(0.254 ) = 01.138 From all S0,j s in S0 , l-mer S0,0 obtains the highest expectation. J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 84 / 88
  • 95. Extra Slides Expectation Maximization (EM) EM: Example The set of l-mers with the highest expectation in each sequence will define another alignment, like in Step 1. From this set of l-mers, we can obtain the next motif model θ1 . S0,0 : C G G T : 20.73 S1,2 : C G A C : 08.41 S2,1 : C G A T : 13.20 We compute the likelihood of a motif model θy using the best expectations. L(θ) = 20.73 + 08.41 + 13.20 = 42.34 Update the motif model θ0 to get θ1 , using the set of l-mers from each sequence that maximize the expectation. Stop iteration if L(θy ) − L(θy−1 ) ≤ δEM . The output of EM in this example is the consensus string CGAT. J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 85 / 88
  • 96. Extra Slides Expectation Maximization (EM) EM: Example The set of l-mers with the highest expectation in each sequence will define another alignment, like in Step 1. From this set of l-mers, we can obtain the next motif model θ1 . S0,0 : C G G T : 20.73 S1,2 : C G A C : 08.41 S2,1 : C G A T : 13.20 We compute the likelihood of a motif model θy using the best expectations. L(θ) = 20.73 + 08.41 + 13.20 = 42.34 Update the motif model θ0 to get θ1 , using the set of l-mers from each sequence that maximize the expectation. Stop iteration if L(θy ) − L(θy−1 ) ≤ δEM . The output of EM in this example is the consensus string CGAT. J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 85 / 88
  • 97. Extra Slides Expectation Maximization (EM) SP-STARσ INPUT: Consensus string M from θy and expected mismatches d OUTPUT: Refined consensus string M ∗ 1 For j in {1, . . . , y } or until convergence 1 Compute for Sb , where Sb is the set of all l-mers from each sequence that has the least Edit distance from M. Sb = {Si,j |dE (M, Si,j ) is minimum ∀Si,j in Si } 2 Compute for score σ(Sb ), where it is equal to the number of sequences in Sb such that dE (M, Si,j ) ≤ d 3 Compute the consensus string M from alignment made by Sb . 4 Compute Sb from M . 5 Compute σ(Sb ). 6 If σ(Sb ) > σ(Sb ), continue iteration using M = M , else M ∗ = M . J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 86 / 88
  • 98. Extra Slides Expectation Maximization (EM) SP-STARσ: Example Using M =CGAT and expected mismatches d = 1. Compute for Sb . For S0 the S0,j is identified as follows. dE (M, S0,0 ) = dE (CGAT, CGGT) = 1 dE (M, S0,1 ) = dE (CGAT, GGTC) = 3 dE (M, S0,2 ) = dE (CGAT, GTCA) = 4 dE (M, S0,3 ) = dE (CGAT, TCAG) = 3 dE (M, S0,4 ) = dE (CGAT, CAGG) = 3 The set Sb contains Sb = {S0,0 , S1,2 , S2,1 } Sb = CGGT, CGAC, CGAT J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 87 / 88
  • 99. Extra Slides Expectation Maximization (EM) SP-STARσ: Example Score for Sb is σ(Sb ) = 3 because the least edit distance in each sequence is 1, 1, 0. That is all 3 sequences satisfies dE (M, Si,j ) ≤ 1 Consensus string from Sb is M = CGAT. Sb from M is similar to Sb . Sb = {S0,0 , S1,2 , S2,1 } Sb = {CGGT, CGAC, CGAT} Since σ(Sb ) = σ(Sb ), M ∗ = M = CGAT. J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 88 / 88