SlideShare uma empresa Scribd logo
1 de 46
Baixar para ler offline
Compressed Membership in Automata with
                            Compressed Labels

                                              Markus Lohrey
                                            Christian Mathissen

                                             University of Leipzig


                                              June 16, 2011




Lohrey, Mathissen (University of Leipzig)     Compressed Membership   June 16, 2011   1 / 12
Motivation

  Try to develop algorithms that directly work on compressed data.




Lohrey, Mathissen (University of Leipzig)   Compressed Membership   June 16, 2011   2 / 12
Motivation

  Try to develop algorithms that directly work on compressed data.
  Goal: Beat straightforward decompress and manipulate strategy.




Lohrey, Mathissen (University of Leipzig)   Compressed Membership   June 16, 2011   2 / 12
Motivation

  Try to develop algorithms that directly work on compressed data.
  Goal: Beat straightforward decompress and manipulate strategy.
  In this talk: focus on compressed strings




Lohrey, Mathissen (University of Leipzig)   Compressed Membership   June 16, 2011   2 / 12
Motivation

  Try to develop algorithms that directly work on compressed data.
  Goal: Beat straightforward decompress and manipulate strategy.
  In this talk: focus on compressed strings
  Applications:
     ◮    all domains, where massive string data arise and have to be
          processed, e.g. bioinformatics




Lohrey, Mathissen (University of Leipzig)   Compressed Membership   June 16, 2011   2 / 12
Motivation

  Try to develop algorithms that directly work on compressed data.
  Goal: Beat straightforward decompress and manipulate strategy.
  In this talk: focus on compressed strings
  Applications:
     ◮    all domains, where massive string data arise and have to be
          processed, e.g. bioinformatics
     ◮    large (and highly compressible) strings often occur as intermediate
          data structures (e.g. in computational group theory, program analysis,
          verification).




Lohrey, Mathissen (University of Leipzig)   Compressed Membership   June 16, 2011   2 / 12
Straight-line programs
  Dictionary-based compression algorithms (LZ77, LZ78) exploit text
  repetition.




Lohrey, Mathissen (University of Leipzig)   Compressed Membership   June 16, 2011   3 / 12
Straight-line programs
  Dictionary-based compression algorithms (LZ77, LZ78) exploit text
  repetition.
  Straight-line programs are a general representation for compressed strings,
  which covers most dictionary-based algorithms.




Lohrey, Mathissen (University of Leipzig)   Compressed Membership   June 16, 2011   3 / 12
Straight-line programs
  Dictionary-based compression algorithms (LZ77, LZ78) exploit text
  repetition.
  Straight-line programs are a general representation for compressed strings,
  which covers most dictionary-based algorithms.
  A straight-line program (SLP) over the alphabet Γ is a sequence of
  definitions A = (Ai := αi )1≤i ≤n , where either αi ∈ Γ or αi = Aj Ak for
  some j, k < i .




Lohrey, Mathissen (University of Leipzig)   Compressed Membership   June 16, 2011   3 / 12
Straight-line programs
  Dictionary-based compression algorithms (LZ77, LZ78) exploit text
  repetition.
  Straight-line programs are a general representation for compressed strings,
  which covers most dictionary-based algorithms.
  A straight-line program (SLP) over the alphabet Γ is a sequence of
  definitions A = (Ai := αi )1≤i ≤n , where either αi ∈ Γ or αi = Aj Ak for
  some j, k < i .
  Alternatively: a context-free grammar that generates exactly one string.




Lohrey, Mathissen (University of Leipzig)   Compressed Membership   June 16, 2011   3 / 12
Straight-line programs
  Dictionary-based compression algorithms (LZ77, LZ78) exploit text
  repetition.
  Straight-line programs are a general representation for compressed strings,
  which covers most dictionary-based algorithms.
  A straight-line program (SLP) over the alphabet Γ is a sequence of
  definitions A = (Ai := αi )1≤i ≤n , where either αi ∈ Γ or αi = Aj Ak for
  some j, k < i .
  Alternatively: a context-free grammar that generates exactly one string.
  We write val(A) for the unique word generated by A.




Lohrey, Mathissen (University of Leipzig)   Compressed Membership   June 16, 2011   3 / 12
Straight-line programs
  Dictionary-based compression algorithms (LZ77, LZ78) exploit text
  repetition.
  Straight-line programs are a general representation for compressed strings,
  which covers most dictionary-based algorithms.
  A straight-line program (SLP) over the alphabet Γ is a sequence of
  definitions A = (Ai := αi )1≤i ≤n , where either αi ∈ Γ or αi = Aj Ak for
  some j, k < i .
  Alternatively: a context-free grammar that generates exactly one string.
  We write val(A) for the unique word generated by A.
  The size of an SLP A = (Ai := αi )1≤i ≤n is |A| = n.




Lohrey, Mathissen (University of Leipzig)   Compressed Membership   June 16, 2011   3 / 12
Straight-line programs
  Dictionary-based compression algorithms (LZ77, LZ78) exploit text
  repetition.
  Straight-line programs are a general representation for compressed strings,
  which covers most dictionary-based algorithms.
  A straight-line program (SLP) over the alphabet Γ is a sequence of
  definitions A = (Ai := αi )1≤i ≤n , where either αi ∈ Γ or αi = Aj Ak for
  some j, k < i .
  Alternatively: a context-free grammar that generates exactly one string.
  We write val(A) for the unique word generated by A.
  The size of an SLP A = (Ai := αi )1≤i ≤n is |A| = n.
  One may have |val(A)| = 2n .



Lohrey, Mathissen (University of Leipzig)   Compressed Membership   June 16, 2011   3 / 12
Straight-line programs
  Dictionary-based compression algorithms (LZ77, LZ78) exploit text
  repetition.
  Straight-line programs are a general representation for compressed strings,
  which covers most dictionary-based algorithms.
  A straight-line program (SLP) over the alphabet Γ is a sequence of
  definitions A = (Ai := αi )1≤i ≤n , where either αi ∈ Γ or αi = Aj Ak for
  some j, k < i .
  Alternatively: a context-free grammar that generates exactly one string.
  We write val(A) for the unique word generated by A.
  The size of an SLP A = (Ai := αi )1≤i ≤n is |A| = n.
  One may have |val(A)| = 2n .
  Thus, an SLP A can be seen as a compressed representation of val(A).

Lohrey, Mathissen (University of Leipzig)   Compressed Membership   June 16, 2011   3 / 12
Straight-line programs
  Example: A = (A1 := b,                    A2 := a,      Ai := Ai −1 Ai −2 for 3 ≤ i ≤ 7).




Lohrey, Mathissen (University of Leipzig)    Compressed Membership               June 16, 2011   4 / 12
Straight-line programs
  Example: A = (A1 := b,                    A2 := a,      Ai := Ai −1 Ai −2 for 3 ≤ i ≤ 7).
  Then val(A) = abaababaabaab:




Lohrey, Mathissen (University of Leipzig)    Compressed Membership               June 16, 2011   4 / 12
Straight-line programs
  Example: A = (A1 := b,                    A2 := a,      Ai := Ai −1 Ai −2 for 3 ≤ i ≤ 7).
  Then val(A) = abaababaabaab:

                                     A3 = A2 A1 = ab
                                     A4 = A3 A2 = aba
                                     A5 = A4 A3 = abaab
                                     A6 = A5 A4 = abaababa
                                     A7 = A6 A5 = abaababaabaab




Lohrey, Mathissen (University of Leipzig)    Compressed Membership               June 16, 2011   4 / 12
Straight-line programs
  Example: A = (A1 := b,                    A2 := a,      Ai := Ai −1 Ai −2 for 3 ≤ i ≤ 7).
  Then val(A) = abaababaabaab:

                                     A3 = A2 A1 = ab
                                     A4 = A3 A2 = aba
                                     A5 = A4 A3 = abaab
                                     A6 = A5 A4 = abaababa
                                     A7 = A6 A5 = abaababaabaab

  Relationship to dictionary-based compression (Rytter):




Lohrey, Mathissen (University of Leipzig)    Compressed Membership               June 16, 2011   4 / 12
Straight-line programs
  Example: A = (A1 := b,                    A2 := a,      Ai := Ai −1 Ai −2 for 3 ≤ i ≤ 7).
  Then val(A) = abaababaabaab:

                                     A3 = A2 A1 = ab
                                     A4 = A3 A2 = aba
                                     A5 = A4 A3 = abaab
                                     A6 = A5 A4 = abaababa
                                     A7 = A6 A5 = abaababaabaab

  Relationship to dictionary-based compression (Rytter):
     ◮    From an SLP A one can compute in polynomial time LZ77(val(A)).




Lohrey, Mathissen (University of Leipzig)    Compressed Membership               June 16, 2011   4 / 12
Straight-line programs
  Example: A = (A1 := b,                    A2 := a,      Ai := Ai −1 Ai −2 for 3 ≤ i ≤ 7).
  Then val(A) = abaababaabaab:

                                     A3 = A2 A1 = ab
                                     A4 = A3 A2 = aba
                                     A5 = A4 A3 = abaab
                                     A6 = A5 A4 = abaababa
                                     A7 = A6 A5 = abaababaabaab

  Relationship to dictionary-based compression (Rytter):
     ◮    From an SLP A one can compute in polynomial time LZ77(val(A)).
     ◮    From LZ77(w ) one can compute in polynomial time an SLP A with
          val(A) = w .

Lohrey, Mathissen (University of Leipzig)    Compressed Membership               June 16, 2011   4 / 12
Algorithms on SLP-compressed strings

  The mother of all algorithms on SLP -compressed strings:




Lohrey, Mathissen (University of Leipzig)   Compressed Membership   June 16, 2011   5 / 12
Algorithms on SLP-compressed strings

  The mother of all algorithms on SLP -compressed strings:


  Plandowski 1994
  The following problem can be solved in polynomial time:
  INPUT: SLPs A, B
  QUESTION: val(A) = val(B)?




Lohrey, Mathissen (University of Leipzig)   Compressed Membership   June 16, 2011   5 / 12
Algorithms on SLP-compressed strings

  The mother of all algorithms on SLP -compressed strings:


  Plandowski 1994
  The following problem can be solved in polynomial time:
  INPUT: SLPs A, B
  QUESTION: val(A) = val(B)?


  Note: The decompress-and-compare strategy does not work here.
  We cannot compute val(A) and val(B) in polynomial time.




Lohrey, Mathissen (University of Leipzig)   Compressed Membership   June 16, 2011   5 / 12
Algorithms on SLP-compressed strings

  The mother of all algorithms on SLP -compressed strings:


  Plandowski 1994
  The following problem can be solved in polynomial time:
  INPUT: SLPs A, B
  QUESTION: val(A) = val(B)?


  Note: The decompress-and-compare strategy does not work here.
  We cannot compute val(A) and val(B) in polynomial time.

  Plandowski’s algorithm uses combinatorics on words, in particular the
  Periodicity-Lemma of Fine and Wilf.


Lohrey, Mathissen (University of Leipzig)   Compressed Membership   June 16, 2011   5 / 12
Improvements of Plandowski’s result


  Gasieniec, Karpinski, Miyazaki, Plandowski, Rytter, Shinohara,
  Takeda (mid 90’s)
  The following problem can be solved in polynomial time
  (fully compressed pattern matching):
  INPUT: SLPs P, T
  QUESTION: Is val(P) a factor of val(T), i.e., ∃x, y : val(T) = x val(P) y ?




Lohrey, Mathissen (University of Leipzig)   Compressed Membership   June 16, 2011   6 / 12
Improvements of Plandowski’s result


  Gasieniec, Karpinski, Miyazaki, Plandowski, Rytter, Shinohara,
  Takeda (mid 90’s)
  The following problem can be solved in polynomial time
  (fully compressed pattern matching):
  INPUT: SLPs P, T
  QUESTION: Is val(P) a factor of val(T), i.e., ∃x, y : val(T) = x val(P) y ?


  The best known algorithm has a running time of O(|P| · |T|2 )
  (Lifshits 2006).




Lohrey, Mathissen (University of Leipzig)   Compressed Membership   June 16, 2011   6 / 12
Generalization: Compressed automata
  A compressed automata is an ordinary finite automaton, where every
  transition is labelled with an SLP.




Lohrey, Mathissen (University of Leipzig)   Compressed Membership   June 16, 2011   7 / 12
Generalization: Compressed automata
  A compressed automata is an ordinary finite automaton, where every
  transition is labelled with an SLP.

  Example: The compressed automaton
           Σ                Σ

                      q              A      q

  accepts all words that have val(A) as a factor.




Lohrey, Mathissen (University of Leipzig)   Compressed Membership   June 16, 2011   7 / 12
Generalization: Compressed automata
  A compressed automata is an ordinary finite automaton, where every
  transition is labelled with an SLP.

  Example: The compressed automaton
           Σ                Σ

                      q              A        q

  accepts all words that have val(A) as a factor.

  The size |A| of the compressed automaton A
  (with the set ∆ of transition triples) is

                                            |A| =                |A|.
                                                    (p,A,q)∈∆


Lohrey, Mathissen (University of Leipzig)     Compressed Membership     June 16, 2011   7 / 12
Compressed membership for compressed automata
  Compressed membership for compressed automata:
  INPUT: A compressed automaton A and an SLP B.
  QUESTION: val(B) ∈ L(A)?




Lohrey, Mathissen (University of Leipzig)   Compressed Membership   June 16, 2011   8 / 12
Compressed membership for compressed automata
  Compressed membership for compressed automata:
  INPUT: A compressed automaton A and an SLP B.
  QUESTION: val(B) ∈ L(A)?


  Plandowski, Rytter 1999

     ◮    Compressed membership for compressed automata belongs to
          PSPACE.
     ◮    Compressed membership for compressed automata over a unary
          alphabet is NP-complete.




Lohrey, Mathissen (University of Leipzig)   Compressed Membership   June 16, 2011   8 / 12
Compressed membership for compressed automata
  Compressed membership for compressed automata:
  INPUT: A compressed automaton A and an SLP B.
  QUESTION: val(B) ∈ L(A)?


  Plandowski, Rytter 1999

     ◮    Compressed membership for compressed automata belongs to
          PSPACE.
     ◮    Compressed membership for compressed automata over a unary
          alphabet is NP-complete.


  Plandowski and Rytter conjectured that compressed membership for
  compressed automata is NP-complete (for every alphabet size).
Lohrey, Mathissen (University of Leipzig)   Compressed Membership   June 16, 2011   8 / 12
Some combinatorics on words

  The period of the word w ∈ Σ∗ is the smallest number p such that
  w [k + p] = w [k] for all 1 ≤ k ≤ |w | − p
  (period(w ) = |w | if such a p does not exist).




Lohrey, Mathissen (University of Leipzig)   Compressed Membership   June 16, 2011   9 / 12
Some combinatorics on words

  The period of the word w ∈ Σ∗ is the smallest number p such that
  w [k + p] = w [k] for all 1 ≤ k ≤ |w | − p
  (period(w ) = |w | if such a p does not exist).
                       |w |
  Let order(w ) = ⌊ period(w ) ⌋.




Lohrey, Mathissen (University of Leipzig)   Compressed Membership   June 16, 2011   9 / 12
Some combinatorics on words

  The period of the word w ∈ Σ∗ is the smallest number p such that
  w [k + p] = w [k] for all 1 ≤ k ≤ |w | − p
  (period(w ) = |w | if such a p does not exist).
                       |w |
  Let order(w ) = ⌊ period(w ) ⌋.

  Example: Let w = abbabbab = (abb)2 ab.




Lohrey, Mathissen (University of Leipzig)   Compressed Membership   June 16, 2011   9 / 12
Some combinatorics on words

  The period of the word w ∈ Σ∗ is the smallest number p such that
  w [k + p] = w [k] for all 1 ≤ k ≤ |w | − p
  (period(w ) = |w | if such a p does not exist).
                       |w |
  Let order(w ) = ⌊ period(w ) ⌋.

  Example: Let w = abbabbab = (abb)2 ab.
  Then period(w ) = 3 and order(w ) = 2.




Lohrey, Mathissen (University of Leipzig)   Compressed Membership   June 16, 2011   9 / 12
Some combinatorics on words

  The period of the word w ∈ Σ∗ is the smallest number p such that
  w [k + p] = w [k] for all 1 ≤ k ≤ |w | − p
  (period(w ) = |w | if such a p does not exist).
                       |w |
  Let order(w ) = ⌊ period(w ) ⌋.

  Example: Let w = abbabbab = (abb)2 ab.
  Then period(w ) = 3 and order(w ) = 2.

  For a compressed automaton A, let:

             order(A) = max{order(val(A)) | A occurs as a label in A}
           period(A) = max{period(val(A)) | A occurs as a label in A}



Lohrey, Mathissen (University of Leipzig)   Compressed Membership   June 16, 2011   9 / 12
First main result


  Theorem 1
  For a compressed automaton A and an SLP B, we can check
  val(B) ∈ L(A) in time polynomial in (i) |A|, (ii) |B|, and (iii) order(A).




Lohrey, Mathissen (University of Leipzig)   Compressed Membership   June 16, 2011   10 / 12
First main result


  Theorem 1
  For a compressed automaton A and an SLP B, we can check
  val(B) ∈ L(A) in time polynomial in (i) |A|, (ii) |B|, and (iii) order(A).


  We use the following combinatorial fact:


     Let u, v ∈ Σ∗ and let p be a position in v . Then there exist at most
     order(u) many occurrences of u in v that “touch” position p.




Lohrey, Mathissen (University of Leipzig)   Compressed Membership   June 16, 2011   10 / 12
Second main result


  Theorem 2
  For a compressed automaton A and an SLP B, we can check
  val(B) ∈ L(A) nondeterministically in time polynomial in (i) |A|, (ii) |B|,
  and (iii) period(A).




Lohrey, Mathissen (University of Leipzig)   Compressed Membership   June 16, 2011   11 / 12
Second main result


  Theorem 2
  For a compressed automaton A and an SLP B, we can check
  val(B) ∈ L(A) nondeterministically in time polynomial in (i) |A|, (ii) |B|,
  and (iii) period(A).

  Generalizes the result of Plandowski & Rytter for a unary alphabet
  (NP-completeness of compressed membership for compressed automata).




Lohrey, Mathissen (University of Leipzig)   Compressed Membership   June 16, 2011   11 / 12
Second main result


  Theorem 2
  For a compressed automaton A and an SLP B, we can check
  val(B) ∈ L(A) nondeterministically in time polynomial in (i) |A|, (ii) |B|,
  and (iii) period(A).

  Generalizes the result of Plandowski & Rytter for a unary alphabet
  (NP-completeness of compressed membership for compressed automata).
  A word w = an has period 1!




Lohrey, Mathissen (University of Leipzig)   Compressed Membership   June 16, 2011   11 / 12
Second main result


  Theorem 2
  For a compressed automaton A and an SLP B, we can check
  val(B) ∈ L(A) nondeterministically in time polynomial in (i) |A|, (ii) |B|,
  and (iii) period(A).

  Generalizes the result of Plandowski & Rytter for a unary alphabet
  (NP-completeness of compressed membership for compressed automata).
  A word w = an has period 1!
  The theorem is proven by a reduction to the case of a unary alphabet.




Lohrey, Mathissen (University of Leipzig)   Compressed Membership   June 16, 2011   11 / 12
Open problems and a conjecture
  Conjecture 1
  Compressed membership for compressed automata is NP-complete.




Lohrey, Mathissen (University of Leipzig)   Compressed Membership   June 16, 2011   12 / 12
Open problems and a conjecture
  Conjecture 1
  Compressed membership for compressed automata is NP-complete.


  Conjecture 2
  If val(B) ∈ L(A) for an SLP B and a compressed automaton A, then there
  exists an accepting run of A on val(B) (viewed as a word over the set of
  transition triples of A), which can be generated by an SLP of size
  poly(|B|, |A|).




Lohrey, Mathissen (University of Leipzig)   Compressed Membership   June 16, 2011   12 / 12
Open problems and a conjecture
  Conjecture 1
  Compressed membership for compressed automata is NP-complete.


  Conjecture 2
  If val(B) ∈ L(A) for an SLP B and a compressed automaton A, then there
  exists an accepting run of A on val(B) (viewed as a word over the set of
  transition triples of A), which can be generated by an SLP of size
  poly(|B|, |A|).

  Conjecture 2 implies Conjecture 1:
     ◮    Guess nondeterministically an SLP C of polynomial size.
     ◮    Check (in deterministic polynomial time), whether val(C) is an
          accepting run of A on val(B).

Lohrey, Mathissen (University of Leipzig)   Compressed Membership   June 16, 2011   12 / 12

Mais conteúdo relacionado

Mais de CSR2011

Csr2011 june18 09_30_shpilka
Csr2011 june18 09_30_shpilkaCsr2011 june18 09_30_shpilka
Csr2011 june18 09_30_shpilkaCSR2011
 
Csr2011 june18 12_00_nguyen
Csr2011 june18 12_00_nguyenCsr2011 june18 12_00_nguyen
Csr2011 june18 12_00_nguyenCSR2011
 
Csr2011 june18 11_00_tiskin
Csr2011 june18 11_00_tiskinCsr2011 june18 11_00_tiskin
Csr2011 june18 11_00_tiskinCSR2011
 
Csr2011 june18 11_30_remila
Csr2011 june18 11_30_remilaCsr2011 june18 11_30_remila
Csr2011 june18 11_30_remilaCSR2011
 
Csr2011 june17 17_00_likhomanov
Csr2011 june17 17_00_likhomanovCsr2011 june17 17_00_likhomanov
Csr2011 june17 17_00_likhomanovCSR2011
 
Csr2011 june17 16_30_blin
Csr2011 june17 16_30_blinCsr2011 june17 16_30_blin
Csr2011 june17 16_30_blinCSR2011
 
Csr2011 june17 09_30_yekhanin
Csr2011 june17 09_30_yekhaninCsr2011 june17 09_30_yekhanin
Csr2011 june17 09_30_yekhaninCSR2011
 
Csr2011 june17 09_30_yekhanin
Csr2011 june17 09_30_yekhaninCsr2011 june17 09_30_yekhanin
Csr2011 june17 09_30_yekhaninCSR2011
 
Csr2011 june17 12_00_morin
Csr2011 june17 12_00_morinCsr2011 june17 12_00_morin
Csr2011 june17 12_00_morinCSR2011
 
Csr2011 june17 11_30_vyalyi
Csr2011 june17 11_30_vyalyiCsr2011 june17 11_30_vyalyi
Csr2011 june17 11_30_vyalyiCSR2011
 
Csr2011 june17 11_00_lonati
Csr2011 june17 11_00_lonatiCsr2011 june17 11_00_lonati
Csr2011 june17 11_00_lonatiCSR2011
 
Csr2011 june17 14_00_bulatov
Csr2011 june17 14_00_bulatovCsr2011 june17 14_00_bulatov
Csr2011 june17 14_00_bulatovCSR2011
 
Csr2011 june17 15_15_kaminski
Csr2011 june17 15_15_kaminskiCsr2011 june17 15_15_kaminski
Csr2011 june17 15_15_kaminskiCSR2011
 
Csr2011 june17 14_00_bulatov
Csr2011 june17 14_00_bulatovCsr2011 june17 14_00_bulatov
Csr2011 june17 14_00_bulatovCSR2011
 
Csr2011 june17 12_00_morin
Csr2011 june17 12_00_morinCsr2011 june17 12_00_morin
Csr2011 june17 12_00_morinCSR2011
 
Csr2011 june17 11_30_vyalyi
Csr2011 june17 11_30_vyalyiCsr2011 june17 11_30_vyalyi
Csr2011 june17 11_30_vyalyiCSR2011
 
Csr2011 june17 11_00_lonati
Csr2011 june17 11_00_lonatiCsr2011 june17 11_00_lonati
Csr2011 june17 11_00_lonatiCSR2011
 
Csr2011 june17 09_30_yekhanin
Csr2011 june17 09_30_yekhaninCsr2011 june17 09_30_yekhanin
Csr2011 june17 09_30_yekhaninCSR2011
 
Csr2011 june17 17_00_likhomanov
Csr2011 june17 17_00_likhomanovCsr2011 june17 17_00_likhomanov
Csr2011 june17 17_00_likhomanovCSR2011
 
Csr2011 june16 11_30_georgiadis
Csr2011 june16 11_30_georgiadisCsr2011 june16 11_30_georgiadis
Csr2011 june16 11_30_georgiadisCSR2011
 

Mais de CSR2011 (20)

Csr2011 june18 09_30_shpilka
Csr2011 june18 09_30_shpilkaCsr2011 june18 09_30_shpilka
Csr2011 june18 09_30_shpilka
 
Csr2011 june18 12_00_nguyen
Csr2011 june18 12_00_nguyenCsr2011 june18 12_00_nguyen
Csr2011 june18 12_00_nguyen
 
Csr2011 june18 11_00_tiskin
Csr2011 june18 11_00_tiskinCsr2011 june18 11_00_tiskin
Csr2011 june18 11_00_tiskin
 
Csr2011 june18 11_30_remila
Csr2011 june18 11_30_remilaCsr2011 june18 11_30_remila
Csr2011 june18 11_30_remila
 
Csr2011 june17 17_00_likhomanov
Csr2011 june17 17_00_likhomanovCsr2011 june17 17_00_likhomanov
Csr2011 june17 17_00_likhomanov
 
Csr2011 june17 16_30_blin
Csr2011 june17 16_30_blinCsr2011 june17 16_30_blin
Csr2011 june17 16_30_blin
 
Csr2011 june17 09_30_yekhanin
Csr2011 june17 09_30_yekhaninCsr2011 june17 09_30_yekhanin
Csr2011 june17 09_30_yekhanin
 
Csr2011 june17 09_30_yekhanin
Csr2011 june17 09_30_yekhaninCsr2011 june17 09_30_yekhanin
Csr2011 june17 09_30_yekhanin
 
Csr2011 june17 12_00_morin
Csr2011 june17 12_00_morinCsr2011 june17 12_00_morin
Csr2011 june17 12_00_morin
 
Csr2011 june17 11_30_vyalyi
Csr2011 june17 11_30_vyalyiCsr2011 june17 11_30_vyalyi
Csr2011 june17 11_30_vyalyi
 
Csr2011 june17 11_00_lonati
Csr2011 june17 11_00_lonatiCsr2011 june17 11_00_lonati
Csr2011 june17 11_00_lonati
 
Csr2011 june17 14_00_bulatov
Csr2011 june17 14_00_bulatovCsr2011 june17 14_00_bulatov
Csr2011 june17 14_00_bulatov
 
Csr2011 june17 15_15_kaminski
Csr2011 june17 15_15_kaminskiCsr2011 june17 15_15_kaminski
Csr2011 june17 15_15_kaminski
 
Csr2011 june17 14_00_bulatov
Csr2011 june17 14_00_bulatovCsr2011 june17 14_00_bulatov
Csr2011 june17 14_00_bulatov
 
Csr2011 june17 12_00_morin
Csr2011 june17 12_00_morinCsr2011 june17 12_00_morin
Csr2011 june17 12_00_morin
 
Csr2011 june17 11_30_vyalyi
Csr2011 june17 11_30_vyalyiCsr2011 june17 11_30_vyalyi
Csr2011 june17 11_30_vyalyi
 
Csr2011 june17 11_00_lonati
Csr2011 june17 11_00_lonatiCsr2011 june17 11_00_lonati
Csr2011 june17 11_00_lonati
 
Csr2011 june17 09_30_yekhanin
Csr2011 june17 09_30_yekhaninCsr2011 june17 09_30_yekhanin
Csr2011 june17 09_30_yekhanin
 
Csr2011 june17 17_00_likhomanov
Csr2011 june17 17_00_likhomanovCsr2011 june17 17_00_likhomanov
Csr2011 june17 17_00_likhomanov
 
Csr2011 june16 11_30_georgiadis
Csr2011 june16 11_30_georgiadisCsr2011 june16 11_30_georgiadis
Csr2011 june16 11_30_georgiadis
 

Último

How Accurate are Carbon Emissions Projections?
How Accurate are Carbon Emissions Projections?How Accurate are Carbon Emissions Projections?
How Accurate are Carbon Emissions Projections?IES VE
 
20230202 - Introduction to tis-py
20230202 - Introduction to tis-py20230202 - Introduction to tis-py
20230202 - Introduction to tis-pyJamie (Taka) Wang
 
Igniting Next Level Productivity with AI-Infused Data Integration Workflows
Igniting Next Level Productivity with AI-Infused Data Integration WorkflowsIgniting Next Level Productivity with AI-Infused Data Integration Workflows
Igniting Next Level Productivity with AI-Infused Data Integration WorkflowsSafe Software
 
Linked Data in Production: Moving Beyond Ontologies
Linked Data in Production: Moving Beyond OntologiesLinked Data in Production: Moving Beyond Ontologies
Linked Data in Production: Moving Beyond OntologiesDavid Newbury
 
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdfUiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdfDianaGray10
 
UiPath Platform: The Backend Engine Powering Your Automation - Session 1
UiPath Platform: The Backend Engine Powering Your Automation - Session 1UiPath Platform: The Backend Engine Powering Your Automation - Session 1
UiPath Platform: The Backend Engine Powering Your Automation - Session 1DianaGray10
 
Empowering Africa's Next Generation: The AI Leadership Blueprint
Empowering Africa's Next Generation: The AI Leadership BlueprintEmpowering Africa's Next Generation: The AI Leadership Blueprint
Empowering Africa's Next Generation: The AI Leadership BlueprintMahmoud Rabie
 
Nanopower In Semiconductor Industry.pdf
Nanopower  In Semiconductor Industry.pdfNanopower  In Semiconductor Industry.pdf
Nanopower In Semiconductor Industry.pdfPedro Manuel
 
Designing A Time bound resource download URL
Designing A Time bound resource download URLDesigning A Time bound resource download URL
Designing A Time bound resource download URLRuncy Oommen
 
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...Will Schroeder
 
Introduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptxIntroduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptxMatsuo Lab
 
Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve Decarbonization
Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve DecarbonizationUsing IESVE for Loads, Sizing and Heat Pump Modeling to Achieve Decarbonization
Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve DecarbonizationIES VE
 
Building Your Own AI Instance (TBLC AI )
Building Your Own AI Instance (TBLC AI )Building Your Own AI Instance (TBLC AI )
Building Your Own AI Instance (TBLC AI )Brian Pichman
 
AI You Can Trust - Ensuring Success with Data Integrity Webinar
AI You Can Trust - Ensuring Success with Data Integrity WebinarAI You Can Trust - Ensuring Success with Data Integrity Webinar
AI You Can Trust - Ensuring Success with Data Integrity WebinarPrecisely
 
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...UbiTrack UK
 
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019IES VE
 
VoIP Service and Marketing using Odoo and Asterisk PBX
VoIP Service and Marketing using Odoo and Asterisk PBXVoIP Service and Marketing using Odoo and Asterisk PBX
VoIP Service and Marketing using Odoo and Asterisk PBXTarek Kalaji
 
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...Aggregage
 
Salesforce Miami User Group Event - 1st Quarter 2024
Salesforce Miami User Group Event - 1st Quarter 2024Salesforce Miami User Group Event - 1st Quarter 2024
Salesforce Miami User Group Event - 1st Quarter 2024SkyPlanner
 

Último (20)

How Accurate are Carbon Emissions Projections?
How Accurate are Carbon Emissions Projections?How Accurate are Carbon Emissions Projections?
How Accurate are Carbon Emissions Projections?
 
20230202 - Introduction to tis-py
20230202 - Introduction to tis-py20230202 - Introduction to tis-py
20230202 - Introduction to tis-py
 
Igniting Next Level Productivity with AI-Infused Data Integration Workflows
Igniting Next Level Productivity with AI-Infused Data Integration WorkflowsIgniting Next Level Productivity with AI-Infused Data Integration Workflows
Igniting Next Level Productivity with AI-Infused Data Integration Workflows
 
Linked Data in Production: Moving Beyond Ontologies
Linked Data in Production: Moving Beyond OntologiesLinked Data in Production: Moving Beyond Ontologies
Linked Data in Production: Moving Beyond Ontologies
 
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdfUiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
 
UiPath Platform: The Backend Engine Powering Your Automation - Session 1
UiPath Platform: The Backend Engine Powering Your Automation - Session 1UiPath Platform: The Backend Engine Powering Your Automation - Session 1
UiPath Platform: The Backend Engine Powering Your Automation - Session 1
 
20230104 - machine vision
20230104 - machine vision20230104 - machine vision
20230104 - machine vision
 
Empowering Africa's Next Generation: The AI Leadership Blueprint
Empowering Africa's Next Generation: The AI Leadership BlueprintEmpowering Africa's Next Generation: The AI Leadership Blueprint
Empowering Africa's Next Generation: The AI Leadership Blueprint
 
Nanopower In Semiconductor Industry.pdf
Nanopower  In Semiconductor Industry.pdfNanopower  In Semiconductor Industry.pdf
Nanopower In Semiconductor Industry.pdf
 
Designing A Time bound resource download URL
Designing A Time bound resource download URLDesigning A Time bound resource download URL
Designing A Time bound resource download URL
 
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
 
Introduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptxIntroduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptx
 
Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve Decarbonization
Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve DecarbonizationUsing IESVE for Loads, Sizing and Heat Pump Modeling to Achieve Decarbonization
Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve Decarbonization
 
Building Your Own AI Instance (TBLC AI )
Building Your Own AI Instance (TBLC AI )Building Your Own AI Instance (TBLC AI )
Building Your Own AI Instance (TBLC AI )
 
AI You Can Trust - Ensuring Success with Data Integrity Webinar
AI You Can Trust - Ensuring Success with Data Integrity WebinarAI You Can Trust - Ensuring Success with Data Integrity Webinar
AI You Can Trust - Ensuring Success with Data Integrity Webinar
 
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
 
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
 
VoIP Service and Marketing using Odoo and Asterisk PBX
VoIP Service and Marketing using Odoo and Asterisk PBXVoIP Service and Marketing using Odoo and Asterisk PBX
VoIP Service and Marketing using Odoo and Asterisk PBX
 
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
 
Salesforce Miami User Group Event - 1st Quarter 2024
Salesforce Miami User Group Event - 1st Quarter 2024Salesforce Miami User Group Event - 1st Quarter 2024
Salesforce Miami User Group Event - 1st Quarter 2024
 

Csr2011 june16 17_00_lohrey

  • 1. Compressed Membership in Automata with Compressed Labels Markus Lohrey Christian Mathissen University of Leipzig June 16, 2011 Lohrey, Mathissen (University of Leipzig) Compressed Membership June 16, 2011 1 / 12
  • 2. Motivation Try to develop algorithms that directly work on compressed data. Lohrey, Mathissen (University of Leipzig) Compressed Membership June 16, 2011 2 / 12
  • 3. Motivation Try to develop algorithms that directly work on compressed data. Goal: Beat straightforward decompress and manipulate strategy. Lohrey, Mathissen (University of Leipzig) Compressed Membership June 16, 2011 2 / 12
  • 4. Motivation Try to develop algorithms that directly work on compressed data. Goal: Beat straightforward decompress and manipulate strategy. In this talk: focus on compressed strings Lohrey, Mathissen (University of Leipzig) Compressed Membership June 16, 2011 2 / 12
  • 5. Motivation Try to develop algorithms that directly work on compressed data. Goal: Beat straightforward decompress and manipulate strategy. In this talk: focus on compressed strings Applications: ◮ all domains, where massive string data arise and have to be processed, e.g. bioinformatics Lohrey, Mathissen (University of Leipzig) Compressed Membership June 16, 2011 2 / 12
  • 6. Motivation Try to develop algorithms that directly work on compressed data. Goal: Beat straightforward decompress and manipulate strategy. In this talk: focus on compressed strings Applications: ◮ all domains, where massive string data arise and have to be processed, e.g. bioinformatics ◮ large (and highly compressible) strings often occur as intermediate data structures (e.g. in computational group theory, program analysis, verification). Lohrey, Mathissen (University of Leipzig) Compressed Membership June 16, 2011 2 / 12
  • 7. Straight-line programs Dictionary-based compression algorithms (LZ77, LZ78) exploit text repetition. Lohrey, Mathissen (University of Leipzig) Compressed Membership June 16, 2011 3 / 12
  • 8. Straight-line programs Dictionary-based compression algorithms (LZ77, LZ78) exploit text repetition. Straight-line programs are a general representation for compressed strings, which covers most dictionary-based algorithms. Lohrey, Mathissen (University of Leipzig) Compressed Membership June 16, 2011 3 / 12
  • 9. Straight-line programs Dictionary-based compression algorithms (LZ77, LZ78) exploit text repetition. Straight-line programs are a general representation for compressed strings, which covers most dictionary-based algorithms. A straight-line program (SLP) over the alphabet Γ is a sequence of definitions A = (Ai := αi )1≤i ≤n , where either αi ∈ Γ or αi = Aj Ak for some j, k < i . Lohrey, Mathissen (University of Leipzig) Compressed Membership June 16, 2011 3 / 12
  • 10. Straight-line programs Dictionary-based compression algorithms (LZ77, LZ78) exploit text repetition. Straight-line programs are a general representation for compressed strings, which covers most dictionary-based algorithms. A straight-line program (SLP) over the alphabet Γ is a sequence of definitions A = (Ai := αi )1≤i ≤n , where either αi ∈ Γ or αi = Aj Ak for some j, k < i . Alternatively: a context-free grammar that generates exactly one string. Lohrey, Mathissen (University of Leipzig) Compressed Membership June 16, 2011 3 / 12
  • 11. Straight-line programs Dictionary-based compression algorithms (LZ77, LZ78) exploit text repetition. Straight-line programs are a general representation for compressed strings, which covers most dictionary-based algorithms. A straight-line program (SLP) over the alphabet Γ is a sequence of definitions A = (Ai := αi )1≤i ≤n , where either αi ∈ Γ or αi = Aj Ak for some j, k < i . Alternatively: a context-free grammar that generates exactly one string. We write val(A) for the unique word generated by A. Lohrey, Mathissen (University of Leipzig) Compressed Membership June 16, 2011 3 / 12
  • 12. Straight-line programs Dictionary-based compression algorithms (LZ77, LZ78) exploit text repetition. Straight-line programs are a general representation for compressed strings, which covers most dictionary-based algorithms. A straight-line program (SLP) over the alphabet Γ is a sequence of definitions A = (Ai := αi )1≤i ≤n , where either αi ∈ Γ or αi = Aj Ak for some j, k < i . Alternatively: a context-free grammar that generates exactly one string. We write val(A) for the unique word generated by A. The size of an SLP A = (Ai := αi )1≤i ≤n is |A| = n. Lohrey, Mathissen (University of Leipzig) Compressed Membership June 16, 2011 3 / 12
  • 13. Straight-line programs Dictionary-based compression algorithms (LZ77, LZ78) exploit text repetition. Straight-line programs are a general representation for compressed strings, which covers most dictionary-based algorithms. A straight-line program (SLP) over the alphabet Γ is a sequence of definitions A = (Ai := αi )1≤i ≤n , where either αi ∈ Γ or αi = Aj Ak for some j, k < i . Alternatively: a context-free grammar that generates exactly one string. We write val(A) for the unique word generated by A. The size of an SLP A = (Ai := αi )1≤i ≤n is |A| = n. One may have |val(A)| = 2n . Lohrey, Mathissen (University of Leipzig) Compressed Membership June 16, 2011 3 / 12
  • 14. Straight-line programs Dictionary-based compression algorithms (LZ77, LZ78) exploit text repetition. Straight-line programs are a general representation for compressed strings, which covers most dictionary-based algorithms. A straight-line program (SLP) over the alphabet Γ is a sequence of definitions A = (Ai := αi )1≤i ≤n , where either αi ∈ Γ or αi = Aj Ak for some j, k < i . Alternatively: a context-free grammar that generates exactly one string. We write val(A) for the unique word generated by A. The size of an SLP A = (Ai := αi )1≤i ≤n is |A| = n. One may have |val(A)| = 2n . Thus, an SLP A can be seen as a compressed representation of val(A). Lohrey, Mathissen (University of Leipzig) Compressed Membership June 16, 2011 3 / 12
  • 15. Straight-line programs Example: A = (A1 := b, A2 := a, Ai := Ai −1 Ai −2 for 3 ≤ i ≤ 7). Lohrey, Mathissen (University of Leipzig) Compressed Membership June 16, 2011 4 / 12
  • 16. Straight-line programs Example: A = (A1 := b, A2 := a, Ai := Ai −1 Ai −2 for 3 ≤ i ≤ 7). Then val(A) = abaababaabaab: Lohrey, Mathissen (University of Leipzig) Compressed Membership June 16, 2011 4 / 12
  • 17. Straight-line programs Example: A = (A1 := b, A2 := a, Ai := Ai −1 Ai −2 for 3 ≤ i ≤ 7). Then val(A) = abaababaabaab: A3 = A2 A1 = ab A4 = A3 A2 = aba A5 = A4 A3 = abaab A6 = A5 A4 = abaababa A7 = A6 A5 = abaababaabaab Lohrey, Mathissen (University of Leipzig) Compressed Membership June 16, 2011 4 / 12
  • 18. Straight-line programs Example: A = (A1 := b, A2 := a, Ai := Ai −1 Ai −2 for 3 ≤ i ≤ 7). Then val(A) = abaababaabaab: A3 = A2 A1 = ab A4 = A3 A2 = aba A5 = A4 A3 = abaab A6 = A5 A4 = abaababa A7 = A6 A5 = abaababaabaab Relationship to dictionary-based compression (Rytter): Lohrey, Mathissen (University of Leipzig) Compressed Membership June 16, 2011 4 / 12
  • 19. Straight-line programs Example: A = (A1 := b, A2 := a, Ai := Ai −1 Ai −2 for 3 ≤ i ≤ 7). Then val(A) = abaababaabaab: A3 = A2 A1 = ab A4 = A3 A2 = aba A5 = A4 A3 = abaab A6 = A5 A4 = abaababa A7 = A6 A5 = abaababaabaab Relationship to dictionary-based compression (Rytter): ◮ From an SLP A one can compute in polynomial time LZ77(val(A)). Lohrey, Mathissen (University of Leipzig) Compressed Membership June 16, 2011 4 / 12
  • 20. Straight-line programs Example: A = (A1 := b, A2 := a, Ai := Ai −1 Ai −2 for 3 ≤ i ≤ 7). Then val(A) = abaababaabaab: A3 = A2 A1 = ab A4 = A3 A2 = aba A5 = A4 A3 = abaab A6 = A5 A4 = abaababa A7 = A6 A5 = abaababaabaab Relationship to dictionary-based compression (Rytter): ◮ From an SLP A one can compute in polynomial time LZ77(val(A)). ◮ From LZ77(w ) one can compute in polynomial time an SLP A with val(A) = w . Lohrey, Mathissen (University of Leipzig) Compressed Membership June 16, 2011 4 / 12
  • 21. Algorithms on SLP-compressed strings The mother of all algorithms on SLP -compressed strings: Lohrey, Mathissen (University of Leipzig) Compressed Membership June 16, 2011 5 / 12
  • 22. Algorithms on SLP-compressed strings The mother of all algorithms on SLP -compressed strings: Plandowski 1994 The following problem can be solved in polynomial time: INPUT: SLPs A, B QUESTION: val(A) = val(B)? Lohrey, Mathissen (University of Leipzig) Compressed Membership June 16, 2011 5 / 12
  • 23. Algorithms on SLP-compressed strings The mother of all algorithms on SLP -compressed strings: Plandowski 1994 The following problem can be solved in polynomial time: INPUT: SLPs A, B QUESTION: val(A) = val(B)? Note: The decompress-and-compare strategy does not work here. We cannot compute val(A) and val(B) in polynomial time. Lohrey, Mathissen (University of Leipzig) Compressed Membership June 16, 2011 5 / 12
  • 24. Algorithms on SLP-compressed strings The mother of all algorithms on SLP -compressed strings: Plandowski 1994 The following problem can be solved in polynomial time: INPUT: SLPs A, B QUESTION: val(A) = val(B)? Note: The decompress-and-compare strategy does not work here. We cannot compute val(A) and val(B) in polynomial time. Plandowski’s algorithm uses combinatorics on words, in particular the Periodicity-Lemma of Fine and Wilf. Lohrey, Mathissen (University of Leipzig) Compressed Membership June 16, 2011 5 / 12
  • 25. Improvements of Plandowski’s result Gasieniec, Karpinski, Miyazaki, Plandowski, Rytter, Shinohara, Takeda (mid 90’s) The following problem can be solved in polynomial time (fully compressed pattern matching): INPUT: SLPs P, T QUESTION: Is val(P) a factor of val(T), i.e., ∃x, y : val(T) = x val(P) y ? Lohrey, Mathissen (University of Leipzig) Compressed Membership June 16, 2011 6 / 12
  • 26. Improvements of Plandowski’s result Gasieniec, Karpinski, Miyazaki, Plandowski, Rytter, Shinohara, Takeda (mid 90’s) The following problem can be solved in polynomial time (fully compressed pattern matching): INPUT: SLPs P, T QUESTION: Is val(P) a factor of val(T), i.e., ∃x, y : val(T) = x val(P) y ? The best known algorithm has a running time of O(|P| · |T|2 ) (Lifshits 2006). Lohrey, Mathissen (University of Leipzig) Compressed Membership June 16, 2011 6 / 12
  • 27. Generalization: Compressed automata A compressed automata is an ordinary finite automaton, where every transition is labelled with an SLP. Lohrey, Mathissen (University of Leipzig) Compressed Membership June 16, 2011 7 / 12
  • 28. Generalization: Compressed automata A compressed automata is an ordinary finite automaton, where every transition is labelled with an SLP. Example: The compressed automaton Σ Σ q A q accepts all words that have val(A) as a factor. Lohrey, Mathissen (University of Leipzig) Compressed Membership June 16, 2011 7 / 12
  • 29. Generalization: Compressed automata A compressed automata is an ordinary finite automaton, where every transition is labelled with an SLP. Example: The compressed automaton Σ Σ q A q accepts all words that have val(A) as a factor. The size |A| of the compressed automaton A (with the set ∆ of transition triples) is |A| = |A|. (p,A,q)∈∆ Lohrey, Mathissen (University of Leipzig) Compressed Membership June 16, 2011 7 / 12
  • 30. Compressed membership for compressed automata Compressed membership for compressed automata: INPUT: A compressed automaton A and an SLP B. QUESTION: val(B) ∈ L(A)? Lohrey, Mathissen (University of Leipzig) Compressed Membership June 16, 2011 8 / 12
  • 31. Compressed membership for compressed automata Compressed membership for compressed automata: INPUT: A compressed automaton A and an SLP B. QUESTION: val(B) ∈ L(A)? Plandowski, Rytter 1999 ◮ Compressed membership for compressed automata belongs to PSPACE. ◮ Compressed membership for compressed automata over a unary alphabet is NP-complete. Lohrey, Mathissen (University of Leipzig) Compressed Membership June 16, 2011 8 / 12
  • 32. Compressed membership for compressed automata Compressed membership for compressed automata: INPUT: A compressed automaton A and an SLP B. QUESTION: val(B) ∈ L(A)? Plandowski, Rytter 1999 ◮ Compressed membership for compressed automata belongs to PSPACE. ◮ Compressed membership for compressed automata over a unary alphabet is NP-complete. Plandowski and Rytter conjectured that compressed membership for compressed automata is NP-complete (for every alphabet size). Lohrey, Mathissen (University of Leipzig) Compressed Membership June 16, 2011 8 / 12
  • 33. Some combinatorics on words The period of the word w ∈ Σ∗ is the smallest number p such that w [k + p] = w [k] for all 1 ≤ k ≤ |w | − p (period(w ) = |w | if such a p does not exist). Lohrey, Mathissen (University of Leipzig) Compressed Membership June 16, 2011 9 / 12
  • 34. Some combinatorics on words The period of the word w ∈ Σ∗ is the smallest number p such that w [k + p] = w [k] for all 1 ≤ k ≤ |w | − p (period(w ) = |w | if such a p does not exist). |w | Let order(w ) = ⌊ period(w ) ⌋. Lohrey, Mathissen (University of Leipzig) Compressed Membership June 16, 2011 9 / 12
  • 35. Some combinatorics on words The period of the word w ∈ Σ∗ is the smallest number p such that w [k + p] = w [k] for all 1 ≤ k ≤ |w | − p (period(w ) = |w | if such a p does not exist). |w | Let order(w ) = ⌊ period(w ) ⌋. Example: Let w = abbabbab = (abb)2 ab. Lohrey, Mathissen (University of Leipzig) Compressed Membership June 16, 2011 9 / 12
  • 36. Some combinatorics on words The period of the word w ∈ Σ∗ is the smallest number p such that w [k + p] = w [k] for all 1 ≤ k ≤ |w | − p (period(w ) = |w | if such a p does not exist). |w | Let order(w ) = ⌊ period(w ) ⌋. Example: Let w = abbabbab = (abb)2 ab. Then period(w ) = 3 and order(w ) = 2. Lohrey, Mathissen (University of Leipzig) Compressed Membership June 16, 2011 9 / 12
  • 37. Some combinatorics on words The period of the word w ∈ Σ∗ is the smallest number p such that w [k + p] = w [k] for all 1 ≤ k ≤ |w | − p (period(w ) = |w | if such a p does not exist). |w | Let order(w ) = ⌊ period(w ) ⌋. Example: Let w = abbabbab = (abb)2 ab. Then period(w ) = 3 and order(w ) = 2. For a compressed automaton A, let: order(A) = max{order(val(A)) | A occurs as a label in A} period(A) = max{period(val(A)) | A occurs as a label in A} Lohrey, Mathissen (University of Leipzig) Compressed Membership June 16, 2011 9 / 12
  • 38. First main result Theorem 1 For a compressed automaton A and an SLP B, we can check val(B) ∈ L(A) in time polynomial in (i) |A|, (ii) |B|, and (iii) order(A). Lohrey, Mathissen (University of Leipzig) Compressed Membership June 16, 2011 10 / 12
  • 39. First main result Theorem 1 For a compressed automaton A and an SLP B, we can check val(B) ∈ L(A) in time polynomial in (i) |A|, (ii) |B|, and (iii) order(A). We use the following combinatorial fact: Let u, v ∈ Σ∗ and let p be a position in v . Then there exist at most order(u) many occurrences of u in v that “touch” position p. Lohrey, Mathissen (University of Leipzig) Compressed Membership June 16, 2011 10 / 12
  • 40. Second main result Theorem 2 For a compressed automaton A and an SLP B, we can check val(B) ∈ L(A) nondeterministically in time polynomial in (i) |A|, (ii) |B|, and (iii) period(A). Lohrey, Mathissen (University of Leipzig) Compressed Membership June 16, 2011 11 / 12
  • 41. Second main result Theorem 2 For a compressed automaton A and an SLP B, we can check val(B) ∈ L(A) nondeterministically in time polynomial in (i) |A|, (ii) |B|, and (iii) period(A). Generalizes the result of Plandowski & Rytter for a unary alphabet (NP-completeness of compressed membership for compressed automata). Lohrey, Mathissen (University of Leipzig) Compressed Membership June 16, 2011 11 / 12
  • 42. Second main result Theorem 2 For a compressed automaton A and an SLP B, we can check val(B) ∈ L(A) nondeterministically in time polynomial in (i) |A|, (ii) |B|, and (iii) period(A). Generalizes the result of Plandowski & Rytter for a unary alphabet (NP-completeness of compressed membership for compressed automata). A word w = an has period 1! Lohrey, Mathissen (University of Leipzig) Compressed Membership June 16, 2011 11 / 12
  • 43. Second main result Theorem 2 For a compressed automaton A and an SLP B, we can check val(B) ∈ L(A) nondeterministically in time polynomial in (i) |A|, (ii) |B|, and (iii) period(A). Generalizes the result of Plandowski & Rytter for a unary alphabet (NP-completeness of compressed membership for compressed automata). A word w = an has period 1! The theorem is proven by a reduction to the case of a unary alphabet. Lohrey, Mathissen (University of Leipzig) Compressed Membership June 16, 2011 11 / 12
  • 44. Open problems and a conjecture Conjecture 1 Compressed membership for compressed automata is NP-complete. Lohrey, Mathissen (University of Leipzig) Compressed Membership June 16, 2011 12 / 12
  • 45. Open problems and a conjecture Conjecture 1 Compressed membership for compressed automata is NP-complete. Conjecture 2 If val(B) ∈ L(A) for an SLP B and a compressed automaton A, then there exists an accepting run of A on val(B) (viewed as a word over the set of transition triples of A), which can be generated by an SLP of size poly(|B|, |A|). Lohrey, Mathissen (University of Leipzig) Compressed Membership June 16, 2011 12 / 12
  • 46. Open problems and a conjecture Conjecture 1 Compressed membership for compressed automata is NP-complete. Conjecture 2 If val(B) ∈ L(A) for an SLP B and a compressed automaton A, then there exists an accepting run of A on val(B) (viewed as a word over the set of transition triples of A), which can be generated by an SLP of size poly(|B|, |A|). Conjecture 2 implies Conjecture 1: ◮ Guess nondeterministically an SLP C of polynomial size. ◮ Check (in deterministic polynomial time), whether val(C) is an accepting run of A on val(B). Lohrey, Mathissen (University of Leipzig) Compressed Membership June 16, 2011 12 / 12