SlideShare uma empresa Scribd logo
1 de 32
Baixar para ler offline
Auctions for Distributed (and Possibly Parallel)
                   Matchings

                               E. Jason Riedy
                              jason@acm.org1

                                EECS Department
                         University of California, Berkeley


                            17 December, 2008



1
    Thanks to FBF for funding this Cerfacs visit.
    Jason Riedy (UCB)            Distributed Auctions         17 Dec, 2008   1 / 28
Outline


1   Introduction

2   Linear Assignment Problem (LAP): Mathematical Form

3   Auction Algorithms

4   Distributed Auctions

5   Prospects for Parallel Matching




     Jason Riedy (UCB)       Distributed Auctions   17 Dec, 2008   2 / 28
Motivation: Ever Larger Ax = b
Systems Ax = b are growing larger, more difficult
   Omega3P: n = 7.5 million with τ = 300 million entries
   Quantum Mechanics: precondition with blocks of dimension
   200-350 thousand
   Large barrier-based optimization problems: Many solves, similar
   structure, increasing condition number

   Huge systems are generated, solved, and analyzed automatically.
   Large, highly unsymmetric systems need scalable parallel solvers.
   Low-level routines: No expert in the loop!

   Use static pivoting to decouple symbolic, numeric phases.
   Perturb the factorization and refine the solution to recover
   accuracy.
   Jason Riedy (UCB)       Distributed Auctions        17 Dec, 2008   3 / 28
Sparse Matrix to Bipartite Graph to Pivots
          Col 1Col 2Col 3Col 4                                             Col 1Col 2Col 3Col 4


Row 1                            Row 1                     Col 1   Row 2

Row 2                            Row 2                     Col 2   Row 3

Row 3                            Row 3                     Col 3   Row 1

Row 4                            Row 4                     Col 4   Row 4




Bipartite model
        Each row and column is a vertex.
        Each explicit entry is an edge.
        Want to chose “largest” entries for pivots.
        Maximum weight complete bipartite matching:
                           linear assignment problem
        Jason Riedy (UCB)           Distributed Auctions                       17 Dec, 2008   4 / 28
Mathematical Form
“Just” a linear optimization problem:
          B n × n matrix of benefits in ∪ {−∞}, often c + log2 |A|
          X n × n permutation matrix: the matching
     pr , πc dual variables, will be price and profit
     1r , 1c unit entry vectors corresponding to rows, cols

 Lin. assignment prob.                          Dual problem

   maximize             Tr B T X                  minimize 1T pr + 1T πc
                                                            r       c
    X∈ n×n                                                pr ,πc

  subject to X 1c = 1r ,                        subject to pr 1T + 1r πc ≥ B.
                                                               c
                                                                       T

                    X T 1r = 1c , and
                    X ≥ 0.

    Jason Riedy (UCB)              Distributed Auctions            17 Dec, 2008   5 / 28
Mathematical Form
“Just” a linear optimization problem:
          B n × n matrix of benefits in ∪ {−∞}, often c + log2 |A|
          X n × n permutation matrix: the matching
     pr , πc dual variables, will be price and profit
     1r , 1c unit entry vectors corresponding to rows, cols

 Lin. assignment prob.                          Dual problem
                                                Implicit form:
   maximize             Tr B T X
    X∈ n×n                                        minimize 1T pr
                                                            r
                                                          pr
  subject to X 1c = 1r ,
                                                               +         max(B(i, j)
                    X T 1r = 1c , and                                    i∈R
                                                                   j∈C
                    X ≥ 0.                                           − pr (j)).
    Jason Riedy (UCB)              Distributed Auctions                   17 Dec, 2008   5 / 28
Do We Need a Special Method?
 The LAP:                                      Standard form:

   maximize            Tr B T X                                min c T x
    X∈    n×n                                                   x
                                                                e
  subject to       X 1c = 1r ,                           subject to   Ax = b, and
                   X T 1r = 1c , and                                  x ≥ 0.
                   X ≥ 0.
                                               A: 2n × τ vertex-edge matrix

   Network optimization kills simplex methods.
          (“Smoothed analysis” does not apply.)
   Interior point needs to round the solution.
          (And needs to solve Ax = b for a much larger A.)
   Combinatorial methods should be faster.
          (But unpredictable!)
   Jason Riedy (UCB)              Distributed Auctions                     17 Dec, 2008   6 / 28
Properties from Optimization


Complementary slackness
                       X                T
                           (pr 1T + 1r πc − B) = 0.
                                c



   If (i, j) is in the matching (X (i, j) = 0), then
   pr (i) + πc (j) = B(i, j).
   Used to chose matching edges and modify dual variables in
   combinatorial algorithms.




   Jason Riedy (UCB)           Distributed Auctions   17 Dec, 2008   7 / 28
Properties from Optimization
Relaxed problem
Introduce a parameter µ, two interpretations:
     from a barrier function related to X ≥ 0, or
     from the auction algorithm (later).
Then
          Tr B T X ∗ ≤ 1T pr + 1T πc ≤ Tr B T X ∗ + (n − 1)µ,
                        r       c

or the computed dual value (and hence computed primal matching) is
within (n − 1)µ of the optimal primal.
     Very useful for finding approximately optimal matchings.

Feasibility bound
Starting from zero prices:
                  pr (i) ≤ (n − 1)(µ + finite range of B)
    Jason Riedy (UCB)          Distributed Auctions        17 Dec, 2008   8 / 28
Algorithms for Solving the LAP
Goal: A parallel algorithm that justifies buying big machines.
Acceptable: A distributed algorithm; matrix is on many nodes.

Choices
    Simplex or continuous / interior-point
           Plain simplex blows up, network simplex difficult to parallelize.
           Rounding for interior point often falls back on matching.
           (Optimal IP algorithm: Goldberg, Plotkin, Shmoys, Tardos.
           Needs factorization.)
    Augmenting-path based (Mc64: Duff and Koster)
           Based on depth- or breadth-first search.
           Both are P-complete, inherently sequential (Greenlaw, Reif).
    Auctions (Bertsekas, et al.)
           Only length-1 alternating paths; global sync for duals.
    Jason Riedy (UCB)           Distributed Auctions          17 Dec, 2008   9 / 28
Auction Algorithms
   Discussion will be column-major.
   General structure:
      1   Each unmatched column finds the “best” row, places a bid.
                  The dual variable pr holds the prices.
                  The profit πc is implicit. (No significant FP errors!)
                  Each entry’s value: benefit B(i, j)− price p(i).
                  A bid maximally increases the price of the most valuable row.
      2   Bids are reconciled.
                  Highest proposed price wins, forms a match.
                  Loser needs to re-bid.
                  Some versions need tie-breaking; here least column.
      3   Repeat.
                  Eventually everyone will be matched, or
                  some price will be too high.
   Seq. implementation in ∼40–50 lines, can compete with Mc64
   Some corner cases to handle. . .
   Jason Riedy (UCB)               Distributed Auctions           17 Dec, 2008   10 / 28
The Bid-Finding Loop
For each unmatched column:
                                                          Price
                  Row Index
                   Row Entry



                               value = entry − price
                               Save largest and second−largest
                               Bid price incr: diff. in values


Differences from sparse matrix-vector products
    Not all columns, rows used every iteration.
    Hence output price updates are scattered.
    More local work per entry

    Jason Riedy (UCB)              Distributed Auctions           17 Dec, 2008   11 / 28
The Bid-Finding Loop
For each unmatched column:
                                                          Price
                  Row Index
                   Row Entry



                               value = entry − price
                               Save largest and second−largest
                               Bid price incr: diff. in values


Little points
    Increase bid price by µ to avoid loops
           Needs care in floating-point for small µ.
    Single adjacent row → ∞ price
           Affects feasibility test, computing dual
    Jason Riedy (UCB)              Distributed Auctions           17 Dec, 2008   11 / 28
Termination


   Once a row is matched, it stays matched.
          A new bid may swap it to another column.
          The matching (primal) increases monotonically.
   Prices only increase.
          The dual does not change when a row is newly matched.
          But the dual may decrease when a row is taken.
          The dual decreases monotonically.
   Subtle part: If the dual doesn’t decrease. . .
          It’s ok. Can show the new edge begins an augmenting path that
          increases the matching or an alternating path that decreases the
          dual.



   Jason Riedy (UCB)          Distributed Auctions          17 Dec, 2008   12 / 28
Successive Approximation (µ-scaling)

Complication #1
   Simple auctions aren’t really competitive with Mc64.
   Start with a rough approximation (large µ) and refine.
   Called -scaling in the literature, but µ-scaling is better.
   Preserve the prices pr at each step, but clear the matching.
   Note: Do not clear matches associated with ∞ prices!
   Equivalent to finding diagonal scaling Dr ADc and matching
   again on the new B.
   Problem: Performance strongly depends on initial scaling.
   Also depends strongly on hidden parameters.


   Jason Riedy (UCB)       Distributed Auctions        17 Dec, 2008   13 / 28
Performance Varies on the Same Data!

   Group                Name                realT     intT    real      int
   FEMLAB               poisson3Db          0.014    0.014   0.014     0.013
   GHS indef            ncvxqp5             0.475    0.605   0.476     0.608
   Hamm                 scircuit            0.058    0.018   0.058     0.031
   Schenk IBMSDS        bm matrix 2         1.446    2.336   1.089     1.367
   Schenk IBMSDS        matrix 9            4.955    6.453   3.091     5.401
   Schenk ISEI          barrier2-4          2.915    5.678   6.363     7.699
   Zhao                 Zhao2               1.227    2.726   0.686     1.450
   Vavasis              av41092             5.417    5.172   4.038     6.220
   Hollinger            g7jac200            0.654 2.557 0.848 2.656
   Hollinger            g7jac200sc          0.356 1.505 0.371 0.410

On a Core 2, 2.133ghz. Note: Mc64 performance is in the same range.
    Jason Riedy (UCB)         Distributed Auctions              17 Dec, 2008   14 / 28
Setting / Lowering Parallel Expectations

Performance scalability?
   Originally proposed (early 1990s) when
      cpu speed ≈ memory speed ≈ network speed ≈ slow.
   Now:
        cpu speed     memory latency > network latency.
   Latency dominates matching algorithms (auction and others).
   Communication patterns are very irregular.
   Latency (and software overhead) is not improving. . .

Scaled back goal
       It suffices to not slow down much on distributed data.


   Jason Riedy (UCB)       Distributed Auctions      17 Dec, 2008   15 / 28
Basic Idea: Run Local Auctions, Treat as Bids
                                         1
                                         0                   0
                                                             1
         0 0
         1 1
         1 1
         0 0
       0000000
       1111111
         1 1
         0 0
       1111111
       0000000
         0 0
         1 1
                                     111
                                     000
                                     000
                                     111
                                         1
                                         0
                                         0
                                         1
                                         1
                                         0
                                         0
                                         1
                                                         111
                                                         000
                                                         111
                                                         000
                                                             1
                                                             0
                                                             1
                                                             0
                                                             1
                                                             0
                                                             1
                                                             0
                                                                           00000
                                                                           11111
0000
1111
         0 0
         1 1
         1 1
         0 0             0000
                         1111            0
                                         1
                                         0
                                         1        00
                                                   111
                                                   000
                                                  11         0
                                                             1
                                                             0
                                                             1000
                                                             11
                                                             00
                                                              111
                         1111
                       ⇒ 0000                     11
                                                   111
                                                   000
                                                  00         00
                                                             11
                                                              000
                                                              111
         1 1
         0 0                             0
                                         1                   0
                                                             1
0000
1111     1 1
         0 0
         0 0
         1 1
                         0000
                         1111
                                         0
                                         1
                                         1
                                         0
                                                   000
                                                  00
                                                  11
                                                   111
                                                             0
                                                             1
                                                             11
                                                             00
                                                             0
                                                             1
                                                              000
                                                              111
1111
0000     1 1
         0 0
         1 1
         0 0
                         0000
                         1111
                                         0
                                         1
                                                   111
                                                  11
                                                  00
                                                   000
                                                             0
                                                             1
                                                             11
                                                             00
                                                              111
                                                              000
1111
0000
1111
0000
         0 0
         1 1
          B
         0 0
         1 1
         0 0
         1 1             0000
                         1111
                                         1
                                         0
                                         1
                                         0
                                         0
                                         1
                                         1
                                         0         000
                                                  11
                                                  00
                                                   111
                                                             1
                                                             0
                                                             0
                                                             1
                                                             11
                                                             00
                                                             0
                                                             1
                                                             1
                                                             0000
                                                              111
1111
0000
         1 1
         0 0
         0 0
         1 1
         1 1
         0 0
         1 1
         0 0
                         0000
                         1111            1
                                         0
                                         1
                                         0
                                         0
                                         1
                                         1
                                         0
                                                  11
                                                   000
                                                   111
                                                  00         11
                                                             00
                                                             0
                                                             1
                                                             0
                                                             1
                                                             1
                                                             0
                                                             1
                                                             0
                                                              111
                                                              000
         1 1
         0 0                             1
                                      P1 0                P2 0
                                                             1              P3

   Slice the matrix into pieces, run local auctions.
   The winning local bids are the slices’ bids.
   Merge. . . (“And then a miracle occurs. . .”)
   Need to keep some data in sync for termination.

   Jason Riedy (UCB)       Distributed Auctions             17 Dec, 2008    16 / 28
Basic Idea: Run Local Auctions, Treat as Bids
                                            1
                                            0                   0
                                                                1
         0 0
         1 1
         1 1
         0 0
       0000000
       1111111
         1 1
         0 0
       1111111
       0000000
         0 0
         1 1
                                        111
                                        000
                                        000
                                        111
                                            1
                                            0
                                            0
                                            1
                                            1
                                            0
                                            0
                                            1
                                                            111
                                                            000
                                                            111
                                                            000
                                                                1
                                                                0
                                                                1
                                                                0
                                                                1
                                                                0
                                                                1
                                                                0
                                                                              00000
                                                                              11111
0000
1111
         0 0
         1 1
         1 1
         0 0             0000
                         1111               0
                                            1
                                            0
                                            1        00
                                                      111
                                                      000
                                                     11         0
                                                                1
                                                                0
                                                                1000
                                                                11
                                                                00
                                                                 111
                         1111
                       ⇒ 0000                        11
                                                      111
                                                      000
                                                     00         00
                                                                11
                                                                 000
                                                                 111
         1 1
         0 0                                0
                                            1                   0
                                                                1
0000
1111     1 1
         0 0
         0 0
         1 1
                         0000
                         1111
                                            0
                                            1
                                            1
                                            0
                                                      000
                                                     00
                                                     11
                                                      111
                                                                0
                                                                1
                                                                11
                                                                00
                                                                0
                                                                1
                                                                 000
                                                                 111
1111
0000     1 1
         0 0
         1 1
         0 0
                         0000
                         1111
                                            0
                                            1
                                                      111
                                                     11
                                                     00
                                                      000
                                                                0
                                                                1
                                                                11
                                                                00
                                                                 111
                                                                 000
1111
0000
1111
0000
         0 0
         1 1
          B
         0 0
         1 1
         0 0
         1 1             0000
                         1111
                                            1
                                            0
                                            1
                                            0
                                            0
                                            1
                                            1
                                            0         000
                                                     11
                                                     00
                                                      111
                                                                1
                                                                0
                                                                0
                                                                1
                                                                11
                                                                00
                                                                0
                                                                1
                                                                1
                                                                0000
                                                                 111
1111
0000
         1 1
         0 0
         0 0
         1 1
         1 1
         0 0
         1 1
         0 0
                         0000
                         1111               1
                                            0
                                            1
                                            0
                                            0
                                            1
                                            1
                                            0
                                                     11
                                                      000
                                                      111
                                                     00         11
                                                                00
                                                                0
                                                                1
                                                                0
                                                                1
                                                                1
                                                                0
                                                                1
                                                                0
                                                                 111
                                                                 000
         1 1
         0 0                                1
                                         P1 0                P2 0
                                                                1              P3

   Can be memory scalable: Compact the local pieces.
   Have not experimented with simple SMP version.
          Sequential performance is limited by the memory system.
   Note: Could be useful for multicore w/local memory.


   Jason Riedy (UCB)          Distributed Auctions             17 Dec, 2008    16 / 28
Ring Around the Auction
        Fits nodes-of-smp architectures well (Itanium2/Myrinet).
        Needs O( largest # of rows ) data, may be memory scalable.
        Initial, sparse matrices cannot be spread across too many
        processors. . . (Below: At least 500 cols per proc.)
                                                                                                                                     q
                                          Auction                                                                             Auction v. root
                                                                                                                                                     q

                                                                                                                                                     q




                                                                                                                      q



                                                                                                              q
                   8   q                                                                                  8   q
                       q




                                                                                      Re−scaled speedup
                   6                                                                                      6
         Speedup




                                                                                                                      q              q
                                                                                                                  q
                                                                                                              q   q
                                                                                                              q       q
                                                                                                                                     q
                                                                                                                      q
                                                                                                                  q   q
                                                                                                              q       q
                   4       q
                               q
                                                                                                          4       q
                                                                                                                  q
                           q
                                                                                                              q   q
                       q   q                                                                                  q   q                                  q
                                                                                                                                     q                         q
                                                                                                              q
                       q       q                                                                                      q
                                                                                                              q   q                  q
                       q
                       q   q                                                                                  q                      q               q
                       q                      q                                                               q   q
                           q                                                                                  q   q
                                                                                                                  q                                            q
                   2   q
                       q   q                                                                              2   q
                                                                                                              q       q
                           q   q                              q                                                                                      q
                       q
                       q       q                                                                              q   q   q                              q
                       q       q
                           q                  q               q                                                   q
                           q                                                                                                                         q
                           q                                                                                          q              q
                       q   q   q
                               q                                                                              q   q                                  q
                                              q
                                              q                                                                   q
                       q   q                  q               q                                                       q              q
                               q
                               q              q               q        q                                              q
                       q                                                                                                             q
                                                                                                                                     q
                               q              q               q                                                   q   q
                           q   q                                       q                                              q              q               q         q
                               q              q               q
                                                              q        q                                                             q               q         q
                               q
                               q              q
                                              q               q        q                                              q
                                                                                                                      q              q
                                                                                                                                     q               q
                                                                                                                                                     q
                                                              q                                                                                      q

                                   5                 10           15                                                      5                 10            15
                                       Number of Processors                                                                   Number of Processors




On the Citris Itanium2 cluster with Myrinet.
       Jason Riedy (UCB)                                               Distributed Auctions                                                          17 Dec, 2008   17 / 28
Other Communication Schemes

Blind all-to-all
    Send all the bids to everyone.
    A little slower than the ring.
    May need O(# local rows · p) intermediate data; not memory
    scalable.
    Needs O(p) communication; not processor scalable

Tree reduction
    Reduction operation: MPI MAXLOC on array of { price, column }.
    Much slower than everything else.
    Each node checks result, handles unmatching locally.
    Needs O(n) intermediate data; not memory scalable.

    Jason Riedy (UCB)      Distributed Auctions      17 Dec, 2008   18 / 28
Diving in: Lessons from Examples


   Three different matrices, four different perf. profiles.
   All are for the ring style.
   Only up to 16 processors; that’s enough to cause problems.
   Performance dependencies:
          mostly the # of comm. phases, and
          a little on the total # of entries scanned along the critical path.
   The latter decreases with more processors, as it must.
   But the former is wildly unpredictable.




   Jason Riedy (UCB)           Distributed Auctions           17 Dec, 2008   19 / 28
Diving in: Example That Scales
Matrix Nemeth/nemeth22
    n = 9 506, nent = 1 358 832
    Entries roughly even across nodes.
    This one is fast and scales.
    But 8 processors is difficult to explain.

              # Proc     Time     # Comm # Ent. Scanned
                   1    0.2676          0    839 482 003
                   2    0.0741         55  1 636 226 414
                   4    0.0307         20    412 509 573
                   8    0.0176         27    105 229 945
                  16    0.0153         21     27 770 769

On jacquard.nersc.gov, Opteron and Infiniband.
    Jason Riedy (UCB)            Distributed Auctions   17 Dec, 2008   20 / 28
Diving in: Example That Does Not Scale

Matrix GHS indef/ncvxqp5
   n = 62 500, nent = 424 966
   Entries roughly even across nodes.
   This one is superlinear in the wrong sense.

            # Proc   Time # Comm                  # Ent. Scanned
                 1   0.910         0                  989 373 986
                 2   1.128    65 370                1 934 162 133
                 4   2.754    63 228                1 458 840 434
                 8 15.924    216 178                  748 628 941
                16 177.282 1 353 734                   96 183 742


   Jason Riedy (UCB)       Distributed Auctions               17 Dec, 2008   21 / 28
Diving in: Example That Confuses Everything

Matrix Vavasis/av41092
   n = 41 092, nent = 16 839 024
   Entries roughly even across nodes.
   Performance ∼ # Comm
   What?!?!
                            Not transposed:
             # Proc     Time # Comm #                Ent. Scanned
                  1     9.042          0             1 760 564 335
                  2     5.328     24 248             2 094 140 729
                  4     6.218     57 553             1 742 989 035
                  8    15.480    209 393             1 109 156 585
                 16    68.908    675 635               321 907 160

   Jason Riedy (UCB)          Distributed Auctions              17 Dec, 2008   22 / 28
Diving in: Example That Confuses Everything

Matrix Vavasis/av41092
   n = 41 092, nent = 16 839 024
   Entries roughly even across nodes.
   Performance ∼ # Comm
   What?!?!
                          Transposed:
           # Proc     Time # Comm # Ent. Scanned
                1   10.044          0 2 010 702 016
                2   10.832    887 047 1 776 252 776
                4   41.417 1 475 564  1 974 921 328
                8   18.929    249 947   844 718 754
               16 (forever)

   Jason Riedy (UCB)       Distributed Auctions   17 Dec, 2008   22 / 28
So What Happened?

   Matrix av41092 has one large strongly connected component.
          (The square blocks in a Dulmage-Mendelsohn decomposition.)
   The SCC spans all the processors.
   Every edge in an SCC is a part of some complete matching.
   Horrible performance from:
          starting along a non-max-weight matching,
          making it almost complete,
          then an edge-by-edge search for nearby matchings,
          requiring a communication phase almost per edge.
   Conjecture: This type of performance land-mine will affect any
   0-1 combinatorial algorithm.


   Jason Riedy (UCB)          Distributed Auctions        17 Dec, 2008   23 / 28
Improvements?

   Rearranging deck chairs: few-to-few communication
          Build a directory of which nodes share rows: collapsed BB T .
          Send only to/from those neighbors.
          Minor improvement over MPI Allgatherv for a huge effort.
          Still too fragile to trust perf. results
   Improving communication may not be worth it. . .
          The real problem is the number of comm. phases.
          If diagonal is the matching, everything is overhead.
          Or if there’s a large SCC. . .
   Another alternative: Multiple algorithms at once.
          Run Bora U¸ar’s alg. on one set of nodes, auction on another,
                     c
          transposed auction on another, . . .
          Requires some painful software engineering.

   Jason Riedy (UCB)           Distributed Auctions          17 Dec, 2008   24 / 28
Forward-Reverse Auctions
Improving the algorithm
Forward-reverse auctions alternate directions.
    Start column-major.
    Once there has been some progress, but progress stops, switch
    to row-major.
    Switch back when stalled after making some progress.

    Much less sensitive to initial scaling.
    Does not need µ-scaling, so trivial cases should be faster.
    But this require the transpose.
           Few-to-few communication very nearly requires the transpose
           already. . .
           Later stages (symbolic factorization) also require some
           transpose information. . .
    Jason Riedy (UCB)          Distributed Auctions        17 Dec, 2008   25 / 28
So, Could This Ever Be Parallel?

Doubtful?
   For a given matrix-processor layout, constructing a matrix
   requiring O(n) communication is pretty easy for combinatorial
   algorithms.
          Force almost every local action to be undone at every step.
          Non-fractional combinatorial algorithms are too restricted.
   Using less-restricted optimization methods is promising, but far
   slower sequentially.
          Existing algs (Goldberg, et al.) are PRAM with n3 processors.
          General purpose methods: Cutting planes, successive SDPs
          Someone clever might find a parallel rounding algorithm.
          Solving the fractional LAP quickly would become a matter of
          finding a magic preconditioner. . .
          Maybe not a good thing for a direct method?

   Jason Riedy (UCB)          Distributed Auctions         17 Dec, 2008   26 / 28
So, Could This Ever Be Parallel?



Another possibility?
    If we could quickly compute the dual by scaling. . .
    Use the complementary slackness condition to produce a much
    smaller, unweighted problem.
    Solve that on one node?
    May be a practical alternative.




   Jason Riedy (UCB)      Distributed Auctions      17 Dec, 2008   27 / 28
Questions?
(I’m currently working on an Octave-based, parallel forw/rev auction
to see if it may help...)




    Jason Riedy (UCB)       Distributed Auctions       17 Dec, 2008   28 / 28

Mais conteúdo relacionado

Mais procurados

Matrix Models of 2D String Theory in Non-trivial Backgrounds
Matrix Models of 2D String Theory in Non-trivial BackgroundsMatrix Models of 2D String Theory in Non-trivial Backgrounds
Matrix Models of 2D String Theory in Non-trivial BackgroundsUtrecht University
 
B. Sazdovic - Noncommutativity and T-duality
B. Sazdovic - Noncommutativity and T-dualityB. Sazdovic - Noncommutativity and T-duality
B. Sazdovic - Noncommutativity and T-dualitySEENET-MTP
 
Bouguet's MatLab Camera Calibration Toolbox for Stereo Camera
Bouguet's MatLab Camera Calibration Toolbox for Stereo CameraBouguet's MatLab Camera Calibration Toolbox for Stereo Camera
Bouguet's MatLab Camera Calibration Toolbox for Stereo CameraYuji Oyamada
 
Mesh Processing Course : Mesh Parameterization
Mesh Processing Course : Mesh ParameterizationMesh Processing Course : Mesh Parameterization
Mesh Processing Course : Mesh ParameterizationGabriel Peyré
 
Mesh Processing Course : Differential Calculus
Mesh Processing Course : Differential CalculusMesh Processing Course : Differential Calculus
Mesh Processing Course : Differential CalculusGabriel Peyré
 
Exchange confirm
Exchange confirmExchange confirm
Exchange confirmNBER
 
T. Popov - Drinfeld-Jimbo and Cremmer-Gervais Quantum Lie Algebras
T. Popov - Drinfeld-Jimbo and Cremmer-Gervais Quantum Lie AlgebrasT. Popov - Drinfeld-Jimbo and Cremmer-Gervais Quantum Lie Algebras
T. Popov - Drinfeld-Jimbo and Cremmer-Gervais Quantum Lie AlgebrasSEENET-MTP
 
CMA-ES with local meta-models
CMA-ES with local meta-modelsCMA-ES with local meta-models
CMA-ES with local meta-modelszyedb
 
Mesh Processing Course : Geodesics
Mesh Processing Course : GeodesicsMesh Processing Course : Geodesics
Mesh Processing Course : GeodesicsGabriel Peyré
 
M. Visinescu - Higher Order First Integrals, Killing Tensors, Killing-Maxwell...
M. Visinescu - Higher Order First Integrals, Killing Tensors, Killing-Maxwell...M. Visinescu - Higher Order First Integrals, Killing Tensors, Killing-Maxwell...
M. Visinescu - Higher Order First Integrals, Killing Tensors, Killing-Maxwell...SEENET-MTP
 
Adaptive Signal and Image Processing
Adaptive Signal and Image ProcessingAdaptive Signal and Image Processing
Adaptive Signal and Image ProcessingGabriel Peyré
 
CVPR2010: higher order models in computer vision: Part 1, 2
CVPR2010: higher order models in computer vision: Part 1, 2CVPR2010: higher order models in computer vision: Part 1, 2
CVPR2010: higher order models in computer vision: Part 1, 2zukun
 
Image Processing 3
Image Processing 3Image Processing 3
Image Processing 3jainatin
 
Learning with Nets and Meshes
Learning with Nets and MeshesLearning with Nets and Meshes
Learning with Nets and MeshesDon Sheehy
 
Linear-Size Approximations to the Vietoris-Rips Filtration - Presented at Uni...
Linear-Size Approximations to the Vietoris-Rips Filtration - Presented at Uni...Linear-Size Approximations to the Vietoris-Rips Filtration - Presented at Uni...
Linear-Size Approximations to the Vietoris-Rips Filtration - Presented at Uni...Don Sheehy
 

Mais procurados (20)

Matrix Models of 2D String Theory in Non-trivial Backgrounds
Matrix Models of 2D String Theory in Non-trivial BackgroundsMatrix Models of 2D String Theory in Non-trivial Backgrounds
Matrix Models of 2D String Theory in Non-trivial Backgrounds
 
B. Sazdovic - Noncommutativity and T-duality
B. Sazdovic - Noncommutativity and T-dualityB. Sazdovic - Noncommutativity and T-duality
B. Sazdovic - Noncommutativity and T-duality
 
Bouguet's MatLab Camera Calibration Toolbox for Stereo Camera
Bouguet's MatLab Camera Calibration Toolbox for Stereo CameraBouguet's MatLab Camera Calibration Toolbox for Stereo Camera
Bouguet's MatLab Camera Calibration Toolbox for Stereo Camera
 
Mesh Processing Course : Mesh Parameterization
Mesh Processing Course : Mesh ParameterizationMesh Processing Course : Mesh Parameterization
Mesh Processing Course : Mesh Parameterization
 
Mo u quantified
Mo u   quantifiedMo u   quantified
Mo u quantified
 
Mesh Processing Course : Differential Calculus
Mesh Processing Course : Differential CalculusMesh Processing Course : Differential Calculus
Mesh Processing Course : Differential Calculus
 
Exchange confirm
Exchange confirmExchange confirm
Exchange confirm
 
Simplex3
Simplex3Simplex3
Simplex3
 
T. Popov - Drinfeld-Jimbo and Cremmer-Gervais Quantum Lie Algebras
T. Popov - Drinfeld-Jimbo and Cremmer-Gervais Quantum Lie AlgebrasT. Popov - Drinfeld-Jimbo and Cremmer-Gervais Quantum Lie Algebras
T. Popov - Drinfeld-Jimbo and Cremmer-Gervais Quantum Lie Algebras
 
Savage-Dickey paradox
Savage-Dickey paradoxSavage-Dickey paradox
Savage-Dickey paradox
 
CMA-ES with local meta-models
CMA-ES with local meta-modelsCMA-ES with local meta-models
CMA-ES with local meta-models
 
Mesh Processing Course : Geodesics
Mesh Processing Course : GeodesicsMesh Processing Course : Geodesics
Mesh Processing Course : Geodesics
 
BIRS 12w5105 meeting
BIRS 12w5105 meetingBIRS 12w5105 meeting
BIRS 12w5105 meeting
 
M. Visinescu - Higher Order First Integrals, Killing Tensors, Killing-Maxwell...
M. Visinescu - Higher Order First Integrals, Killing Tensors, Killing-Maxwell...M. Visinescu - Higher Order First Integrals, Killing Tensors, Killing-Maxwell...
M. Visinescu - Higher Order First Integrals, Killing Tensors, Killing-Maxwell...
 
UCB 2012-02-28
UCB 2012-02-28UCB 2012-02-28
UCB 2012-02-28
 
Adaptive Signal and Image Processing
Adaptive Signal and Image ProcessingAdaptive Signal and Image Processing
Adaptive Signal and Image Processing
 
CVPR2010: higher order models in computer vision: Part 1, 2
CVPR2010: higher order models in computer vision: Part 1, 2CVPR2010: higher order models in computer vision: Part 1, 2
CVPR2010: higher order models in computer vision: Part 1, 2
 
Image Processing 3
Image Processing 3Image Processing 3
Image Processing 3
 
Learning with Nets and Meshes
Learning with Nets and MeshesLearning with Nets and Meshes
Learning with Nets and Meshes
 
Linear-Size Approximations to the Vietoris-Rips Filtration - Presented at Uni...
Linear-Size Approximations to the Vietoris-Rips Filtration - Presented at Uni...Linear-Size Approximations to the Vietoris-Rips Filtration - Presented at Uni...
Linear-Size Approximations to the Vietoris-Rips Filtration - Presented at Uni...
 

Destaque

Micro-benchmarking the Tera MTA
Micro-benchmarking the Tera MTAMicro-benchmarking the Tera MTA
Micro-benchmarking the Tera MTAJason Riedy
 
ICASSP 2012: Analysis of Streaming Social Networks and Graphs on Multicore Ar...
ICASSP 2012: Analysis of Streaming Social Networks and Graphs on Multicore Ar...ICASSP 2012: Analysis of Streaming Social Networks and Graphs on Multicore Ar...
ICASSP 2012: Analysis of Streaming Social Networks and Graphs on Multicore Ar...Jason Riedy
 
Updating PageRank for Streaming Graphs
Updating PageRank for Streaming GraphsUpdating PageRank for Streaming Graphs
Updating PageRank for Streaming GraphsJason Riedy
 
Sparse Data Structures for Weighted Bipartite Matching
Sparse Data Structures for Weighted Bipartite Matching Sparse Data Structures for Weighted Bipartite Matching
Sparse Data Structures for Weighted Bipartite Matching Jason Riedy
 
STING: Spatio-Temporal Interaction Networks and Graphs for Intel Platforms
STING: Spatio-Temporal Interaction Networks and Graphs for Intel PlatformsSTING: Spatio-Temporal Interaction Networks and Graphs for Intel Platforms
STING: Spatio-Temporal Interaction Networks and Graphs for Intel PlatformsJason Riedy
 
Scalable and Efficient Algorithms for Analysis of Massive, Streaming Graphs
Scalable and Efficient Algorithms for Analysis of Massive, Streaming GraphsScalable and Efficient Algorithms for Analysis of Massive, Streaming Graphs
Scalable and Efficient Algorithms for Analysis of Massive, Streaming GraphsJason Riedy
 
SIAM Annual Meeting 2012: Streaming Graph Analytics for Massive Graphs
SIAM Annual Meeting 2012: Streaming Graph Analytics for Massive GraphsSIAM Annual Meeting 2012: Streaming Graph Analytics for Massive Graphs
SIAM Annual Meeting 2012: Streaming Graph Analytics for Massive GraphsJason Riedy
 
Graph Exploitation Seminar, 2011
Graph Exploitation Seminar, 2011Graph Exploitation Seminar, 2011
Graph Exploitation Seminar, 2011Jason Riedy
 
STING: Spatio-Temporal Interaction Networks and Graphs for Intel Platforms
STING: Spatio-Temporal Interaction Networks and Graphs for Intel PlatformsSTING: Spatio-Temporal Interaction Networks and Graphs for Intel Platforms
STING: Spatio-Temporal Interaction Networks and Graphs for Intel PlatformsJason Riedy
 
Graph Analysis Trends and Opportunities -- CMG Performance and Capacity 2014
Graph Analysis Trends and Opportunities -- CMG Performance and Capacity 2014Graph Analysis Trends and Opportunities -- CMG Performance and Capacity 2014
Graph Analysis Trends and Opportunities -- CMG Performance and Capacity 2014Jason Riedy
 
Graph Analysis Beyond Linear Algebra
Graph Analysis Beyond Linear AlgebraGraph Analysis Beyond Linear Algebra
Graph Analysis Beyond Linear AlgebraJason Riedy
 
Future of LAPACK and ScaLAPACK
Future of LAPACK and ScaLAPACKFuture of LAPACK and ScaLAPACK
Future of LAPACK and ScaLAPACKJason Riedy
 
Professional Teaching and CTE License
Professional Teaching and CTE LicenseProfessional Teaching and CTE License
Professional Teaching and CTE LicenseChristopher S. Kelley
 
Mathematical model of ESP
Mathematical model of ESPMathematical model of ESP
Mathematical model of ESPAbhijeet Das
 
IDENTIFICAZIONE STRUTTURALE DEL COMPORTAMENTO SPERIMENTALE DI CENTINE INNOVAT...
IDENTIFICAZIONE STRUTTURALE DEL COMPORTAMENTO SPERIMENTALE DI CENTINE INNOVAT...IDENTIFICAZIONE STRUTTURALE DEL COMPORTAMENTO SPERIMENTALE DI CENTINE INNOVAT...
IDENTIFICAZIONE STRUTTURALE DEL COMPORTAMENTO SPERIMENTALE DI CENTINE INNOVAT...StroNGER2012
 

Destaque (20)

Micro-benchmarking the Tera MTA
Micro-benchmarking the Tera MTAMicro-benchmarking the Tera MTA
Micro-benchmarking the Tera MTA
 
ICASSP 2012: Analysis of Streaming Social Networks and Graphs on Multicore Ar...
ICASSP 2012: Analysis of Streaming Social Networks and Graphs on Multicore Ar...ICASSP 2012: Analysis of Streaming Social Networks and Graphs on Multicore Ar...
ICASSP 2012: Analysis of Streaming Social Networks and Graphs on Multicore Ar...
 
Updating PageRank for Streaming Graphs
Updating PageRank for Streaming GraphsUpdating PageRank for Streaming Graphs
Updating PageRank for Streaming Graphs
 
Sparse Data Structures for Weighted Bipartite Matching
Sparse Data Structures for Weighted Bipartite Matching Sparse Data Structures for Weighted Bipartite Matching
Sparse Data Structures for Weighted Bipartite Matching
 
STING: Spatio-Temporal Interaction Networks and Graphs for Intel Platforms
STING: Spatio-Temporal Interaction Networks and Graphs for Intel PlatformsSTING: Spatio-Temporal Interaction Networks and Graphs for Intel Platforms
STING: Spatio-Temporal Interaction Networks and Graphs for Intel Platforms
 
Scalable and Efficient Algorithms for Analysis of Massive, Streaming Graphs
Scalable and Efficient Algorithms for Analysis of Massive, Streaming GraphsScalable and Efficient Algorithms for Analysis of Massive, Streaming Graphs
Scalable and Efficient Algorithms for Analysis of Massive, Streaming Graphs
 
SIAM Annual Meeting 2012: Streaming Graph Analytics for Massive Graphs
SIAM Annual Meeting 2012: Streaming Graph Analytics for Massive GraphsSIAM Annual Meeting 2012: Streaming Graph Analytics for Massive Graphs
SIAM Annual Meeting 2012: Streaming Graph Analytics for Massive Graphs
 
Graph Exploitation Seminar, 2011
Graph Exploitation Seminar, 2011Graph Exploitation Seminar, 2011
Graph Exploitation Seminar, 2011
 
STING: Spatio-Temporal Interaction Networks and Graphs for Intel Platforms
STING: Spatio-Temporal Interaction Networks and Graphs for Intel PlatformsSTING: Spatio-Temporal Interaction Networks and Graphs for Intel Platforms
STING: Spatio-Temporal Interaction Networks and Graphs for Intel Platforms
 
Graph Analysis Trends and Opportunities -- CMG Performance and Capacity 2014
Graph Analysis Trends and Opportunities -- CMG Performance and Capacity 2014Graph Analysis Trends and Opportunities -- CMG Performance and Capacity 2014
Graph Analysis Trends and Opportunities -- CMG Performance and Capacity 2014
 
Graph Analysis Beyond Linear Algebra
Graph Analysis Beyond Linear AlgebraGraph Analysis Beyond Linear Algebra
Graph Analysis Beyond Linear Algebra
 
Future of LAPACK and ScaLAPACK
Future of LAPACK and ScaLAPACKFuture of LAPACK and ScaLAPACK
Future of LAPACK and ScaLAPACK
 
Web meets culture
Web meets cultureWeb meets culture
Web meets culture
 
13 primavara
13 primavara13 primavara
13 primavara
 
Survey Kebutuhan Pengguna Perpustakaan BAPUSIPDA JABAR
Survey Kebutuhan Pengguna Perpustakaan BAPUSIPDA JABARSurvey Kebutuhan Pengguna Perpustakaan BAPUSIPDA JABAR
Survey Kebutuhan Pengguna Perpustakaan BAPUSIPDA JABAR
 
Professional Teaching and CTE License
Professional Teaching and CTE LicenseProfessional Teaching and CTE License
Professional Teaching and CTE License
 
Mathematical model of ESP
Mathematical model of ESPMathematical model of ESP
Mathematical model of ESP
 
Andrew Saunders Resume
Andrew Saunders Resume Andrew Saunders Resume
Andrew Saunders Resume
 
Colorado Principal License
Colorado Principal LicenseColorado Principal License
Colorado Principal License
 
IDENTIFICAZIONE STRUTTURALE DEL COMPORTAMENTO SPERIMENTALE DI CENTINE INNOVAT...
IDENTIFICAZIONE STRUTTURALE DEL COMPORTAMENTO SPERIMENTALE DI CENTINE INNOVAT...IDENTIFICAZIONE STRUTTURALE DEL COMPORTAMENTO SPERIMENTALE DI CENTINE INNOVAT...
IDENTIFICAZIONE STRUTTURALE DEL COMPORTAMENTO SPERIMENTALE DI CENTINE INNOVAT...
 

Semelhante a Auctions for Distributed (and Possibly Parallel) Matchings

Matrix Computations in Machine Learning
Matrix Computations in Machine LearningMatrix Computations in Machine Learning
Matrix Computations in Machine Learningbutest
 
Lecture6
Lecture6Lecture6
Lecture6voracle
 
Kernel based models for geo- and environmental sciences- Alexei Pozdnoukhov –...
Kernel based models for geo- and environmental sciences- Alexei Pozdnoukhov –...Kernel based models for geo- and environmental sciences- Alexei Pozdnoukhov –...
Kernel based models for geo- and environmental sciences- Alexei Pozdnoukhov –...Beniamino Murgante
 
Munich07 Foils
Munich07 FoilsMunich07 Foils
Munich07 FoilsAntonini
 
Random Matrix Theory and Machine Learning - Part 3
Random Matrix Theory and Machine Learning - Part 3Random Matrix Theory and Machine Learning - Part 3
Random Matrix Theory and Machine Learning - Part 3Fabian Pedregosa
 
Higher-order Factorization Machines(第5回ステアラボ人工知能セミナー)
Higher-order Factorization Machines(第5回ステアラボ人工知能セミナー)Higher-order Factorization Machines(第5回ステアラボ人工知能セミナー)
Higher-order Factorization Machines(第5回ステアラボ人工知能セミナー)STAIR Lab, Chiba Institute of Technology
 
On recent improvements in the conic optimizer in MOSEK
On recent improvements in the conic optimizer in MOSEKOn recent improvements in the conic optimizer in MOSEK
On recent improvements in the conic optimizer in MOSEKedadk
 
ISI MSQE Entrance Question Paper (2008)
ISI MSQE Entrance Question Paper (2008)ISI MSQE Entrance Question Paper (2008)
ISI MSQE Entrance Question Paper (2008)CrackDSE
 
Numerical Linear Algebra for Data and Link Analysis.
Numerical Linear Algebra for Data and Link Analysis.Numerical Linear Algebra for Data and Link Analysis.
Numerical Linear Algebra for Data and Link Analysis.Leonid Zhukov
 
Planning Under Uncertainty With Markov Decision Processes
Planning Under Uncertainty With Markov Decision ProcessesPlanning Under Uncertainty With Markov Decision Processes
Planning Under Uncertainty With Markov Decision Processesahmad bassiouny
 
Electrical dictionary.1
Electrical dictionary.1Electrical dictionary.1
Electrical dictionary.1sameeksha9
 
Alternating direction
Alternating directionAlternating direction
Alternating directionDerek Pang
 
NCE, GANs & VAEs (and maybe BAC)
NCE, GANs & VAEs (and maybe BAC)NCE, GANs & VAEs (and maybe BAC)
NCE, GANs & VAEs (and maybe BAC)Christian Robert
 
Edge linking via Hough transform.ppt
Edge linking via Hough transform.pptEdge linking via Hough transform.ppt
Edge linking via Hough transform.pptSivaSankar306103
 

Semelhante a Auctions for Distributed (and Possibly Parallel) Matchings (20)

Matrix Computations in Machine Learning
Matrix Computations in Machine LearningMatrix Computations in Machine Learning
Matrix Computations in Machine Learning
 
Lecture6
Lecture6Lecture6
Lecture6
 
Kernel based models for geo- and environmental sciences- Alexei Pozdnoukhov –...
Kernel based models for geo- and environmental sciences- Alexei Pozdnoukhov –...Kernel based models for geo- and environmental sciences- Alexei Pozdnoukhov –...
Kernel based models for geo- and environmental sciences- Alexei Pozdnoukhov –...
 
Munich07 Foils
Munich07 FoilsMunich07 Foils
Munich07 Foils
 
Random Matrix Theory and Machine Learning - Part 3
Random Matrix Theory and Machine Learning - Part 3Random Matrix Theory and Machine Learning - Part 3
Random Matrix Theory and Machine Learning - Part 3
 
Higher-order Factorization Machines(第5回ステアラボ人工知能セミナー)
Higher-order Factorization Machines(第5回ステアラボ人工知能セミナー)Higher-order Factorization Machines(第5回ステアラボ人工知能セミナー)
Higher-order Factorization Machines(第5回ステアラボ人工知能セミナー)
 
On recent improvements in the conic optimizer in MOSEK
On recent improvements in the conic optimizer in MOSEKOn recent improvements in the conic optimizer in MOSEK
On recent improvements in the conic optimizer in MOSEK
 
ISI MSQE Entrance Question Paper (2008)
ISI MSQE Entrance Question Paper (2008)ISI MSQE Entrance Question Paper (2008)
ISI MSQE Entrance Question Paper (2008)
 
Numerical Linear Algebra for Data and Link Analysis.
Numerical Linear Algebra for Data and Link Analysis.Numerical Linear Algebra for Data and Link Analysis.
Numerical Linear Algebra for Data and Link Analysis.
 
03raster 1
03raster 103raster 1
03raster 1
 
Planning Under Uncertainty With Markov Decision Processes
Planning Under Uncertainty With Markov Decision ProcessesPlanning Under Uncertainty With Markov Decision Processes
Planning Under Uncertainty With Markov Decision Processes
 
Electrical dictionary.1
Electrical dictionary.1Electrical dictionary.1
Electrical dictionary.1
 
Unit 3
Unit 3Unit 3
Unit 3
 
Unit 3
Unit 3Unit 3
Unit 3
 
NN-Ch7.PDF
NN-Ch7.PDFNN-Ch7.PDF
NN-Ch7.PDF
 
Optimization tutorial
Optimization tutorialOptimization tutorial
Optimization tutorial
 
Alternating direction
Alternating directionAlternating direction
Alternating direction
 
Rpra1
Rpra1Rpra1
Rpra1
 
NCE, GANs & VAEs (and maybe BAC)
NCE, GANs & VAEs (and maybe BAC)NCE, GANs & VAEs (and maybe BAC)
NCE, GANs & VAEs (and maybe BAC)
 
Edge linking via Hough transform.ppt
Edge linking via Hough transform.pptEdge linking via Hough transform.ppt
Edge linking via Hough transform.ppt
 

Mais de Jason Riedy

Lucata at the HPEC GraphBLAS BoF
Lucata at the HPEC GraphBLAS BoFLucata at the HPEC GraphBLAS BoF
Lucata at the HPEC GraphBLAS BoFJason Riedy
 
LAGraph 2021-10-13
LAGraph 2021-10-13LAGraph 2021-10-13
LAGraph 2021-10-13Jason Riedy
 
Lucata at the HPEC GraphBLAS BoF
Lucata at the HPEC GraphBLAS BoFLucata at the HPEC GraphBLAS BoF
Lucata at the HPEC GraphBLAS BoFJason Riedy
 
Graph analysis and novel architectures
Graph analysis and novel architecturesGraph analysis and novel architectures
Graph analysis and novel architecturesJason Riedy
 
GraphBLAS and Emus
GraphBLAS and EmusGraphBLAS and Emus
GraphBLAS and EmusJason Riedy
 
Reproducible Linear Algebra from Application to Architecture
Reproducible Linear Algebra from Application to ArchitectureReproducible Linear Algebra from Application to Architecture
Reproducible Linear Algebra from Application to ArchitectureJason Riedy
 
PEARC19: Wrangling Rogues: A Case Study on Managing Experimental Post-Moore A...
PEARC19: Wrangling Rogues: A Case Study on Managing Experimental Post-Moore A...PEARC19: Wrangling Rogues: A Case Study on Managing Experimental Post-Moore A...
PEARC19: Wrangling Rogues: A Case Study on Managing Experimental Post-Moore A...Jason Riedy
 
ICIAM 2019: Reproducible Linear Algebra from Application to Architecture
ICIAM 2019: Reproducible Linear Algebra from Application to ArchitectureICIAM 2019: Reproducible Linear Algebra from Application to Architecture
ICIAM 2019: Reproducible Linear Algebra from Application to ArchitectureJason Riedy
 
ICIAM 2019: A New Algorithm Model for Massive-Scale Streaming Graph Analysis
ICIAM 2019: A New Algorithm Model for Massive-Scale Streaming Graph AnalysisICIAM 2019: A New Algorithm Model for Massive-Scale Streaming Graph Analysis
ICIAM 2019: A New Algorithm Model for Massive-Scale Streaming Graph AnalysisJason Riedy
 
Novel Architectures for Applications in Data Science and Beyond
Novel Architectures for Applications in Data Science and BeyondNovel Architectures for Applications in Data Science and Beyond
Novel Architectures for Applications in Data Science and BeyondJason Riedy
 
Characterization of Emu Chick with Microbenchmarks
Characterization of Emu Chick with MicrobenchmarksCharacterization of Emu Chick with Microbenchmarks
Characterization of Emu Chick with MicrobenchmarksJason Riedy
 
CRNCH 2018 Summit: Rogues Gallery Update
CRNCH 2018 Summit: Rogues Gallery UpdateCRNCH 2018 Summit: Rogues Gallery Update
CRNCH 2018 Summit: Rogues Gallery UpdateJason Riedy
 
Augmented Arithmetic Operations Proposed for IEEE-754 2018
Augmented Arithmetic Operations Proposed for IEEE-754 2018Augmented Arithmetic Operations Proposed for IEEE-754 2018
Augmented Arithmetic Operations Proposed for IEEE-754 2018Jason Riedy
 
Graph Analysis: New Algorithm Models, New Architectures
Graph Analysis: New Algorithm Models, New ArchitecturesGraph Analysis: New Algorithm Models, New Architectures
Graph Analysis: New Algorithm Models, New ArchitecturesJason Riedy
 
CRNCH Rogues Gallery: A Community Core for Novel Computing Platforms
CRNCH Rogues Gallery: A Community Core for Novel Computing PlatformsCRNCH Rogues Gallery: A Community Core for Novel Computing Platforms
CRNCH Rogues Gallery: A Community Core for Novel Computing PlatformsJason Riedy
 
CRNCH Rogues Gallery: A Community Core for Novel Computing Platforms
CRNCH Rogues Gallery: A Community Core for Novel Computing PlatformsCRNCH Rogues Gallery: A Community Core for Novel Computing Platforms
CRNCH Rogues Gallery: A Community Core for Novel Computing PlatformsJason Riedy
 
A New Algorithm Model for Massive-Scale Streaming Graph Analysis
A New Algorithm Model for Massive-Scale Streaming Graph AnalysisA New Algorithm Model for Massive-Scale Streaming Graph Analysis
A New Algorithm Model for Massive-Scale Streaming Graph AnalysisJason Riedy
 
High-Performance Analysis of Streaming Graphs
High-Performance Analysis of Streaming Graphs High-Performance Analysis of Streaming Graphs
High-Performance Analysis of Streaming Graphs Jason Riedy
 
High-Performance Analysis of Streaming Graphs
High-Performance Analysis of Streaming GraphsHigh-Performance Analysis of Streaming Graphs
High-Performance Analysis of Streaming GraphsJason Riedy
 
Network Challenge: Error and Sensitivity Analysis
Network Challenge: Error and Sensitivity AnalysisNetwork Challenge: Error and Sensitivity Analysis
Network Challenge: Error and Sensitivity AnalysisJason Riedy
 

Mais de Jason Riedy (20)

Lucata at the HPEC GraphBLAS BoF
Lucata at the HPEC GraphBLAS BoFLucata at the HPEC GraphBLAS BoF
Lucata at the HPEC GraphBLAS BoF
 
LAGraph 2021-10-13
LAGraph 2021-10-13LAGraph 2021-10-13
LAGraph 2021-10-13
 
Lucata at the HPEC GraphBLAS BoF
Lucata at the HPEC GraphBLAS BoFLucata at the HPEC GraphBLAS BoF
Lucata at the HPEC GraphBLAS BoF
 
Graph analysis and novel architectures
Graph analysis and novel architecturesGraph analysis and novel architectures
Graph analysis and novel architectures
 
GraphBLAS and Emus
GraphBLAS and EmusGraphBLAS and Emus
GraphBLAS and Emus
 
Reproducible Linear Algebra from Application to Architecture
Reproducible Linear Algebra from Application to ArchitectureReproducible Linear Algebra from Application to Architecture
Reproducible Linear Algebra from Application to Architecture
 
PEARC19: Wrangling Rogues: A Case Study on Managing Experimental Post-Moore A...
PEARC19: Wrangling Rogues: A Case Study on Managing Experimental Post-Moore A...PEARC19: Wrangling Rogues: A Case Study on Managing Experimental Post-Moore A...
PEARC19: Wrangling Rogues: A Case Study on Managing Experimental Post-Moore A...
 
ICIAM 2019: Reproducible Linear Algebra from Application to Architecture
ICIAM 2019: Reproducible Linear Algebra from Application to ArchitectureICIAM 2019: Reproducible Linear Algebra from Application to Architecture
ICIAM 2019: Reproducible Linear Algebra from Application to Architecture
 
ICIAM 2019: A New Algorithm Model for Massive-Scale Streaming Graph Analysis
ICIAM 2019: A New Algorithm Model for Massive-Scale Streaming Graph AnalysisICIAM 2019: A New Algorithm Model for Massive-Scale Streaming Graph Analysis
ICIAM 2019: A New Algorithm Model for Massive-Scale Streaming Graph Analysis
 
Novel Architectures for Applications in Data Science and Beyond
Novel Architectures for Applications in Data Science and BeyondNovel Architectures for Applications in Data Science and Beyond
Novel Architectures for Applications in Data Science and Beyond
 
Characterization of Emu Chick with Microbenchmarks
Characterization of Emu Chick with MicrobenchmarksCharacterization of Emu Chick with Microbenchmarks
Characterization of Emu Chick with Microbenchmarks
 
CRNCH 2018 Summit: Rogues Gallery Update
CRNCH 2018 Summit: Rogues Gallery UpdateCRNCH 2018 Summit: Rogues Gallery Update
CRNCH 2018 Summit: Rogues Gallery Update
 
Augmented Arithmetic Operations Proposed for IEEE-754 2018
Augmented Arithmetic Operations Proposed for IEEE-754 2018Augmented Arithmetic Operations Proposed for IEEE-754 2018
Augmented Arithmetic Operations Proposed for IEEE-754 2018
 
Graph Analysis: New Algorithm Models, New Architectures
Graph Analysis: New Algorithm Models, New ArchitecturesGraph Analysis: New Algorithm Models, New Architectures
Graph Analysis: New Algorithm Models, New Architectures
 
CRNCH Rogues Gallery: A Community Core for Novel Computing Platforms
CRNCH Rogues Gallery: A Community Core for Novel Computing PlatformsCRNCH Rogues Gallery: A Community Core for Novel Computing Platforms
CRNCH Rogues Gallery: A Community Core for Novel Computing Platforms
 
CRNCH Rogues Gallery: A Community Core for Novel Computing Platforms
CRNCH Rogues Gallery: A Community Core for Novel Computing PlatformsCRNCH Rogues Gallery: A Community Core for Novel Computing Platforms
CRNCH Rogues Gallery: A Community Core for Novel Computing Platforms
 
A New Algorithm Model for Massive-Scale Streaming Graph Analysis
A New Algorithm Model for Massive-Scale Streaming Graph AnalysisA New Algorithm Model for Massive-Scale Streaming Graph Analysis
A New Algorithm Model for Massive-Scale Streaming Graph Analysis
 
High-Performance Analysis of Streaming Graphs
High-Performance Analysis of Streaming Graphs High-Performance Analysis of Streaming Graphs
High-Performance Analysis of Streaming Graphs
 
High-Performance Analysis of Streaming Graphs
High-Performance Analysis of Streaming GraphsHigh-Performance Analysis of Streaming Graphs
High-Performance Analysis of Streaming Graphs
 
Network Challenge: Error and Sensitivity Analysis
Network Challenge: Error and Sensitivity AnalysisNetwork Challenge: Error and Sensitivity Analysis
Network Challenge: Error and Sensitivity Analysis
 

Último

What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 

Último (20)

What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 

Auctions for Distributed (and Possibly Parallel) Matchings

  • 1. Auctions for Distributed (and Possibly Parallel) Matchings E. Jason Riedy jason@acm.org1 EECS Department University of California, Berkeley 17 December, 2008 1 Thanks to FBF for funding this Cerfacs visit. Jason Riedy (UCB) Distributed Auctions 17 Dec, 2008 1 / 28
  • 2. Outline 1 Introduction 2 Linear Assignment Problem (LAP): Mathematical Form 3 Auction Algorithms 4 Distributed Auctions 5 Prospects for Parallel Matching Jason Riedy (UCB) Distributed Auctions 17 Dec, 2008 2 / 28
  • 3. Motivation: Ever Larger Ax = b Systems Ax = b are growing larger, more difficult Omega3P: n = 7.5 million with τ = 300 million entries Quantum Mechanics: precondition with blocks of dimension 200-350 thousand Large barrier-based optimization problems: Many solves, similar structure, increasing condition number Huge systems are generated, solved, and analyzed automatically. Large, highly unsymmetric systems need scalable parallel solvers. Low-level routines: No expert in the loop! Use static pivoting to decouple symbolic, numeric phases. Perturb the factorization and refine the solution to recover accuracy. Jason Riedy (UCB) Distributed Auctions 17 Dec, 2008 3 / 28
  • 4. Sparse Matrix to Bipartite Graph to Pivots Col 1Col 2Col 3Col 4 Col 1Col 2Col 3Col 4 Row 1 Row 1 Col 1 Row 2 Row 2 Row 2 Col 2 Row 3 Row 3 Row 3 Col 3 Row 1 Row 4 Row 4 Col 4 Row 4 Bipartite model Each row and column is a vertex. Each explicit entry is an edge. Want to chose “largest” entries for pivots. Maximum weight complete bipartite matching: linear assignment problem Jason Riedy (UCB) Distributed Auctions 17 Dec, 2008 4 / 28
  • 5. Mathematical Form “Just” a linear optimization problem: B n × n matrix of benefits in ∪ {−∞}, often c + log2 |A| X n × n permutation matrix: the matching pr , πc dual variables, will be price and profit 1r , 1c unit entry vectors corresponding to rows, cols Lin. assignment prob. Dual problem maximize Tr B T X minimize 1T pr + 1T πc r c X∈ n×n pr ,πc subject to X 1c = 1r , subject to pr 1T + 1r πc ≥ B. c T X T 1r = 1c , and X ≥ 0. Jason Riedy (UCB) Distributed Auctions 17 Dec, 2008 5 / 28
  • 6. Mathematical Form “Just” a linear optimization problem: B n × n matrix of benefits in ∪ {−∞}, often c + log2 |A| X n × n permutation matrix: the matching pr , πc dual variables, will be price and profit 1r , 1c unit entry vectors corresponding to rows, cols Lin. assignment prob. Dual problem Implicit form: maximize Tr B T X X∈ n×n minimize 1T pr r pr subject to X 1c = 1r , + max(B(i, j) X T 1r = 1c , and i∈R j∈C X ≥ 0. − pr (j)). Jason Riedy (UCB) Distributed Auctions 17 Dec, 2008 5 / 28
  • 7. Do We Need a Special Method? The LAP: Standard form: maximize Tr B T X min c T x X∈ n×n x e subject to X 1c = 1r , subject to Ax = b, and X T 1r = 1c , and x ≥ 0. X ≥ 0. A: 2n × τ vertex-edge matrix Network optimization kills simplex methods. (“Smoothed analysis” does not apply.) Interior point needs to round the solution. (And needs to solve Ax = b for a much larger A.) Combinatorial methods should be faster. (But unpredictable!) Jason Riedy (UCB) Distributed Auctions 17 Dec, 2008 6 / 28
  • 8. Properties from Optimization Complementary slackness X T (pr 1T + 1r πc − B) = 0. c If (i, j) is in the matching (X (i, j) = 0), then pr (i) + πc (j) = B(i, j). Used to chose matching edges and modify dual variables in combinatorial algorithms. Jason Riedy (UCB) Distributed Auctions 17 Dec, 2008 7 / 28
  • 9. Properties from Optimization Relaxed problem Introduce a parameter µ, two interpretations: from a barrier function related to X ≥ 0, or from the auction algorithm (later). Then Tr B T X ∗ ≤ 1T pr + 1T πc ≤ Tr B T X ∗ + (n − 1)µ, r c or the computed dual value (and hence computed primal matching) is within (n − 1)µ of the optimal primal. Very useful for finding approximately optimal matchings. Feasibility bound Starting from zero prices: pr (i) ≤ (n − 1)(µ + finite range of B) Jason Riedy (UCB) Distributed Auctions 17 Dec, 2008 8 / 28
  • 10. Algorithms for Solving the LAP Goal: A parallel algorithm that justifies buying big machines. Acceptable: A distributed algorithm; matrix is on many nodes. Choices Simplex or continuous / interior-point Plain simplex blows up, network simplex difficult to parallelize. Rounding for interior point often falls back on matching. (Optimal IP algorithm: Goldberg, Plotkin, Shmoys, Tardos. Needs factorization.) Augmenting-path based (Mc64: Duff and Koster) Based on depth- or breadth-first search. Both are P-complete, inherently sequential (Greenlaw, Reif). Auctions (Bertsekas, et al.) Only length-1 alternating paths; global sync for duals. Jason Riedy (UCB) Distributed Auctions 17 Dec, 2008 9 / 28
  • 11. Auction Algorithms Discussion will be column-major. General structure: 1 Each unmatched column finds the “best” row, places a bid. The dual variable pr holds the prices. The profit πc is implicit. (No significant FP errors!) Each entry’s value: benefit B(i, j)− price p(i). A bid maximally increases the price of the most valuable row. 2 Bids are reconciled. Highest proposed price wins, forms a match. Loser needs to re-bid. Some versions need tie-breaking; here least column. 3 Repeat. Eventually everyone will be matched, or some price will be too high. Seq. implementation in ∼40–50 lines, can compete with Mc64 Some corner cases to handle. . . Jason Riedy (UCB) Distributed Auctions 17 Dec, 2008 10 / 28
  • 12. The Bid-Finding Loop For each unmatched column: Price Row Index Row Entry value = entry − price Save largest and second−largest Bid price incr: diff. in values Differences from sparse matrix-vector products Not all columns, rows used every iteration. Hence output price updates are scattered. More local work per entry Jason Riedy (UCB) Distributed Auctions 17 Dec, 2008 11 / 28
  • 13. The Bid-Finding Loop For each unmatched column: Price Row Index Row Entry value = entry − price Save largest and second−largest Bid price incr: diff. in values Little points Increase bid price by µ to avoid loops Needs care in floating-point for small µ. Single adjacent row → ∞ price Affects feasibility test, computing dual Jason Riedy (UCB) Distributed Auctions 17 Dec, 2008 11 / 28
  • 14. Termination Once a row is matched, it stays matched. A new bid may swap it to another column. The matching (primal) increases monotonically. Prices only increase. The dual does not change when a row is newly matched. But the dual may decrease when a row is taken. The dual decreases monotonically. Subtle part: If the dual doesn’t decrease. . . It’s ok. Can show the new edge begins an augmenting path that increases the matching or an alternating path that decreases the dual. Jason Riedy (UCB) Distributed Auctions 17 Dec, 2008 12 / 28
  • 15. Successive Approximation (µ-scaling) Complication #1 Simple auctions aren’t really competitive with Mc64. Start with a rough approximation (large µ) and refine. Called -scaling in the literature, but µ-scaling is better. Preserve the prices pr at each step, but clear the matching. Note: Do not clear matches associated with ∞ prices! Equivalent to finding diagonal scaling Dr ADc and matching again on the new B. Problem: Performance strongly depends on initial scaling. Also depends strongly on hidden parameters. Jason Riedy (UCB) Distributed Auctions 17 Dec, 2008 13 / 28
  • 16. Performance Varies on the Same Data! Group Name realT intT real int FEMLAB poisson3Db 0.014 0.014 0.014 0.013 GHS indef ncvxqp5 0.475 0.605 0.476 0.608 Hamm scircuit 0.058 0.018 0.058 0.031 Schenk IBMSDS bm matrix 2 1.446 2.336 1.089 1.367 Schenk IBMSDS matrix 9 4.955 6.453 3.091 5.401 Schenk ISEI barrier2-4 2.915 5.678 6.363 7.699 Zhao Zhao2 1.227 2.726 0.686 1.450 Vavasis av41092 5.417 5.172 4.038 6.220 Hollinger g7jac200 0.654 2.557 0.848 2.656 Hollinger g7jac200sc 0.356 1.505 0.371 0.410 On a Core 2, 2.133ghz. Note: Mc64 performance is in the same range. Jason Riedy (UCB) Distributed Auctions 17 Dec, 2008 14 / 28
  • 17. Setting / Lowering Parallel Expectations Performance scalability? Originally proposed (early 1990s) when cpu speed ≈ memory speed ≈ network speed ≈ slow. Now: cpu speed memory latency > network latency. Latency dominates matching algorithms (auction and others). Communication patterns are very irregular. Latency (and software overhead) is not improving. . . Scaled back goal It suffices to not slow down much on distributed data. Jason Riedy (UCB) Distributed Auctions 17 Dec, 2008 15 / 28
  • 18. Basic Idea: Run Local Auctions, Treat as Bids 1 0 0 1 0 0 1 1 1 1 0 0 0000000 1111111 1 1 0 0 1111111 0000000 0 0 1 1 111 000 000 111 1 0 0 1 1 0 0 1 111 000 111 000 1 0 1 0 1 0 1 0 00000 11111 0000 1111 0 0 1 1 1 1 0 0 0000 1111 0 1 0 1 00 111 000 11 0 1 0 1000 11 00 111 1111 ⇒ 0000 11 111 000 00 00 11 000 111 1 1 0 0 0 1 0 1 0000 1111 1 1 0 0 0 0 1 1 0000 1111 0 1 1 0 000 00 11 111 0 1 11 00 0 1 000 111 1111 0000 1 1 0 0 1 1 0 0 0000 1111 0 1 111 11 00 000 0 1 11 00 111 000 1111 0000 1111 0000 0 0 1 1 B 0 0 1 1 0 0 1 1 0000 1111 1 0 1 0 0 1 1 0 000 11 00 111 1 0 0 1 11 00 0 1 1 0000 111 1111 0000 1 1 0 0 0 0 1 1 1 1 0 0 1 1 0 0 0000 1111 1 0 1 0 0 1 1 0 11 000 111 00 11 00 0 1 0 1 1 0 1 0 111 000 1 1 0 0 1 P1 0 P2 0 1 P3 Slice the matrix into pieces, run local auctions. The winning local bids are the slices’ bids. Merge. . . (“And then a miracle occurs. . .”) Need to keep some data in sync for termination. Jason Riedy (UCB) Distributed Auctions 17 Dec, 2008 16 / 28
  • 19. Basic Idea: Run Local Auctions, Treat as Bids 1 0 0 1 0 0 1 1 1 1 0 0 0000000 1111111 1 1 0 0 1111111 0000000 0 0 1 1 111 000 000 111 1 0 0 1 1 0 0 1 111 000 111 000 1 0 1 0 1 0 1 0 00000 11111 0000 1111 0 0 1 1 1 1 0 0 0000 1111 0 1 0 1 00 111 000 11 0 1 0 1000 11 00 111 1111 ⇒ 0000 11 111 000 00 00 11 000 111 1 1 0 0 0 1 0 1 0000 1111 1 1 0 0 0 0 1 1 0000 1111 0 1 1 0 000 00 11 111 0 1 11 00 0 1 000 111 1111 0000 1 1 0 0 1 1 0 0 0000 1111 0 1 111 11 00 000 0 1 11 00 111 000 1111 0000 1111 0000 0 0 1 1 B 0 0 1 1 0 0 1 1 0000 1111 1 0 1 0 0 1 1 0 000 11 00 111 1 0 0 1 11 00 0 1 1 0000 111 1111 0000 1 1 0 0 0 0 1 1 1 1 0 0 1 1 0 0 0000 1111 1 0 1 0 0 1 1 0 11 000 111 00 11 00 0 1 0 1 1 0 1 0 111 000 1 1 0 0 1 P1 0 P2 0 1 P3 Can be memory scalable: Compact the local pieces. Have not experimented with simple SMP version. Sequential performance is limited by the memory system. Note: Could be useful for multicore w/local memory. Jason Riedy (UCB) Distributed Auctions 17 Dec, 2008 16 / 28
  • 20. Ring Around the Auction Fits nodes-of-smp architectures well (Itanium2/Myrinet). Needs O( largest # of rows ) data, may be memory scalable. Initial, sparse matrices cannot be spread across too many processors. . . (Below: At least 500 cols per proc.) q Auction Auction v. root q q q q 8 q 8 q q Re−scaled speedup 6 6 Speedup q q q q q q q q q q q q q 4 q q 4 q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q 2 q q q 2 q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q 5 10 15 5 10 15 Number of Processors Number of Processors On the Citris Itanium2 cluster with Myrinet. Jason Riedy (UCB) Distributed Auctions 17 Dec, 2008 17 / 28
  • 21. Other Communication Schemes Blind all-to-all Send all the bids to everyone. A little slower than the ring. May need O(# local rows · p) intermediate data; not memory scalable. Needs O(p) communication; not processor scalable Tree reduction Reduction operation: MPI MAXLOC on array of { price, column }. Much slower than everything else. Each node checks result, handles unmatching locally. Needs O(n) intermediate data; not memory scalable. Jason Riedy (UCB) Distributed Auctions 17 Dec, 2008 18 / 28
  • 22. Diving in: Lessons from Examples Three different matrices, four different perf. profiles. All are for the ring style. Only up to 16 processors; that’s enough to cause problems. Performance dependencies: mostly the # of comm. phases, and a little on the total # of entries scanned along the critical path. The latter decreases with more processors, as it must. But the former is wildly unpredictable. Jason Riedy (UCB) Distributed Auctions 17 Dec, 2008 19 / 28
  • 23. Diving in: Example That Scales Matrix Nemeth/nemeth22 n = 9 506, nent = 1 358 832 Entries roughly even across nodes. This one is fast and scales. But 8 processors is difficult to explain. # Proc Time # Comm # Ent. Scanned 1 0.2676 0 839 482 003 2 0.0741 55 1 636 226 414 4 0.0307 20 412 509 573 8 0.0176 27 105 229 945 16 0.0153 21 27 770 769 On jacquard.nersc.gov, Opteron and Infiniband. Jason Riedy (UCB) Distributed Auctions 17 Dec, 2008 20 / 28
  • 24. Diving in: Example That Does Not Scale Matrix GHS indef/ncvxqp5 n = 62 500, nent = 424 966 Entries roughly even across nodes. This one is superlinear in the wrong sense. # Proc Time # Comm # Ent. Scanned 1 0.910 0 989 373 986 2 1.128 65 370 1 934 162 133 4 2.754 63 228 1 458 840 434 8 15.924 216 178 748 628 941 16 177.282 1 353 734 96 183 742 Jason Riedy (UCB) Distributed Auctions 17 Dec, 2008 21 / 28
  • 25. Diving in: Example That Confuses Everything Matrix Vavasis/av41092 n = 41 092, nent = 16 839 024 Entries roughly even across nodes. Performance ∼ # Comm What?!?! Not transposed: # Proc Time # Comm # Ent. Scanned 1 9.042 0 1 760 564 335 2 5.328 24 248 2 094 140 729 4 6.218 57 553 1 742 989 035 8 15.480 209 393 1 109 156 585 16 68.908 675 635 321 907 160 Jason Riedy (UCB) Distributed Auctions 17 Dec, 2008 22 / 28
  • 26. Diving in: Example That Confuses Everything Matrix Vavasis/av41092 n = 41 092, nent = 16 839 024 Entries roughly even across nodes. Performance ∼ # Comm What?!?! Transposed: # Proc Time # Comm # Ent. Scanned 1 10.044 0 2 010 702 016 2 10.832 887 047 1 776 252 776 4 41.417 1 475 564 1 974 921 328 8 18.929 249 947 844 718 754 16 (forever) Jason Riedy (UCB) Distributed Auctions 17 Dec, 2008 22 / 28
  • 27. So What Happened? Matrix av41092 has one large strongly connected component. (The square blocks in a Dulmage-Mendelsohn decomposition.) The SCC spans all the processors. Every edge in an SCC is a part of some complete matching. Horrible performance from: starting along a non-max-weight matching, making it almost complete, then an edge-by-edge search for nearby matchings, requiring a communication phase almost per edge. Conjecture: This type of performance land-mine will affect any 0-1 combinatorial algorithm. Jason Riedy (UCB) Distributed Auctions 17 Dec, 2008 23 / 28
  • 28. Improvements? Rearranging deck chairs: few-to-few communication Build a directory of which nodes share rows: collapsed BB T . Send only to/from those neighbors. Minor improvement over MPI Allgatherv for a huge effort. Still too fragile to trust perf. results Improving communication may not be worth it. . . The real problem is the number of comm. phases. If diagonal is the matching, everything is overhead. Or if there’s a large SCC. . . Another alternative: Multiple algorithms at once. Run Bora U¸ar’s alg. on one set of nodes, auction on another, c transposed auction on another, . . . Requires some painful software engineering. Jason Riedy (UCB) Distributed Auctions 17 Dec, 2008 24 / 28
  • 29. Forward-Reverse Auctions Improving the algorithm Forward-reverse auctions alternate directions. Start column-major. Once there has been some progress, but progress stops, switch to row-major. Switch back when stalled after making some progress. Much less sensitive to initial scaling. Does not need µ-scaling, so trivial cases should be faster. But this require the transpose. Few-to-few communication very nearly requires the transpose already. . . Later stages (symbolic factorization) also require some transpose information. . . Jason Riedy (UCB) Distributed Auctions 17 Dec, 2008 25 / 28
  • 30. So, Could This Ever Be Parallel? Doubtful? For a given matrix-processor layout, constructing a matrix requiring O(n) communication is pretty easy for combinatorial algorithms. Force almost every local action to be undone at every step. Non-fractional combinatorial algorithms are too restricted. Using less-restricted optimization methods is promising, but far slower sequentially. Existing algs (Goldberg, et al.) are PRAM with n3 processors. General purpose methods: Cutting planes, successive SDPs Someone clever might find a parallel rounding algorithm. Solving the fractional LAP quickly would become a matter of finding a magic preconditioner. . . Maybe not a good thing for a direct method? Jason Riedy (UCB) Distributed Auctions 17 Dec, 2008 26 / 28
  • 31. So, Could This Ever Be Parallel? Another possibility? If we could quickly compute the dual by scaling. . . Use the complementary slackness condition to produce a much smaller, unweighted problem. Solve that on one node? May be a practical alternative. Jason Riedy (UCB) Distributed Auctions 17 Dec, 2008 27 / 28
  • 32. Questions? (I’m currently working on an Octave-based, parallel forw/rev auction to see if it may help...) Jason Riedy (UCB) Distributed Auctions 17 Dec, 2008 28 / 28