I gave this talk at the 24th International Meeting on Probabilistic, Combinatorial and Asymptotic Methods for the Analysis of Algorithms (AofA 2013) on Menorca (Spain).
A paper covering the analyses of this talk (and some more!) has been submitted.
Also, in the talk, I refer to the previous speaker at the conference, my advisor Markus Nebel - corresponding results can be found in an earlier talk of mine:
slideshare.net/sebawild/average-case-analysis-of-java-7s-dual-pivot-quicksort
Check my website for preprints of papers and my other talks:
wwwagak.cs.uni-kl.de/sebastian-wild.html
Quickselect Under Yaroslavskiy's Dual Pivoting Algorithm
1. Quickselect under
Yaroslavskiy’s Dual-Pivoting Algorithm
Sebastian Wild Markus E. Nebel Hosam Mahmoud
wild@cs.uni-kl.de
Computer Science Department
28 May 2013
AofA 2013
Menorca
Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 1 / 17
2. Selection
Consider the following problem
The Selection Problem
Given an unsorted array A of n data elements,
find the m-th smallest element.
Applications
estimate distribution of data collection in statistics
world rankings in sports
Algorithms
trivial solution: sort(A); return A[m]; O(n log n)
linear worst case possible: Median-of-Medians (Blum et al.)
but rather slow in practice (compared to sorting!)
Hoare 1961: Quicksort idea usable for selection
Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 2 / 17
3. From Quicksort to Quickselect
Sort whole array by Quicksort
Markus Nebel just showed:
Yaroslavskiy’s partitioning can speed up Quicksort
Does success of dual partitioning in sorting carry over to selection?
Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 3 / 17
4. From Quicksort to Quickselect
Can prune many branches
Markus Nebel just showed:
Yaroslavskiy’s partitioning can speed up Quicksort
Does success of dual partitioning in sorting carry over to selection?
Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 3 / 17
5. From Quicksort to Quickselect
Can prune many branches Dual Pivot Quicksort
Markus Nebel just showed:
Yaroslavskiy’s partitioning can speed up Quicksort
Does success of dual partitioning in sorting carry over to selection?
Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 3 / 17
6. From Quicksort to Quickselect
Can prune many branches Prune even more branches
Markus Nebel just showed:
Yaroslavskiy’s partitioning can speed up Quicksort
Does success of dual partitioning in sorting carry over to selection?
Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 3 / 17
7. From Quicksort to Quickselect
Can prune many branches Prune even more branches
Markus Nebel just showed:
Yaroslavskiy’s partitioning can speed up Quicksort
Does success of dual partitioning in sorting carry over to selection?
Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 3 / 17
8. Notation
Analyse (random) number of data comparisons C
(m)
n
to select from random permutation
P < Q: (random) ranks of the two pivots
subarray sizes determined by P and Q:
left P rightQmiddle
J1 := P − 1 J2 := Q − P − 1 J3 := n − Q
selection continues recursively in
left subarray, if m < P
middle subarray, if P < m < Q
right subarray, if Q < m
or terminates if m = P or m = Q
corresponding events
E1 := {m < P}
E2 := {P < m < Q}
E3 := {Q < m}
Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 4 / 17
9. Notation
Analyse (random) number of data comparisons C
(m)
n
to select from random permutation
P < Q: (random) ranks of the two pivots
subarray sizes determined by P and Q:
left P rightQmiddle
J1 := P − 1 J2 := Q − P − 1 J3 := n − Q
selection continues recursively in
left subarray, if m < P
middle subarray, if P < m < Q
right subarray, if Q < m
or terminates if m = P or m = Q
corresponding events
E1 := {m < P}
E2 := {P < m < Q}
E3 := {Q < m}
Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 4 / 17
10. Notation
Analyse (random) number of data comparisons C
(m)
n
to select from random permutation
P < Q: (random) ranks of the two pivots
subarray sizes determined by P and Q:
left P rightQmiddle
J1 := P − 1 J2 := Q − P − 1 J3 := n − Q
selection continues recursively in
left subarray, if m < P
middle subarray, if P < m < Q
right subarray, if Q < m
or terminates if m = P or m = Q
corresponding events
E1 := {m < P}
E2 := {P < m < Q}
E3 := {Q < m}
Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 4 / 17
11. Quickselect Recurrence
Yaroslavskiy’s partitioning preserves randomness
can set up recurrence for the distribution of C
(m)
n :
C
(m)
n
D
= Tn +
C
(m)
J1
if m lies in left subarray
C
(m−P)
J2
if m lies in middle subarray
C
(m−Q)
J3
if m lies in right subarray
= Tn + 1E1
· C
(m)
J1
+ 1E2
· C
(m−P)
J2
+ 1E3
· C
(m−Q)
J3
Tn is the (random) number of comparisons during partitioning
Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 5 / 17
12. Getting Rid of m
Complication: second parameter m in recurrence
C
(m)
n
D
= Tn + 1E1
· C
(m)
J1
+ 1E2
· C
(m−P)
J2
+ 1E3
· C
(m−Q)
J3
consider simpler quantities
1 Random Rank: m = Mn
D
= Uniform {1, . . . , n}
abbreviate ¯Cn := C
(Mn)
n
← focus in talk
¯Cn
D
= Tn+ 1E1
¯CJ1
+ 1E2
¯CJ2
+ 1E3
¯CJ3
2 Extreme Rank: m = 1
abbreviate ˆCn := C
(1)
n
ˆCn
D
= Tn + ˆCP−1
explicit solution for expectations possible (generating functions)
But: Requires tedious computations
More elegant shortcut to asymptotics of E[ ¯Cn] and E[ ˆCn]?
Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 6 / 17
13. Getting Rid of m
Complication: second parameter m in recurrence
C
(m)
n
D
= Tn + 1E1
· C
(m)
J1
+ 1E2
· C
(m−P)
J2
+ 1E3
· C
(m−Q)
J3
consider simpler quantities
1 Random Rank: m = Mn
D
= Uniform {1, . . . , n}
abbreviate ¯Cn := C
(Mn)
n
← focus in talk
¯Cn
D
= Tn+ 1E1
¯CJ1
+ 1E2
¯CJ2
+ 1E3
¯CJ3
2 Extreme Rank: m = 1
abbreviate ˆCn := C
(1)
n
ˆCn
D
= Tn + ˆCP−1
explicit solution for expectations possible (generating functions)
But: Requires tedious computations
More elegant shortcut to asymptotics of E[ ¯Cn] and E[ ˆCn]?
Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 6 / 17
14. Getting Rid of m
Complication: second parameter m in recurrence
C
(m)
n
D
= Tn + 1E1
· C
(m)
J1
+ 1E2
· C
(m−P)
J2
+ 1E3
· C
(m−Q)
J3
consider simpler quantities
1 Random Rank: m = Mn
D
= Uniform {1, . . . , n}
abbreviate ¯Cn := C
(Mn)
n
← focus in talk
¯Cn
D
= Tn+ 1E1
¯CJ1
+ 1E2
¯CJ2
+ 1E3
¯CJ3
2 Extreme Rank: m = 1
abbreviate ˆCn := C
(1)
n
ˆCn
D
= Tn + ˆCP−1
explicit solution for expectations possible (generating functions)
But: Requires tedious computations
More elegant shortcut to asymptotics of E[ ¯Cn] and E[ ˆCn]?
Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 6 / 17
15. Convergence
Idea: Computation easy if we can drop index n
i. e. when recurrence “in equilibrium”
For that, we need stochastic convergence of ¯Cn
0 100 200 300 400 500 600
99.9% massn = 10
n = 20
n = 30
n = 40
¯Cn
But:
E[ ¯Cn] grows with n
high mass region moves with n
¯Cn does not converge
Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 7 / 17
16. Convergence
Idea: Computation easy if we can drop index n
i. e. when recurrence “in equilibrium”
Try normalized variables ¯C∗
n :=
¯Cn
n
instead:
0 2 4 6 8 10 12 14 16 18
99.9% massn = 10
n = 20
n = 30
n = 40
¯Cn/n
E[ ¯C∗
n] seems constant
range of high mass looks bounded
¯C∗
n might converge
Works because of property of Quickselect: E[ ¯Cn] = O Var( ¯Cn)
Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 7 / 17
17. The Contraction Method
Rewrite recurrence: (Write as sum)
¯Cn
D
= Tn + 1E1
¯CJ1
+ 1E2
¯CJ2
+ 1E3
¯CJ3
Contraction Method:
Convergence of coefficients + contraction condition
implies convergence to fixpoint solution.
have to show: (all convergences in L1-norm)
1 T∗
n → T∗
2 1Ei
Ji
n → A∗
i
3
3
i=1 E |A∗
i | < 1
Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 8 / 17
18. The Contraction Method
Rewrite recurrence: (Divide by n)
¯Cn
D
= Tn +
3
i=1
1Ei
¯CJi
Contraction Method:
Convergence of coefficients + contraction condition
implies convergence to fixpoint solution.
have to show: (all convergences in L1-norm)
1 T∗
n → T∗
2 1Ei
Ji
n → A∗
i
3
3
i=1 E |A∗
i | < 1
Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 8 / 17
19. The Contraction Method
Rewrite recurrence: (Rearrange)
¯Cn
n
D
=
Tn
n
+
3
i=1
1Ei
¯CJi
n
·
Ji
Ji
Contraction Method:
Convergence of coefficients + contraction condition
implies convergence to fixpoint solution.
have to show: (all convergences in L1-norm)
1 T∗
n → T∗
2 1Ei
Ji
n → A∗
i
3
3
i=1 E |A∗
i | < 1
Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 8 / 17
20. The Contraction Method
Rewrite recurrence: (use normalized variables)
¯Cn
n
D
=
Tn
n
+
3
i=1
1Ei
Ji
n
·
¯CJi
Ji
Contraction Method:
Convergence of coefficients + contraction condition
implies convergence to fixpoint solution.
have to show: (all convergences in L1-norm)
1 T∗
n → T∗
2 1Ei
Ji
n → A∗
i
3
3
i=1 E |A∗
i | < 1
Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 8 / 17
21. The Contraction Method
Recurrence for ¯C∗
n
¯C∗
n
D
= T∗
n +
3
i=1
1Ei
Ji
n
· ¯C∗
Ji
with T∗
n :=
Tn
n
Contraction Method:
Convergence of coefficients + contraction condition
implies convergence to fixpoint solution.
have to show: (all convergences in L1-norm)
1 T∗
n → T∗
2 1Ei
Ji
n → A∗
i
3
3
i=1 E |A∗
i | < 1
Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 8 / 17
22. The Contraction Method
Recurrence for ¯C∗
n
¯C∗
n
D
= T∗
n +
3
i=1
1Ei
Ji
n
· ¯C∗
Ji
↓ ↓
T∗ A∗
i
Contraction Method:
Convergence of coefficients + contraction condition
implies convergence to fixpoint solution.
have to show: (all convergences in L1-norm)
1 T∗
n → T∗
2 1Ei
Ji
n → A∗
i
3
3
i=1 E |A∗
i | < 1
Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 8 / 17
23. The Contraction Method
Recurrence for ¯C∗
n
¯C∗
n
D
= T∗
n +
3
i=1
1Ei
Ji
n
· ¯C∗
Ji
↓ ↓
3
i=1
E |A∗
i | < 1
T∗ A∗
i
Contraction Method:
Convergence of coefficients + contraction condition
implies convergence to fixpoint solution.
have to show: (all convergences in L1-norm)
1 T∗
n → T∗
2 1Ei
Ji
n → A∗
i
3
3
i=1 E |A∗
i | < 1
Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 8 / 17
24. The Contraction Method
Recurrence for ¯C∗
n
¯C∗
n
D
= T∗
n +
3
i=1
1Ei
Ji
n
· ¯C∗
Ji
↓ ↓ ↓ ↓
3
i=1
E |A∗
i | < 1
¯C∗ D
= T∗ +
3
i=1
A∗
i · ¯C∗
Contraction Method:
Convergence of coefficients + contraction condition
implies convergence to fixpoint solution.
have to show: (all convergences in L1-norm)
1 T∗
n → T∗
2 1Ei
Ji
n → A∗
i
3
3
i=1 E |A∗
i | < 1
Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 8 / 17
25. The Contraction Method
“Equilibrium” equation for ¯C∗
¯C∗ D
= T∗ +
3
i=1
A∗
i · ¯C∗
Contraction Method:
Convergence of coefficients + contraction condition
implies convergence to fixpoint solution.
have to show: (all convergences in L1-norm)
1 T∗
n → T∗
2 1Ei
Ji
n → A∗
i
3
3
i=1 E |A∗
i | < 1
Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 8 / 17
26. The Contraction Method
“Equilibrium” equation for ¯C∗
(Take expectation on both sides)
¯C∗ D
= T∗ +
3
i=1
A∗
i · ¯C∗
Contraction Method:
Convergence of coefficients + contraction condition
implies convergence to fixpoint solution.
have to show: (all convergences in L1-norm)
1 T∗
n → T∗
2 1Ei
Ji
n → A∗
i
3
3
i=1 E |A∗
i | < 1
Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 8 / 17
27. The Contraction Method
“Equilibrium” equation for ¯C∗
(and solve for E ¯C∗)
E ¯C∗ = E T∗+
3
i=1
E A∗
i · E ¯C∗
Contraction Method:
Convergence of coefficients + contraction condition
implies convergence to fixpoint solution.
have to show: (all convergences in L1-norm)
1 T∗
n → T∗
2 1Ei
Ji
n → A∗
i
3
3
i=1 E |A∗
i | < 1
Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 8 / 17
28. The Contraction Method
“Equilibrium” equation for ¯C∗
(and solve for E ¯C∗)
E ¯C∗ =
E T∗
1 − 3
i=1 E A∗
i
Contraction Method:
Convergence of coefficients + contraction condition
implies convergence to fixpoint solution.
have to show: (all convergences in L1-norm)
1 T∗
n → T∗
2 1Ei
Ji
n → A∗
i
3
3
i=1 E |A∗
i | < 1
Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 8 / 17
29. The Contraction Method
“Equilibrium” equation for ¯C∗
(and solve for E ¯C∗)
E ¯C∗ =
E T∗
1 − 3
i=1 E A∗
i
Contraction Method:
Convergence of coefficients + contraction condition
implies convergence to fixpoint solution.
have to show: (all convergences in L1-norm)
1 T∗
n → T∗
2 1Ei
Ji
n → A∗
i . . . upcoming
3
3
i=1 E |A∗
i | < 1
Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 8 / 17
30. Probabilistic Model
Need “convenient” probabilistic model:
same probability space for finite n and limit case
Natural model: input elements U1, . . . , Un i. i. d. Uniform(0, 1)
0 1
first two = pivots values
Our equivalent model:
1 Draw spacings S1, S2, S3 on (0, 1) with S
D
= Dirichlet(1, 1, 1)
2 Draw subarray sizes J1, J2, J3 with J
D
= Multinomial(n − 2; S1, S2,S3)
3 Continue recursively
Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 9 / 17
31. Probabilistic Model
Need “convenient” probabilistic model:
same probability space for finite n and limit case
Natural model: input elements U1, . . . , Un i. i. d. Uniform(0, 1)
0 1
first two = pivots values
Our equivalent model:
1 Draw spacings S1, S2, S3 on (0, 1) with S
D
= Dirichlet(1, 1, 1)
2 Draw subarray sizes J1, J2, J3 with J
D
= Multinomial(n − 2; S1, S2,S3)
3 Continue recursively
Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 9 / 17
32. Probabilistic Model
Need “convenient” probabilistic model:
same probability space for finite n and limit case
Natural model: input elements U1, . . . , Un i. i. d. Uniform(0, 1)
0 1U1 U2
first two = pivots values
Our equivalent model:
1 Draw spacings S1, S2, S3 on (0, 1) with S
D
= Dirichlet(1, 1, 1)
2 Draw subarray sizes J1, J2, J3 with J
D
= Multinomial(n − 2; S1, S2,S3)
3 Continue recursively
Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 9 / 17
33. Probabilistic Model
Need “convenient” probabilistic model:
same probability space for finite n and limit case
Natural model: input elements U1, . . . , Un i. i. d. Uniform(0, 1)
0 1U1 U2
first two = pivots values
Our equivalent model:
1 Draw spacings S1, S2, S3 on (0, 1) with S
D
= Dirichlet(1, 1, 1)
2 Draw subarray sizes J1, J2, J3 with J
D
= Multinomial(n − 2; S1, S2,S3)
3 Continue recursively
0 1
Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 9 / 17
34. Probabilistic Model
Need “convenient” probabilistic model:
same probability space for finite n and limit case
Natural model: input elements U1, . . . , Un i. i. d. Uniform(0, 1)
0 1U1 U2
first two = pivots values
Our equivalent model:
1 Draw spacings S1, S2, S3 on (0, 1) with S
D
= Dirichlet(1, 1, 1)
2 Draw subarray sizes J1, J2, J3 with J
D
= Multinomial(n − 2; S1, S2,S3)
3 Continue recursively
0 1
S1 S2 S3
induces pivot values: U1 = S1, U2 = S1 + S2
0
0.5
1 0 0.5 1
0
0.5
1
Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 9 / 17
35. Probabilistic Model
Need “convenient” probabilistic model:
same probability space for finite n and limit case
Natural model: input elements U1, . . . , Un i. i. d. Uniform(0, 1)
0 1U1 U2
first two = pivots values
Our equivalent model:
1 Draw spacings S1, S2, S3 on (0, 1) with S
D
= Dirichlet(1, 1, 1)
2 Draw subarray sizes J1, J2, J3 with J
D
= Multinomial(n − 2; S1, S2,S3)
3 Continue recursively
0 1
S1 S2 S3
U1 U2
induces pivot values: U1 = S1, U2 = S1 + S2
0
0.5
1 0 0.5 1
0
0.5
1
Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 9 / 17
36. Probabilistic Model
Need “convenient” probabilistic model:
same probability space for finite n and limit case
Natural model: input elements U1, . . . , Un i. i. d. Uniform(0, 1)
0 1U1 U2
first two = pivots values
Our equivalent model:
1 Draw spacings S1, S2, S3 on (0, 1) with S
D
= Dirichlet(1, 1, 1)
2 Draw subarray sizes J1, J2, J3 with J
D
= Multinomial(n − 2; S1, S2,S3)
3 Continue recursively
0 1
S1 S2 S3
U1 U2
J1 J2 J3
S1 = Pr(small) , S2 = Pr(medium) , S3 = Pr(large)
n − 2 elements to draw conditional on S
Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 9 / 17
37. Probabilistic Model
Need “convenient” probabilistic model:
same probability space for finite n and limit case
Natural model: input elements U1, . . . , Un i. i. d. Uniform(0, 1)
0 1U1 U2
first two = pivots values
Our equivalent model:
1 Draw spacings S1, S2, S3 on (0, 1) with S
D
= Dirichlet(1, 1, 1)
2 Draw subarray sizes J1, J2, J3 with J
D
= Multinomial(n − 2; S1, S2,S3)
3 Continue recursively
0 1
S1 S2 S3
U1 U2
J1 J2 J3
S1 = Pr(small) , S2 = Pr(medium) , S3 = Pr(large)
n − 2 elements to draw conditional on S
Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 9 / 17
38. Probabilistic Model
Need “convenient” probabilistic model:
same probability space for finite n and limit case
Natural model: input elements U1, . . . , Un i. i. d. Uniform(0, 1)
0 1U1 U2
first two = pivots values
Our equivalent model:
1 Draw spacings S1, S2, S3 on (0, 1) with S
D
= Dirichlet(1, 1, 1)
2 Draw subarray sizes J1, J2, J3 with J
D
= Multinomial(n − 2; S1, S2,S3)
3 Continue recursively
0 1
S1 S2 S3
U1 U2
J1 J2 J3
S1 = Pr(small) , S2 = Pr(medium) , S3 = Pr(large)
n − 2 elements to draw conditional on S
Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 9 / 17
39. Probabilistic Model
Need “convenient” probabilistic model:
same probability space for finite n and limit case
Natural model: input elements U1, . . . , Un i. i. d. Uniform(0, 1)
0 1U1 U2
first two = pivots values
Our equivalent model:
1 Draw spacings S1, S2, S3 on (0, 1) with S
D
= Dirichlet(1, 1, 1)
2 Draw subarray sizes J1, J2, J3 with J
D
= Multinomial(n − 2; S1, S2,S3)
3 Continue recursively
0 1
S1 S2 S3
U1 U2
J1 J2 J3
S1 = Pr(small) , S2 = Pr(medium) , S3 = Pr(large)
n − 2 elements to draw conditional on S
Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 9 / 17
40. Probabilistic Model
Need “convenient” probabilistic model:
same probability space for finite n and limit case
Natural model: input elements U1, . . . , Un i. i. d. Uniform(0, 1)
0 1U1 U2
first two = pivots values
Our equivalent model:
1 Draw spacings S1, S2, S3 on (0, 1) with S
D
= Dirichlet(1, 1, 1)
2 Draw subarray sizes J1, J2, J3 with J
D
= Multinomial(n − 2; S1, S2,S3)
3 Continue recursively
0 1U1U1 U2 U2
Continue of sub-interval with smaller n
Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 9 / 17
41. Probabilistic Model
Need “convenient” probabilistic model:
same probability space for finite n and limit case
Natural model: input elements U1, . . . , Un i. i. d. Uniform(0, 1)
0 1U1 U2
first two = pivots values
Our equivalent model:
1 Draw spacings S1, S2, S3 on (0, 1) with S
D
= Dirichlet(1, 1, 1)
2 Draw subarray sizes J1, J2, J3 with J
D
= Multinomial(n − 2; S1, S2,S3)
3 Continue recursively
0 1U1U1 U2 U2
Continue of sub-interval with smaller n
Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 9 / 17
42. Distribution of Tn — Repetition
1 T∗
n → T∗
Invariant: < p
→
> qg
←
p ◦ q k
→
k’s range K G
Tn = number of comparisons in first partitioning step
∼ n one for every element
+ J2 second for all medium elements
+ l @ K second for large elements at Ck
+ s@ G second for small elements at Cg
Markus Nebel just showed:
l @ K
D
= Hypergeometric(n − 2; J3 , J1 + J2 )
conditional on J
s@ G
D
= Hypergeometric(n − 2; J1 , J3 )
Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 10 / 17
43. Distribution of Tn — Repetition
1 T∗
n → T∗
Invariant: < p
→
> qg
←
p ◦ q k
→
k’s range K G
Tn = number of comparisons in first partitioning step
∼ n one for every element
+ J2 second for all medium elements
+ l @ K second for large elements at Ck
+ s@ G second for small elements at Cg
Markus Nebel just showed:
l @ K
D
= Hypergeometric(n − 2; J3 , J1 + J2 )
conditional on J
s@ G
D
= Hypergeometric(n − 2; J1 , J3 )
Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 10 / 17
44. Distribution of Tn — Repetition
1 T∗
n → T∗
Invariant: < p
→
> qg
←
p ◦ q k
→
k’s range K G
Tn = number of comparisons in first partitioning step
∼ n one for every element
+ J2 second for all medium elements
+ l @ K second for large elements at Ck
+ s@ G second for small elements at Cg
Markus Nebel just showed:
l @ K
D
= Hypergeometric(n − 2; J3 , J1 + J2 )
conditional on J
s@ G
D
= Hypergeometric(n − 2; J1 , J3 )
Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 10 / 17
45. Convergence of T∗
n
1 T∗
n → T∗
T∗
n =
Tn
n
= 1 +
J2
n
+
l @ K
n
+
s@ G
n
J2
n
→ S2 by law of large numbers, since E[Ji | S] = n · Si
Multinomial and Hypergeometric concentrated around mean
l @ K
n
→ E
l @ K
n
S by Chernoff bound arguments
= E
E l @ K | J
n
S = E
J3 ( J1 + J2 )
n(n − 2)
S
∼ S3 · (S1 + S2)
s@ G
n
→ S1 · S3 similar
T∗
n → T∗ = 1 + S2 + S3(2S1 + S2)
Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 11 / 17
46. Convergence of T∗
n
1 T∗
n → T∗
T∗
n =
Tn
n
= 1 +
J2
n
+
l @ K
n
+
s@ G
n
J2
n
→ S2 by law of large numbers, since E[Ji | S] = n · Si
Multinomial and Hypergeometric concentrated around mean
l @ K
n
→ E
l @ K
n
S by Chernoff bound arguments
= E
E l @ K | J
n
S = E
J3 ( J1 + J2 )
n(n − 2)
S
∼ S3 · (S1 + S2)
s@ G
n
→ S1 · S3 similar
T∗
n → T∗ = 1 + S2 + S3(2S1 + S2)
Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 11 / 17
47. Convergence of T∗
n
1 T∗
n → T∗
T∗
n =
Tn
n
= 1 +
J2
n
+
l @ K
n
+
s@ G
n
J2
n
→ S2 by law of large numbers, since E[Ji | S] = n · Si
Multinomial and Hypergeometric concentrated around mean
l @ K
n
→ E
l @ K
n
S by Chernoff bound arguments
= E
E l @ K | J
n
S = E
J3 ( J1 + J2 )
n(n − 2)
S
∼ S3 · (S1 + S2)
s@ G
n
→ S1 · S3 similar
T∗
n → T∗ = 1 + S2 + S3(2S1 + S2)
Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 11 / 17
48. Convergence of Coefficients & Contraction Condition
2 1Ei
Ji
n → A∗
i
standard computation:
1E1
J1
n
→ 1F1
· S1 with F1 = {V < S1}
i = 2, 3 similar
3
3
i=1 E |A∗
i | < 1
E 1F1
S1 = 1
6
by symmetry:
3
i=1
E |A∗
i | = 1
2 < 1
We (finally) proved convergence ¯C∗
n → ¯C∗.
Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 12 / 17
49. Convergence of Coefficients & Contraction Condition
2 1Ei
Ji
n → A∗
i
standard computation:
1E1
J1
n
→ 1F1
· S1 with F1 = {V < S1}
i = 2, 3 similar
3
3
i=1 E |A∗
i | < 1
E 1F1
S1 = 1
6
by symmetry:
3
i=1
E |A∗
i | = 1
2 < 1
We (finally) proved convergence ¯C∗
n → ¯C∗.
Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 12 / 17
50. Convergence of Coefficients & Contraction Condition
2 1Ei
Ji
n → A∗
i
standard computation:
1E1
J1
n
→ 1F1
· S1 with F1 = {V < S1}
i = 2, 3 similar
3
3
i=1 E |A∗
i | < 1
E 1F1
S1 = 1
6
by symmetry:
3
i=1
E |A∗
i | = 1
2 < 1
We (finally) proved convergence ¯C∗
n → ¯C∗.
Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 12 / 17
51. Results — Expectations
Computing expectations and plugging in . . .
E T∗ = 19
12
Recall:
E ¯C∗ =
E T∗
1 − 3
i=1 E A∗
i3
i=1 E A∗
i = 1
2
E ¯C∗ = 19
6 = 3.16
L1 convergence E ¯Cn ∼ 3.16n
similarly: E ˆCn ∼ 2.375n
Comparisons Swaps
0
1
2
3
+5.55%
+100%
random rank
Classic Quickselect
Yaroslavskiy Select
Comparisons Swaps
0
1
2
+18.8%
+125%
extreme rank
Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 13 / 17
52. Results — Expectations
Computing expectations and plugging in . . .
E T∗ = 19
12
Recall:
E ¯C∗ =
E T∗
1 − 3
i=1 E A∗
i3
i=1 E A∗
i = 1
2
E ¯C∗ = 19
6 = 3.16
L1 convergence E ¯Cn ∼ 3.16n
similarly: E ˆCn ∼ 2.375n
Comparisons Swaps
0
1
2
3
+5.55%
+100%
random rank
Classic Quickselect
Yaroslavskiy Select
Comparisons Swaps
0
1
2
+18.8%
+125%
extreme rank
Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 13 / 17
53. Results — Expectations
Computing expectations and plugging in . . .
E T∗ = 19
12
Recall:
E ¯C∗ =
E T∗
1 − 3
i=1 E A∗
i3
i=1 E A∗
i = 1
2
E ¯C∗ = 19
6 = 3.16
L1 convergence E ¯Cn ∼ 3.16n
similarly: E ˆCn ∼ 2.375n
Comparisons Swaps
0
1
2
3
+5.55%
+100%
random rank
Classic Quickselect
Yaroslavskiy Select
Comparisons Swaps
0
1
2
+18.8%
+125%
extreme rank
Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 13 / 17
54. Results — Expectations
Computing expectations and plugging in . . .
E T∗ = 19
12
Recall:
E ¯C∗ =
E T∗
1 − 3
i=1 E A∗
i3
i=1 E A∗
i = 1
2
E ¯C∗ = 19
6 = 3.16
L1 convergence E ¯Cn ∼ 3.16n
similarly: E ˆCn ∼ 2.375n
Comparisons Swaps
0
1
2
3
+5.55%
+100%
random rank
Classic Quickselect
Yaroslavskiy Select
Comparisons Swaps
0
1
2
+18.8%
+125%
extreme rank
Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 13 / 17
55. Pivot Sampling
successful optimization of Quicksort/-select: median of three
Here: generalized pivot selection scheme
1 Choose random sample of k elements
2 Sort the sample
3 Select pivots s. t. in sorted sample
t1 smaller ,
t2 between and
t3 larger than pivots
Example with
k = 8 and t = (3, 2, 1):
U1 U2
t1 t2 t3
Our stochastic model nicely generalizes to pivot sampling:
only the distribution of spacings S changes!
S
D
= Dirichlet(t1 + 1, t2 + 1, t3 + 1) density: f(s) =
s
t1
1 s
t2
2 s
t3
3
B
contraction conditions remain valid
Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 14 / 17
56. Pivot Sampling
successful optimization of Quicksort/-select: median of three
Here: generalized pivot selection scheme
1 Choose random sample of k elements
2 Sort the sample
3 Select pivots s. t. in sorted sample
t1 smaller ,
t2 between and
t3 larger than pivots
Example with
k = 8 and t = (3, 2, 1):
U1 U2
t1 t2 t3
Our stochastic model nicely generalizes to pivot sampling:
only the distribution of spacings S changes!
S
D
= Dirichlet(t1 + 1, t2 + 1, t3 + 1) density: f(s) =
s
t1
1 s
t2
2 s
t3
3
B
contraction conditions remain valid
Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 14 / 17
57. Results — Pivot Sampling
General result: ¯Cn ∼
1 + t2+1
k+1 +
(2(t1+1)+(t2+1))(t3+1)
(k+1)2
1 − 3
i=1
(ti+1)2
(k+1)2
· n
inconvenient for interpretation . . .
Limiting case for large sample:
ti
k
→ αi as k → ∞
= take exact quantiles as pivots
¯Cn ∼
1 + α2 + (2α1 + α2)α3
1 − 3
i=1α2
i
· n
0 0.2 0.4 0.6 0.8 1
0
0.2
0.4
0.6
0.8
1
α1
α2
optimal choice is not symmetric
classic Quickselect with median of 2t + 1:
2.5n for t = 1 and 2.3n for t = 2
Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 15 / 17
58. Results — Pivot Sampling
General result: ¯Cn ∼
1 + t2+1
k+1 +
(2(t1+1)+(t2+1))(t3+1)
(k+1)2
1 − 3
i=1
(ti+1)2
(k+1)2
· n
inconvenient for interpretation . . .
Limiting case for large sample:
ti
k
→ αi as k → ∞
= take exact quantiles as pivots
¯Cn ∼
1 + α2 + (2α1 + α2)α3
1 − 3
i=1α2
i
· n
0 0.2 0.4 0.6 0.8 1
0
0.2
0.4
0.6
0.8
1
α1
α2
optimal choice is not symmetric
classic Quickselect with median of 2t + 1:
2.5n for t = 1 and 2.3n for t = 2
Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 15 / 17
59. Results — Pivot Sampling
General result: ¯Cn ∼
1 + t2+1
k+1 +
(2(t1+1)+(t2+1))(t3+1)
(k+1)2
1 − 3
i=1
(ti+1)2
(k+1)2
· n
inconvenient for interpretation . . .
Limiting case for large sample:
ti
k
→ αi as k → ∞
= take exact quantiles as pivots
¯Cn ∼
1 + α2 + (2α1 + α2)α3
1 − 3
i=1α2
i
· n
0 0.2 0.4 0.6 0.8 1
0
0.2
0.4
0.6
0.8
1
α1
α2
optimal choice is not symmetric
classic Quickselect with median of 2t + 1:
2.5n for t = 1 and 2.3n for t = 2
Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 15 / 17
60. Results — Pivot Sampling
General result: ¯Cn ∼
1 + t2+1
k+1 +
(2(t1+1)+(t2+1))(t3+1)
(k+1)2
1 − 3
i=1
(ti+1)2
(k+1)2
· n
inconvenient for interpretation . . .
Limiting case for large sample:
ti
k
→ αi as k → ∞
= take exact quantiles as pivots
¯Cn ∼
1 + α2 + (2α1 + α2)α3
1 − 3
i=1α2
i
· n
0 0.2 0.4 0.6 0.8 1
0
0.2
0.4
0.6
0.8
1
2.4654
2.5
α1
α2
optimal choice is not symmetric
classic Quickselect with median of 2t + 1:
2.5n for t = 1 and 2.3n for t = 2
Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 15 / 17
61. Conclusion
Practitioners’ Lesson:
Yaroslavskiy’s dual-pivoting not helpful for selection
Analyzers’ Lesson:
expected costs of Quickselect derivable by contraction method
pivot sampling included naturally
Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 16 / 17
62. References & Further Reading
M. Blum, R. W. Floyd, V. Pratt, R. L. Rivest and R. E. Tarjan.
Time bounds for selection.
J. of Comp. and System Sc., 7(4):448–461, 1973.
C. A. R. Hoare.
Algorithm 65: Find.
Communications of the ACM, 4(7):321–322, July 1961.
H.-K. Hwang and T.-H. Tsai.
Quickselect and Dickman function.
Combinatorics, Probability and Computing, 11, 2000.
Hosam M. Mahmoud.
Sorting: A distribution theory.
John Wiley & Sons, Hoboken, NJ, USA, 2000.
H. M. Mahmoud, R. Modarres and R. T. Smythe.
Analysis of quickselect : an algorithm for order statistics.
Informatique théorique et appl., 29(4):255–276, 1995.
C. Martínez, D. Panario and A. Viola.
Adaptive sampling strategies for quickselects.
ACM Transactions on Algorithms, 6(3):1–45, June 2010.
C. Martínez and S. Roura.
Optimal Sampling Strategies in Quicksort and Quickselect.
SIAM Journal on Computing, 31(3):683, 2001.
R. Neininger.
On a multivariate contraction method for random
recursive structures with applications to Quicksort.
Random Structures & Alg., 19(3-4):498–524, 2001.
R. Neininger and L. Rüschendorf.
A General Limit Theorem for Recursive Algorithms and
Combinatorial Structures.
The Annals of Applied Probability, 14(1):378–418, 2004.
A. Panholzer and H. Prodinger.
A generating functions approach for the analysis of
grand averages for multiple QUICKSELECT.
Random Structures & Alg., 13(3-4):189–209, 1998.
U. Rösler and L. Rüschendorf.
The contraction method for recursive algorithms.
Algorithmica, 29(1):3–33, 2001.
S. Wild.
Java 7’s Dual Pivot Quicksort.
Master thesis, University of Kaiserslautern, 2012.
S. Wild, M. E. Nebel and H. Mahmoud.
Analysis of Quickselect under
Yaroslavskiy’s Dual-Pivoting Algorithm.
Algorithmica, in preparation.
S. Wild, M. E. Nebel and R. Neininger.
Average Case and Distributional Analysis of
Java 7’s Dual Pivot Quicksort.
ACM Transactions on Algorithms, submitted.
V. Yaroslavskiy.
Dual-Pivot Quicksort.
2009.
Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 17 / 17