In topological inference, the goal is to extract information about a shape, given only a sample of points from it. There are many approaches to this problem, but the one we focus on is persistent homology. We get a view of the data at different scales by imagining the points are balls and consider different radii. The shape information we want comes in the form of a persistence diagram, which describes the components, cycles, bubbles, etc in the space that persist over a range of different scales.
To actually compute a persistence diagram in the geometric setting, previous work required complexes of size n^O(d). We reduce this complexity to O(n) (hiding some large constants depending on d) by using ideas from mesh generation.
This talk will not assume any knowledge of topology. This is joint work with Gary Miller, Benoit Hudson, and Steve Oudot.
5. Distance functions add shape to data.
dP (x) = min |x − p|
p∈P
α
P = dP [0, α]
−1
= ball(p, α)
p∈P
x
6. If you know the “scale” of the data, the
situation is often easier.
7. There may not be a right scale at which to
look at the data for two reasons.
1 No single scale captures the shape.
2 Interesting features appear at several scales.
The Persistence Approach:
Look at every scale and then see what persists.
8. Homology gives a quantitative description
of qualitative aspects of shape, i.e.
components, holes, voids.
Reduces to linear algebra for
simplicial complexes.
0th Homology group is the
nullspace of the Laplacian.
9. Persistent Homology tracks homology
across changes in scale.
The input to the persistence
algorithms is a filtration.
10. A filtration is a growing sequence of topological spaces.
{Fα }α≥0 ∀α ≤ β, Fα ⊆ Fβ
We care about two different types of filtrations.
1 Sublevel sets of well-behaved functions.
2 Filtered simplicial complexes.
11. The distance between two diagrams is the
bottleneck of a matching.
p2
q2
∞
dB = max |pi − qi |∞
i
q1 p3
p1
q3
12. Approximate persistence diagrams have features that are born
and die within a constant factor of the birth and death times of their
corresponding features.
This is just the bottleneck distance
of the log-scale diagrams.
log a − log b < ε
a
log b < ε
a
b <1+ε
13. There are two phases, one is geometric the
other is topological.
Geometry Topology
(linear algebra)
Build a filtration, i.e. Compute the
a filtered complex. persistence diagram
(Run the Persistence Algorithm).
We’ll focus on this side. Running time is
polynomial in the
size of the complex.
14. Idea 1: Use the Delaunay Triangulation
Good: It works, (alpha-complex filtration).
Bad: It can have size nO(d).
15. Idea 2: Connect up everything close.
Čech Filtration: Add a k-simplex for every k+1
points that have a smallest enclosing ball of
radius at most .
Rips Filtration: Add a k-simplex for every k+1
points that have all pairwise distances at
most .
Still nd, but we can quit early.
16. Topology is not Topography
Sublevel sets
(But in our case, there are some similarities)
18. Our Idea: Build a quality mesh.
2
We can build meshes of size 2 O(d )n.
19. The α-mesh filtration
1. Build a mesh M.
2. Assign birth times to
vertices based on distance to P
(special case points very close to P).
3. For each simplex s of Del(M),
let birth(s) be the min birth
time of its vertices.
4. Feed this filtered complex to
the persistence algorithm.
20. Approximation via interleaving.
Definition:
Two filtrations, {Pα } and {Qα } are
ε-interleaved if Pα−ε ⊆ Qα ⊆ Pα+ε
for all α.
Theorem [Chazal et al, ’09]:
If {Pα } and {Qα } are ε-interleaved then
their persistence diagrams are ε-close in
the bottleneck distance.
21. The Voronoi filtration interleaves with the
offset filtration.
Theorem:
α/ρ αρ
For all α > rP ,
VM ⊂ P α ⊂ VM ,
where rP is minimum distance between
any pair of points in P .
Finer refinement yields
a tighter interleaving.
23. If it’s so easy, why didn’t anyone think of
this before?
Theorem [Hudson, Miller, Phillips, ’06]:
A quality mesh of a point set can
be constructed in O(n log ∆) time,
where ∆ is the spread.
Theorem [Miller, Phillips, Sheehy, ’08]:
A quality mesh of a well-paced
point set has size O(n).
24. The Results Approximation ratio Complex Size
1. Build a mesh M. Previous
Over-refine it. Work 1 nO(d)
Use linear-size meshing.
2. Assign birth times to Simple
vertices based on distance to P mesh ρ 2 O(d2 )
n log ∆
filtration
(special case points very close
to P).**
Over-refine −O(d2 )
1+ε n log ∆
3. For each simplex s of Del(M),
ε
the mesh
let birth(s) be the min birth
time of its vertices.
Linear-Size −O(d2 )
1 + ε + 3θ (εθ) n
4. Feed this filtered complex to Meshing
the persistence algorithm.