The power and Arnoldi methods in an algebra of circulants
1. 40 60 80 100 120
40
60
80
mm
David F. Gleich (Purdue) CCAM Seminar 1 / 29
The power and Arnoldi methods in an algebra of circulants
David F. Gleich
Computer Science
Purdue University
CCAM Seminar
April 19th, 2013
In collaboration with
Chen Greif and Jim Varah (UBC)
Supported by a research grant from NSERC
and the Sandia National Labs John von Neumann fellowship
2. 40 60 80 100 120
40
60
80
mm
David F. Gleich (Purdue) CCAM Seminar 2 / 29
Introduction
Kilmer, Martin, and Perrone (2008) presented a circulant
algebra: a set of operations that generalize matrix algebra to
three-way data and provided an SVD.
The essence of this approach amounts to viewing
three-dimensional objects as two-dimensional arrays (i.e.,
matrices) of one-dimensional arrays (i.e., vectors).
Braman (2010) developed spectral
and other decompositions.
We have extended this algebra with
the ingredients required for iterative
methods such as the power method
and Arnoldi method, and have char-
acterized the behavior of these algo-
rithms.
3. 40 60 80 100 120
40
60
80
mm
David F. Gleich (Purdue) CCAM Seminar 3 / 29
A look at the power method
Require A, x(0), τ
x(0) ← x(0) x(0) −1
for k = 1, . . . , until convergence
y(k) ← Ax(k−1)
α(k) ← y(k)
x(k) ← y(k)α(k)−1
if sign(
(k)
1 )x(k) − sign(
(k−1)
1 )x(k−1) < τ
return x(k)
end if
end for
Require a scalar inverse, norm, absolute value,...
4. 40 60 80 100 120
40
60
80
mm
David F. Gleich (Purdue) CCAM Seminar 4 / 29
Three-way arrays
Given an m × n × k table of data,
we view this data as an m × n ma-
trix where each “scalar” is a vector
of length k.
A ∈ Km×n
k
We denote the space of length-k scalars as Kk.
These scalars interact like circulant matrices.
5. 40 60 80 100 120
40
60
80
mm
David F. Gleich (Purdue) CCAM Seminar 5 / 29
Circulants
Circulant matrices are a commutative, closed class under the
standard matrix operations.
α1 αk . . . α2
α2 α1
...
...
...
...
... αk
αk . . . α2 α1
We'll see more of their properties shortly!
6. 40 60 80 100 120
40
60
80
mm
David F. Gleich (Purdue) CCAM Seminar 6 / 29
The circ operation
We denote the space of length-k scalars as Kk.
These scalars interact like circulant matrices.
α = {α1 ... αk } ∈ Kk.
α ↔ circ(α) ≡
α1 αk . . . α2
α2 α1
...
...
...
...
... αk
αk . . . α2 α1
.
α+β ↔ circ(α)+circ(β) and α◦β ↔ circ(α)circ(β);
0 = {0 0 ... 0} 1 = {1 0 ... 0}
Kk is the ring of length-k circulants.
7. 40 60 80 100 120
40
60
80
mm
David F. Gleich (Purdue) CCAM Seminar 7 / 29
The circ operation on matrices
A ◦ x =
n
j=1
A1,j ◦ j
...
n
j=1
Am,j ◦ j
↔
circ(A1,1) ... circ(A1,n)
...
...
...
circ(Am,1) ... circ(Am,n)
circ(1)
...
circ(n)
.
Define
circ(A) ≡
circ(A1,1) ... circ(A1,n)
...
...
...
circ(Am,1) ... circ(Am,n)
circ(x) ≡
circ(1)
...
circ(n)
A ◦ x ↔ circ(A)circ(x) matrix-vector products.
x ◦ α ↔ circ(x)circ(α) vector-scalar products
This is equivalent to Kilmer, Martin, Perrone (2008).
8. 40 60 80 100 120
40
60
80
mm
David F. Gleich (Purdue) CCAM Seminar 8 / 29
A look at the power method
Require A, x(0), τ −→ A, x(0), τ
x(0) ← x(0) x(0) −1
−→ x(0) ◦ x(0) −1
↔ circ(x(0))circ( x(0) )−1
for k = 1, . . . , until convergence
y(k) ← Ax(k−1) −→ y(k) ← A ◦ x(k−1) ↔ circ(A)circ(x(k−1))
α(k) ← y(k) −→ . . .
x(k) ← y(k)α(k)−1
if sign(
(k)
1 )x(k) − sign(
(k−1)
1 )x(k−1) < τ
return x(k)
end if
end for
Require a scalar inverse , norm (?), absolute value (?) ,...
9. 40 60 80 100 120
40
60
80
mm
David F. Gleich (Purdue) CCAM Seminar 9 / 29
Circulants and Fourier transforms
Let C be a k × k circulant matrix. Then the eigenvector matrix
of C is given by the k × k discrete Fourier transform matrix F,
where
Fj =
1
k
ω(−1)(j−1)
and ω = e2πι/k.
This matrix is complex symmetric, FT
= F, and unitary,
F∗
= F−1
. Thus, C = FDF∗
, D = dig(λ1, . . . , λk).
Multiplying a vector by F or F∗
can be accomplished via
the fast Fourier transform in O(k log k) time instead of
O(k2) for the typical matrix-vector product algorithm.
Computing the matrix D can be done in time O(k log k) as
well.
d = fft(a)
10. 40 60 80 100 120
40
60
80
mm
David F. Gleich (Purdue) CCAM Seminar 10 / 29
cft and icft
We define the “Circulant Fourier Transform” or cft
cft : α ∈ Kk → Ck×k
and its inverse
icft : Ck×k
→ Kk
as follows:
cft(α) ≡
ˆα1
...
ˆαk
= F∗
circ(α)F,
icft
ˆα1
...
ˆαk
≡ α ↔ F cft(α)F∗
,
where ˆαj are the eigenvalues of circ(α) as produced in the
Fourier transform. These transformations satisfy
icft(cft(α)) = α and provide a convenient way of moving
between operations in Kk to the more familiar environment of
diagonal matrices in Ck×k.
11. 40 60 80 100 120
40
60
80
mm
David F. Gleich (Purdue) CCAM Seminar 11 / 29
Operations
Let α, β ∈ Kk. Note that
α + β = icft(cft(α) + cft(β)), and
α ◦ β = icft(cft(α) cft(β)).
In the Fourier space – the output of the cft operation – these
operations are both O(k) time because they occur between
diagonal matrices. These simplifications generalize to
matrix-based operations too. For example,
A ◦ x = icft(cft(A) cft(x)).
12. 40 60 80 100 120
40
60
80
mm
David F. Gleich (Purdue) CCAM Seminar 12 / 29
Operations (cont.)
In the Fourier space, this system is a series of independent
matrix vector products:
cft(A) cft(x) =
ˆA1
...
ˆAk
ˆx1
...
ˆxk
=
ˆA1 ˆx1
...
ˆAk ˆxk
.
We use ˆAj and ˆxj to denote the blocks of Fourier coefficients, or
equivalently, circulant eigenvalues. This formulation takes
O(mnk log k + nk log k)
cft and icft
+ O(kmn)
matvecs
operations instead of O(mnk2) using the circ formulation.
13. 40 60 80 100 120
40
60
80
mm
David F. Gleich (Purdue) CCAM Seminar 13 / 29
Operations (cont.)
More operations are simplified in the Fourier space too. Let
cft(α) = dig [ˆα1, ..., ˆαk]. Because the ˆαj values are the
eigenvalues of circ(α), we have:
abs(α) = icft(dig [| ˆα1|, ..., | ˆαk|]),
α = icft(dig [ˆα1, ..., ˆαk]) = icft(cft(α)∗
), and
angle(α) = icft(dig [ˆα1/| ˆα1|, ..., ˆαk/| ˆαk|]).
14. 40 60 80 100 120
40
60
80
mm
David F. Gleich (Purdue) CCAM Seminar 14 / 29
Decompositional interpretation of cft
Algebraically, the cft operation for a matrix A ∈ Km×n
k is
cft(A) = Pm(m ⊗ F∗
)circ(A)(n ⊗ F)PT
n
,
where Pm and Pn are permutation matrices. We can
equivalently write this directly in terms of the eigenvalues of
each of the circulant blocks of circ(A):
cft(A) ≡
ˆA1
...
ˆAk
, ˆAj =
λ
1,1
j ... λ
1,n
j
...
...
...
λ
m,1
j ... λ
m,n
j
,
where λ
r,s
1 , . . . , λ
r,s
k are the diagonal elements of cft(Ar,s). The
inverse operation icft, takes a block diagonal matrix and
returns the matrix in Km×n
k :
icft(A) ↔ (m ⊗ F)PT
m
APn(n ⊗ F∗
).
15. 40 60 80 100 120
40
60
80
mm
David F. Gleich (Purdue) CCAM Seminar 15 / 29
Back to gure
17. 40 60 80 100 120
40
60
80
mm
David F. Gleich (Purdue) CCAM Seminar 17 / 29
A look at the power method
Require A, x(0), τ
x(0) ← x(0) x(0) −1
for k = 1, . . . , until convergence
y(k) ← Ax(k−1)
α(k) ← y(k)
x(k) ← y(k)α(k)−1
if sign(
(k)
1 )x(k) − sign(
(k−1)
1 )x(k−1) < τ
return x(k)
end if
end for
Require a scalar inverse , norm , absolute value ,...
18. 40 60 80 100 120
40
60
80
mm
David F. Gleich (Purdue) CCAM Seminar 18 / 29
These operations can now be
straightforwardly de ned
inverse of a scalar:
α−1
↔ circ(α)−1
.
more generally, function of a scalar:
ƒ(α) ↔ ƒ(circ(α))
angle:
angle() || = , angle(α) ↔ circ(abs(α))−1
circ(α).
The norm of a vector in Kn
k
produces a scalar in Kk:
x ↔ (circ(x)∗
circ(x))1/2
=
n
=1
circ()∗
circ()
1/2
.
Inner product:
〈x, y〉 ↔ circ(y)∗
circ(x).
19. 40 60 80 100 120
40
60
80
mm
David F. Gleich (Purdue) CCAM Seminar 19 / 29
Example
Run the power method on
{2 3 1} {0 0 0}
{0 0 0} {3 1 1}
Result
20. 40 60 80 100 120
40
60
80
mm
David F. Gleich (Purdue) CCAM Seminar 19 / 29
Example
Run the power method on
{2 3 1} {0 0 0}
{0 0 0} {3 1 1}
Result λ = (1/3) {10 4 4}
22. 40 60 80 100 120
40
60
80
mm
David F. Gleich (Purdue) CCAM Seminar 21 / 29
Canonical set
There are more eigenvalues
λ5 = icft(dig [6 -ι 3 2]) λ6 = icft(dig [6 2 ι 3])
λ7 = icft(dig [5 -ι 3 2]) λ8 = icft(dig [5 2 ι 3]),
altogether polynomial number, exceeds dimension of matrix.
Definition. A canonical set of eigenvalues and eigenvectors is
a set of minimum size, ordered such that
abs(λ1) ≥ abs(λ2) ≥ . . . ≥ abs(λk), which contains the
information to reproduce any eigenvalue or eigenvector of A
In this case, the only canonical set is {(λ1, x1), (λ2, x2)}. (Need
two, and have abs(λ1) ≥ abs(λ2).)
23. 40 60 80 100 120
40
60
80
mm
David F. Gleich (Purdue) CCAM Seminar 22 / 29
Keeping in real...
Let A ∈ Kn×n
k be real-valued with diagonalizable ˆAj matrices. If
k is odd, then the eigendecomposition X ◦ Λ ◦ X−1
is real-valued
if and only if ˆA1 has real-valued eigenvalues. If k is even, then
X ◦ Λ ◦ X−1
is real-valued if and only if ˆA1 and ˆAk/2+1 have
real-valued eigenvalues.
24. 40 60 80 100 120
40
60
80
mm
David F. Gleich (Purdue) CCAM Seminar 23 / 29
The power method converges
Let A ∈ Kn×n
k have a canonical set of eigenvalues λ1, . . . , λn
where |λ1| > |λ2|, then the power method in the circulant
algebra convergences to an eigenvector x1 with eigenvalue λ1.
Where we use the ordering ...
α < β ↔ cft(α) < cft(β) elementwise
25. 40 60 80 100 120
40
60
80
mm
David F. Gleich (Purdue) CCAM Seminar 24 / 29
The Arnoldi process
Let A be an n × n matrix with real valued entries. Then the
Arnoldi method is a technique to build an orthogonal basis
for the Krylov subspace Kt(A, v) = span{v, Av, . . . , At−1
v},
where v is an initial vector.
We have the decomposition
AQt = Qt+1Ht+1,t
where Qt is an n × t matrix, and Ht+1,t is a (t + 1) × t upper
Hessenberg matrix.
Using our repertoire of operations, the Arnoldi method in
the circulant algebra is equivalent to individual Arnoldi
processes on each matrix ˆAj.
Equivalent to a block Arnoldi process.
Using the cft and icft operations, we produce an Arnoldi
factorization:
A ◦ Qt = Qt+1 ◦ Ht+1,t.
26. 40 60 80 100 120
40
60
80
mm
David F. Gleich (Purdue) CCAM Seminar 25 / 29
Example
Consider
−Δ(, y) = ƒ(, y) (, 0) = (, 1), (0, y) = y(1, y) = 0
for (, y) ∈ [0, 1] × [0, 1] with a uniform mesh and the standard
5-point discrete Laplacian:
−Δ(, yj) ≈ −(−1, yj) − (, yj−1)
+ 4(, yj) − (+1, yj) − (, yj+1).
Apply the boundary conditions and organizing the unknowns of
in y-major order.
An approximate solution is given by solving an
N(N − 1) × N(N − 1) block-tridiagonal, circulant-block system.
27. 40 60 80 100 120
40
60
80
mm
David F. Gleich (Purdue) CCAM Seminar 26 / 29
The Linear System
C −
− C
...
...
... −
− C
A
(1, ·)
(2, ·)
...
(N−1, ·)
=
f(1, ·)
f(2, ·)
...
f(N−1, ·)
f
,
C =
4 −1 −1
−1 4
...
...
... −1
−1 −1 4
N×N
,
That is, A = f, or A ◦ = f, where A is an N − 1 × N − 1 matrix
of KN elements, and f have compatible sizes, and
A = circ(A), = vec(), f = vec(f).
28. 40 60 80 100 120
40
60
80
mm
David F. Gleich (Purdue) CCAM Seminar 27 / 29
The canonical eigenvalues of A are
λj = {4+2 cos(jπ/N),−1,0,...,0,−1} .
To see this result, let λ(μ) = {μ,−1,0,...,0,−1} . Then
(A − λ(μ) ◦ ) =
(4 − μ) ◦ 1 −1 ◦ 1
−1 ◦ 1 (4 − μ) ◦ 1
...
...
... −1 ◦ 1
−1 ◦ 1 (4 − μ) ◦ 1
.
29. 40 60 80 100 120
40
60
80
mm
David F. Gleich (Purdue) CCAM Seminar 28 / 29
2000 4000 6000 8000
10
−15
10
−10
10
−5
10
0
2 +2 cos( 2π/n)
2 +2 cos( π/n)
2 i
6 +2 cos( 2π/n)
6 +2 cos( π/n)
2 i
6 +2 cos( 2π/n)
6 +2 cos( π/n)
i
iteration
magnitude
Eigenvalue Error
Eigenvector Change
Figure: The convergence behavior of the power
method in the circulant algebra. The gray lines show
the error in the each eigenvalue in Fourier space.
These curves track the predictions made based on
the eigenvalues as discussed in the text. The red
line shows the magnitude of the change in the
eigenvector. We use this as the stopping criteria. It
also decays as predicted by the ratio of eigenvalues.
The blue fit lines have been visually adjusted to
match the behavior in the convergence tail.
0 10 20 30 40 50
10
−15
10
−10
10
−5
10
0
Arnoldi iteration
Magnitude
Absolute error
Residual magnitude
Figure: The convergence behavior of a GMRES
procedure using the circulant Arnoldi process. The
gray lines show the error in each Fourier component
and the red line shows the magnitude of the
residual. We observe poor convergence in one
Fourier component; until the Arnoldi basis captures
all of the eigenvalues after N/2 + 1 = 26 iterations.
These results show how the two computations are
performing individual power methods or Arnoldi
processes in Fourier space.
30. 40 60 80 100 120
40
60
80
mm
David F. Gleich (Purdue) CCAM Seminar 29 / 29
The End
Paper available online from http://www.cs.ubc.ca/˜greif:
“The power and Arnoldi methods in an algebra of circulants”,
David Gleich, Chen Greif and Jim Varah
Thank you!