The power and Arnoldi methods in an algebra of circulants

40 60 80 100 120
40
60
80
mm
David F. Gleich (Purdue) CCAM Seminar 1 / 29
The power and Arnoldi methods in an algebra of circulants
David F. Gleich
Computer Science
Purdue University
CCAM Seminar
April 19th, 2013
In collaboration with
Chen Greif and Jim Varah (UBC)
Supported by a research grant from NSERC
and the Sandia National Labs John von Neumann fellowship

40 60 80 100 120
40
60
80
mm
Introduction
Kilmer, Martin, and Perrone (2008) presented a circulant
algebra: a set of operations that generalize matrix algebra to
three-way data and provided an SVD.
The essence of this approach amounts to viewing
three-dimensional objects as two-dimensional arrays (i.e.,
matrices) of one-dimensional arrays (i.e., vectors).
Braman (2010) developed spectral
and other decompositions.
We have extended this algebra with
the ingredients required for iterative
methods such as the power method
and Arnoldi method, and have char-
acterized the behavior of these algo-
rithms.

40 60 80 100 120
40
60
80
mm
A look at the power method
Require A, x(0), τ
x(0) ← x(0) x(0) −1
for k = 1, . . . , until convergence
y(k) ← Ax(k−1)
α(k) ← y(k)
x(k) ← y(k)α(k)−1
if sign(
(k)
1 )x(k) − sign(
(k−1)
1 )x(k−1) < τ
return x(k)
end if
end for
Require a scalar inverse, norm, absolute value,...

40 60 80 100 120
40
60
80
mm
Three-way arrays
Given an m × n × k table of data,
we view this data as an m × n ma-
trix where each “scalar” is a vector
of length k.
A ∈ Km×n
k
We denote the space of length-k scalars as Kk.
These scalars interact like circulant matrices.

40 60 80 100 120
40
60
80
mm
Circulants
Circulant matrices are a commutative, closed class under the
standard matrix operations.







α1 αk . . . α2
α2 α1
...
...
...
...
... αk
αk . . . α2 α1







We'll see more of their properties shortly!

40 60 80 100 120
40
60
80
mm
The circ operation
We denote the space of length-k scalars as Kk.
These scalars interact like circulant matrices.
α = {α1 ... αk } ∈ Kk.
α ↔ circ(α) ≡







α1 αk . . . α2
α2 α1
...
...
...
...
... αk
αk . . . α2 α1







.
α+β ↔ circ(α)+circ(β) and α◦β ↔ circ(α)circ(β);
0 = {0 0 ... 0} 1 = {1 0 ... 0}
Kk is the ring of length-k circulants.

40 60 80 100 120
40
60
80
mm
The circ operation on matrices
A ◦ x =



n
j=1
A1,j ◦ j
...
n
j=1
Am,j ◦ j


 ↔


circ(A1,1) ... circ(A1,n)
...
...
...
circ(Am,1) ... circ(Am,n)




circ(1)
...
circ(n)

 .
Deﬁne
circ(A) ≡


circ(A1,1) ... circ(A1,n)
...
...
...
circ(Am,1) ... circ(Am,n)

 circ(x) ≡


circ(1)
...
circ(n)


A ◦ x ↔ circ(A)circ(x) matrix-vector products.
x ◦ α ↔ circ(x)circ(α) vector-scalar products
This is equivalent to Kilmer, Martin, Perrone (2008).

40 60 80 100 120
40
60
80
mm
Require A, x(0), τ −→ A, x(0), τ
x(0) ← x(0) x(0) −1
−→ x(0) ◦ x(0) −1
↔ circ(x(0))circ( x(0) )−1
y(k) ← Ax(k−1) −→ y(k) ← A ◦ x(k−1) ↔ circ(A)circ(x(k−1))
α(k) ← y(k) −→ . . .
x(k) ← y(k)α(k)−1
if sign(
(k)
1 )x(k) − sign(
(k−1)
1 )x(k−1) < τ
return x(k)
end if
end for
Require a scalar inverse , norm (?), absolute value (?) ,...

40 60 80 100 120
40
60
80
mm
Circulants and Fourier transforms
Let C be a k × k circulant matrix. Then the eigenvector matrix
of C is given by the k × k discrete Fourier transform matrix F,
where
Fj =
1
k
ω(−1)(j−1)
and ω = e2πι/k.
This matrix is complex symmetric, FT
= F, and unitary,
F∗
= F−1
. Thus, C = FDF∗
, D = dig(λ1, . . . , λk).
Multiplying a vector by F or F∗
can be accomplished via
the fast Fourier transform in O(k log k) time instead of
O(k2) for the typical matrix-vector product algorithm.
Computing the matrix D can be done in time O(k log k) as
well.
d = fft(a)

40 60 80 100 120
40
60
80
mm
cft and icft
We deﬁne the “Circulant Fourier Transform” or cft
cft : α ∈ Kk → Ck×k
and its inverse
icft : Ck×k
→ Kk
as follows:
cft(α) ≡
ˆα1
...
ˆαk
= F∗
circ(α)F,
icft
ˆα1
...
ˆαk
≡ α ↔ F cft(α)F∗
,
where ˆαj are the eigenvalues of circ(α) as produced in the
Fourier transform. These transformations satisfy
icft(cft(α)) = α and provide a convenient way of moving
between operations in Kk to the more familiar environment of
diagonal matrices in Ck×k.

40 60 80 100 120
40
60
80
mm
Operations
Let α, β ∈ Kk. Note that
α + β = icft(cft(α) + cft(β)), and
α ◦ β = icft(cft(α) cft(β)).
In the Fourier space – the output of the cft operation – these
operations are both O(k) time because they occur between
diagonal matrices. These simpliﬁcations generalize to
matrix-based operations too. For example,
A ◦ x = icft(cft(A) cft(x)).

40 60 80 100 120
40
60
80
mm
Operations (cont.)
In the Fourier space, this system is a series of independent
matrix vector products:
cft(A) cft(x) =


Â1
...
Âk


ˆx1
...
ˆxk
=


Â1 ˆx1
...
Âk ˆxk

 .
We use Âj and ˆxj to denote the blocks of Fourier coefficients, or
equivalently, circulant eigenvalues. This formulation takes
O(mnk log k + nk log k)
cft and icft
+ O(kmn)
matvecs
operations instead of O(mnk2) using the circ formulation.

40 60 80 100 120
40
60
80
mm
Operations (cont.)
More operations are simpliﬁed in the Fourier space too. Let
cft(α) = dig [ˆα1, ..., ˆαk]. Because the ˆαj values are the
eigenvalues of circ(α), we have:
abs(α) = icft(dig [| ˆα1|, ..., | ˆαk|]),
α = icft(dig [ˆα1, ..., ˆαk]) = icft(cft(α)∗
), and
angle(α) = icft(dig [ˆα1/| ˆα1|, ..., ˆαk/| ˆαk|]).

40 60 80 100 120
40
60
80
mm
Decompositional interpretation of cft
Algebraically, the cft operation for a matrix A ∈ Km×n
k is
cft(A) = Pm(m ⊗ F∗
)circ(A)(n ⊗ F)PT
n
,
where Pm and Pn are permutation matrices. We can
equivalently write this directly in terms of the eigenvalues of
each of the circulant blocks of circ(A):
cft(A) ≡


Â1
...
Âk

 , Âj =



λ
1,1
j ... λ
1,n
j
...
...
...
λ
m,1
j ... λ
m,n
j


 ,
where λ
r,s
1 , . . . , λ
r,s
k are the diagonal elements of cft(Ar,s). The
inverse operation icft, takes a block diagonal matrix and
returns the matrix in Km×n
k :
icft(A) ↔ (m ⊗ F)PT
m
APn(n ⊗ F∗
).

40 60 80 100 120
40
60
80
mm
Back to gure

40 60 80 100 120
40
60
80
mm
Example
Let A = {2 3 1} {8 −2 0}
{−2 0 2} {3 1 1} . The result of the circ and cft
operations are:
circ(A) =








2 1 3 8 0 −2
3 2 1 −2 8 0
1 3 2 0 −2 8
−2 2 0 3 1 1
0 −2 2 1 3 1
2 0 −2 1 1 3








,
( ⊗ F∗)circ(A)( ⊗ F) =









6 6
− 3ι −9 + 3ι
3ι −9 − 3ι
0 5
−3 + 3ι 2
−3 − 3ι 2









,
cft(A) =









6 6
0 5
− 3ι −9 + 3ι
−3 + 3ι 2
3ι −9 − 3ι
−3 − 3ι 2









.

40 60 80 100 120
40
60
80
mm
Require A, x(0), τ
x(0) ← x(0) x(0) −1
y(k) ← Ax(k−1)
α(k) ← y(k)
x(k) ← y(k)α(k)−1
if sign(
(k)
1 )x(k) − sign(
(k−1)
1 )x(k−1) < τ
return x(k)
end if
end for
Require a scalar inverse , norm , absolute value ,...

40 60 80 100 120
40
60
80
mm
These operations can now be
straightforwardly de ned
inverse of a scalar:
α−1
↔ circ(α)−1
.
more generally, function of a scalar:
ƒ(α) ↔ ƒ(circ(α))
angle:
angle() || = , angle(α) ↔ circ(abs(α))−1
circ(α).
The norm of a vector in Kn
k
produces a scalar in Kk:
x ↔ (circ(x)∗
circ(x))1/2
=
n
=1
circ()∗
circ()
1/2
.
Inner product:
〈x, y〉 ↔ circ(y)∗
circ(x).

40 60 80 100 120
40
60
80
mm
Example
Run the power method on
{2 3 1} {0 0 0}
{0 0 0} {3 1 1}
Result

40 60 80 100 120
40
60
80
mm
Example
Run the power method on
{2 3 1} {0 0 0}
{0 0 0} {3 1 1}
Result λ = (1/3) {10 4 4}

40 60 80 100 120
40
60
80
mm
Example
A =
{2 3 1} {0 0 0}
{0 0 0} {3 1 1}
Â1 =
6 0
0 5
, Â2 =
-ι 3 0
0 2
, Â3 =
ι 3 0
0 2
.
λ1 = icft(dig [6 2 2]) = (1/3) {10 4 4}
λ2 = icft(dig [5 -ι 3 ι 3]) = (1/3) {5 2 2}
λ3 = icft(dig [6 -ι 3 ι 3]) = {2 3 1}
λ4 = icft(dig [5 2 2]) = (1/3) {3 1 1} .
The corresponding eigenvectors are
x1 =
{1/3 1/3 1/3}
{2/3 -1/3 -1/3}
; x2 =
{2/3 -1/3 -1/3}
{1/3 1/3 1/3}
;
x3 =
{1 0 0}
{0 0 0}
; x4 =
{0 0 0}
{1 0 0}
.

40 60 80 100 120
40
60
80
mm
Canonical set
There are more eigenvalues
λ5 = icft(dig [6 -ι 3 2]) λ6 = icft(dig [6 2 ι 3])
λ7 = icft(dig [5 -ι 3 2]) λ8 = icft(dig [5 2 ι 3]),
altogether polynomial number, exceeds dimension of matrix.
Deﬁnition. A canonical set of eigenvalues and eigenvectors is
a set of minimum size, ordered such that
abs(λ1) ≥ abs(λ2) ≥ . . . ≥ abs(λk), which contains the
information to reproduce any eigenvalue or eigenvector of A
In this case, the only canonical set is {(λ1, x1), (λ2, x2)}. (Need
two, and have abs(λ1) ≥ abs(λ2).)

40 60 80 100 120
40
60
80
mm
Keeping in real...
Let A ∈ Kn×n
k be real-valued with diagonalizable Âj matrices. If
k is odd, then the eigendecomposition X ◦ Λ ◦ X−1
is real-valued
if and only if Â1 has real-valued eigenvalues. If k is even, then
X ◦ Λ ◦ X−1
is real-valued if and only if Â1 and Âk/2+1 have
real-valued eigenvalues.

40 60 80 100 120
40
60
80
mm
The power method converges
Let A ∈ Kn×n
k have a canonical set of eigenvalues λ1, . . . , λn
where |λ1| > |λ2|, then the power method in the circulant
algebra convergences to an eigenvector x1 with eigenvalue λ1.
Where we use the ordering ...
α < β ↔ cft(α) < cft(β) elementwise

40 60 80 100 120
40
60
80
mm
The Arnoldi process
Let A be an n × n matrix with real valued entries. Then the
Arnoldi method is a technique to build an orthogonal basis
for the Krylov subspace Kt(A, v) = span{v, Av, . . . , At−1
v},
where v is an initial vector.
We have the decomposition
AQt = Qt+1Ht+1,t
where Qt is an n × t matrix, and Ht+1,t is a (t + 1) × t upper
Hessenberg matrix.
Using our repertoire of operations, the Arnoldi method in
the circulant algebra is equivalent to individual Arnoldi
processes on each matrix ˆAj.
Equivalent to a block Arnoldi process.
Using the cft and icft operations, we produce an Arnoldi
factorization:
A ◦ Qt = Qt+1 ◦ Ht+1,t.

40 60 80 100 120
40
60
80
mm
Example
Consider
−Δ(, y) = ƒ(, y) (, 0) = (, 1), (0, y) = y(1, y) = 0
for (, y) ∈ [0, 1] × [0, 1] with a uniform mesh and the standard
5-point discrete Laplacian:
−Δ(, yj) ≈ −(−1, yj) − (, yj−1)
+ 4(, yj) − (+1, yj) − (, yj+1).
Apply the boundary conditions and organizing the unknowns of
 in y-major order.
An approximate solution  is given by solving an
N(N − 1) × N(N − 1) block-tridiagonal, circulant-block system.

40 60 80 100 120
40
60
80
mm
The Linear System







C −
− C
...
...
... −
− C







A






(1, ·)
(2, ·)
...
(N−1, ·)







=






f(1, ·)
f(2, ·)
...
f(N−1, ·)






f
,
C =







4 −1 −1
−1 4
...
...
... −1
−1 −1 4







N×N
,
That is, A = f, or A ◦  = f, where A is an N − 1 × N − 1 matrix
of KN elements,  and f have compatible sizes, and
A = circ(A),  = vec(), f = vec(f).

40 60 80 100 120
40
60
80
mm
The canonical eigenvalues of A are
λj = {4+2 cos(jπ/N),−1,0,...,0,−1} .
To see this result, let λ(μ) = {μ,−1,0,...,0,−1} . Then
(A − λ(μ) ◦ ) =







(4 − μ) ◦ 1 −1 ◦ 1
−1 ◦ 1 (4 − μ) ◦ 1
...
...
... −1 ◦ 1
−1 ◦ 1 (4 − μ) ◦ 1







.

40 60 80 100 120
40
60
80
mm
2000 4000 6000 8000
10
−15
10
−10
10
−5
10
0
2 +2 cos( 2π/n)
2 +2 cos( π/n)
2 i
6 +2 cos( 2π/n)
6 +2 cos( π/n)
2 i
6 +2 cos( 2π/n)
6 +2 cos( π/n)
i
iteration
magnitude
Eigenvalue Error
Eigenvector Change
Figure: The convergence behavior of the power
method in the circulant algebra. The gray lines show
the error in the each eigenvalue in Fourier space.
These curves track the predictions made based on
the eigenvalues as discussed in the text. The red
line shows the magnitude of the change in the
eigenvector. We use this as the stopping criteria. It
also decays as predicted by the ratio of eigenvalues.
The blue ﬁt lines have been visually adjusted to
match the behavior in the convergence tail.
0 10 20 30 40 50
10
−15
10
−10
10
−5
10
0
Arnoldi iteration
Magnitude
Absolute error
Residual magnitude
Figure: The convergence behavior of a GMRES
procedure using the circulant Arnoldi process. The
gray lines show the error in each Fourier component
and the red line shows the magnitude of the
residual. We observe poor convergence in one
Fourier component; until the Arnoldi basis captures
all of the eigenvalues after N/2 + 1 = 26 iterations.
These results show how the two computations are
performing individual power methods or Arnoldi
processes in Fourier space.

40 60 80 100 120
40
60
80
mm
The End
Paper available online from http://www.cs.ubc.ca/˜greif:
“The power and Arnoldi methods in an algebra of circulants”,
David Gleich, Chen Greif and Jim Varah
Thank you!

The power and Arnoldi methods in an algebra of circulants

Recomendados

Recomendados

Mais conteúdo relacionado

Mais procurados

Mais procurados (18)

Destaque

Destaque (20)

Semelhante a The power and Arnoldi methods in an algebra of circulants

Semelhante a The power and Arnoldi methods in an algebra of circulants (20)

Mais de David Gleich

Mais de David Gleich (14)

Último

Último (20)

The power and Arnoldi methods in an algebra of circulants