SlideShare uma empresa Scribd logo
1 de 12
Baixar para ler offline
Optimization Models
EE 127 / EE 227AT
Laurent El Ghaoui
EECS department
UC Berkeley
Spring 2015
Sp’15 1 / 46
LECTURE 2
Vectors and Functions
Mathematicians are like
Frenchmen: whatever you say to
them, they translate into their own
language, and turn it into
something entirely di↵erent.
Goethe
Sp’15 2 / 46
Outline
1 Introduction
Basics
Examples
Vector spaces
2 Inner product, angle, orthogonality
3 Projections
4 Functions and maps
Hyperplanes and halfspaces
Gradients
Sp’15 3 / 46
Introduction
A vector is a collection of numbers, arranged in a column or a row, which can be
thought of as the coordinates of a point in n-dimensional space.
Equipping vectors with sum and scalar multiplication allows to define notions such
as independence, span, subspaces, and dimension. Further, the scalar product
introduces a notion of angle between two vectors, and induces the concept of
length, or norm.
Via the scalar product, we can also view a vector as a linear function. We can
compute the projection of a vector onto a line defined by another vector, onto a
plane, or more generally onto a subspace.
Projections can be viewed as a first elementary optimization problem (finding the
point in a given set at minimum distance from a given point), and they constitute a
basic ingredient in many processing and visualization techniques for
high-dimensional data.
Sp’15 4 / 46
Notes
Notes
Notes
Notes
Basics
Notation
We usually write vectors in column format:
x =
2
6
6
6
4
x1
x2
...
xn
3
7
7
7
5
.
Element xi is said to be the i-th component (or the i-th element, or entry) of vector
x, and the number n of components is usually referred to as the dimension of x.
When the components of x are real numbers, i.e. xi 2 R, then x is a real vector of
dimension n, which we indicate with the notation x 2 Rn
.
We shall seldom need complex vectors, which are collections of complex numbers
xi 2 C, i = 1, . . . , n. We denote the set of such vectors by Cn
.
To transform a column-vector x in row format and vice versa, we define an
operation called transpose, denoted with a superscript >
:
x>
=
⇥
x1 x2 · · · xn
⇤
; x>>
= x.
Sp’15 5 / 46
Examples
Example 1 (Bag-of-words representations of text)
Consider the following text:
“A (real) vector is just a collection of real numbers, referred to as
the components (or, elements) of the vector; Rn
denotes the set
of all vectors with n elements. If x 2 Rn
denotes a vector, we use
subscripts to denote elements, so that xi is the i-th component of
x. Vectors are arranged in a column, or a row. If x is a column
vector, x>
denotes the corresponding row vector, and vice-versa.”
Row vector c = [5, 3, 3, 4] contains the number of times each word in the list
V = {vector, elements, of, the} appears in the above paragraph.
Dividing each entry in c by the total number of occurrences of words in the list (15,
in this example), we obtain a vector x = [1/3, 1/5, 1/5, 4/15] of relative word
frequencies.
Frequency-based representation of text documents (bag-of-words).
Sp’15 6 / 46
Examples
Example 2 (Time series)
A time series represents the evolution in (discrete) time of a physical or economical
quantity.
If x(k), k = 1, . . . , T, describes the numerical value of the quantity of interest at
time k, then the whole time series, over the time horizon from 1 to T, can be
represented as a T-dimensional vector x containing all the values of x(k), for k = 1
to k = T, that is
x = [x(1) x(2) · · · x(T)]>
2 RT
.
Adjusted close price of the Dow Jones Industrial Average Index, over a 66 days period
from April 19, 2012 to July 20, 2012.
10 20 30 40 50 60
1.22
1.24
1.26
1.28
1.3
1.32
1.34
1 66
Sp’15 7 / 46
Example 3 (Images)
We are given a gray-scale image where each pixel has a certain value representing the
level of darkness. We can arrange the image as a vector of pixels.
. . .
Figure : Row vector representation of an image.
Sp’15 8 / 46
Notes
Notes
Notes
Notes
Vector spaces
The operations of sum, di↵erence and scalar multiplication are defined in an
obvious way for vectors: for any two vectors v(1)
, v(2)
having equal number of
elements, we have that the sum v(1)
+ v(2)
is simply a vector having as components
the sum of the corresponding components of the addends, and the same holds for
the di↵erence.
If v is a vector and ↵ is a scalar (i.e., a real or complex number), then ↵v is
obtained multiplying each component of v by ↵. If ↵ = 0, then ↵v is the zero
vector, or origin.
A vector space, X, is obtained by equipping vectors with the operations of addition
and multiplication by a scalar.
A simple example of a vector space is X = Rn
, the space of n-tuples of real
numbers. A less obvious example is the set of single-variable polynomials of a given
degree.
Sp’15 9 / 46
Subspaces and span
A nonempty subset V of a vector space X is called a subspace of X if, for any
scalars ↵, ,
x, y 2 V ) ↵x + y 2 V.
In other words, V is “closed” under addition and scalar multiplication.
A linear combination of a set of vectors S = {x(1)
, . . . , x(m)
} in a vector space X is
a vector of the form ↵1x(1)
+ · · · + ↵mx(m)
, where ↵1, . . . , ↵m are given scalars.
The set of all possible linear combinations of the vectors in S = {x(1)
, . . . , x(m)
}
forms a subspace, which is called the subspace generated by S, or the span of S,
denoted with span(S).
Given two subspaces X, Y in Rn
, the direct sum of X, Y, which we denote by
X Y, is the set of vectors of the form x + y, with x 2 X, y 2 Y. It is readily
checked that X Y is itself a subspace.
Sp’15 10 / 46
Bases and dimensions
A collection x(1)
, . . . , x(m)
of vectors in a vector space X is said to be linearly
independent if no vector in the collection can be expressed as a linear combination
of the others. This is the same as the condition
mX
i=1
↵i x(i)
= 0 =) ↵ = 0.
Given a subspace S of a vector space X, a basis of S is a set B of vectors of
minimal cardinality, such that span(B) = S. The cardinality of a basis is called the
dimension of S.
If we have a basis {x(1)
, . . . , x(d)
} for a subspace S, then we can write any element
in the subspace as a linear combination of elements in the basis. That is, any x 2 S
can be written as
x =
dX
i=1
↵i x(i)
,
for appropriate scalars ↵i
Sp’15 11 / 46
A ne sets
An a ne set is a set of the form
A = {x 2 X : x = v + x(0)
, v 2 V},
where x(0)
is a given point and V is a given subspace of X. Subspaces are just
a ne spaces containing the origin.
Geometrically, an a ne set is a flat passing through x(0)
. The dimension of an
a ne set A is defined as the dimension of its generating subspace V.
A line is a one-dimensional a ne set. The line through x0 along direction u is the
set
L = {x 2 X : x = x0 + v, v 2 span(u)},
where in this case span(u) = { u : 2 R}.
Sp’15 12 / 46
Notes
Notes
Notes
Notes
Euclidean length
The Euclidean length of a vector x 2 Rn
is the square-root of the sum of squares of
the components of x, that is
Euclidean length of x
.
=
q
x2
1 + x2
2 + · · · + x2
n .
This formula is an obvious extension to the multidimensional case of the
Pythagoras theorem in R2
.
The Euclidean length represents the actual distance to be “travelled” for reaching
point x from the origin 0, along the most direct way (the straight line passing
through 0 and x).
Sp’15 13 / 46
Basics
Norms and `p norms
A norm on a vector space X is a real-valued function with special properties that
maps any element x 2 X into a real number kxk.
Definition 1
A function from X to R is a norm, if
kxk 0 8x 2 X, and kxk = 0 if and only if x = 0;
kx + yk  kxk + kyk, for any x, y 2 X (triangle inequality);
k↵xk = |↵|kxk, for any scalar ↵ and any x 2 X.
`p norms are defined as
kxkp
.
=
nX
k=1
|xk |p
!1/p
, 1  p < 1.
Sp’15 14 / 46
Basics
Norms and `p norms
For p = 2 we obtain the standard Euclidean length
kxk2
.
=
v
u
u
t
nX
k=1
x2
k ,
or p = 1 we obtain the sum-of-absolute-values length
kxk1
.
=
nX
k=1
|xk |.
The limit case p = 1 defines the `1 norm (max absolute value norm, or
Chebyshev norm)
kxk1
.
= max
k=1,...,n
|xk |.
The cardinality of a vector x is often called the `0 (pseudo) norm and denoted with
kxk0.
Sp’15 15 / 46
Inner product
An inner product on a (real) vector space X is a real-valued function which maps
any pair of elements x, y 2 X into a scalar denoted as hx, yi. The inner product
satisfies the following axioms: for any x, y, z 2 X and scalar ↵
hx, xi 0;
hx, xi = 0 if and only if x = 0;
hx + y, zi = hx, zi + hy, zi;
h↵x, yi = ↵hx, yi;
hx, yi = hy, xi.
A vector space equipped with an inner product is called an inner product space.
The standard inner product defined in Rn
is the “row-column” product of two
vectors
hx, yi = x>
y =
nX
k=1
xk yk .
The inner product induces a norm: kxk =
p
hx, xi.
Sp’15 16 / 46
Notes
Notes
Notes
Notes
Angle between vectors
0
The angle between x and y is defined via the relation
cos ✓ =
x>
y
kxk2kyk2
.
When x>
y = 0, the angle between x and y is ✓ = ±90 , i.e., x, y are orthogonal.
When the angle ✓ is 0 , or ±180 , then x is aligned with y, that is y = ↵x, for
some scalar ↵, i.e., x and y are parallel. In this situation |x>
y| achieves its
maximum value |↵|kxk2
2.
Sp’15 17 / 46
Cauchy-Schwartz and H¨older inequality
Since | cos ✓|  1, it follows from the angle equation that
|x>
y|  kxk2kyk2,
and this inequality is known as the Cauchy-Schwartz inequality.
A generalization of this inequality involves general `p norms and it is known as the
H¨older inequality.
For any vectors x, y 2 Rn
and for any p, q 1 such that 1/p + 1/q = 1, it holds
that
|x>
y| 
nX
k=1
|xk yk |  kxkpkykq.
Sp’15 18 / 46
Maximization of inner product over norm balls
Our first optimization problem:
max
kxkp1
x>
y.
For p = 2:
x⇤
2 =
y
kyk2
,
hence maxkxk21 x>
y = kyk2.
For p = 1:
x⇤
1 = sgn(y),
and maxkxk11 x>
y =
Pn
i=1 |yi | = kyk1.
For p = 1:
[x⇤
1 ]i =
⇢
sgn(yi ) if i = m
0 otherwise
, i = 1, . . . , n.
We thus have maxkxk11 x>
y = maxi |yi | = kyk1.
Sp’15 19 / 46
Orthogonal vectors
Generalizing the concept of orthogonality to generic inner product spaces, we say
that two vectors x, y in an inner product space X are orthogonal if hx, yi = 0.
Orthogonality of two vectors x, y 2 X is symbolized by x ? y.
Nonzero vectors x(1)
, . . . , x(d)
are said to be mutually orthogonal if hx(i)
, x(j)
i = 0
whenever i 6= j. In words, each vector is orthogonal to all other vectors in the
collection.
Proposition 1
Mutually orthogonal vectors are linearly independent.
A collection of vectors S = {x(1)
, . . . , x(d)
} is said to be orthonormal if, for
i, j = 1, . . . , d,
hx(i)
, x(j)
i =
⇢
0 if i 6= j,
1 if i = j.
In words, S is orthonormal if every element has unit norm, and all elements are
orthogonal to each other. A collection of orthonormal vectors S forms an
orthonormal basis for the span of S.
Sp’15 20 / 46
Notes
Notes
Notes
Notes
Orthogonal complement
A vector x 2 X is orthogonal to a subset S of an inner product space X if x ? s for
all s 2 S.
The set of vectors in X that are orthogonal to S is called the orthogonal
complement of S, and it is denoted with S?
;
0
Theorem 1 (Orthogonal decomposition)
If S is a subspace of an inner-product space X, then any vector x 2 X can be written in
a unique way as the sum of an element in S and one in the orthogonal complement S?
:
X = S S?
for any subspace S ✓ X.
Sp’15 21 / 46
Projections
The idea of projection is central in optimization, and it corresponds to the problem
of finding a point on a given set that is closest (in norm) to a given point.
Given a vector x in an inner product space X (say, e.g., X = Rn
) and a closed set
S ✓ X, the projection of x onto S, denoted as ⇧S (x), is defined as the point in S
at minimal distance from x:
⇧S (x) = arg min
y2S
ky xk,
where the norm used here is the norm induced by the inner product, that is
ky xk =
p
hy x, y xi.
This simply reduces to the Euclidean norm, when using the standard inner product,
in which case the projection is called Euclidean projection.
Sp’15 22 / 46
Projections
Theorem 2 (Projection Theorem)
Let X be an inner product space, let x be a given element in X, and let S be a subspace
of X. Then, there exists a unique vector x⇤
2 S which is solution to the problem
min
y2S
ky xk.
Moreover, a necessary and su cient condition for x⇤
being the optimal solution for this
problem is that
x⇤
2 S, (x x⇤
) ? S.
0
Sp’15 23 / 46
Projections
Corollary 1 (Projection on a ne set)
Let X be an inner product space, let x be a given element in X, and let A = x(0)
+ S be
the a ne set obtained by translating a given subspace S by a given vector x(0)
. Then,
there exists a unique vector x⇤
2 A which is solution to the problem
min
y2A
ky xk.
Moreover, a necessary and su cient condition for x⇤
to be the optimal solution for this
problem is that
x⇤
2 A, (x x⇤
) ? S.
0
Sp’15 24 / 46
Notes
Notes
Notes
Notes
Projections
Euclidean projection of a point onto a line
Let p 2 Rn
be a given point. We want to compute the Euclidean projection p⇤
of p
onto a line L = {x0 + span(u)}, kuk2 = 1:
p⇤
= arg min
x2L
kx pk2.
Since any point x 2 L can be written as x = x0 + v, for some v 2 span(u), the
above problem is equivalent to finding a value v⇤
for v, such that
v⇤
= arg min
v2span(u)
kv (p x0)k2.
Sp’15 25 / 46
Projections
Euclidean projection of a point onto a line
The solution must satisfy the orthogonality condition (z v⇤
)?u. Recalling that
v⇤
= ⇤
u and u>
u = kuk2
2 = 1, we hence have
u>
z u>
v⇤
= 0 , u>
z ⇤
= 0 , ⇤
= u>
z = u>
(p x0).
The optimal point p⇤
is thus given by
p⇤
= x0 + v⇤
= x0 + ⇤
u = x0 + u>
(p x0)u,
The squared distance from p to the line is
kp p⇤
k2
2 = kp x0k2
2
⇤2
= kp x0k2
2 (u>
(p x0))2
.
Sp’15 26 / 46
Projections
Example: Senate voting data
The above is a n ⇥ m matrix X of votes in the 2004-2006 US Senate, with n = #
Senators, m = # bills.
Sp’15 27 / 46
Projections
Scoring senators
Let’s find a way to score senators, by assigning to each a value
f (x) = u>
x + v
where x 2 Rm
is the vector of votes for a given senator (corresponds to a row in the
voting matrix), and u 2 Rm
, v 2 R.
Without loss of generality, we enforce that the average values of f across Senators is
zero, so that
f (x) = u>
(x ˆx),
with ˆx 2 Rm
is the vector of average votes (across Senators).
Again WLOG, we can always scale u so that kuk2 = 1. In that case, the values of f
correspond to the projections of the Senators points xi on the line passing through ˆx
with direction given by u.
Sp’15 28 / 46
Notes
Notes
Notes
Notes
Projections
Example: Senate voting data
Random direction Average direction (u ⇠ all ones)
Clearly some directions are more informative than others . . .
Sp’15 29 / 46
Projections
Euclidean projection of a point onto an hyperplane
A hyperplane is an a ne set defined as
H = {z 2 Rn
: a>
z = b},
where a 6= 0 is called a normal direction of the hyperplane, since for any two vectors
z1, z2 2 H it holds that (z1 z2)?a.
Given p 2 Rn
we want to determine the Euclidean projection p⇤
of p onto H.
The projection theorem requires p p⇤
to be orthogonal to H. Since a is a
direction orthogonal to H, the condition (p p⇤
)?H is equivalent to saying that
p p⇤
= ↵a, for some ↵ 2 R.
Sp’15 30 / 46
Projections
Euclidean projection of a point onto an hyperplane
To find ↵, consider that p⇤
2 H, thus a>
p⇤
= b, and multiply the previous
equation on the left by a>
, obtaining
a>
p b = ↵kak2
2
whereby
↵ =
a>
p b
kak2
2
,
and
p⇤
= p
a>
p b
kak2
2
a.
The distance from p to H is
kp p⇤
k2 = |↵| · kak2 =
|a>
p b|
kak2
.
Sp’15 31 / 46
Projections
Projection on a vector span
Suppose we have a basis for a subspace S ✓ X, that is
S = span(x(1)
, . . . , x(d)
).
Given x 2 X, the Projection Theorem states that the unique projection x⇤
of x
onto S is characterized by (x x⇤
) ? S.
Since x⇤
2 S, we can write x⇤
as some (unknown) linear combination of the
elements in the basis of S, that is
x⇤
=
dX
i=1
↵i x(i)
.
Then (x x⇤
) ? S , hx x⇤
, x(k)
i = 0, k = 1, . . . , d:
dX
i=1
↵i hx(k)
, x(i)
i = hx(k)
, xi, k = 1, . . . , d.
Solving this system of linear equations provides the coe cients ↵, and hence the
desired x⇤
.
Sp’15 32 / 46
Notes
Notes
Notes
Notes
Projections
Projection onto the span of orthonormal vectors
If we have an orthonormal basis for a subspace S = span(S), then it is immediate
to obtain the projection x⇤
of x onto that subspace.
This is due to the fact that, in this case, the Gram system of equations immediately
gives the coe cients
↵k = hx(k)
, xi, i = 1, . . . , d.
Therefore, we have that
x⇤
=
dX
i=1
hx(i)
, xix(i)
.
Given a basis S = {x(1)
, . . . , x(d)
} for a subspace S = span(S), there are numerical
procedures to construct an orthonormal basis for the same subspace (e.g., the
Gram-Schmidt procedure and QR factorization).
Sp’15 33 / 46
Functions and maps
A function takes a vector argument in Rn
, and returns a unique value in R.
We use the notation
f : Rn
! R,
to refer to a function with “input” space Rn
. The “output” space for functions is R.
For example, the function f : R2
! R with values
f (x) =
p
(x1 y1)2 + (x2 y2)2
gives the Euclidean distance from the point (x1, x2) to a given point (y1, y2).
We allow functions to take infinity values. The domain of a function f , denoted
dom f , is defined as the set of points where the function is finite.
Sp’15 34 / 46
Functions and maps
We usually reserve the term map to refer to vector-valued functions.
That is, maps are functions that return more a vector of values. We use the
notation
f : Rn
! Rm
,
to refer to a map with input space Rn
and output space Rm
.
The components of the map f are the (scalar-valued) functions fi , i = 1, . . . , m.
Sp’15 35 / 46
Sets related to functions
−1 −0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8 1
−1
−0.5
0
0.5
1
1.5
2
2.5
3
Consider a function f : Rn
! R.
The graph and the epigraph of a function f : Rn
! R are both subsets of Rn+1
.
The graph of f is the set of input-output pairs that f can attain, that is:
graph f =
n
(x, f (x)) 2 Rn+1
: x 2 Rn
o
.
The epigraph, denoted epi f , describes the set of input-output pairs that f can
achieve, as well as “anything above”:
epi f =
n
(x, t) 2 Rn+1
: x 2 Rn
, t f (x)
o
.
Sp’15 36 / 46
Notes
Notes
Notes
Notes
Sets related to functions
−5 −4 −3 −2 −1 0 1 2 3 4 5
−5
−4
−3
−2
−1
0
1
2
3
4
5
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
1.8
2
A level set (or contour line) is the set of points that achieve exactly some value for
the function f . For t 2 R, the t-level set of the function f is defined as
Cf (t) = {x 2 Rn
: f (x) = t} .
The t-sublevel set of f is the set of points that achieve at most a certain value for f :
Lf (t) = {x 2 Rn
: f (x)  t} .
Sp’15 37 / 46
Linear and a ne functions
Linear functions are functions that preserve scaling and addition of the input
argument.
A function f : Rn
! R is linear if and only if
8x 2 Rn
and ↵ 2 R, f (↵x) = ↵f (x);
8x1, x2 2 Rn
, f (x1 + x2) = f (x1) + f (x2).
A function f is a ne if and only if the function ˜f (x) = f (x) f (0) is linear (a ne
= linear + constant).
Consider the functions f1, f2, f3 : R2
! R defined below:
f1(x) = 3.2x1 + 2x2,
f2(x) = 3.2x1 + 2x2 + 0.15,
f3(x) = 0.001x2
2 + 2.3x1 + 0.3x2.
The function f1 is linear; f2 is a ne; f3 is neither linear nor a ne (f3 is a quadratic
function).
Sp’15 38 / 46
Linear and a ne functions
Linear or a ne functions can be conveniently defined by means of the standard
inner product.
A function f : Rn
! R is a ne if and only if it can be expressed as
f (x) = a>
x + b,
for some unique pair (a, b), with a in Rn
and b 2 R.
The function is linear if and only if b = 0.
Vector a 2 Rn
can thus be viewed as a (linear) map from the “input” space Rn
to
the “output” space R.
For any a ne function f , we can obtain a and b as follows: b = f (0), and
ai = f (ei ) b, i = 1, . . . , n.
Sp’15 39 / 46
Hyperplanes and halfspaces
A hyperplane in Rn
is a set of the form
H =
n
x 2 Rn
: a>
x = b
o
,
where a 2 Rn
, a 6= 0, and b 2 R are given.
0
Equivalently, we can think of hyperplanes as the level sets of linear functions.
When b = 0, the hyperplane is simply the set of points that are orthogonal to a
(i.e., H is a (n 1)-dimensional subspace).
Sp’15 40 / 46
Notes
Notes
Notes
Notes
Hyperplanes and halfspaces
An hyperplane H separates the whole space in two regions:
H =
n
x : a>
x  b
o
, H++ =
n
x : a>
x > b
o
.
These regions are called halfspaces (H is a closed halfspace, H++ is an open
halfspace).
the halfspace H is the region delimited by the hyperplane H = {a>
x = b} and
lying in the direction opposite to vector a. Similarly, the halfspace H++ is the region
lying above (i.e., in the direction of a) the hyperplane.
Sp’15 41 / 46
Gradients
The gradient of a function f : Rn
! R at a point x where f is di↵erentiable,
denoted with rf (x), is a column vector of first derivatives of f with respect to
x1, . . . , xn:
rf (x) =
h
@f (x)
@x1
· · · @f (x)
@xn
i>
.
When n = 1 (there is only one input variable), the gradient is simply the derivative.
An a ne function f : Rn
! R, represented as f (x) = a>
x + b, has a very simple
gradient: rf (x) = a.
Example 4
The distance function ⇢(x) = kx pk2 =
pPn
i=1(xi pi )2 has gradient
r⇢(x) =
1
kx pk2
(x p).
Sp’15 42 / 46
A ne approximation of nonlinear functions
A non-linear function f : Rn
! R can be approximated locally via an a ne
function, using a first-order Taylor series expansion.
Specifically, if f is di↵erentiable at point x0, then for all points x in a neighborhood
of x0, we have that
f (x) = f (x0) + rf (x0)>
(x x0) + ✏(x),
where the error term ✏(x) goes to zero faster than first order, as x ! x0, that is
lim
x!x0
✏(x)
kx x0k2
= 0.
In practice, this means that for x su ciently close to x0, we can write the
approximation
f (x) ' f (x0) + rf (x0)>
(x x0).
Sp’15 43 / 46
Geometric interpretation of the gradient
The gradient of a function can be interpreted in the context of the level sets.
Indeed, geometrically, the gradient of f at a point x0 is a vector rf (x0)
perpendicular to the contour line of f at level ↵ = f (x0), pointing from x0 outwards
the ↵-sublevel set (that is, it points towards higher values of the function).
−5
0
5
−5
0
5
−0.5
0
0.5
1
1.5
2
−5 −4 −3 −2 −1 0 1 2 3 4 5
−5
−4
−3
−2
−1
0
1
2
3
4
5
x1
x2
x1
x2
−3.5 −3 −2.5 −2 −1.5 −1 −0.5 0
−2
−1.5
−1
−0.5
0
0.5
1
1.5
2
x1
x2
Sp’15 44 / 46
Notes
Notes
Notes
Notes
Geometric interpretation of the gradient
The gradient rf (x0) also represents the direction along which the function has the
maximum rate of increase (steepest ascent direction).
Let v be a unit direction vector (i.e., kvk2 = 1), let ✏ 0, and consider moving
away at distance ✏ from x0 along direction v, that is, consider a point x = x0 + ✏v.
We have that
f (x0 + ✏v) ' f (x0) + ✏rf (x0)>
v, for ✏ ! 0,
or, equivalently,
lim
✏!0
f (x0 + ✏v) f (x0)
✏
= rf (x0)>
v.
Whenever ✏ > 0 and v is such that rf (x0)>
v > 0, then f is increasing along the
direction v, for small ✏.
The inner product rf (x0)>
v measures the rate of variation of f at x0, along
direction v, and it is usually referred to as the directional derivative of f along v.
Sp’15 45 / 46
Geometric interpretation of the gradient
The rate of variation is thus zero, if v is orthogonal to rf (x0): along such a
direction the function value remains constant (to first order), that is, this direction
is tangent to the contour line of f at x0.
Contrary, the rate of variation is maximal when v is parallel to rf (x0), hence along
the normal direction to the contour line at x0.
Sp’15 46 / 46
Notes
Notes
Notes
Notes

Mais conteúdo relacionado

Mais procurados

vector space and subspace
vector space and subspacevector space and subspace
vector space and subspace
2461998
 
MASSS_Presentation_20160209
MASSS_Presentation_20160209MASSS_Presentation_20160209
MASSS_Presentation_20160209
Yimin Wu
 

Mais procurados (20)

Partitions
PartitionsPartitions
Partitions
 
vector space and subspace
vector space and subspacevector space and subspace
vector space and subspace
 
05 linear transformations
05 linear transformations05 linear transformations
05 linear transformations
 
01.01 vector spaces
01.01 vector spaces01.01 vector spaces
01.01 vector spaces
 
Linear vector space
Linear vector spaceLinear vector space
Linear vector space
 
Inner product spaces
Inner product spacesInner product spaces
Inner product spaces
 
Metric space
Metric spaceMetric space
Metric space
 
Ch01
Ch01Ch01
Ch01
 
Solution set 3
Solution set 3Solution set 3
Solution set 3
 
Vector space
Vector spaceVector space
Vector space
 
Series solutions at ordinary point and regular singular point
Series solutions at ordinary point and regular singular pointSeries solutions at ordinary point and regular singular point
Series solutions at ordinary point and regular singular point
 
Eigenvalue problems .ppt
Eigenvalue problems .pptEigenvalue problems .ppt
Eigenvalue problems .ppt
 
Numerical Solution of Ordinary Differential Equations
Numerical Solution of Ordinary Differential EquationsNumerical Solution of Ordinary Differential Equations
Numerical Solution of Ordinary Differential Equations
 
Independence, basis and dimension
Independence, basis and dimensionIndependence, basis and dimension
Independence, basis and dimension
 
HERMITE SERIES
HERMITE SERIESHERMITE SERIES
HERMITE SERIES
 
Probability Formula sheet
Probability Formula sheetProbability Formula sheet
Probability Formula sheet
 
Math for Intelligent Systems - 01 Linear Algebra 01 Vector Spaces
Math for Intelligent Systems - 01 Linear Algebra 01  Vector SpacesMath for Intelligent Systems - 01 Linear Algebra 01  Vector Spaces
Math for Intelligent Systems - 01 Linear Algebra 01 Vector Spaces
 
Unidad i limites y continuidad
Unidad i    limites y continuidadUnidad i    limites y continuidad
Unidad i limites y continuidad
 
Notes on eigenvalues
Notes on eigenvaluesNotes on eigenvalues
Notes on eigenvalues
 
MASSS_Presentation_20160209
MASSS_Presentation_20160209MASSS_Presentation_20160209
MASSS_Presentation_20160209
 

Semelhante a 2 vectors notes

Chapter 12 Section 12.1 Three-Dimensional Coordinate Sys
Chapter 12 Section 12.1  Three-Dimensional Coordinate SysChapter 12 Section 12.1  Three-Dimensional Coordinate Sys
Chapter 12 Section 12.1 Three-Dimensional Coordinate Sys
EstelaJeffery653
 
The Euclidean Spaces (elementary topology and sequences)
The Euclidean Spaces (elementary topology and sequences)The Euclidean Spaces (elementary topology and sequences)
The Euclidean Spaces (elementary topology and sequences)
JelaiAujero
 
Polyhedral and algebraic method in computational geometry
Polyhedral and algebraic method in computational geometryPolyhedral and algebraic method in computational geometry
Polyhedral and algebraic method in computational geometry
Springer
 
Lecture13p.pdf.pdfThedeepness of freedom are threevalues.docx
Lecture13p.pdf.pdfThedeepness of freedom are threevalues.docxLecture13p.pdf.pdfThedeepness of freedom are threevalues.docx
Lecture13p.pdf.pdfThedeepness of freedom are threevalues.docx
croysierkathey
 
Lecture13p.pdf.pdfThedeepness of freedom are threevalues.docx
Lecture13p.pdf.pdfThedeepness of freedom are threevalues.docxLecture13p.pdf.pdfThedeepness of freedom are threevalues.docx
Lecture13p.pdf.pdfThedeepness of freedom are threevalues.docx
jeremylockett77
 

Semelhante a 2 vectors notes (20)

Beginning direct3d gameprogrammingmath03_vectors_20160328_jintaeks
Beginning direct3d gameprogrammingmath03_vectors_20160328_jintaeksBeginning direct3d gameprogrammingmath03_vectors_20160328_jintaeks
Beginning direct3d gameprogrammingmath03_vectors_20160328_jintaeks
 
Lecture_note2.pdf
Lecture_note2.pdfLecture_note2.pdf
Lecture_note2.pdf
 
Lesson 6 differentials parametric-curvature
Lesson 6 differentials parametric-curvatureLesson 6 differentials parametric-curvature
Lesson 6 differentials parametric-curvature
 
Chapter 12 Section 12.1 Three-Dimensional Coordinate Sys
Chapter 12 Section 12.1  Three-Dimensional Coordinate SysChapter 12 Section 12.1  Three-Dimensional Coordinate Sys
Chapter 12 Section 12.1 Three-Dimensional Coordinate Sys
 
Fixed points theorem on a pair of random generalized non linear contractions
Fixed points theorem on a pair of random generalized non linear contractionsFixed points theorem on a pair of random generalized non linear contractions
Fixed points theorem on a pair of random generalized non linear contractions
 
Mathematics for Deep Learning (1)
Mathematics for Deep Learning (1)Mathematics for Deep Learning (1)
Mathematics for Deep Learning (1)
 
Solution to schrodinger equation with dirac comb potential
Solution to schrodinger equation with dirac comb potential Solution to schrodinger equation with dirac comb potential
Solution to schrodinger equation with dirac comb potential
 
The Euclidean Spaces (elementary topology and sequences)
The Euclidean Spaces (elementary topology and sequences)The Euclidean Spaces (elementary topology and sequences)
The Euclidean Spaces (elementary topology and sequences)
 
Polyhedral and algebraic method in computational geometry
Polyhedral and algebraic method in computational geometryPolyhedral and algebraic method in computational geometry
Polyhedral and algebraic method in computational geometry
 
Multivriada ppt ms
Multivriada   ppt msMultivriada   ppt ms
Multivriada ppt ms
 
On Application of Power Series Solution of Bessel Problems to the Problems of...
On Application of Power Series Solution of Bessel Problems to the Problems of...On Application of Power Series Solution of Bessel Problems to the Problems of...
On Application of Power Series Solution of Bessel Problems to the Problems of...
 
Optimization Methods for Machine Learning and Engineering: Optimization in Ve...
Optimization Methods for Machine Learning and Engineering: Optimization in Ve...Optimization Methods for Machine Learning and Engineering: Optimization in Ve...
Optimization Methods for Machine Learning and Engineering: Optimization in Ve...
 
Ph 101-9 QUANTUM MACHANICS
Ph 101-9 QUANTUM MACHANICSPh 101-9 QUANTUM MACHANICS
Ph 101-9 QUANTUM MACHANICS
 
Vectorspace in 2,3and n space
Vectorspace in 2,3and n spaceVectorspace in 2,3and n space
Vectorspace in 2,3and n space
 
03_AJMS_209_19_RA.pdf
03_AJMS_209_19_RA.pdf03_AJMS_209_19_RA.pdf
03_AJMS_209_19_RA.pdf
 
03_AJMS_209_19_RA.pdf
03_AJMS_209_19_RA.pdf03_AJMS_209_19_RA.pdf
03_AJMS_209_19_RA.pdf
 
Lecture13p.pdf.pdfThedeepness of freedom are threevalues.docx
Lecture13p.pdf.pdfThedeepness of freedom are threevalues.docxLecture13p.pdf.pdfThedeepness of freedom are threevalues.docx
Lecture13p.pdf.pdfThedeepness of freedom are threevalues.docx
 
Lecture13p.pdf.pdfThedeepness of freedom are threevalues.docx
Lecture13p.pdf.pdfThedeepness of freedom are threevalues.docxLecture13p.pdf.pdfThedeepness of freedom are threevalues.docx
Lecture13p.pdf.pdfThedeepness of freedom are threevalues.docx
 
Chapter 3 Output Primitives
Chapter 3 Output PrimitivesChapter 3 Output Primitives
Chapter 3 Output Primitives
 
Vcla ppt ch=vector space
Vcla ppt ch=vector spaceVcla ppt ch=vector space
Vcla ppt ch=vector space
 

Último

CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Riyadh +966572737505 get cytotec
 
Just Call Vip call girls Mysore Escorts ☎️9352988975 Two shot with one girl (...
Just Call Vip call girls Mysore Escorts ☎️9352988975 Two shot with one girl (...Just Call Vip call girls Mysore Escorts ☎️9352988975 Two shot with one girl (...
Just Call Vip call girls Mysore Escorts ☎️9352988975 Two shot with one girl (...
gajnagarg
 
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
amitlee9823
 
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
amitlee9823
 
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
amitlee9823
 
Just Call Vip call girls Palakkad Escorts ☎️9352988975 Two shot with one girl...
Just Call Vip call girls Palakkad Escorts ☎️9352988975 Two shot with one girl...Just Call Vip call girls Palakkad Escorts ☎️9352988975 Two shot with one girl...
Just Call Vip call girls Palakkad Escorts ☎️9352988975 Two shot with one girl...
gajnagarg
 
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
amitlee9823
 
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night StandCall Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
amitlee9823
 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night StandCall Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
amitlee9823
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
amitlee9823
 
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
karishmasinghjnh
 
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men 🔝mahisagar🔝 Esc...
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men  🔝mahisagar🔝   Esc...➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men  🔝mahisagar🔝   Esc...
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men 🔝mahisagar🔝 Esc...
amitlee9823
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
ZurliaSoop
 

Último (20)

CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
 
Just Call Vip call girls Mysore Escorts ☎️9352988975 Two shot with one girl (...
Just Call Vip call girls Mysore Escorts ☎️9352988975 Two shot with one girl (...Just Call Vip call girls Mysore Escorts ☎️9352988975 Two shot with one girl (...
Just Call Vip call girls Mysore Escorts ☎️9352988975 Two shot with one girl (...
 
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
 
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
 
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
 
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
 
Just Call Vip call girls Palakkad Escorts ☎️9352988975 Two shot with one girl...
Just Call Vip call girls Palakkad Escorts ☎️9352988975 Two shot with one girl...Just Call Vip call girls Palakkad Escorts ☎️9352988975 Two shot with one girl...
Just Call Vip call girls Palakkad Escorts ☎️9352988975 Two shot with one girl...
 
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
 
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night StandCall Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
 
Aspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - AlmoraAspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - Almora
 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
 
Predicting Loan Approval: A Data Science Project
Predicting Loan Approval: A Data Science ProjectPredicting Loan Approval: A Data Science Project
Predicting Loan Approval: A Data Science Project
 
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night StandCall Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
 
Detecting Credit Card Fraud: A Machine Learning Approach
Detecting Credit Card Fraud: A Machine Learning ApproachDetecting Credit Card Fraud: A Machine Learning Approach
Detecting Credit Card Fraud: A Machine Learning Approach
 
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
 
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men 🔝mahisagar🔝 Esc...
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men  🔝mahisagar🔝   Esc...➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men  🔝mahisagar🔝   Esc...
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men 🔝mahisagar🔝 Esc...
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 

2 vectors notes

  • 1. Optimization Models EE 127 / EE 227AT Laurent El Ghaoui EECS department UC Berkeley Spring 2015 Sp’15 1 / 46 LECTURE 2 Vectors and Functions Mathematicians are like Frenchmen: whatever you say to them, they translate into their own language, and turn it into something entirely di↵erent. Goethe Sp’15 2 / 46 Outline 1 Introduction Basics Examples Vector spaces 2 Inner product, angle, orthogonality 3 Projections 4 Functions and maps Hyperplanes and halfspaces Gradients Sp’15 3 / 46 Introduction A vector is a collection of numbers, arranged in a column or a row, which can be thought of as the coordinates of a point in n-dimensional space. Equipping vectors with sum and scalar multiplication allows to define notions such as independence, span, subspaces, and dimension. Further, the scalar product introduces a notion of angle between two vectors, and induces the concept of length, or norm. Via the scalar product, we can also view a vector as a linear function. We can compute the projection of a vector onto a line defined by another vector, onto a plane, or more generally onto a subspace. Projections can be viewed as a first elementary optimization problem (finding the point in a given set at minimum distance from a given point), and they constitute a basic ingredient in many processing and visualization techniques for high-dimensional data. Sp’15 4 / 46 Notes Notes Notes Notes
  • 2. Basics Notation We usually write vectors in column format: x = 2 6 6 6 4 x1 x2 ... xn 3 7 7 7 5 . Element xi is said to be the i-th component (or the i-th element, or entry) of vector x, and the number n of components is usually referred to as the dimension of x. When the components of x are real numbers, i.e. xi 2 R, then x is a real vector of dimension n, which we indicate with the notation x 2 Rn . We shall seldom need complex vectors, which are collections of complex numbers xi 2 C, i = 1, . . . , n. We denote the set of such vectors by Cn . To transform a column-vector x in row format and vice versa, we define an operation called transpose, denoted with a superscript > : x> = ⇥ x1 x2 · · · xn ⇤ ; x>> = x. Sp’15 5 / 46 Examples Example 1 (Bag-of-words representations of text) Consider the following text: “A (real) vector is just a collection of real numbers, referred to as the components (or, elements) of the vector; Rn denotes the set of all vectors with n elements. If x 2 Rn denotes a vector, we use subscripts to denote elements, so that xi is the i-th component of x. Vectors are arranged in a column, or a row. If x is a column vector, x> denotes the corresponding row vector, and vice-versa.” Row vector c = [5, 3, 3, 4] contains the number of times each word in the list V = {vector, elements, of, the} appears in the above paragraph. Dividing each entry in c by the total number of occurrences of words in the list (15, in this example), we obtain a vector x = [1/3, 1/5, 1/5, 4/15] of relative word frequencies. Frequency-based representation of text documents (bag-of-words). Sp’15 6 / 46 Examples Example 2 (Time series) A time series represents the evolution in (discrete) time of a physical or economical quantity. If x(k), k = 1, . . . , T, describes the numerical value of the quantity of interest at time k, then the whole time series, over the time horizon from 1 to T, can be represented as a T-dimensional vector x containing all the values of x(k), for k = 1 to k = T, that is x = [x(1) x(2) · · · x(T)]> 2 RT . Adjusted close price of the Dow Jones Industrial Average Index, over a 66 days period from April 19, 2012 to July 20, 2012. 10 20 30 40 50 60 1.22 1.24 1.26 1.28 1.3 1.32 1.34 1 66 Sp’15 7 / 46 Example 3 (Images) We are given a gray-scale image where each pixel has a certain value representing the level of darkness. We can arrange the image as a vector of pixels. . . . Figure : Row vector representation of an image. Sp’15 8 / 46 Notes Notes Notes Notes
  • 3. Vector spaces The operations of sum, di↵erence and scalar multiplication are defined in an obvious way for vectors: for any two vectors v(1) , v(2) having equal number of elements, we have that the sum v(1) + v(2) is simply a vector having as components the sum of the corresponding components of the addends, and the same holds for the di↵erence. If v is a vector and ↵ is a scalar (i.e., a real or complex number), then ↵v is obtained multiplying each component of v by ↵. If ↵ = 0, then ↵v is the zero vector, or origin. A vector space, X, is obtained by equipping vectors with the operations of addition and multiplication by a scalar. A simple example of a vector space is X = Rn , the space of n-tuples of real numbers. A less obvious example is the set of single-variable polynomials of a given degree. Sp’15 9 / 46 Subspaces and span A nonempty subset V of a vector space X is called a subspace of X if, for any scalars ↵, , x, y 2 V ) ↵x + y 2 V. In other words, V is “closed” under addition and scalar multiplication. A linear combination of a set of vectors S = {x(1) , . . . , x(m) } in a vector space X is a vector of the form ↵1x(1) + · · · + ↵mx(m) , where ↵1, . . . , ↵m are given scalars. The set of all possible linear combinations of the vectors in S = {x(1) , . . . , x(m) } forms a subspace, which is called the subspace generated by S, or the span of S, denoted with span(S). Given two subspaces X, Y in Rn , the direct sum of X, Y, which we denote by X Y, is the set of vectors of the form x + y, with x 2 X, y 2 Y. It is readily checked that X Y is itself a subspace. Sp’15 10 / 46 Bases and dimensions A collection x(1) , . . . , x(m) of vectors in a vector space X is said to be linearly independent if no vector in the collection can be expressed as a linear combination of the others. This is the same as the condition mX i=1 ↵i x(i) = 0 =) ↵ = 0. Given a subspace S of a vector space X, a basis of S is a set B of vectors of minimal cardinality, such that span(B) = S. The cardinality of a basis is called the dimension of S. If we have a basis {x(1) , . . . , x(d) } for a subspace S, then we can write any element in the subspace as a linear combination of elements in the basis. That is, any x 2 S can be written as x = dX i=1 ↵i x(i) , for appropriate scalars ↵i Sp’15 11 / 46 A ne sets An a ne set is a set of the form A = {x 2 X : x = v + x(0) , v 2 V}, where x(0) is a given point and V is a given subspace of X. Subspaces are just a ne spaces containing the origin. Geometrically, an a ne set is a flat passing through x(0) . The dimension of an a ne set A is defined as the dimension of its generating subspace V. A line is a one-dimensional a ne set. The line through x0 along direction u is the set L = {x 2 X : x = x0 + v, v 2 span(u)}, where in this case span(u) = { u : 2 R}. Sp’15 12 / 46 Notes Notes Notes Notes
  • 4. Euclidean length The Euclidean length of a vector x 2 Rn is the square-root of the sum of squares of the components of x, that is Euclidean length of x . = q x2 1 + x2 2 + · · · + x2 n . This formula is an obvious extension to the multidimensional case of the Pythagoras theorem in R2 . The Euclidean length represents the actual distance to be “travelled” for reaching point x from the origin 0, along the most direct way (the straight line passing through 0 and x). Sp’15 13 / 46 Basics Norms and `p norms A norm on a vector space X is a real-valued function with special properties that maps any element x 2 X into a real number kxk. Definition 1 A function from X to R is a norm, if kxk 0 8x 2 X, and kxk = 0 if and only if x = 0; kx + yk  kxk + kyk, for any x, y 2 X (triangle inequality); k↵xk = |↵|kxk, for any scalar ↵ and any x 2 X. `p norms are defined as kxkp . = nX k=1 |xk |p !1/p , 1  p < 1. Sp’15 14 / 46 Basics Norms and `p norms For p = 2 we obtain the standard Euclidean length kxk2 . = v u u t nX k=1 x2 k , or p = 1 we obtain the sum-of-absolute-values length kxk1 . = nX k=1 |xk |. The limit case p = 1 defines the `1 norm (max absolute value norm, or Chebyshev norm) kxk1 . = max k=1,...,n |xk |. The cardinality of a vector x is often called the `0 (pseudo) norm and denoted with kxk0. Sp’15 15 / 46 Inner product An inner product on a (real) vector space X is a real-valued function which maps any pair of elements x, y 2 X into a scalar denoted as hx, yi. The inner product satisfies the following axioms: for any x, y, z 2 X and scalar ↵ hx, xi 0; hx, xi = 0 if and only if x = 0; hx + y, zi = hx, zi + hy, zi; h↵x, yi = ↵hx, yi; hx, yi = hy, xi. A vector space equipped with an inner product is called an inner product space. The standard inner product defined in Rn is the “row-column” product of two vectors hx, yi = x> y = nX k=1 xk yk . The inner product induces a norm: kxk = p hx, xi. Sp’15 16 / 46 Notes Notes Notes Notes
  • 5. Angle between vectors 0 The angle between x and y is defined via the relation cos ✓ = x> y kxk2kyk2 . When x> y = 0, the angle between x and y is ✓ = ±90 , i.e., x, y are orthogonal. When the angle ✓ is 0 , or ±180 , then x is aligned with y, that is y = ↵x, for some scalar ↵, i.e., x and y are parallel. In this situation |x> y| achieves its maximum value |↵|kxk2 2. Sp’15 17 / 46 Cauchy-Schwartz and H¨older inequality Since | cos ✓|  1, it follows from the angle equation that |x> y|  kxk2kyk2, and this inequality is known as the Cauchy-Schwartz inequality. A generalization of this inequality involves general `p norms and it is known as the H¨older inequality. For any vectors x, y 2 Rn and for any p, q 1 such that 1/p + 1/q = 1, it holds that |x> y|  nX k=1 |xk yk |  kxkpkykq. Sp’15 18 / 46 Maximization of inner product over norm balls Our first optimization problem: max kxkp1 x> y. For p = 2: x⇤ 2 = y kyk2 , hence maxkxk21 x> y = kyk2. For p = 1: x⇤ 1 = sgn(y), and maxkxk11 x> y = Pn i=1 |yi | = kyk1. For p = 1: [x⇤ 1 ]i = ⇢ sgn(yi ) if i = m 0 otherwise , i = 1, . . . , n. We thus have maxkxk11 x> y = maxi |yi | = kyk1. Sp’15 19 / 46 Orthogonal vectors Generalizing the concept of orthogonality to generic inner product spaces, we say that two vectors x, y in an inner product space X are orthogonal if hx, yi = 0. Orthogonality of two vectors x, y 2 X is symbolized by x ? y. Nonzero vectors x(1) , . . . , x(d) are said to be mutually orthogonal if hx(i) , x(j) i = 0 whenever i 6= j. In words, each vector is orthogonal to all other vectors in the collection. Proposition 1 Mutually orthogonal vectors are linearly independent. A collection of vectors S = {x(1) , . . . , x(d) } is said to be orthonormal if, for i, j = 1, . . . , d, hx(i) , x(j) i = ⇢ 0 if i 6= j, 1 if i = j. In words, S is orthonormal if every element has unit norm, and all elements are orthogonal to each other. A collection of orthonormal vectors S forms an orthonormal basis for the span of S. Sp’15 20 / 46 Notes Notes Notes Notes
  • 6. Orthogonal complement A vector x 2 X is orthogonal to a subset S of an inner product space X if x ? s for all s 2 S. The set of vectors in X that are orthogonal to S is called the orthogonal complement of S, and it is denoted with S? ; 0 Theorem 1 (Orthogonal decomposition) If S is a subspace of an inner-product space X, then any vector x 2 X can be written in a unique way as the sum of an element in S and one in the orthogonal complement S? : X = S S? for any subspace S ✓ X. Sp’15 21 / 46 Projections The idea of projection is central in optimization, and it corresponds to the problem of finding a point on a given set that is closest (in norm) to a given point. Given a vector x in an inner product space X (say, e.g., X = Rn ) and a closed set S ✓ X, the projection of x onto S, denoted as ⇧S (x), is defined as the point in S at minimal distance from x: ⇧S (x) = arg min y2S ky xk, where the norm used here is the norm induced by the inner product, that is ky xk = p hy x, y xi. This simply reduces to the Euclidean norm, when using the standard inner product, in which case the projection is called Euclidean projection. Sp’15 22 / 46 Projections Theorem 2 (Projection Theorem) Let X be an inner product space, let x be a given element in X, and let S be a subspace of X. Then, there exists a unique vector x⇤ 2 S which is solution to the problem min y2S ky xk. Moreover, a necessary and su cient condition for x⇤ being the optimal solution for this problem is that x⇤ 2 S, (x x⇤ ) ? S. 0 Sp’15 23 / 46 Projections Corollary 1 (Projection on a ne set) Let X be an inner product space, let x be a given element in X, and let A = x(0) + S be the a ne set obtained by translating a given subspace S by a given vector x(0) . Then, there exists a unique vector x⇤ 2 A which is solution to the problem min y2A ky xk. Moreover, a necessary and su cient condition for x⇤ to be the optimal solution for this problem is that x⇤ 2 A, (x x⇤ ) ? S. 0 Sp’15 24 / 46 Notes Notes Notes Notes
  • 7. Projections Euclidean projection of a point onto a line Let p 2 Rn be a given point. We want to compute the Euclidean projection p⇤ of p onto a line L = {x0 + span(u)}, kuk2 = 1: p⇤ = arg min x2L kx pk2. Since any point x 2 L can be written as x = x0 + v, for some v 2 span(u), the above problem is equivalent to finding a value v⇤ for v, such that v⇤ = arg min v2span(u) kv (p x0)k2. Sp’15 25 / 46 Projections Euclidean projection of a point onto a line The solution must satisfy the orthogonality condition (z v⇤ )?u. Recalling that v⇤ = ⇤ u and u> u = kuk2 2 = 1, we hence have u> z u> v⇤ = 0 , u> z ⇤ = 0 , ⇤ = u> z = u> (p x0). The optimal point p⇤ is thus given by p⇤ = x0 + v⇤ = x0 + ⇤ u = x0 + u> (p x0)u, The squared distance from p to the line is kp p⇤ k2 2 = kp x0k2 2 ⇤2 = kp x0k2 2 (u> (p x0))2 . Sp’15 26 / 46 Projections Example: Senate voting data The above is a n ⇥ m matrix X of votes in the 2004-2006 US Senate, with n = # Senators, m = # bills. Sp’15 27 / 46 Projections Scoring senators Let’s find a way to score senators, by assigning to each a value f (x) = u> x + v where x 2 Rm is the vector of votes for a given senator (corresponds to a row in the voting matrix), and u 2 Rm , v 2 R. Without loss of generality, we enforce that the average values of f across Senators is zero, so that f (x) = u> (x ˆx), with ˆx 2 Rm is the vector of average votes (across Senators). Again WLOG, we can always scale u so that kuk2 = 1. In that case, the values of f correspond to the projections of the Senators points xi on the line passing through ˆx with direction given by u. Sp’15 28 / 46 Notes Notes Notes Notes
  • 8. Projections Example: Senate voting data Random direction Average direction (u ⇠ all ones) Clearly some directions are more informative than others . . . Sp’15 29 / 46 Projections Euclidean projection of a point onto an hyperplane A hyperplane is an a ne set defined as H = {z 2 Rn : a> z = b}, where a 6= 0 is called a normal direction of the hyperplane, since for any two vectors z1, z2 2 H it holds that (z1 z2)?a. Given p 2 Rn we want to determine the Euclidean projection p⇤ of p onto H. The projection theorem requires p p⇤ to be orthogonal to H. Since a is a direction orthogonal to H, the condition (p p⇤ )?H is equivalent to saying that p p⇤ = ↵a, for some ↵ 2 R. Sp’15 30 / 46 Projections Euclidean projection of a point onto an hyperplane To find ↵, consider that p⇤ 2 H, thus a> p⇤ = b, and multiply the previous equation on the left by a> , obtaining a> p b = ↵kak2 2 whereby ↵ = a> p b kak2 2 , and p⇤ = p a> p b kak2 2 a. The distance from p to H is kp p⇤ k2 = |↵| · kak2 = |a> p b| kak2 . Sp’15 31 / 46 Projections Projection on a vector span Suppose we have a basis for a subspace S ✓ X, that is S = span(x(1) , . . . , x(d) ). Given x 2 X, the Projection Theorem states that the unique projection x⇤ of x onto S is characterized by (x x⇤ ) ? S. Since x⇤ 2 S, we can write x⇤ as some (unknown) linear combination of the elements in the basis of S, that is x⇤ = dX i=1 ↵i x(i) . Then (x x⇤ ) ? S , hx x⇤ , x(k) i = 0, k = 1, . . . , d: dX i=1 ↵i hx(k) , x(i) i = hx(k) , xi, k = 1, . . . , d. Solving this system of linear equations provides the coe cients ↵, and hence the desired x⇤ . Sp’15 32 / 46 Notes Notes Notes Notes
  • 9. Projections Projection onto the span of orthonormal vectors If we have an orthonormal basis for a subspace S = span(S), then it is immediate to obtain the projection x⇤ of x onto that subspace. This is due to the fact that, in this case, the Gram system of equations immediately gives the coe cients ↵k = hx(k) , xi, i = 1, . . . , d. Therefore, we have that x⇤ = dX i=1 hx(i) , xix(i) . Given a basis S = {x(1) , . . . , x(d) } for a subspace S = span(S), there are numerical procedures to construct an orthonormal basis for the same subspace (e.g., the Gram-Schmidt procedure and QR factorization). Sp’15 33 / 46 Functions and maps A function takes a vector argument in Rn , and returns a unique value in R. We use the notation f : Rn ! R, to refer to a function with “input” space Rn . The “output” space for functions is R. For example, the function f : R2 ! R with values f (x) = p (x1 y1)2 + (x2 y2)2 gives the Euclidean distance from the point (x1, x2) to a given point (y1, y2). We allow functions to take infinity values. The domain of a function f , denoted dom f , is defined as the set of points where the function is finite. Sp’15 34 / 46 Functions and maps We usually reserve the term map to refer to vector-valued functions. That is, maps are functions that return more a vector of values. We use the notation f : Rn ! Rm , to refer to a map with input space Rn and output space Rm . The components of the map f are the (scalar-valued) functions fi , i = 1, . . . , m. Sp’15 35 / 46 Sets related to functions −1 −0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8 1 −1 −0.5 0 0.5 1 1.5 2 2.5 3 Consider a function f : Rn ! R. The graph and the epigraph of a function f : Rn ! R are both subsets of Rn+1 . The graph of f is the set of input-output pairs that f can attain, that is: graph f = n (x, f (x)) 2 Rn+1 : x 2 Rn o . The epigraph, denoted epi f , describes the set of input-output pairs that f can achieve, as well as “anything above”: epi f = n (x, t) 2 Rn+1 : x 2 Rn , t f (x) o . Sp’15 36 / 46 Notes Notes Notes Notes
  • 10. Sets related to functions −5 −4 −3 −2 −1 0 1 2 3 4 5 −5 −4 −3 −2 −1 0 1 2 3 4 5 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 A level set (or contour line) is the set of points that achieve exactly some value for the function f . For t 2 R, the t-level set of the function f is defined as Cf (t) = {x 2 Rn : f (x) = t} . The t-sublevel set of f is the set of points that achieve at most a certain value for f : Lf (t) = {x 2 Rn : f (x)  t} . Sp’15 37 / 46 Linear and a ne functions Linear functions are functions that preserve scaling and addition of the input argument. A function f : Rn ! R is linear if and only if 8x 2 Rn and ↵ 2 R, f (↵x) = ↵f (x); 8x1, x2 2 Rn , f (x1 + x2) = f (x1) + f (x2). A function f is a ne if and only if the function ˜f (x) = f (x) f (0) is linear (a ne = linear + constant). Consider the functions f1, f2, f3 : R2 ! R defined below: f1(x) = 3.2x1 + 2x2, f2(x) = 3.2x1 + 2x2 + 0.15, f3(x) = 0.001x2 2 + 2.3x1 + 0.3x2. The function f1 is linear; f2 is a ne; f3 is neither linear nor a ne (f3 is a quadratic function). Sp’15 38 / 46 Linear and a ne functions Linear or a ne functions can be conveniently defined by means of the standard inner product. A function f : Rn ! R is a ne if and only if it can be expressed as f (x) = a> x + b, for some unique pair (a, b), with a in Rn and b 2 R. The function is linear if and only if b = 0. Vector a 2 Rn can thus be viewed as a (linear) map from the “input” space Rn to the “output” space R. For any a ne function f , we can obtain a and b as follows: b = f (0), and ai = f (ei ) b, i = 1, . . . , n. Sp’15 39 / 46 Hyperplanes and halfspaces A hyperplane in Rn is a set of the form H = n x 2 Rn : a> x = b o , where a 2 Rn , a 6= 0, and b 2 R are given. 0 Equivalently, we can think of hyperplanes as the level sets of linear functions. When b = 0, the hyperplane is simply the set of points that are orthogonal to a (i.e., H is a (n 1)-dimensional subspace). Sp’15 40 / 46 Notes Notes Notes Notes
  • 11. Hyperplanes and halfspaces An hyperplane H separates the whole space in two regions: H = n x : a> x  b o , H++ = n x : a> x > b o . These regions are called halfspaces (H is a closed halfspace, H++ is an open halfspace). the halfspace H is the region delimited by the hyperplane H = {a> x = b} and lying in the direction opposite to vector a. Similarly, the halfspace H++ is the region lying above (i.e., in the direction of a) the hyperplane. Sp’15 41 / 46 Gradients The gradient of a function f : Rn ! R at a point x where f is di↵erentiable, denoted with rf (x), is a column vector of first derivatives of f with respect to x1, . . . , xn: rf (x) = h @f (x) @x1 · · · @f (x) @xn i> . When n = 1 (there is only one input variable), the gradient is simply the derivative. An a ne function f : Rn ! R, represented as f (x) = a> x + b, has a very simple gradient: rf (x) = a. Example 4 The distance function ⇢(x) = kx pk2 = pPn i=1(xi pi )2 has gradient r⇢(x) = 1 kx pk2 (x p). Sp’15 42 / 46 A ne approximation of nonlinear functions A non-linear function f : Rn ! R can be approximated locally via an a ne function, using a first-order Taylor series expansion. Specifically, if f is di↵erentiable at point x0, then for all points x in a neighborhood of x0, we have that f (x) = f (x0) + rf (x0)> (x x0) + ✏(x), where the error term ✏(x) goes to zero faster than first order, as x ! x0, that is lim x!x0 ✏(x) kx x0k2 = 0. In practice, this means that for x su ciently close to x0, we can write the approximation f (x) ' f (x0) + rf (x0)> (x x0). Sp’15 43 / 46 Geometric interpretation of the gradient The gradient of a function can be interpreted in the context of the level sets. Indeed, geometrically, the gradient of f at a point x0 is a vector rf (x0) perpendicular to the contour line of f at level ↵ = f (x0), pointing from x0 outwards the ↵-sublevel set (that is, it points towards higher values of the function). −5 0 5 −5 0 5 −0.5 0 0.5 1 1.5 2 −5 −4 −3 −2 −1 0 1 2 3 4 5 −5 −4 −3 −2 −1 0 1 2 3 4 5 x1 x2 x1 x2 −3.5 −3 −2.5 −2 −1.5 −1 −0.5 0 −2 −1.5 −1 −0.5 0 0.5 1 1.5 2 x1 x2 Sp’15 44 / 46 Notes Notes Notes Notes
  • 12. Geometric interpretation of the gradient The gradient rf (x0) also represents the direction along which the function has the maximum rate of increase (steepest ascent direction). Let v be a unit direction vector (i.e., kvk2 = 1), let ✏ 0, and consider moving away at distance ✏ from x0 along direction v, that is, consider a point x = x0 + ✏v. We have that f (x0 + ✏v) ' f (x0) + ✏rf (x0)> v, for ✏ ! 0, or, equivalently, lim ✏!0 f (x0 + ✏v) f (x0) ✏ = rf (x0)> v. Whenever ✏ > 0 and v is such that rf (x0)> v > 0, then f is increasing along the direction v, for small ✏. The inner product rf (x0)> v measures the rate of variation of f at x0, along direction v, and it is usually referred to as the directional derivative of f along v. Sp’15 45 / 46 Geometric interpretation of the gradient The rate of variation is thus zero, if v is orthogonal to rf (x0): along such a direction the function value remains constant (to first order), that is, this direction is tangent to the contour line of f at x0. Contrary, the rate of variation is maximal when v is parallel to rf (x0), hence along the normal direction to the contour line at x0. Sp’15 46 / 46 Notes Notes Notes Notes