2. Outline
1 Why Liner Algebra
Why and What?
A Little Bit of History
2 The Beginning
Fields
3 Vector Space
Introduction
Some Notes in Notation
Use of Linear Algebra in Regression...
Sub-spaces and Linear Combinations
Recognizing Sub-spaces
Combinations
4 Basis and Dimensions
Basis
Coordinates
Basis and Dimensions
5 Application in Machine Learning
Feature Vector
Least Squared Error
2 / 59
3. Outline
1 Why Liner Algebra
Why and What?
A Little Bit of History
2 The Beginning
Fields
3 Vector Space
Introduction
Some Notes in Notation
Use of Linear Algebra in Regression...
Sub-spaces and Linear Combinations
Recognizing Sub-spaces
Combinations
4 Basis and Dimensions
Basis
Coordinates
Basis and Dimensions
5 Application in Machine Learning
Feature Vector
Least Squared Error
3 / 59
4. Introduction
What is this class about?
It is clear that the use of mathematics is essential for the data mining and
machine learning fields.
Therefore...
The understanding of Mathematical Modeling is part of the deal...
If you want to be
A Good Data Scientist!!!
4 / 59
5. Introduction
What is this class about?
It is clear that the use of mathematics is essential for the data mining and
machine learning fields.
Therefore...
The understanding of Mathematical Modeling is part of the deal...
If you want to be
A Good Data Scientist!!!
4 / 59
6. Introduction
What is this class about?
It is clear that the use of mathematics is essential for the data mining and
machine learning fields.
Therefore...
The understanding of Mathematical Modeling is part of the deal...
If you want to be
A Good Data Scientist!!!
4 / 59
7. Example
Imagine
A web surfer moves from a web page to another web page...
Question: How do you model this?
You can use a graph!!!
5 / 59
8. Example
Imagine
A web surfer moves from a web page to another web page...
Question: How do you model this?
You can use a graph!!!
1
8
2
9
10
4
7
14
11
12
6
3
13
5
5 / 59
10. Thus
We can build a matrix
M =
P11 P12 · · · P1N
P21 P22 · · · P2N
...
...
...
...
PN1 PN2 · · · PNN
(1)
Thus, it is possible to obtain certain information by looking at the
eigenvector and eigenvalues
These vectors vλs and values λ s have the property that
Mvλ = λvλ (2)
7 / 59
11. Thus
We can build a matrix
M =
P11 P12 · · · P1N
P21 P22 · · · P2N
...
...
...
...
PN1 PN2 · · · PNN
(1)
Thus, it is possible to obtain certain information by looking at the
eigenvector and eigenvalues
These vectors vλs and values λ s have the property that
Mvλ = λvλ (2)
7 / 59
12. This is the Basis of Page Rank in Google
For example
Look at this example...
8 / 59
13. Outline
1 Why Liner Algebra
Why and What?
A Little Bit of History
2 The Beginning
Fields
3 Vector Space
Introduction
Some Notes in Notation
Use of Linear Algebra in Regression...
Sub-spaces and Linear Combinations
Recognizing Sub-spaces
Combinations
4 Basis and Dimensions
Basis
Coordinates
Basis and Dimensions
5 Application in Machine Learning
Feature Vector
Least Squared Error
9 / 59
14. About 4000 years ago
Babylonians knew how to solve the following kind of systems
ax + by = c
dx + ey = f
As always the first steps in any field of knowledge tend to be slow
It is only after the death of Plato and Aristotle, that the Chinese (Nine
Chapters of the Mathematical Art 200 B.C.) were able to solve 3 × 3
system.
By working an “elimination method”
Similar to the one devised by Gauss 2000 years later for general systems.
10 / 59
15. About 4000 years ago
Babylonians knew how to solve the following kind of systems
ax + by = c
dx + ey = f
As always the first steps in any field of knowledge tend to be slow
It is only after the death of Plato and Aristotle, that the Chinese (Nine
Chapters of the Mathematical Art 200 B.C.) were able to solve 3 × 3
system.
By working an “elimination method”
Similar to the one devised by Gauss 2000 years later for general systems.
10 / 59
16. About 4000 years ago
Babylonians knew how to solve the following kind of systems
ax + by = c
dx + ey = f
As always the first steps in any field of knowledge tend to be slow
It is only after the death of Plato and Aristotle, that the Chinese (Nine
Chapters of the Mathematical Art 200 B.C.) were able to solve 3 × 3
system.
By working an “elimination method”
Similar to the one devised by Gauss 2000 years later for general systems.
10 / 59
17. Not only that
The Matrix
Gauss defined implicitly the concept of a Matrix as linear transformations
in his book “Disquisitions.”
The Final Definition of Matrix
It was introduced by Cayley in two papers in 1850 and 1858 respectively,
which allowed him to prove the important Cayley-Hamilton Theorem.
There is quite a lot
Kleiner, I., A History of Abstract Algebra (Birkhäuser Boston, 2007).
11 / 59
18. Not only that
The Matrix
Gauss defined implicitly the concept of a Matrix as linear transformations
in his book “Disquisitions.”
The Final Definition of Matrix
It was introduced by Cayley in two papers in 1850 and 1858 respectively,
which allowed him to prove the important Cayley-Hamilton Theorem.
There is quite a lot
Kleiner, I., A History of Abstract Algebra (Birkhäuser Boston, 2007).
11 / 59
19. Not only that
The Matrix
Gauss defined implicitly the concept of a Matrix as linear transformations
in his book “Disquisitions.”
The Final Definition of Matrix
It was introduced by Cayley in two papers in 1850 and 1858 respectively,
which allowed him to prove the important Cayley-Hamilton Theorem.
There is quite a lot
Kleiner, I., A History of Abstract Algebra (Birkhäuser Boston, 2007).
11 / 59
20. Matrix can help to represent many things
They are important for many calculations as
a11x1 + a12x2 + ... + a1nxn =b1,
a21x1 + a22x2 + ... + a2nxn =b2,
· · · · · · · · · · · · · · · · · · · · · · · ·
am1x1 + am2x2 + ... + amnxn =b2.
It is clear
We would like to collect those linear equations in a compact structure that
allows for simpler manipulation.
12 / 59
21. Matrix can help to represent many things
They are important for many calculations as
a11x1 + a12x2 + ... + a1nxn =b1,
a21x1 + a22x2 + ... + a2nxn =b2,
· · · · · · · · · · · · · · · · · · · · · · · ·
am1x1 + am2x2 + ... + amnxn =b2.
It is clear
We would like to collect those linear equations in a compact structure that
allows for simpler manipulation.
12 / 59
22. Therefore, we have
For example
x =
x1
x2
...
xn
, b =
b1
b2
...
bn
and A =
a11 a12 · · · a1n
a21 a22 · · · a2n
...
...
...
...
am1 am2 · · · amn
Using a little of notation
Ax = b
13 / 59
23. Therefore, we have
For example
x =
x1
x2
...
xn
, b =
b1
b2
...
bn
and A =
a11 a12 · · · a1n
a21 a22 · · · a2n
...
...
...
...
am1 am2 · · · amn
Using a little of notation
Ax = b
13 / 59
24. Outline
1 Why Liner Algebra
Why and What?
A Little Bit of History
2 The Beginning
Fields
3 Vector Space
Introduction
Some Notes in Notation
Use of Linear Algebra in Regression...
Sub-spaces and Linear Combinations
Recognizing Sub-spaces
Combinations
4 Basis and Dimensions
Basis
Coordinates
Basis and Dimensions
5 Application in Machine Learning
Feature Vector
Least Squared Error
14 / 59
25. Introduction
As always, we star with a simple fact
Everything is an element in a set.
For example
The set of Real Numbers R.
The set of n-tuples in Rn.
The set of Complex Number C.
15 / 59
26. Introduction
As always, we star with a simple fact
Everything is an element in a set.
For example
The set of Real Numbers R.
The set of n-tuples in Rn.
The set of Complex Number C.
15 / 59
27. Introduction
As always, we star with a simple fact
Everything is an element in a set.
For example
The set of Real Numbers R.
The set of n-tuples in Rn.
The set of Complex Number C.
15 / 59
28. Introduction
As always, we star with a simple fact
Everything is an element in a set.
For example
The set of Real Numbers R.
The set of n-tuples in Rn.
The set of Complex Number C.
15 / 59
29. Definition
We shall say that K is a field if it satisfies the following conditions for
the addition
Property Formalism
Addition is Commutative x + y = y + x for all x, y ∈ K
Addition is associative x + (y + z) = (x + y) + z for all x, y, z ∈ K
Existence of 0 x + 0 = x, for every x ∈ K
Existence of the inverse ∀x there is ∃ − x =⇒ x + (−x) = 0
16 / 59
30. Furthermore
We have the following properties for the product
Property Formalism
Product is Commutative xy = yx for all x, y ∈ K
Product is associative x (yz) = (xy) z for all x, y, z ∈ K
Existence of 1 1x = x1 = x, for every x ∈ K.
Existence of the inverse x−1
or 1
x
in K such that xx−1
= 1.
Multiplication is Distributive over addition x (y + z) = xy + xz, for all x, y, z ∈ K
17 / 59
31. Therefore
Examples
1 For example the reals R and the C.
2 In addition, we have the rationals Q too.
The elements of the field will be also called numbers
Thus, we will use this ideas to define the Vector Space V over a field K.
18 / 59
32. Therefore
Examples
1 For example the reals R and the C.
2 In addition, we have the rationals Q too.
The elements of the field will be also called numbers
Thus, we will use this ideas to define the Vector Space V over a field K.
18 / 59
33. Therefore
Examples
1 For example the reals R and the C.
2 In addition, we have the rationals Q too.
The elements of the field will be also called numbers
Thus, we will use this ideas to define the Vector Space V over a field K.
18 / 59
34. Then, we get a crazy moment
How do we relate these numbers to obtain certain properties
We have then the vector and matrix structures for this...
a11 · · · · · · a1n
a21 · · · · · · a2n
...
...
...
...
an1 · · · · · · ann
and
a11
a21
...
an1
19 / 59
35. Outline
1 Why Liner Algebra
Why and What?
A Little Bit of History
2 The Beginning
Fields
3 Vector Space
Introduction
Some Notes in Notation
Use of Linear Algebra in Regression...
Sub-spaces and Linear Combinations
Recognizing Sub-spaces
Combinations
4 Basis and Dimensions
Basis
Coordinates
Basis and Dimensions
5 Application in Machine Learning
Feature Vector
Least Squared Error
20 / 59
36. Vector Space V
Definition
A vector space V over the field K is a set of objects which can be added
and multiplied by elements of K.
Where
The sum of two elements of V is again an element of V .
The product of an element of V by an element of K is an element of
V .
21 / 59
37. Vector Space V
Definition
A vector space V over the field K is a set of objects which can be added
and multiplied by elements of K.
Where
The sum of two elements of V is again an element of V .
The product of an element of V by an element of K is an element of
V .
21 / 59
38. Vector Space V
Definition
A vector space V over the field K is a set of objects which can be added
and multiplied by elements of K.
Where
The sum of two elements of V is again an element of V .
The product of an element of V by an element of K is an element of
V .
21 / 59
39. Vector Space V
Definition
A vector space V over the field K is a set of objects which can be added
and multiplied by elements of K.
Where
The sum of two elements of V is again an element of V .
The product of an element of V by an element of K is an element of
V .
21 / 59
40. Properties
We have then
1 Given elements u, v, w of V , we have (u + v) + w = u + (v + w).
2 There is an element of V , denoted by O, such that
O + u = u + O = u for all elements u of V .
3 Given an element u of V , there exists an element −u in V such that
u + (−u) = O.
4 For all elements u, v of V , we have u + v = v + u.
5 For all elements u of V , we have 1 · u = u.
6 If c is a number, then c (u + v) = cu + cv.
7 if a, b are two numbers, then (ab) v = a (bv).
8 If a, b are two numbers, then (a + b) v = av + bv.
22 / 59
41. Properties
We have then
1 Given elements u, v, w of V , we have (u + v) + w = u + (v + w).
2 There is an element of V , denoted by O, such that
O + u = u + O = u for all elements u of V .
3 Given an element u of V , there exists an element −u in V such that
u + (−u) = O.
4 For all elements u, v of V , we have u + v = v + u.
5 For all elements u of V , we have 1 · u = u.
6 If c is a number, then c (u + v) = cu + cv.
7 if a, b are two numbers, then (ab) v = a (bv).
8 If a, b are two numbers, then (a + b) v = av + bv.
22 / 59
42. Properties
We have then
1 Given elements u, v, w of V , we have (u + v) + w = u + (v + w).
2 There is an element of V , denoted by O, such that
O + u = u + O = u for all elements u of V .
3 Given an element u of V , there exists an element −u in V such that
u + (−u) = O.
4 For all elements u, v of V , we have u + v = v + u.
5 For all elements u of V , we have 1 · u = u.
6 If c is a number, then c (u + v) = cu + cv.
7 if a, b are two numbers, then (ab) v = a (bv).
8 If a, b are two numbers, then (a + b) v = av + bv.
22 / 59
43. Properties
We have then
1 Given elements u, v, w of V , we have (u + v) + w = u + (v + w).
2 There is an element of V , denoted by O, such that
O + u = u + O = u for all elements u of V .
3 Given an element u of V , there exists an element −u in V such that
u + (−u) = O.
4 For all elements u, v of V , we have u + v = v + u.
5 For all elements u of V , we have 1 · u = u.
6 If c is a number, then c (u + v) = cu + cv.
7 if a, b are two numbers, then (ab) v = a (bv).
8 If a, b are two numbers, then (a + b) v = av + bv.
22 / 59
44. Properties
We have then
1 Given elements u, v, w of V , we have (u + v) + w = u + (v + w).
2 There is an element of V , denoted by O, such that
O + u = u + O = u for all elements u of V .
3 Given an element u of V , there exists an element −u in V such that
u + (−u) = O.
4 For all elements u, v of V , we have u + v = v + u.
5 For all elements u of V , we have 1 · u = u.
6 If c is a number, then c (u + v) = cu + cv.
7 if a, b are two numbers, then (ab) v = a (bv).
8 If a, b are two numbers, then (a + b) v = av + bv.
22 / 59
45. Properties
We have then
1 Given elements u, v, w of V , we have (u + v) + w = u + (v + w).
2 There is an element of V , denoted by O, such that
O + u = u + O = u for all elements u of V .
3 Given an element u of V , there exists an element −u in V such that
u + (−u) = O.
4 For all elements u, v of V , we have u + v = v + u.
5 For all elements u of V , we have 1 · u = u.
6 If c is a number, then c (u + v) = cu + cv.
7 if a, b are two numbers, then (ab) v = a (bv).
8 If a, b are two numbers, then (a + b) v = av + bv.
22 / 59
46. Properties
We have then
1 Given elements u, v, w of V , we have (u + v) + w = u + (v + w).
2 There is an element of V , denoted by O, such that
O + u = u + O = u for all elements u of V .
3 Given an element u of V , there exists an element −u in V such that
u + (−u) = O.
4 For all elements u, v of V , we have u + v = v + u.
5 For all elements u of V , we have 1 · u = u.
6 If c is a number, then c (u + v) = cu + cv.
7 if a, b are two numbers, then (ab) v = a (bv).
8 If a, b are two numbers, then (a + b) v = av + bv.
22 / 59
47. Properties
We have then
1 Given elements u, v, w of V , we have (u + v) + w = u + (v + w).
2 There is an element of V , denoted by O, such that
O + u = u + O = u for all elements u of V .
3 Given an element u of V , there exists an element −u in V such that
u + (−u) = O.
4 For all elements u, v of V , we have u + v = v + u.
5 For all elements u of V , we have 1 · u = u.
6 If c is a number, then c (u + v) = cu + cv.
7 if a, b are two numbers, then (ab) v = a (bv).
8 If a, b are two numbers, then (a + b) v = av + bv.
22 / 59
48. Outline
1 Why Liner Algebra
Why and What?
A Little Bit of History
2 The Beginning
Fields
3 Vector Space
Introduction
Some Notes in Notation
Use of Linear Algebra in Regression...
Sub-spaces and Linear Combinations
Recognizing Sub-spaces
Combinations
4 Basis and Dimensions
Basis
Coordinates
Basis and Dimensions
5 Application in Machine Learning
Feature Vector
Least Squared Error
23 / 59
49. Notation
First, u + (−v)
As u − v.
For O
We will write sometimes 0.
The elements in the field K
They can receive the name of number or scalar.
24 / 59
50. Notation
First, u + (−v)
As u − v.
For O
We will write sometimes 0.
The elements in the field K
They can receive the name of number or scalar.
24 / 59
51. Notation
First, u + (−v)
As u − v.
For O
We will write sometimes 0.
The elements in the field K
They can receive the name of number or scalar.
24 / 59
52. Outline
1 Why Liner Algebra
Why and What?
A Little Bit of History
2 The Beginning
Fields
3 Vector Space
Introduction
Some Notes in Notation
Use of Linear Algebra in Regression...
Sub-spaces and Linear Combinations
Recognizing Sub-spaces
Combinations
4 Basis and Dimensions
Basis
Coordinates
Basis and Dimensions
5 Application in Machine Learning
Feature Vector
Least Squared Error
25 / 59
54. Therefore
We can represent these relations as vectors
Squared Feet
Price
=
2104
400
,
1800
460
,
1600
300
, ...
Thus, we can start using
All the tools that Linear Algebra can provide!!!
27 / 59
55. Therefore
We can represent these relations as vectors
Squared Feet
Price
=
2104
400
,
1800
460
,
1600
300
, ...
Thus, we can start using
All the tools that Linear Algebra can provide!!!
27 / 59
56. Thus
We can adjust a line/hyper-plane to be able to forecast prices
28 / 59
57. Thus, Our Objective
To find such hyper-plane
To do forecasting on the prices of a house given its surface size!!!
Here, where “Learning” comes around
Basically, the process defined in Machine Learning!!!
29 / 59
58. Thus, Our Objective
To find such hyper-plane
To do forecasting on the prices of a house given its surface size!!!
Here, where “Learning” comes around
Basically, the process defined in Machine Learning!!!
29 / 59
59. Outline
1 Why Liner Algebra
Why and What?
A Little Bit of History
2 The Beginning
Fields
3 Vector Space
Introduction
Some Notes in Notation
Use of Linear Algebra in Regression...
Sub-spaces and Linear Combinations
Recognizing Sub-spaces
Combinations
4 Basis and Dimensions
Basis
Coordinates
Basis and Dimensions
5 Application in Machine Learning
Feature Vector
Least Squared Error
30 / 59
60. Sub-spaces
Definition
Let V a vector space and W ⊆ V , thus W is a subspace if:
1 If v, w ∈ W, then v + w ∈ W.
2 If v ∈ W and c ∈ K, then cv ∈ W.
3 The element 0 ∈ V is also an element of W.
31 / 59
61. Sub-spaces
Definition
Let V a vector space and W ⊆ V , thus W is a subspace if:
1 If v, w ∈ W, then v + w ∈ W.
2 If v ∈ W and c ∈ K, then cv ∈ W.
3 The element 0 ∈ V is also an element of W.
31 / 59
62. Sub-spaces
Definition
Let V a vector space and W ⊆ V , thus W is a subspace if:
1 If v, w ∈ W, then v + w ∈ W.
2 If v ∈ W and c ∈ K, then cv ∈ W.
3 The element 0 ∈ V is also an element of W.
31 / 59
63. Sub-spaces
Definition
Let V a vector space and W ⊆ V , thus W is a subspace if:
1 If v, w ∈ W, then v + w ∈ W.
2 If v ∈ W and c ∈ K, then cv ∈ W.
3 The element 0 ∈ V is also an element of W.
31 / 59
64. Outline
1 Why Liner Algebra
Why and What?
A Little Bit of History
2 The Beginning
Fields
3 Vector Space
Introduction
Some Notes in Notation
Use of Linear Algebra in Regression...
Sub-spaces and Linear Combinations
Recognizing Sub-spaces
Combinations
4 Basis and Dimensions
Basis
Coordinates
Basis and Dimensions
5 Application in Machine Learning
Feature Vector
Least Squared Error
32 / 59
65. Some ways of recognizing Sub-spaces
Theorem
A non-empty subset W of V is a subspace of V if and only if for each pair
of vectors v, w ∈ W and each scalar c ∈ K the vector cv + w ∈ W.
33 / 59
67. Outline
1 Why Liner Algebra
Why and What?
A Little Bit of History
2 The Beginning
Fields
3 Vector Space
Introduction
Some Notes in Notation
Use of Linear Algebra in Regression...
Sub-spaces and Linear Combinations
Recognizing Sub-spaces
Combinations
4 Basis and Dimensions
Basis
Coordinates
Basis and Dimensions
5 Application in Machine Learning
Feature Vector
Least Squared Error
35 / 59
68. Linear Combinations
Definition
Let V an arbitrary vector space, and let v1, v2, ..., vn ∈ V and
x1, x2, ..., xn ∈ K. Then, an expression like
x1v1 + x2v2 + ... + xnvn (3)
is called a linear combination of v1, v2, ..., vn.
36 / 59
69. Classic Examples
Endmember Representation in Hyperspectral Images
Look at the board
Geometric Representation of addition of forces in Physics
Look at the board!!
37 / 59
70. Classic Examples
Endmember Representation in Hyperspectral Images
Look at the board
Geometric Representation of addition of forces in Physics
Look at the board!!
37 / 59
71. Properties and Definitions
Theorem
Let V be a vector space over the field K. The intersection of any
collection of sub-spaces of V is a subspace of V .
Definition
Let S be a set of vectors in a vector space V .
The sub-space spanned by S is defined as the intersection W of all
sub-spaces of V which contains S.
When S is a finite set of vectors, S = {v1, v2, . . . , vn}, we shall
simply call W the sub-space spanned by the vectors v1, v2, . . . , vn.
38 / 59
72. Properties and Definitions
Theorem
Let V be a vector space over the field K. The intersection of any
collection of sub-spaces of V is a subspace of V .
Definition
Let S be a set of vectors in a vector space V .
The sub-space spanned by S is defined as the intersection W of all
sub-spaces of V which contains S.
When S is a finite set of vectors, S = {v1, v2, . . . , vn}, we shall
simply call W the sub-space spanned by the vectors v1, v2, . . . , vn.
38 / 59
73. Properties and Definitions
Theorem
Let V be a vector space over the field K. The intersection of any
collection of sub-spaces of V is a subspace of V .
Definition
Let S be a set of vectors in a vector space V .
The sub-space spanned by S is defined as the intersection W of all
sub-spaces of V which contains S.
When S is a finite set of vectors, S = {v1, v2, . . . , vn}, we shall
simply call W the sub-space spanned by the vectors v1, v2, . . . , vn.
38 / 59
74. Properties and Definitions
Theorem
Let V be a vector space over the field K. The intersection of any
collection of sub-spaces of V is a subspace of V .
Definition
Let S be a set of vectors in a vector space V .
The sub-space spanned by S is defined as the intersection W of all
sub-spaces of V which contains S.
When S is a finite set of vectors, S = {v1, v2, . . . , vn}, we shall
simply call W the sub-space spanned by the vectors v1, v2, . . . , vn.
38 / 59
75. We get the following Theorem
Theorem
The subspace spanned by S = ∅ is the set of all linear combinations of
vectors in S.
39 / 59
76. Outline
1 Why Liner Algebra
Why and What?
A Little Bit of History
2 The Beginning
Fields
3 Vector Space
Introduction
Some Notes in Notation
Use of Linear Algebra in Regression...
Sub-spaces and Linear Combinations
Recognizing Sub-spaces
Combinations
4 Basis and Dimensions
Basis
Coordinates
Basis and Dimensions
5 Application in Machine Learning
Feature Vector
Least Squared Error
40 / 59
77. Linear Independence
Definition
Let V be a vector space over a field K, and let v1, v2, ..., vn ∈ V . We
have that v1, v2, ..., vn are linearly dependent over K if there are elements
a1, a2, ..., an ∈ K not all equal to 0 such that
a1v1 + a2v2 + ... + anvn = O
Thus
Therefore, if there are not such numbers, then we say that v1, v2, ..., vn
are linearly independent.
We have the following
Example!!!
41 / 59
78. Linear Independence
Definition
Let V be a vector space over a field K, and let v1, v2, ..., vn ∈ V . We
have that v1, v2, ..., vn are linearly dependent over K if there are elements
a1, a2, ..., an ∈ K not all equal to 0 such that
a1v1 + a2v2 + ... + anvn = O
Thus
Therefore, if there are not such numbers, then we say that v1, v2, ..., vn
are linearly independent.
We have the following
Example!!!
41 / 59
79. Linear Independence
Definition
Let V be a vector space over a field K, and let v1, v2, ..., vn ∈ V . We
have that v1, v2, ..., vn are linearly dependent over K if there are elements
a1, a2, ..., an ∈ K not all equal to 0 such that
a1v1 + a2v2 + ... + anvn = O
Thus
Therefore, if there are not such numbers, then we say that v1, v2, ..., vn
are linearly independent.
We have the following
Example!!!
41 / 59
80. Basis
Definition
If elements v1, v2, ..., vn generate V and in addition are linearly
independent, then {v1, v2, ..., vn} is called a basis of V . In other words
the elements v1, v2, ..., vn form a basis of V .
Examples
The Classic Ones!!!
42 / 59
81. Basis
Definition
If elements v1, v2, ..., vn generate V and in addition are linearly
independent, then {v1, v2, ..., vn} is called a basis of V . In other words
the elements v1, v2, ..., vn form a basis of V .
Examples
The Classic Ones!!!
42 / 59
82. Outline
1 Why Liner Algebra
Why and What?
A Little Bit of History
2 The Beginning
Fields
3 Vector Space
Introduction
Some Notes in Notation
Use of Linear Algebra in Regression...
Sub-spaces and Linear Combinations
Recognizing Sub-spaces
Combinations
4 Basis and Dimensions
Basis
Coordinates
Basis and Dimensions
5 Application in Machine Learning
Feature Vector
Least Squared Error
43 / 59
83. Coordinates
Theorem
Let V be a vector space. Let v1, v2, ..., vn be linearly independent elements
of V. Let x1, . . . , xn and y1, . . . , yn be numbers. Suppose that we have
x1v1 + x2v2 + · · · + xnvn = y1v1 + y2v2 + · · · + ynvn (4)
Then, xi = yi for all i = 1, . . . , n.
44 / 59
84. Coordinates
Let V be a vector space, and let {v1, v2, ..., vn} be a basis of V
For all v ∈ V , v = x1v1 + x2v2 + · · · + xnvn.
Thus, this n-tuple is uniquely determined by v
We will call (x1, x2, . . . , xn) as the coordinates of v with respect to the
basis.
The n−tuple X = (x1, x2, . . . , xn)
It is the coordinate vector of v with respect to the basis {v1, v2, ..., vn} .
45 / 59
85. Coordinates
Let V be a vector space, and let {v1, v2, ..., vn} be a basis of V
For all v ∈ V , v = x1v1 + x2v2 + · · · + xnvn.
Thus, this n-tuple is uniquely determined by v
We will call (x1, x2, . . . , xn) as the coordinates of v with respect to the
basis.
The n−tuple X = (x1, x2, . . . , xn)
It is the coordinate vector of v with respect to the basis {v1, v2, ..., vn} .
45 / 59
86. Coordinates
Let V be a vector space, and let {v1, v2, ..., vn} be a basis of V
For all v ∈ V , v = x1v1 + x2v2 + · · · + xnvn.
Thus, this n-tuple is uniquely determined by v
We will call (x1, x2, . . . , xn) as the coordinates of v with respect to the
basis.
The n−tuple X = (x1, x2, . . . , xn)
It is the coordinate vector of v with respect to the basis {v1, v2, ..., vn} .
45 / 59
87. Outline
1 Why Liner Algebra
Why and What?
A Little Bit of History
2 The Beginning
Fields
3 Vector Space
Introduction
Some Notes in Notation
Use of Linear Algebra in Regression...
Sub-spaces and Linear Combinations
Recognizing Sub-spaces
Combinations
4 Basis and Dimensions
Basis
Coordinates
Basis and Dimensions
5 Application in Machine Learning
Feature Vector
Least Squared Error
46 / 59
88. Properties of a Basis
Theorem - (Limit in the size of the basis)
Let V be a vector space over a field K with a basis {v1, v2, ..., vm}. Let
w1, w2, ..., wn be elements of V , and assume that n > m. Then
w1, w2, ..., wn are linearly dependent.
Examples
Matrix Space
Canonical Space vectors
etc
47 / 59
89. Properties of a Basis
Theorem - (Limit in the size of the basis)
Let V be a vector space over a field K with a basis {v1, v2, ..., vm}. Let
w1, w2, ..., wn be elements of V , and assume that n > m. Then
w1, w2, ..., wn are linearly dependent.
Examples
Matrix Space
Canonical Space vectors
etc
47 / 59
90. Properties of a Basis
Theorem - (Limit in the size of the basis)
Let V be a vector space over a field K with a basis {v1, v2, ..., vm}. Let
w1, w2, ..., wn be elements of V , and assume that n > m. Then
w1, w2, ..., wn are linearly dependent.
Examples
Matrix Space
Canonical Space vectors
etc
47 / 59
91. Some Basic Definitions
We will define the dimension of a vector space V over K
As the number of elements in the basis.
Denoted by dimK V , or simply dim V
Therefore
A vector space with a basis consisting of a finite number of elements, or
the zero vector space, is called a finite dimensional.
Now
Is this number unique?
48 / 59
92. Some Basic Definitions
We will define the dimension of a vector space V over K
As the number of elements in the basis.
Denoted by dimK V , or simply dim V
Therefore
A vector space with a basis consisting of a finite number of elements, or
the zero vector space, is called a finite dimensional.
Now
Is this number unique?
48 / 59
93. Some Basic Definitions
We will define the dimension of a vector space V over K
As the number of elements in the basis.
Denoted by dimK V , or simply dim V
Therefore
A vector space with a basis consisting of a finite number of elements, or
the zero vector space, is called a finite dimensional.
Now
Is this number unique?
48 / 59
94. Maximal Set of Linearly Independent Elements
Theorem
Let V be a vector space, and {v1, v2, ..., vn} a maximal set of linearly
independent elements of V . Then, {v1, v2, ..., vn} is a basis of V .
Theorem
Let V be a vector space of dimension n, and let v1, v2, ..., vn be linearly
independent elements of V . Then, v1, v2, ..., vn constitutes a basis of V .
49 / 59
95. Maximal Set of Linearly Independent Elements
Theorem
Let V be a vector space, and {v1, v2, ..., vn} a maximal set of linearly
independent elements of V . Then, {v1, v2, ..., vn} is a basis of V .
Theorem
Let V be a vector space of dimension n, and let v1, v2, ..., vn be linearly
independent elements of V . Then, v1, v2, ..., vn constitutes a basis of V .
49 / 59
96. Maximal Set of Linearly Independent Elements
Theorem
Let V be a vector space, and {v1, v2, ..., vn} a maximal set of linearly
independent elements of V . Then, {v1, v2, ..., vn} is a basis of V .
Theorem
Let V be a vector space of dimension n, and let v1, v2, ..., vn be linearly
independent elements of V . Then, v1, v2, ..., vn constitutes a basis of V .
49 / 59
97. Equality between Basis
Corollary
Let V be a vector space and let W be a subspace. If dim W = dim V
then V = W.
Proof
At the Board...
Corollary
Let V be a vector space of dimension n. Let r be a positive integer with
r < n, and let v1, v2, ..., vr be linearly independent elements of V. Then
one can find elements vr+1, vr+2, ..., vn such that {v1, v2, ..., vn} is a
basis of V .
Proof
At the Board...
50 / 59
98. Equality between Basis
Corollary
Let V be a vector space and let W be a subspace. If dim W = dim V
then V = W.
Proof
At the Board...
Corollary
Let V be a vector space of dimension n. Let r be a positive integer with
r < n, and let v1, v2, ..., vr be linearly independent elements of V. Then
one can find elements vr+1, vr+2, ..., vn such that {v1, v2, ..., vn} is a
basis of V .
Proof
At the Board...
50 / 59
99. Equality between Basis
Corollary
Let V be a vector space and let W be a subspace. If dim W = dim V
then V = W.
Proof
At the Board...
Corollary
Let V be a vector space of dimension n. Let r be a positive integer with
r < n, and let v1, v2, ..., vr be linearly independent elements of V. Then
one can find elements vr+1, vr+2, ..., vn such that {v1, v2, ..., vn} is a
basis of V .
Proof
At the Board...
50 / 59
100. Equality between Basis
Corollary
Let V be a vector space and let W be a subspace. If dim W = dim V
then V = W.
Proof
At the Board...
Corollary
Let V be a vector space of dimension n. Let r be a positive integer with
r < n, and let v1, v2, ..., vr be linearly independent elements of V. Then
one can find elements vr+1, vr+2, ..., vn such that {v1, v2, ..., vn} is a
basis of V .
Proof
At the Board...
50 / 59
101. Finally
Theorem
Let V be a vector space having a basis consisting of n elements. Let W be
a subspace which does not consist of O alone. Then W has a basis, and
the dimension of W is ≤ n.
Proof
At the Board...
51 / 59
102. Finally
Theorem
Let V be a vector space having a basis consisting of n elements. Let W be
a subspace which does not consist of O alone. Then W has a basis, and
the dimension of W is ≤ n.
Proof
At the Board...
51 / 59
103. Outline
1 Why Liner Algebra
Why and What?
A Little Bit of History
2 The Beginning
Fields
3 Vector Space
Introduction
Some Notes in Notation
Use of Linear Algebra in Regression...
Sub-spaces and Linear Combinations
Recognizing Sub-spaces
Combinations
4 Basis and Dimensions
Basis
Coordinates
Basis and Dimensions
5 Application in Machine Learning
Feature Vector
Least Squared Error
52 / 59
104. Feature Vector
Definition
A feature vector is a n-dimensional vector of numerical features that
represent an object.
Why is this important?
This allows to use linear algebra to represent basic classification algorithms
because
The tuples {(x, y) |x ∈ Kn, y ∈ K} can be easily used to design
specific algorithms.
53 / 59
105. Feature Vector
Definition
A feature vector is a n-dimensional vector of numerical features that
represent an object.
Why is this important?
This allows to use linear algebra to represent basic classification algorithms
because
The tuples {(x, y) |x ∈ Kn, y ∈ K} can be easily used to design
specific algorithms.
53 / 59
106. Outline
1 Why Liner Algebra
Why and What?
A Little Bit of History
2 The Beginning
Fields
3 Vector Space
Introduction
Some Notes in Notation
Use of Linear Algebra in Regression...
Sub-spaces and Linear Combinations
Recognizing Sub-spaces
Combinations
4 Basis and Dimensions
Basis
Coordinates
Basis and Dimensions
5 Application in Machine Learning
Feature Vector
Least Squared Error
54 / 59
107. Least Squared Error
We need to fit a series of points against a certain function
We want
The general problem is given a set of functions f1, f2, ..., fK find values of
coefficients a1, a2, ..., ak such that the linear combination:
y = a1f1 (x) + · · · + aKfK (x) (5)
55 / 59
108. Least Squared Error
We need to fit a series of points against a certain function
We want
The general problem is given a set of functions f1, f2, ..., fK find values of
coefficients a1, a2, ..., ak such that the linear combination:
y = a1f1 (x) + · · · + aKfK (x) (5)
55 / 59
109. Thus
We have that given the datasets {(x1, y1) , ..., (xN , yN )}
x =
1
N
N
i=1
xi. (6)
Thus, we have the following problem
A Possible High Variance on the Data itself
Variance
σ2
x =
1
N
N
i=1
(xi − x) (7)
56 / 59
110. Thus
We have that given the datasets {(x1, y1) , ..., (xN , yN )}
x =
1
N
N
i=1
xi. (6)
Thus, we have the following problem
A Possible High Variance on the Data itself
Variance
σ2
x =
1
N
N
i=1
(xi − x) (7)
56 / 59
111. Thus
We have that given the datasets {(x1, y1) , ..., (xN , yN )}
x =
1
N
N
i=1
xi. (6)
Thus, we have the following problem
A Possible High Variance on the Data itself
Variance
σ2
x =
1
N
N
i=1
(xi − x) (7)
56 / 59
112. Now
Assume
A linear equation y = ax + b, then y − (ax + b) ≈ 0.
We get a series of errors given the following observations
{(x1, y1) , ..., (xN , yN )}
{y1 − (ax1 + b) , ..., yN − (axN + b)} .
Then, the mean should be really small (If it is a good fit)
σ2
y−(ax+b) =
1
N
N
i=1
(yi − (axi + b))2
(8)
57 / 59
113. Now
Assume
A linear equation y = ax + b, then y − (ax + b) ≈ 0.
We get a series of errors given the following observations
{(x1, y1) , ..., (xN , yN )}
{y1 − (ax1 + b) , ..., yN − (axN + b)} .
Then, the mean should be really small (If it is a good fit)
σ2
y−(ax+b) =
1
N
N
i=1
(yi − (axi + b))2
(8)
57 / 59
114. Now
Assume
A linear equation y = ax + b, then y − (ax + b) ≈ 0.
We get a series of errors given the following observations
{(x1, y1) , ..., (xN , yN )}
{y1 − (ax1 + b) , ..., yN − (axN + b)} .
Then, the mean should be really small (If it is a good fit)
σ2
y−(ax+b) =
1
N
N
i=1
(yi − (axi + b))2
(8)
57 / 59
115. Thus
We can define the following error Ei (a, b) = y − (ax + b)
E (a, b) =
N
i=1
Ei (a, b) =
N
i=1
(yi − (axi + b)) (9)
We want to minimize the previous equation
∂E
∂a
= 0,
∂E
∂b
= 0.
58 / 59
116. Thus
We can define the following error Ei (a, b) = y − (ax + b)
E (a, b) =
N
i=1
Ei (a, b) =
N
i=1
(yi − (axi + b)) (9)
We want to minimize the previous equation
∂E
∂a
= 0,
∂E
∂b
= 0.
58 / 59
117. Finally
Look at the Board
We need to obtain the necessary equations.
59 / 59