Pranešimas VII Lietuvos jaunųjų mokslininkų konferencijoje „Operacijų tyrimas ir taikymai“
„Kompiuterininkų dienos – 2015“, Panevėžyje, KTU PTVF 2013-09-18
4. Multidimensional scaling
• Multidimensional scaling method is used to
visualize multidimensional data in a lower-
dimensional space, usually in two or three
dimensional space.
• Multidimensional scaling is widely applied in
many science fields, such as economics,
psychology, etc.
4
5. Dissimilarities and distances in
multidimensional scaling
• Let us assume that every 𝑛 dimensional vector
∈ 𝑅 𝑛
, 𝑖 ∈ {1, . . , 𝑚} corresponds to a lower
dimensional vector 𝑌 ∈ 𝑅 𝑑
, 𝑑 < 𝑛. The distance
between vectors and 𝑗 is called dissimilarity
and is marked as 𝛿 𝑗. The distance between
vectors 𝑌 and 𝑌𝑗 is 𝑑(𝑌 , 𝑌𝑗), 𝑖, 𝑗 = 1, … , 𝑚.
Distances between vectors usually are Minkovski
distances between points 𝑌 and 𝑌𝑗, which are
𝑑 𝑝 𝑌 , 𝑌𝑗 = ( 𝑘=
𝑚
|𝑌 𝑘 − 𝑌𝑗𝑘| 𝑝
)
1
𝑝.
5
6. Least squares stress function and
relative error
• If we use least squares stress function, then
we get this formula
𝑆 𝐱 = <𝑗
𝑛
𝑤 𝑗 <𝑗
𝑚
𝑘 − 𝑗𝑘 − 𝛿 𝑗 .
• Relative error is also often used. The formula
is this one
𝑓 = 𝑆( )/ <𝑗
𝑛
𝑤 𝑗 𝛿 𝑗.
6
7. Genetic algorithm
Genetic algorithms belong to the class of heuristic algorithms. There exist a
lot of genetic algorithms but all genetic algorithms have some traits that are
the same:
• Every genetic algorithm has a population which consists of individuals.
• Some part of the individuals are selected from the old population (parents
population).
• Crossover is carried out with the selected individuals. Some genes are taken
from one individual and some from the other individual. Then the genes are
interchanged between parents and a new individual or individuals are
created. New individuals are called offsprings.
• With a certain probability mutation is carried out. During mutation some
genes in the offspring are randomly changed.
• Individuals with the worst fitness values are eliminated from the
population.
7
9. Hybrid genetic algorithm
1:Randomly generate an initial population of ind individuals
2:Calculate relative error of each individual
3:for gen generations do
4: for the size of population (pop) do
5: randomly select two parents from the population
6: crossover these two individuals (parents) in order to get an offspring
7: with some mutation probability mutate the new offspring
8: perform local search
9: add the offspring to the offsprings’ population
10: end do
11: merge parents population with the offsprings’ population
12: delete half of the joint population by eliminating individuals that have the
worst fitness value
13:end do
9
10. Beverages data views
a) Relative error =
0.1574(the best relative
error)
b) Relative error =
0.1922(seperability of diet
drinks)
10
11. Relative errors for beverages dataset
Run 𝑓∗
Seperable by a line
0.1574 NO
0.1614 NO
0.1664 NO
0.1675 NO
0.1683 NO
0.1743 YES
0.1751 NO
0.1752 NO
0.1753 NO
0.1761 NO
0.1769 NO
0.1774 NO
0.1791 NO
0.1795 NO
0.1798 NO
Run 𝑓∗
Seperable by a line
0.1813 NO
0.1819 NO
0.1821 NO
0.1836 NO
0.1843 NO
0.1851 NO
0.1874 YES
0.1886 YES
0.1893 NO
0.1907 YES
0.1921 YES
0.1922 YES
0.1956 YES
0.1969 NO
0.2071 NO 11
12. Pharmacological data views
a)Relative error = 0.0517
(the best relative error)
b)Relative error =
0.1709(seperable images
of activating and blocking
ligands)
12
13. Relative errors for pharmacological
datasetRun 𝑓∗
Seperable by a line
0.0517 NO
0.0519 NO
0.0625 NO
0.0816 NO
0.0915 NO
0.0921 NO
0.0945 NO
0.1014 NO
0.1101 NO
0.1116 NO
0.1121 NO
0.1169 NO
0.1221 NO
0.1248 NO
0.1251 NO
Run 𝑓∗
Seperable by a line
0.1253 NO
0.1316 NO
0.1325 NO
0.1329 NO
0.1402 NO
0.1403 NO
0.1431 NO
0.1463 NO
0.1526 NO
0.1531 NO
0.1601 YES
0.1669 YES
0.1709 YES
0.1745 YES
0.1961 YES 13
14. Conclusions
In experimental investigation experiments with
beverages and pharmacological datasets were
carried out. It was noticed that with the smallest
relative error’s value we cannot always see
expected separation in images. When the
relative error’s value is higher we can sometimes
notice separation. It could be concluded that it
is useful to visualize several best local minima
instead of the global minimum only.
14