SlideShare uma empresa Scribd logo
1 de 6
Baixar para ler offline
Data Mining Using Synergies Between Self-Organizing Maps and
Inductive Learning of Fuzzy Rules
MARIO DROBICS, ULRICH BODENHOFER, WERNER WINIWARTER

Software Competence Center Hagenberg
A-4232 Hagenberg, Austria
{mario.drobics/ulrich.bodenhofer!wemer. winiwarter} @scch.at
ERICH PETER KLEMENT

Fuzzy Logic Laboratorium Linz-Hagenberg
Johannes Kepler Universitat Linz
A-4040 Linz, Austria
klement@flll.uni-linz.ac.at
Abstract
Identifying structures in large data sets raises a number of problems. On the one hand, many methods cannot be applied to larger data sets, while, on the other
hand, the results are often hard to interpret. We address
these problems by a novel three-stage approach. First.
we compute a small representation of the input data using a self-organizing map. This reduces the amount of
data and allows us to create two-dimensional plots of
the data. Then we use this preprocessed information to
identify clusters of similarity. Finally, inductive learning methods are applied to generate sets of fuzzy descriptions of these clusters. This approach is applied to
three case studies, including image data and real-world
data sets. The results illustrate the generality and intuitiveness of the proposed method.

1. Introduction
In this paper, we address the problem of identifying and
describing regions of interest in large data sets. Decision trees (171 or other inductive learning methods
which are often used for this task are not always sufficient, or not even applicable. Clustering methods [2],
on the other hand, do not offer sufficient insight into the
data structure as their results are often hard to display
and interpret.
This contribution is devoted to a novel three-stage faulttolerant approach to data mining which is able to produce qualitative infonnationfrom very large data sets.
In the first stage, self-organizing maps (SOMs) [ !3)
are used to compress the data to a reasonable amount
of nodes which still contain all significant information

0-7803-7078-3/01/$10.00 (C)20011EEE.

while eliminating possible data faults, such as noise,
outliers, and missing values. While, for the purpose of
data reduction, other methods may be used as well [7),
SOMs have the advantages that they preserve the topology of the data space and allow to display the results in
two-dimensional plots [6].
The second step is concerned with identifying significant regions of interest within the SOM nodes using a
modified fuzzy c-means algorithm [3 , lO]. In our variant, the two main drawbacks of the standard fuzzy cmeans algorithm are resolved in a very simple and elegant way. Firstly, the clustering is not influenced by
outliers anymore, as they have been eliminated by the
SOM in the pre-processing steP,. Secondly, initialization
is handled by computing a crisp Ward clustering [2] first
to find initial values for the cluster centers.
In the third and last stage, we create linguistic descriptions of the centers of these clusters which help the analyst to interpret the results. As the number of data
samples under consideration has been reduced tremendously by using the SOM nodes, we are able to apply
inductive learning methods (16] to find fuzzy descriptions of the clusters . Using the previously found clusters, we can make use of supervised learning methods in
an unsupervised environment by considering the cluster
membership as goal parameter. Descriptions are composed using fuzzy predicates of the form "x is/is not/is
at least/is at most A", where xis the parameter under
consideration, and A is a linguistic expression modeled
by a fuzzy set.
We applied this method to several real-world data sets.
Through the combination of the two-dimensional projection of the data using the SOM and the fuzzy descriptions generated, the results are not only significant and
accurate, but also very intuitive.

Page:l780
2. Preprocessing
To reduce computational effort without loosing significant information, we use sel[-organizi11g maps to create
a mapping of the data space. Self-organizing maps use
unsupervised competitive teaming to create a topologypreserving map of the data.
Let us assume that we are given a data set / consisting of
K samples I = { x 1, • • • , xK} from an 11-dimensional real
space 1( <:;: JR!". A SOM is defined as a mapping from
the data space 1( to an m-dimensional discrete output
space C (most often m = 2). The output space consists
of a finite number Nc of neurons. To each neurons E C
of the output space. we associate a parametric weight
vecTOr in the input space w·' = ( w! ,
w~ ) r E 1(.

w ... ,
z,

The mapping
4> : 1(-+C

x >-> cj>(x)
defines the neural map from the input space 1( to the
topological structure C; cj>(x) is defined as

cj> (x) = argmin; {ll x - wi ll v} ,
with
2

II X 11 V

=

2.:" !!!!:_2
r=l V, r
"'"
Lr=l

m,

'

where lllr denotes an individual weight parameter which
allows us to take different importances of the input parameters into account. The factor V, stands for the variance of the values (x) , .. . ,-t{). In other words, the function <P maps every input sample to that neuron of the map
whose weight vector is closest to the input with respect
to the distance measure Jl .ll v.
The goal of the learning process is to reduce the overall
distortion error

An additional benefit of this preprocessing step is noise
reduction. As the weight vector of each node represents
the average of several data records, outliers and small
variations in the input are eliminated.

3. Clustering
Now we try to identify possible regions of interest in
the map. Regions of interest are characterized through
nodes whose weight vectors are close to each other in
the data space. Additionally. we use the information
provided by the map. Doing so, clusters are generated
that are not onl y homogeneous with respect to the input
space, but also within the map. To find regions of interest, we use a modified fuz zy c -means [3 ) clustering.
Using this algorithm, a fuzzy segmentation of the data
space is created.
For initializing the fuzzy clustering, we first perform
a crisp Ward clustering [2). Ward clustering is an agglomerative clustering method, which is, however, often not applicable due to its high computational effort . Through the data reduction done in the preprocessing step, we are not only able to apply this clustering method, but to take the topological information
of the map into account, too. This is accomplished by
merging only nodes/clusters which are neighbors in the
map.
The number of regions of interest L may be fixed a priori or determined by the Ward clustering initialization
procedure, too.
To compute the final fuzzy clustering, we use a modified
distance measure, which also takes the weights and the
topological structure of the map into account:

"'"
d (x,y) = (
where h(i,j) denotes the neighborhood relation in C.
i.e. a scaling factor from the unit interval which rates
the distance in the discrete topological structure of the
map. For details about the learning algorithm, see (13) .
The mapping provided by a SOM does not only create a
set of weight vectors, it furthermore preserves the topology of the data space. As the map has a fixed dimension
of m = 2, it is possible to plot each parameter (or cluster) with respect to the map in a very straightforward
manner [6].
For our purposes, we will use nets with a several hundreds nodes. Through this large number of nodes ,
the overall distortion error can be kept small, however,
training may take longer.

0-7803-7078-31011$10.00 (C)lOOl IEEE.

Lr.; J

m (
r

Jx, -y,J
) 2 )
J n.1 - ~~ ~

l•(~(x),~(y))~

ffiaXH

L."

'-------r- I_m_r____~
=~

Note that, due to the division by maxkl IJ, -

w;.1.

dR(x, y) is always a value between 0 and I if all values Xr. Yr are in the interval [minkJ,,maxkJ,J. The
exponent h(<jl(x), cj>(y))tl corresponds to the influence
of the neighborhood relation I!(Q>(x), <P(y)); the smaller
h( <jl(x) , cj> (y)) is (i .e. the larger the distance of node Q>(x)
and node cp(y) in the map is), the stronger the raw distance dR(x,y) is stretched . The factor~ controls how
strong the inftuence of the topological structure is (we
use P= 2).
The fuzzy clustering is computed using the usual iterative two-step update procedure. First, the degree of
membership to which node i belongs to cluster k, de-

Page:l781
noted

C(i, k). is computed as

C(i,k) =

: (i,k )

·

2:4~J a{i , l )

with

a(i ,k ) =

e-a·d(w',,~ ))

The factor a 2: I determines the fuzziness of the clustering (1 corresponds to a very fuzzy clustering, while,
e.g .• a value of 100 gives very crisp clusters~ we use
a = 40). Then the cluster centers vk are updated by
computing the weighted means of all nodes [ 10] :

Logical operations on the truth values are defined using triangular norms and conorms, i.e. commutative,
associative, and non-decreasing binary operations on
the unit interval with neutral elements I and 0, respectively [12]. In our applications, we restricted to the
Lukasiewicz operations:

h (x,y) = max. (x + y - I ,0)
SL(x,y) == min(x-t- y, 1)
Then conjunctions and disjunctions of fuzzy predicates
can be defined liS follows :

t(A(x) 1 B(x)) = h(t(A(x)) ,t(B(x)))
This procedure is repeated until the change of the cluster
centers is lower than a predefined bound E.

4. Cluster Descriptions
The final step is concerned with finding descriptions of
the regions of interest determined in the clustering step.
Depending on the underlying context of the variable under consideration, we assign natural language expressions like very very low, medium, large, etc . to each
variable. These expressions are modeled by fuzzy sets
which can either be defined by hand or automatically.
When defining the fuzzy sets, in particular when they
are generated automatically, it is important to preserve
the semantic context of the natural language expressions
(e.g. that low corresponds to smaller values than high)
in order to achieve interpretable descriptions [5 ) .
To find appropriate fuzzy segmentations of the input domains according to the distribution of the sample data
automatically, we use a simple clustering-based algorithm . The cluster centers are first initialized by dividing
the input range into a certain number of equally sized intervals. Then these centers are iteratively updated using
a modified c-means algorithm (14] with simple neighborhood interaction to preserve the ordering of the cl usters. Usually, a few iterations (:$ 10) are sufficient for
each domain. The fuzzy sets are then created as trapewidal membership functions around the centers computed previously.

A fuzzy set A induces a fuzzy predicate

t(x is A)= P.A(x),
where tis the function which assigns a truth value to the
assertion "xis A" . In a straightforward way, a fuzzy set
A also induces the following fuzzy predicates [4]:

t(x is not A)= 1- JlA(x)
t(x is at least A) = sup{!J.A (y) I y :$ x}
t(x is at most A) == sup{p.A (y) I y ~ x}

0-7803-7078-3/011$10.00 (C)2001 IEEE.

t(A(x) V B(x)) = SL(t(A (x)) ,t(B(x)))
Since we want to describe the class membership of the
samples, we are interested in fuzzy rules of the form
IF A(x) THEN C(x) (for convenience, we refer to such
a rule as A ...... C. although the rule should rather be regarded as a conditional assignment than as a logical implication) where C is a fuzzy predicate that describes
the class membership and A may be a compound fuzzy
predicate composed of several atomic predicates, e .g.

IF (age(x) is high) 1 (size (x) is at least high)

THEN (class(x) is 3).
The degree offulfillment of such a statement for a given
sample xis defined as t (A (x) 1 C(x)) (corresponding to
the so-called Mamdani inference [15]). The fuzzy set of
samples fulfilling a predicate A, which we denote with
A(/), is defined as (for x E /)

P.A(J)(x) = t(A(x)).
Let us define the cardinality of a fuzzy set B of input
samples as the sum of P.n(x), i.e.

IBI =

L,P.n(x).
xEI

Finally. the cardinality of samples fulfilling a rule A ...... C
can be defined by

lA(/) nC(l)l = L,t(A(x) AC(x))
xE/

= L,T(t(A(x)),t(C(x))).
xE/

To find the most accurate and significant descriptions,
we use FS-FOIL (8], a modification of Quinlan's FOIL
algorithm [ 18] for fuzzy rules . It creates a stepwise coverage of each cluster such that not only spheric, but arbitrary clusters can be handled.
We compute descriptions of the individual clusters independently, therefore, the goal predicate Cis fixed and
a description rule A-. Cis uniquely characterized by its
antecedent predicate A.

Page: 1782
The basic algorithm can be summarized as follows:
open nodrs =cluster;

s; =0

start with the most general predicate {0}
do {

expand the best n previous predicates
if a rule A -> Cis accurate & signi fie ant
{
add predicate A to rule set S;

remove covered nodes from open nodes
go on with the most general predicate {0}
prune predicates
} while open nodes i 0;
In detail, the expansion of a predicate A is performed
by appending a new atomic fuzzy predicate 8 by means
of conjunction. To decide which predicates to expand,
an entropy-based information gain measure is used to
define a ranking of the set of predicates:

G = IA(I)nC(I) ' (tog 2 IA(J) nC(I) I _log 2 'C(f)l)
IA(I)I

I

II I

Rules of interest have to be significant and accurate. The
significance of rule A ->Cis defined by its support
supp (A __. C) =

IA{i) nC(I)I

II I

'

5.1. Iris data
We used the well-known Iris data set according to Fisher
[91 to illustrate the ability of our approach to identify
hidden structures in a data set. The data consist of three
classes of tlowers, where each tlower is characterized by
four attributes. Each class is represented by 50 samples.
We generated a SOM with 10 x 10 nodes and a set of 4
clusters to separate the test data. Finally, a compact set
of nine intuitive fuzzy rules was created by the above
algorithm .
Applying this set of rules to the original data resulted
in 10.6% misclassified and 7.3% unclassified samples.
Note that, throughout the clustering and labeling process, the goal information has not been taken into account. This means that the proposed method is able to
label the goal classes with a correctness of over 80%
without even knowing the correct classification.
In the original Iris data set, one goal class is perfectly
separated from the others. The remaining two classes,
however, have a certain region of overlapping in the data
space. As a detailed analysis has shown, three of the
four generated clusters can be directly associated with
one of the goal classes. The fourth cluster exactly contained the region of overlap, which demonstrates that
our method is able to preserve the topology of the input
space and to detect hidden structures in the data.

the accuracy as its confidence
con f( A...... C) =

supp(A ..... C)
,
supp(A)

where supp(A) is defined as
supp(A)

=

IA(/1)1

i/

We consider only rules which satisfy reasonable requirements in terms of support and confidence. This
is achieved by two lower bounds SUPPmin and confmin,
respectively [ 1]. As, for all predicates A, B. and C,
supp(A-+ C) ~ supp(A II B-+ C),
we can prune the expanded set of predicates by removing all predicates X for which supp(X) < SUPPmin·
The final degree of membership of a sample x E 2( to
cluster i with respect to the rule set Si is given by:

SLA(x)
AES'

5. Results
Now we will present three results with very different
data sets and purposes-<lemonstrating the flexibility
and general applicability of the proposed approach.

0-7803-7078-31011$10.00 (C)2001IEEE.

5.2. Image segmentation
A typical application of clustering in computer vision is
image segmentation, therefore, it seemed interesting to
apply our three-stage approach to this problem. Moreover, the possibility to describe segments with natural
language expressions gives rise to completely new opportunities in image understanding.
First experiments using pixel coordinates and RGB values showed good results in terms of accuracy, however,
the descriptions were rarely intuitive. In order to overcome this problem, we added HSL features (hue, saturation and lightness), which match the way humans perceive color much closer, in the final description step.
Since humans have a certain understanding of colors
and intensities, it is more appropriate to assume predefined fuzzy predicates corresponding to natural language terms describing these properties. By this way,
moreover, it can be avoided that the meaning of the descriptions depend on the image under consideration.
The example in Fig. 1 shows an image with 170 x 256
pixels, i.e. we have K = 43520 samples with n = 8 features. First, it was mapped onto a SOM with 10 x 10
nodes. Then four clusters and the corresponding descriptions were computed. On the right-hand side, Fig.
1 shows the computed segmentation. Figure 2 shows the

Page: 1783
final rule sets and the cross validation table which illustrates how good the clusters are described by the rule
sets.

classification prohlem with ad hoc structured probability mudels, binary decision trees, or Bayesian dassilier
methods were only abk to achieve an accuracy of approximately 55'k ill].
We ..:omput.ed a SOM with 20 >: 20 nodes . which were
then grouped to 5 clusters (Fig. 3 shows pkm of the
class mcnihcrships over the SOM). Finally. a set of descriptions for each cluster was created.

figure 1. Original iuwge ( i£ji) and its scgrnentarion (rig hi}

Rule set 110:
[blue •• B)
[red

<= L] &.&: [blue >=

Figure 3. Clusters in tile yeast datu sel

VH]

Rille set #1:
(lightness <= dark]

To validate our results, we compared the originul classification. with our clusters and rules. Although we used
less clusters than goal classes (which were not taken
into account). the overall number of samples assigned
to the wrong goal class was relatively small (cf. Fig. 4).

Rule set #2:
(ligtltnasn >= light:]
Rille set #3:
(hue
orange)
(hue •• red)
[hue ~::;: yellow]

[hue •• gi·een) U

Rule #0
Clus1:er #0 : 97%

c
[lightness •• normal)

Rule #1

or.

Rule #2
07.

Rule #3

O'l.

Cluster #1:

07.

987.

OJ.

Cluster #2:

7'/,

Oi.

987.

01.
0'1.

Cluster #3:

27.

07.

07.

99'1.

Tote1 Error: 0.41% (+0.43% missing)

Figure 2. Rule sets for tile image segmeruation

[C¥T)
[N1JC)

('129):
[MIT] (244):
(~E3]

(163):
(XE2) (51):

[ME1] (44):
(EXC]

#0

(163): ( 9%)

(35):

[VAC] (30):
(POX] (20) :
[ERLJ (5):

c

#l

C~"l'l..l

c

#2

c

#3

117.
[317.]

?Y.

07.

4'l.

c

#4
13%

67.
14'l.
177.
58'l,

[45'f.]
20%
1/.
5%

93%
82%

O'l,

O'l.

2'l.

11/.
33%

07.

33'l.

36

O'l.
23'l.

15'f.

25%

15%

O'l,

457.
20'1.

0'1.

BO'l.

O'l,

0'1.

11%

17.

6'l.
[68'l.]

27'/.

!)'(.

[59%]
l'l.
5'l.
47.
5%
6Y,

Total Error: 37% ( +3% <nissing)

Figure 4. Cross validation table (goal vs. clusters),

brackets i11dicate cluster-class assignments
The rule sets in Fig. 2 can be interpreted as follows: The
first rule set corresponds to the blue sky. The second
rule set describes the black pants. The snow is classified
by the third rule set. Finally. the jackets are identified by
the t()urth rule set.

5.3. Yeast data set
As third test case, we used the UCI yeast data set due

The rule sets created for each cluster are shown in Fig. 5.
Applying these rules to the original data, and comparing
the result with the cluster memberships. an overage error of 10.38% with 53.63% missing occurred. The high
number of missing samples is due to complex structure
of the clusters (see Fig. 3).

to K. Nakai. The datu set consists of K = 1484 sumplcs

6. Conclusion

with n = 8 attributes. Each sample- belongs to one of
ten classes. The data set has a very complex inner structure and a lot of inconsistencies. Attempts to solve this

ln this paper we have shown how the combination
of different methods (SOMs, clustering, and inductive

0-7803-7078-3/011$10.00 (C)2001 IEEE.

Page: 1784
Rule set #0:
(gvh >u VH)

'=

[meg >= H] t&: [gvh <= Hl llti. [alm
H]
U
(mit <= M] U (v• c >• VH] 1<1< (nue '• VL)

Rul e set lll :
[alm >= VH ) U

(mit • • VVL] 1<1< [vae == M)

Rul e set #2:
[nuc >= M)

[erl >• VL) 1<1< [vac
Rule set #3:
[meg •• VVL] U

•u

(5] U. Bodenhofer and P. Bauer. Towards an axiomatic treatment of "interpretability". In Proc.
IIZUKA2000, pages 334-339, Iizuka, October
2000.
[6] G. Deboeck and T. Kohoncn (Eds). Visual Explorations in Finance with Self-Organizing Maps .
Springer, London, 1998.

VVH] 1<1< [nuc >= VL]

[alm • • VVL]

Rule set #4:
[mit >= M)

Figure 5. Generated rule sets for the yeast data set

learning) can minimize the shortcomings of the individual methods. Through the generic approach, our methods can be applied to a wide variety of problems, including classification, image segmentation. and data mining.
The modular design allows to change individual components on demand or even to omit one step. e.g. when labeled data is available, it is possible to use this information as the goal parameter in the inductive learning step,
instead of performing unsupervised clustering first.

Acknowledgements
Th is work has been done in the framework of the Kplus
Competence Center Program which is funded by the
Austrian Government, the Province of Upper Austria,
and the Chamber of Commerce of Upper Austria.

References
[l] R . Agrawal. T. Imielinski, and A. N. Swami. Mining association rules between sets of items in large
databases. In P. Buneman and S. Jajodia, editors,

Proc. ACM SIGMOD Int. Conf on Management of
Data, pages 207-216, Washington, DC, 1993.

[7 ) C. Diamanrini and M. Panti. An efficient and scalable data compression approach to classification.
ACM SIGKDD Explorations, 2:49-55, 2000.
[8] M. Drobics , W. Winiwarter, and U. Bodenhofer.
Interpretation of self-organizing maps with fuzzy
rules. In Proc. 12th IEEE lilt. Conf on Tools with
Artificial Intelligence (ICTA/'00), pages 304-311.
IEEE Press, 2000.
[9] R . A. Fisher. The use of multiple measurements in
tax.onomic problems . Annals of Eugenics, 7: 179188, 1936.

[10] F. Hoeppner, F. Klawonn, R. Kruse, and T. A. Runkler. Fuuy Cluster Analysis-Methods for Image Recognition, Classification, and Data Analysis. John Wiley & Sons, Chichester, 1999.
I II] P. Horton and K. Nakai. A probabilistic classification system for predicting the cellular localization sites of proteins. ln D. J. States, P. Agarwal , T. Gaasterland. L. Hunter, and R. Smith, editors , Proc. 4th Int. Conf on IntelligenT Systems for
Molecular Biology, pages 109-115. AAAI Press,
Menlo Park, 1996.

[12] E. P. Klement, R. Mesiar, and E. Pap . Triangular Norms, volume 8 of Trends in Logic. Kluwer
Academic Publishers, Dordrecht, 2000.
[13] T. Kohonen . Self-Organizing Maps, volume 30 of
Information Sciences. Springer, Berlin, second extended edition, 1997.
[14] Y. Linde, A. Buzo, and R. M. Gray. An algorithm
for vector quantizer design. IEEE Trans. Commun., 28( I ):84-95, 1980.

[2] M. R. Anderberg. Cluster Analysis for Applications . Academic Press, 1973.

[ 15) E. H. Mamdani and S. Assilian. An experiment
in linguistic synthesis of fuzzy controllers. Int. J.
Man-Mach. Stud., 7:1-13, 1975.

[3] J. C. Bezdek.. Pattern Recognition with Fuzz.y Objective Function Algorithms. Plenum Press, New
York, 1981.

[16] R. S. Michalski, I. Bratko, and M. Kubat. Machine
Learning and Data Mining. John Wiley & Sons,
Chichester, 1998.

[4) U . Bodenhofer. The construction of orderingbased modifiers.
In G. Brewka. R. Der,
S. Gottwald, and A. Schierwagen, editors , FuuyNeuro Systems '99, pages 55-62. Leipziger Universitatsverlag, 1999.

0-7803-70711-31011$10.00 (C)2001 IEEE.

[ 17] J. R. Quinlan. Induction of decision trees. Machine Learning, 1(1):81-106, 1986.
[18] J. R. Quinlan. Learning logical definitions from
relations. Machine Learning, 5(3):239-266, 1990.

Page: 1785

Mais conteúdo relacionado

Mais procurados

Spme 2013 segmentation
Spme 2013 segmentationSpme 2013 segmentation
Spme 2013 segmentationQujiang Lei
 
Lecture 11 (Digital Image Processing)
Lecture 11 (Digital Image Processing)Lecture 11 (Digital Image Processing)
Lecture 11 (Digital Image Processing)VARUN KUMAR
 
Lecture 3 image sampling and quantization
Lecture 3 image sampling and quantizationLecture 3 image sampling and quantization
Lecture 3 image sampling and quantizationVARUN KUMAR
 
imageCorrectionLinearDiffusion
imageCorrectionLinearDiffusionimageCorrectionLinearDiffusion
imageCorrectionLinearDiffusionKellen Betts
 
Self Organizing Feature Map(SOM), Topographic Product, Cascade 2 Algorithm
Self Organizing Feature Map(SOM), Topographic Product, Cascade 2 AlgorithmSelf Organizing Feature Map(SOM), Topographic Product, Cascade 2 Algorithm
Self Organizing Feature Map(SOM), Topographic Product, Cascade 2 AlgorithmChenghao Jin
 
SIFT/SURF can achieve scale, rotation and illumination invariant during image...
SIFT/SURF can achieve scale, rotation and illumination invariant during image...SIFT/SURF can achieve scale, rotation and illumination invariant during image...
SIFT/SURF can achieve scale, rotation and illumination invariant during image...National Cheng Kung University
 
A Unifying Probabilistic Perspective for Spectral Dimensionality Reduction:
A Unifying Probabilistic Perspective for Spectral Dimensionality Reduction:A Unifying Probabilistic Perspective for Spectral Dimensionality Reduction:
A Unifying Probabilistic Perspective for Spectral Dimensionality Reduction:Sean Golliher
 
Chapter 5: Mapping and Scheduling
Chapter  5: Mapping and SchedulingChapter  5: Mapping and Scheduling
Chapter 5: Mapping and SchedulingHeman Pathak
 
Research Inventy : International Journal of Engineering and Science
Research Inventy : International Journal of Engineering and ScienceResearch Inventy : International Journal of Engineering and Science
Research Inventy : International Journal of Engineering and Scienceinventy
 
International Journal of Engineering Research and Development
International Journal of Engineering Research and DevelopmentInternational Journal of Engineering Research and Development
International Journal of Engineering Research and DevelopmentIJERD Editor
 
Interactive Rendering and Stylization of Transportation Networks Using Distan...
Interactive Rendering and Stylization of Transportation Networks Using Distan...Interactive Rendering and Stylization of Transportation Networks Using Distan...
Interactive Rendering and Stylization of Transportation Networks Using Distan...Matthias Trapp
 
Geometry Batching Using Texture-Arrays
Geometry Batching Using Texture-ArraysGeometry Batching Using Texture-Arrays
Geometry Batching Using Texture-ArraysMatthias Trapp
 
Identification system of characters in vehicular plates
Identification system of characters in vehicular platesIdentification system of characters in vehicular plates
Identification system of characters in vehicular platesIJRES Journal
 

Mais procurados (17)

Spme 2013 segmentation
Spme 2013 segmentationSpme 2013 segmentation
Spme 2013 segmentation
 
Lecture 11 (Digital Image Processing)
Lecture 11 (Digital Image Processing)Lecture 11 (Digital Image Processing)
Lecture 11 (Digital Image Processing)
 
Lecture 3 image sampling and quantization
Lecture 3 image sampling and quantizationLecture 3 image sampling and quantization
Lecture 3 image sampling and quantization
 
imageCorrectionLinearDiffusion
imageCorrectionLinearDiffusionimageCorrectionLinearDiffusion
imageCorrectionLinearDiffusion
 
Self Organizing Feature Map(SOM), Topographic Product, Cascade 2 Algorithm
Self Organizing Feature Map(SOM), Topographic Product, Cascade 2 AlgorithmSelf Organizing Feature Map(SOM), Topographic Product, Cascade 2 Algorithm
Self Organizing Feature Map(SOM), Topographic Product, Cascade 2 Algorithm
 
11 clusadvanced
11 clusadvanced11 clusadvanced
11 clusadvanced
 
SIFT/SURF can achieve scale, rotation and illumination invariant during image...
SIFT/SURF can achieve scale, rotation and illumination invariant during image...SIFT/SURF can achieve scale, rotation and illumination invariant during image...
SIFT/SURF can achieve scale, rotation and illumination invariant during image...
 
A Unifying Probabilistic Perspective for Spectral Dimensionality Reduction:
A Unifying Probabilistic Perspective for Spectral Dimensionality Reduction:A Unifying Probabilistic Perspective for Spectral Dimensionality Reduction:
A Unifying Probabilistic Perspective for Spectral Dimensionality Reduction:
 
Chapter 5: Mapping and Scheduling
Chapter  5: Mapping and SchedulingChapter  5: Mapping and Scheduling
Chapter 5: Mapping and Scheduling
 
Research Inventy : International Journal of Engineering and Science
Research Inventy : International Journal of Engineering and ScienceResearch Inventy : International Journal of Engineering and Science
Research Inventy : International Journal of Engineering and Science
 
International Journal of Engineering Research and Development
International Journal of Engineering Research and DevelopmentInternational Journal of Engineering Research and Development
International Journal of Engineering Research and Development
 
C012271015
C012271015C012271015
C012271015
 
Clustering lect
Clustering lectClustering lect
Clustering lect
 
Interactive Rendering and Stylization of Transportation Networks Using Distan...
Interactive Rendering and Stylization of Transportation Networks Using Distan...Interactive Rendering and Stylization of Transportation Networks Using Distan...
Interactive Rendering and Stylization of Transportation Networks Using Distan...
 
Geometry Batching Using Texture-Arrays
Geometry Batching Using Texture-ArraysGeometry Batching Using Texture-Arrays
Geometry Batching Using Texture-Arrays
 
Identification system of characters in vehicular plates
Identification system of characters in vehicular platesIdentification system of characters in vehicular plates
Identification system of characters in vehicular plates
 
Pca ankita dubey
Pca ankita dubeyPca ankita dubey
Pca ankita dubey
 

Destaque

Maher.m.l. 2006: a model of co evolutionary design
Maher.m.l. 2006: a model of co evolutionary designMaher.m.l. 2006: a model of co evolutionary design
Maher.m.l. 2006: a model of co evolutionary designArchiLab 7
 
Fractal cities low resolution
Fractal cities low resolutionFractal cities low resolution
Fractal cities low resolutionArchiLab 7
 
Kubota, r. 2006: genetic algorithm with modified reproduction strategy based ...
Kubota, r. 2006: genetic algorithm with modified reproduction strategy based ...Kubota, r. 2006: genetic algorithm with modified reproduction strategy based ...
Kubota, r. 2006: genetic algorithm with modified reproduction strategy based ...ArchiLab 7
 
P.Corning 2002 the re emergence of emergence
P.Corning 2002 the re emergence of emergenceP.Corning 2002 the re emergence of emergence
P.Corning 2002 the re emergence of emergenceArchiLab 7
 
Ramos, v; fernandes, c; rosa, a. (2007): social cognitive maps, swarm collect...
Ramos, v; fernandes, c; rosa, a. (2007): social cognitive maps, swarm collect...Ramos, v; fernandes, c; rosa, a. (2007): social cognitive maps, swarm collect...
Ramos, v; fernandes, c; rosa, a. (2007): social cognitive maps, swarm collect...ArchiLab 7
 

Destaque (7)

Maher.m.l. 2006: a model of co evolutionary design
Maher.m.l. 2006: a model of co evolutionary designMaher.m.l. 2006: a model of co evolutionary design
Maher.m.l. 2006: a model of co evolutionary design
 
Marketing and Tech: Meeting the challenge
Marketing and Tech: Meeting the challengeMarketing and Tech: Meeting the challenge
Marketing and Tech: Meeting the challenge
 
Fractal cities low resolution
Fractal cities low resolutionFractal cities low resolution
Fractal cities low resolution
 
Kubota, r. 2006: genetic algorithm with modified reproduction strategy based ...
Kubota, r. 2006: genetic algorithm with modified reproduction strategy based ...Kubota, r. 2006: genetic algorithm with modified reproduction strategy based ...
Kubota, r. 2006: genetic algorithm with modified reproduction strategy based ...
 
P.Corning 2002 the re emergence of emergence
P.Corning 2002 the re emergence of emergenceP.Corning 2002 the re emergence of emergence
P.Corning 2002 the re emergence of emergence
 
Ramos, v; fernandes, c; rosa, a. (2007): social cognitive maps, swarm collect...
Ramos, v; fernandes, c; rosa, a. (2007): social cognitive maps, swarm collect...Ramos, v; fernandes, c; rosa, a. (2007): social cognitive maps, swarm collect...
Ramos, v; fernandes, c; rosa, a. (2007): social cognitive maps, swarm collect...
 
The Rise of the Marketer
The Rise of the Marketer The Rise of the Marketer
The Rise of the Marketer
 

Semelhante a Drobics, m. 2001: datamining using synergiesbetween self-organising maps and inductive learning on fuzzy rule

MODIFIED LLL ALGORITHM WITH SHIFTED START COLUMN FOR COMPLEXITY REDUCTION
MODIFIED LLL ALGORITHM WITH SHIFTED START COLUMN FOR COMPLEXITY REDUCTIONMODIFIED LLL ALGORITHM WITH SHIFTED START COLUMN FOR COMPLEXITY REDUCTION
MODIFIED LLL ALGORITHM WITH SHIFTED START COLUMN FOR COMPLEXITY REDUCTIONijwmn
 
Investigation on the Pattern Synthesis of Subarray Weights for Low EMI Applic...
Investigation on the Pattern Synthesis of Subarray Weights for Low EMI Applic...Investigation on the Pattern Synthesis of Subarray Weights for Low EMI Applic...
Investigation on the Pattern Synthesis of Subarray Weights for Low EMI Applic...IOSRJECE
 
CLUSTERING HYPERSPECTRAL DATA
CLUSTERING HYPERSPECTRAL DATACLUSTERING HYPERSPECTRAL DATA
CLUSTERING HYPERSPECTRAL DATAcsandit
 
Fuzzy c means_realestate_application
Fuzzy c means_realestate_applicationFuzzy c means_realestate_application
Fuzzy c means_realestate_applicationCemal Ardil
 
Exact network reconstruction from consensus signals and one eigen value
Exact network reconstruction from consensus signals and one eigen valueExact network reconstruction from consensus signals and one eigen value
Exact network reconstruction from consensus signals and one eigen valueIJCNCJournal
 
Road Segmentation from satellites images
Road Segmentation from satellites imagesRoad Segmentation from satellites images
Road Segmentation from satellites imagesYoussefKitane
 
Ill-posedness formulation of the emission source localization in the radio- d...
Ill-posedness formulation of the emission source localization in the radio- d...Ill-posedness formulation of the emission source localization in the radio- d...
Ill-posedness formulation of the emission source localization in the radio- d...Ahmed Ammar Rebai PhD
 
SLIDING WINDOW SUM ALGORITHMS FOR DEEP NEURAL NETWORKS
SLIDING WINDOW SUM ALGORITHMS FOR DEEP NEURAL NETWORKSSLIDING WINDOW SUM ALGORITHMS FOR DEEP NEURAL NETWORKS
SLIDING WINDOW SUM ALGORITHMS FOR DEEP NEURAL NETWORKSIJCI JOURNAL
 
Undetermined Mixing Matrix Estimation Base on Classification and Counting
Undetermined Mixing Matrix Estimation Base on Classification and CountingUndetermined Mixing Matrix Estimation Base on Classification and Counting
Undetermined Mixing Matrix Estimation Base on Classification and CountingIJRESJOURNAL
 
COMPARISON OF VOLUME AND DISTANCE CONSTRAINT ON HYPERSPECTRAL UNMIXING
COMPARISON OF VOLUME AND DISTANCE CONSTRAINT ON HYPERSPECTRAL UNMIXINGCOMPARISON OF VOLUME AND DISTANCE CONSTRAINT ON HYPERSPECTRAL UNMIXING
COMPARISON OF VOLUME AND DISTANCE CONSTRAINT ON HYPERSPECTRAL UNMIXINGcsandit
 
Time of arrival based localization in wireless sensor networks a non linear ...
Time of arrival based localization in wireless sensor networks  a non linear ...Time of arrival based localization in wireless sensor networks  a non linear ...
Time of arrival based localization in wireless sensor networks a non linear ...sipij
 
Using Subspace Pursuit Algorithm to Improve Performance of the Distributed Co...
Using Subspace Pursuit Algorithm to Improve Performance of the Distributed Co...Using Subspace Pursuit Algorithm to Improve Performance of the Distributed Co...
Using Subspace Pursuit Algorithm to Improve Performance of the Distributed Co...Polytechnique Montreal
 
An_Accelerated_Nearest_Neighbor_Search_Method_for_the_K-Means_Clustering_Algo...
An_Accelerated_Nearest_Neighbor_Search_Method_for_the_K-Means_Clustering_Algo...An_Accelerated_Nearest_Neighbor_Search_Method_for_the_K-Means_Clustering_Algo...
An_Accelerated_Nearest_Neighbor_Search_Method_for_the_K-Means_Clustering_Algo...Adam Fausett
 
GraphSignalProcessingFinalPaper
GraphSignalProcessingFinalPaperGraphSignalProcessingFinalPaper
GraphSignalProcessingFinalPaperChiraz Nafouki
 
A comparison of efficient algorithms for scheduling parallel data redistribution
A comparison of efficient algorithms for scheduling parallel data redistributionA comparison of efficient algorithms for scheduling parallel data redistribution
A comparison of efficient algorithms for scheduling parallel data redistributionIJCNCJournal
 
AUTOMATIC THRESHOLDING TECHNIQUES FOR SAR IMAGES
AUTOMATIC THRESHOLDING TECHNIQUES FOR SAR IMAGESAUTOMATIC THRESHOLDING TECHNIQUES FOR SAR IMAGES
AUTOMATIC THRESHOLDING TECHNIQUES FOR SAR IMAGEScscpconf
 

Semelhante a Drobics, m. 2001: datamining using synergiesbetween self-organising maps and inductive learning on fuzzy rule (20)

MODIFIED LLL ALGORITHM WITH SHIFTED START COLUMN FOR COMPLEXITY REDUCTION
MODIFIED LLL ALGORITHM WITH SHIFTED START COLUMN FOR COMPLEXITY REDUCTIONMODIFIED LLL ALGORITHM WITH SHIFTED START COLUMN FOR COMPLEXITY REDUCTION
MODIFIED LLL ALGORITHM WITH SHIFTED START COLUMN FOR COMPLEXITY REDUCTION
 
Investigation on the Pattern Synthesis of Subarray Weights for Low EMI Applic...
Investigation on the Pattern Synthesis of Subarray Weights for Low EMI Applic...Investigation on the Pattern Synthesis of Subarray Weights for Low EMI Applic...
Investigation on the Pattern Synthesis of Subarray Weights for Low EMI Applic...
 
CLUSTERING HYPERSPECTRAL DATA
CLUSTERING HYPERSPECTRAL DATACLUSTERING HYPERSPECTRAL DATA
CLUSTERING HYPERSPECTRAL DATA
 
Fuzzy c means_realestate_application
Fuzzy c means_realestate_applicationFuzzy c means_realestate_application
Fuzzy c means_realestate_application
 
Path loss prediction
Path loss predictionPath loss prediction
Path loss prediction
 
Exact network reconstruction from consensus signals and one eigen value
Exact network reconstruction from consensus signals and one eigen valueExact network reconstruction from consensus signals and one eigen value
Exact network reconstruction from consensus signals and one eigen value
 
Road Segmentation from satellites images
Road Segmentation from satellites imagesRoad Segmentation from satellites images
Road Segmentation from satellites images
 
Ill-posedness formulation of the emission source localization in the radio- d...
Ill-posedness formulation of the emission source localization in the radio- d...Ill-posedness formulation of the emission source localization in the radio- d...
Ill-posedness formulation of the emission source localization in the radio- d...
 
50120130406039
5012013040603950120130406039
50120130406039
 
SLIDING WINDOW SUM ALGORITHMS FOR DEEP NEURAL NETWORKS
SLIDING WINDOW SUM ALGORITHMS FOR DEEP NEURAL NETWORKSSLIDING WINDOW SUM ALGORITHMS FOR DEEP NEURAL NETWORKS
SLIDING WINDOW SUM ALGORITHMS FOR DEEP NEURAL NETWORKS
 
Undetermined Mixing Matrix Estimation Base on Classification and Counting
Undetermined Mixing Matrix Estimation Base on Classification and CountingUndetermined Mixing Matrix Estimation Base on Classification and Counting
Undetermined Mixing Matrix Estimation Base on Classification and Counting
 
COMPARISON OF VOLUME AND DISTANCE CONSTRAINT ON HYPERSPECTRAL UNMIXING
COMPARISON OF VOLUME AND DISTANCE CONSTRAINT ON HYPERSPECTRAL UNMIXINGCOMPARISON OF VOLUME AND DISTANCE CONSTRAINT ON HYPERSPECTRAL UNMIXING
COMPARISON OF VOLUME AND DISTANCE CONSTRAINT ON HYPERSPECTRAL UNMIXING
 
How to Decide the Best Fuzzy Model in ANFIS
How to Decide the Best Fuzzy Model in ANFIS How to Decide the Best Fuzzy Model in ANFIS
How to Decide the Best Fuzzy Model in ANFIS
 
Time of arrival based localization in wireless sensor networks a non linear ...
Time of arrival based localization in wireless sensor networks  a non linear ...Time of arrival based localization in wireless sensor networks  a non linear ...
Time of arrival based localization in wireless sensor networks a non linear ...
 
Using Subspace Pursuit Algorithm to Improve Performance of the Distributed Co...
Using Subspace Pursuit Algorithm to Improve Performance of the Distributed Co...Using Subspace Pursuit Algorithm to Improve Performance of the Distributed Co...
Using Subspace Pursuit Algorithm to Improve Performance of the Distributed Co...
 
Dycops2019
Dycops2019 Dycops2019
Dycops2019
 
An_Accelerated_Nearest_Neighbor_Search_Method_for_the_K-Means_Clustering_Algo...
An_Accelerated_Nearest_Neighbor_Search_Method_for_the_K-Means_Clustering_Algo...An_Accelerated_Nearest_Neighbor_Search_Method_for_the_K-Means_Clustering_Algo...
An_Accelerated_Nearest_Neighbor_Search_Method_for_the_K-Means_Clustering_Algo...
 
GraphSignalProcessingFinalPaper
GraphSignalProcessingFinalPaperGraphSignalProcessingFinalPaper
GraphSignalProcessingFinalPaper
 
A comparison of efficient algorithms for scheduling parallel data redistribution
A comparison of efficient algorithms for scheduling parallel data redistributionA comparison of efficient algorithms for scheduling parallel data redistribution
A comparison of efficient algorithms for scheduling parallel data redistribution
 
AUTOMATIC THRESHOLDING TECHNIQUES FOR SAR IMAGES
AUTOMATIC THRESHOLDING TECHNIQUES FOR SAR IMAGESAUTOMATIC THRESHOLDING TECHNIQUES FOR SAR IMAGES
AUTOMATIC THRESHOLDING TECHNIQUES FOR SAR IMAGES
 

Mais de ArchiLab 7

John michael greer: an old kind of science cellular automata
John michael greer:  an old kind of science cellular automataJohn michael greer:  an old kind of science cellular automata
John michael greer: an old kind of science cellular automataArchiLab 7
 
Coates p: the use of genetic programming for applications in the field of spa...
Coates p: the use of genetic programming for applications in the field of spa...Coates p: the use of genetic programming for applications in the field of spa...
Coates p: the use of genetic programming for applications in the field of spa...ArchiLab 7
 
Coates p: the use of genetic programing in exploring 3 d design worlds
Coates p: the use of genetic programing in exploring 3 d design worldsCoates p: the use of genetic programing in exploring 3 d design worlds
Coates p: the use of genetic programing in exploring 3 d design worldsArchiLab 7
 
Coates p: genetic programming and spatial morphogenesis
Coates p: genetic programming and spatial morphogenesisCoates p: genetic programming and spatial morphogenesis
Coates p: genetic programming and spatial morphogenesisArchiLab 7
 
Coates p 1999: exploring 3_d design worlds using lindenmeyer systems and gen...
Coates  p 1999: exploring 3_d design worlds using lindenmeyer systems and gen...Coates  p 1999: exploring 3_d design worlds using lindenmeyer systems and gen...
Coates p 1999: exploring 3_d design worlds using lindenmeyer systems and gen...ArchiLab 7
 
Ying hua, c. (2010): adopting co-evolution and constraint-satisfaction concep...
Ying hua, c. (2010): adopting co-evolution and constraint-satisfaction concep...Ying hua, c. (2010): adopting co-evolution and constraint-satisfaction concep...
Ying hua, c. (2010): adopting co-evolution and constraint-satisfaction concep...ArchiLab 7
 
Sakanashi, h.; kakazu, y. (1994): co evolving genetic algorithm with filtered...
Sakanashi, h.; kakazu, y. (1994): co evolving genetic algorithm with filtered...Sakanashi, h.; kakazu, y. (1994): co evolving genetic algorithm with filtered...
Sakanashi, h.; kakazu, y. (1994): co evolving genetic algorithm with filtered...ArchiLab 7
 
Dorst, kees and cross, nigel (2001): creativity in the design process co evol...
Dorst, kees and cross, nigel (2001): creativity in the design process co evol...Dorst, kees and cross, nigel (2001): creativity in the design process co evol...
Dorst, kees and cross, nigel (2001): creativity in the design process co evol...ArchiLab 7
 
Daniel hillis 1991: co-evolving parasites sfi artificial life ii
Daniel hillis 1991: co-evolving parasites sfi artificial life iiDaniel hillis 1991: co-evolving parasites sfi artificial life ii
Daniel hillis 1991: co-evolving parasites sfi artificial life iiArchiLab 7
 
P bentley skumar: three ways to grow designs a comparison of evolved embryoge...
P bentley skumar: three ways to grow designs a comparison of evolved embryoge...P bentley skumar: three ways to grow designs a comparison of evolved embryoge...
P bentley skumar: three ways to grow designs a comparison of evolved embryoge...ArchiLab 7
 
P bentley: aspects of evolutionary design by computers
P bentley: aspects of evolutionary design by computersP bentley: aspects of evolutionary design by computers
P bentley: aspects of evolutionary design by computersArchiLab 7
 
M de landa: deleuze and use of ga in architecture
M de landa: deleuze and use of ga in architectureM de landa: deleuze and use of ga in architecture
M de landa: deleuze and use of ga in architectureArchiLab 7
 
Kumar bentley: computational embryology_ past, present and future
Kumar bentley: computational embryology_ past, present and futureKumar bentley: computational embryology_ past, present and future
Kumar bentley: computational embryology_ past, present and futureArchiLab 7
 
Derix 2010: mediating spatial phenomena through computational heuristics
Derix 2010:  mediating spatial phenomena through computational heuristicsDerix 2010:  mediating spatial phenomena through computational heuristics
Derix 2010: mediating spatial phenomena through computational heuristicsArchiLab 7
 
De garis, h. (1999): artificial embryology and cellular differentiation. ch. ...
De garis, h. (1999): artificial embryology and cellular differentiation. ch. ...De garis, h. (1999): artificial embryology and cellular differentiation. ch. ...
De garis, h. (1999): artificial embryology and cellular differentiation. ch. ...ArchiLab 7
 
Yamashiro, d. 2006: efficiency of search performance through visualising sear...
Yamashiro, d. 2006: efficiency of search performance through visualising sear...Yamashiro, d. 2006: efficiency of search performance through visualising sear...
Yamashiro, d. 2006: efficiency of search performance through visualising sear...ArchiLab 7
 
Tanaka, m 1995: ga-based decision support system for multi-criteria optimisa...
Tanaka, m 1995: ga-based decision support system for multi-criteria  optimisa...Tanaka, m 1995: ga-based decision support system for multi-criteria  optimisa...
Tanaka, m 1995: ga-based decision support system for multi-criteria optimisa...ArchiLab 7
 
Rauber, a. 1999: label_som_on the labeling of self-organising maps
Rauber, a. 1999:  label_som_on the labeling of self-organising mapsRauber, a. 1999:  label_som_on the labeling of self-organising maps
Rauber, a. 1999: label_som_on the labeling of self-organising mapsArchiLab 7
 
Kubota, n. 1994: genetic algorithm with age structure and its application to ...
Kubota, n. 1994: genetic algorithm with age structure and its application to ...Kubota, n. 1994: genetic algorithm with age structure and its application to ...
Kubota, n. 1994: genetic algorithm with age structure and its application to ...ArchiLab 7
 
Kubota, n. 1994: genetic algorithm with age structure and its application to ...
Kubota, n. 1994: genetic algorithm with age structure and its application to ...Kubota, n. 1994: genetic algorithm with age structure and its application to ...
Kubota, n. 1994: genetic algorithm with age structure and its application to ...ArchiLab 7
 

Mais de ArchiLab 7 (20)

John michael greer: an old kind of science cellular automata
John michael greer:  an old kind of science cellular automataJohn michael greer:  an old kind of science cellular automata
John michael greer: an old kind of science cellular automata
 
Coates p: the use of genetic programming for applications in the field of spa...
Coates p: the use of genetic programming for applications in the field of spa...Coates p: the use of genetic programming for applications in the field of spa...
Coates p: the use of genetic programming for applications in the field of spa...
 
Coates p: the use of genetic programing in exploring 3 d design worlds
Coates p: the use of genetic programing in exploring 3 d design worldsCoates p: the use of genetic programing in exploring 3 d design worlds
Coates p: the use of genetic programing in exploring 3 d design worlds
 
Coates p: genetic programming and spatial morphogenesis
Coates p: genetic programming and spatial morphogenesisCoates p: genetic programming and spatial morphogenesis
Coates p: genetic programming and spatial morphogenesis
 
Coates p 1999: exploring 3_d design worlds using lindenmeyer systems and gen...
Coates  p 1999: exploring 3_d design worlds using lindenmeyer systems and gen...Coates  p 1999: exploring 3_d design worlds using lindenmeyer systems and gen...
Coates p 1999: exploring 3_d design worlds using lindenmeyer systems and gen...
 
Ying hua, c. (2010): adopting co-evolution and constraint-satisfaction concep...
Ying hua, c. (2010): adopting co-evolution and constraint-satisfaction concep...Ying hua, c. (2010): adopting co-evolution and constraint-satisfaction concep...
Ying hua, c. (2010): adopting co-evolution and constraint-satisfaction concep...
 
Sakanashi, h.; kakazu, y. (1994): co evolving genetic algorithm with filtered...
Sakanashi, h.; kakazu, y. (1994): co evolving genetic algorithm with filtered...Sakanashi, h.; kakazu, y. (1994): co evolving genetic algorithm with filtered...
Sakanashi, h.; kakazu, y. (1994): co evolving genetic algorithm with filtered...
 
Dorst, kees and cross, nigel (2001): creativity in the design process co evol...
Dorst, kees and cross, nigel (2001): creativity in the design process co evol...Dorst, kees and cross, nigel (2001): creativity in the design process co evol...
Dorst, kees and cross, nigel (2001): creativity in the design process co evol...
 
Daniel hillis 1991: co-evolving parasites sfi artificial life ii
Daniel hillis 1991: co-evolving parasites sfi artificial life iiDaniel hillis 1991: co-evolving parasites sfi artificial life ii
Daniel hillis 1991: co-evolving parasites sfi artificial life ii
 
P bentley skumar: three ways to grow designs a comparison of evolved embryoge...
P bentley skumar: three ways to grow designs a comparison of evolved embryoge...P bentley skumar: three ways to grow designs a comparison of evolved embryoge...
P bentley skumar: three ways to grow designs a comparison of evolved embryoge...
 
P bentley: aspects of evolutionary design by computers
P bentley: aspects of evolutionary design by computersP bentley: aspects of evolutionary design by computers
P bentley: aspects of evolutionary design by computers
 
M de landa: deleuze and use of ga in architecture
M de landa: deleuze and use of ga in architectureM de landa: deleuze and use of ga in architecture
M de landa: deleuze and use of ga in architecture
 
Kumar bentley: computational embryology_ past, present and future
Kumar bentley: computational embryology_ past, present and futureKumar bentley: computational embryology_ past, present and future
Kumar bentley: computational embryology_ past, present and future
 
Derix 2010: mediating spatial phenomena through computational heuristics
Derix 2010:  mediating spatial phenomena through computational heuristicsDerix 2010:  mediating spatial phenomena through computational heuristics
Derix 2010: mediating spatial phenomena through computational heuristics
 
De garis, h. (1999): artificial embryology and cellular differentiation. ch. ...
De garis, h. (1999): artificial embryology and cellular differentiation. ch. ...De garis, h. (1999): artificial embryology and cellular differentiation. ch. ...
De garis, h. (1999): artificial embryology and cellular differentiation. ch. ...
 
Yamashiro, d. 2006: efficiency of search performance through visualising sear...
Yamashiro, d. 2006: efficiency of search performance through visualising sear...Yamashiro, d. 2006: efficiency of search performance through visualising sear...
Yamashiro, d. 2006: efficiency of search performance through visualising sear...
 
Tanaka, m 1995: ga-based decision support system for multi-criteria optimisa...
Tanaka, m 1995: ga-based decision support system for multi-criteria  optimisa...Tanaka, m 1995: ga-based decision support system for multi-criteria  optimisa...
Tanaka, m 1995: ga-based decision support system for multi-criteria optimisa...
 
Rauber, a. 1999: label_som_on the labeling of self-organising maps
Rauber, a. 1999:  label_som_on the labeling of self-organising mapsRauber, a. 1999:  label_som_on the labeling of self-organising maps
Rauber, a. 1999: label_som_on the labeling of self-organising maps
 
Kubota, n. 1994: genetic algorithm with age structure and its application to ...
Kubota, n. 1994: genetic algorithm with age structure and its application to ...Kubota, n. 1994: genetic algorithm with age structure and its application to ...
Kubota, n. 1994: genetic algorithm with age structure and its application to ...
 
Kubota, n. 1994: genetic algorithm with age structure and its application to ...
Kubota, n. 1994: genetic algorithm with age structure and its application to ...Kubota, n. 1994: genetic algorithm with age structure and its application to ...
Kubota, n. 1994: genetic algorithm with age structure and its application to ...
 

Último

Best-NO1 Pakistani Amil Baba Real Amil baba In Pakistan Najoomi Baba in Pakis...
Best-NO1 Pakistani Amil Baba Real Amil baba In Pakistan Najoomi Baba in Pakis...Best-NO1 Pakistani Amil Baba Real Amil baba In Pakistan Najoomi Baba in Pakis...
Best-NO1 Pakistani Amil Baba Real Amil baba In Pakistan Najoomi Baba in Pakis...Amil baba
 
LRFD Bridge Design Specifications-AASHTO (2014).pdf
LRFD Bridge Design Specifications-AASHTO (2014).pdfLRFD Bridge Design Specifications-AASHTO (2014).pdf
LRFD Bridge Design Specifications-AASHTO (2014).pdfHctorFranciscoSnchez1
 
Cold War Tensions Increase - 1945-1952.pptx
Cold War Tensions Increase - 1945-1952.pptxCold War Tensions Increase - 1945-1952.pptx
Cold War Tensions Increase - 1945-1952.pptxSamKuruvilla5
 
How to use Ai for UX UI Design | ChatGPT
How to use Ai for UX UI Design | ChatGPTHow to use Ai for UX UI Design | ChatGPT
How to use Ai for UX UI Design | ChatGPTThink 360 Studio
 
Design mental models for managing large-scale dbt projects. March 21, 2024 in...
Design mental models for managing large-scale dbt projects. March 21, 2024 in...Design mental models for managing large-scale dbt projects. March 21, 2024 in...
Design mental models for managing large-scale dbt projects. March 21, 2024 in...Ed Orozco
 
WCM Branding Agency | 210519 - Portfolio Review (F&B) -s.pptx
WCM Branding Agency | 210519 - Portfolio Review (F&B) -s.pptxWCM Branding Agency | 210519 - Portfolio Review (F&B) -s.pptx
WCM Branding Agency | 210519 - Portfolio Review (F&B) -s.pptxHasan S
 
UX Conference on UX Research Trends in 2024
UX Conference on UX Research Trends in 2024UX Conference on UX Research Trends in 2024
UX Conference on UX Research Trends in 2024mikailaoh
 
The future of UX design support tools - talk Paris March 2024
The future of UX design support tools - talk Paris March 2024The future of UX design support tools - talk Paris March 2024
The future of UX design support tools - talk Paris March 2024Alan Dix
 
Construction Documents Checklist before Construction
Construction Documents Checklist before ConstructionConstruction Documents Checklist before Construction
Construction Documents Checklist before ConstructionResDraft
 
Introduce Trauma-Informed Design to Your Organization - CSUN ATC 2024
Introduce Trauma-Informed Design to Your Organization - CSUN ATC 2024Introduce Trauma-Informed Design to Your Organization - CSUN ATC 2024
Introduce Trauma-Informed Design to Your Organization - CSUN ATC 2024Ted Drake
 
Designing for privacy: 3 essential UX habits for product teams
Designing for privacy: 3 essential UX habits for product teamsDesigning for privacy: 3 essential UX habits for product teams
Designing for privacy: 3 essential UX habits for product teamsBlock Party
 
Embroidery design from embroidery magazine
Embroidery design from embroidery magazineEmbroidery design from embroidery magazine
Embroidery design from embroidery magazineRivanEleraki
 
High-Quality Faux Embroidery Services | Cre8iveSkill
High-Quality Faux Embroidery Services | Cre8iveSkillHigh-Quality Faux Embroidery Services | Cre8iveSkill
High-Quality Faux Embroidery Services | Cre8iveSkillCre8iveskill
 
Math Group 3 Presentation OLOLOLOLILOOLLOLOL
Math Group 3 Presentation OLOLOLOLILOOLLOLOLMath Group 3 Presentation OLOLOLOLILOOLLOLOL
Math Group 3 Presentation OLOLOLOLILOOLLOLOLkenzukiri
 
Production of Erythromycin microbiology.pptx
Production of Erythromycin microbiology.pptxProduction of Erythromycin microbiology.pptx
Production of Erythromycin microbiology.pptxb2kshani34
 
Create Funeral Invites Online @ feedvu.com
Create Funeral Invites Online @ feedvu.comCreate Funeral Invites Online @ feedvu.com
Create Funeral Invites Online @ feedvu.comjakyjhon00
 
Mike Tyson Sign The Contract Big Boy Shirt
Mike Tyson Sign The Contract Big Boy ShirtMike Tyson Sign The Contract Big Boy Shirt
Mike Tyson Sign The Contract Big Boy ShirtTeeFusion
 
Khushi sharma undergraduate portfolio...
Khushi sharma undergraduate portfolio...Khushi sharma undergraduate portfolio...
Khushi sharma undergraduate portfolio...khushisharma298853
 
Building+your+Data+Project+on+AWS+-+Luke+Anderson.pdf
Building+your+Data+Project+on+AWS+-+Luke+Anderson.pdfBuilding+your+Data+Project+on+AWS+-+Luke+Anderson.pdf
Building+your+Data+Project+on+AWS+-+Luke+Anderson.pdfsaidbilgen
 

Último (19)

Best-NO1 Pakistani Amil Baba Real Amil baba In Pakistan Najoomi Baba in Pakis...
Best-NO1 Pakistani Amil Baba Real Amil baba In Pakistan Najoomi Baba in Pakis...Best-NO1 Pakistani Amil Baba Real Amil baba In Pakistan Najoomi Baba in Pakis...
Best-NO1 Pakistani Amil Baba Real Amil baba In Pakistan Najoomi Baba in Pakis...
 
LRFD Bridge Design Specifications-AASHTO (2014).pdf
LRFD Bridge Design Specifications-AASHTO (2014).pdfLRFD Bridge Design Specifications-AASHTO (2014).pdf
LRFD Bridge Design Specifications-AASHTO (2014).pdf
 
Cold War Tensions Increase - 1945-1952.pptx
Cold War Tensions Increase - 1945-1952.pptxCold War Tensions Increase - 1945-1952.pptx
Cold War Tensions Increase - 1945-1952.pptx
 
How to use Ai for UX UI Design | ChatGPT
How to use Ai for UX UI Design | ChatGPTHow to use Ai for UX UI Design | ChatGPT
How to use Ai for UX UI Design | ChatGPT
 
Design mental models for managing large-scale dbt projects. March 21, 2024 in...
Design mental models for managing large-scale dbt projects. March 21, 2024 in...Design mental models for managing large-scale dbt projects. March 21, 2024 in...
Design mental models for managing large-scale dbt projects. March 21, 2024 in...
 
WCM Branding Agency | 210519 - Portfolio Review (F&B) -s.pptx
WCM Branding Agency | 210519 - Portfolio Review (F&B) -s.pptxWCM Branding Agency | 210519 - Portfolio Review (F&B) -s.pptx
WCM Branding Agency | 210519 - Portfolio Review (F&B) -s.pptx
 
UX Conference on UX Research Trends in 2024
UX Conference on UX Research Trends in 2024UX Conference on UX Research Trends in 2024
UX Conference on UX Research Trends in 2024
 
The future of UX design support tools - talk Paris March 2024
The future of UX design support tools - talk Paris March 2024The future of UX design support tools - talk Paris March 2024
The future of UX design support tools - talk Paris March 2024
 
Construction Documents Checklist before Construction
Construction Documents Checklist before ConstructionConstruction Documents Checklist before Construction
Construction Documents Checklist before Construction
 
Introduce Trauma-Informed Design to Your Organization - CSUN ATC 2024
Introduce Trauma-Informed Design to Your Organization - CSUN ATC 2024Introduce Trauma-Informed Design to Your Organization - CSUN ATC 2024
Introduce Trauma-Informed Design to Your Organization - CSUN ATC 2024
 
Designing for privacy: 3 essential UX habits for product teams
Designing for privacy: 3 essential UX habits for product teamsDesigning for privacy: 3 essential UX habits for product teams
Designing for privacy: 3 essential UX habits for product teams
 
Embroidery design from embroidery magazine
Embroidery design from embroidery magazineEmbroidery design from embroidery magazine
Embroidery design from embroidery magazine
 
High-Quality Faux Embroidery Services | Cre8iveSkill
High-Quality Faux Embroidery Services | Cre8iveSkillHigh-Quality Faux Embroidery Services | Cre8iveSkill
High-Quality Faux Embroidery Services | Cre8iveSkill
 
Math Group 3 Presentation OLOLOLOLILOOLLOLOL
Math Group 3 Presentation OLOLOLOLILOOLLOLOLMath Group 3 Presentation OLOLOLOLILOOLLOLOL
Math Group 3 Presentation OLOLOLOLILOOLLOLOL
 
Production of Erythromycin microbiology.pptx
Production of Erythromycin microbiology.pptxProduction of Erythromycin microbiology.pptx
Production of Erythromycin microbiology.pptx
 
Create Funeral Invites Online @ feedvu.com
Create Funeral Invites Online @ feedvu.comCreate Funeral Invites Online @ feedvu.com
Create Funeral Invites Online @ feedvu.com
 
Mike Tyson Sign The Contract Big Boy Shirt
Mike Tyson Sign The Contract Big Boy ShirtMike Tyson Sign The Contract Big Boy Shirt
Mike Tyson Sign The Contract Big Boy Shirt
 
Khushi sharma undergraduate portfolio...
Khushi sharma undergraduate portfolio...Khushi sharma undergraduate portfolio...
Khushi sharma undergraduate portfolio...
 
Building+your+Data+Project+on+AWS+-+Luke+Anderson.pdf
Building+your+Data+Project+on+AWS+-+Luke+Anderson.pdfBuilding+your+Data+Project+on+AWS+-+Luke+Anderson.pdf
Building+your+Data+Project+on+AWS+-+Luke+Anderson.pdf
 

Drobics, m. 2001: datamining using synergiesbetween self-organising maps and inductive learning on fuzzy rule

  • 1. Data Mining Using Synergies Between Self-Organizing Maps and Inductive Learning of Fuzzy Rules MARIO DROBICS, ULRICH BODENHOFER, WERNER WINIWARTER Software Competence Center Hagenberg A-4232 Hagenberg, Austria {mario.drobics/ulrich.bodenhofer!wemer. winiwarter} @scch.at ERICH PETER KLEMENT Fuzzy Logic Laboratorium Linz-Hagenberg Johannes Kepler Universitat Linz A-4040 Linz, Austria klement@flll.uni-linz.ac.at Abstract Identifying structures in large data sets raises a number of problems. On the one hand, many methods cannot be applied to larger data sets, while, on the other hand, the results are often hard to interpret. We address these problems by a novel three-stage approach. First. we compute a small representation of the input data using a self-organizing map. This reduces the amount of data and allows us to create two-dimensional plots of the data. Then we use this preprocessed information to identify clusters of similarity. Finally, inductive learning methods are applied to generate sets of fuzzy descriptions of these clusters. This approach is applied to three case studies, including image data and real-world data sets. The results illustrate the generality and intuitiveness of the proposed method. 1. Introduction In this paper, we address the problem of identifying and describing regions of interest in large data sets. Decision trees (171 or other inductive learning methods which are often used for this task are not always sufficient, or not even applicable. Clustering methods [2], on the other hand, do not offer sufficient insight into the data structure as their results are often hard to display and interpret. This contribution is devoted to a novel three-stage faulttolerant approach to data mining which is able to produce qualitative infonnationfrom very large data sets. In the first stage, self-organizing maps (SOMs) [ !3) are used to compress the data to a reasonable amount of nodes which still contain all significant information 0-7803-7078-3/01/$10.00 (C)20011EEE. while eliminating possible data faults, such as noise, outliers, and missing values. While, for the purpose of data reduction, other methods may be used as well [7), SOMs have the advantages that they preserve the topology of the data space and allow to display the results in two-dimensional plots [6]. The second step is concerned with identifying significant regions of interest within the SOM nodes using a modified fuzzy c-means algorithm [3 , lO]. In our variant, the two main drawbacks of the standard fuzzy cmeans algorithm are resolved in a very simple and elegant way. Firstly, the clustering is not influenced by outliers anymore, as they have been eliminated by the SOM in the pre-processing steP,. Secondly, initialization is handled by computing a crisp Ward clustering [2] first to find initial values for the cluster centers. In the third and last stage, we create linguistic descriptions of the centers of these clusters which help the analyst to interpret the results. As the number of data samples under consideration has been reduced tremendously by using the SOM nodes, we are able to apply inductive learning methods (16] to find fuzzy descriptions of the clusters . Using the previously found clusters, we can make use of supervised learning methods in an unsupervised environment by considering the cluster membership as goal parameter. Descriptions are composed using fuzzy predicates of the form "x is/is not/is at least/is at most A", where xis the parameter under consideration, and A is a linguistic expression modeled by a fuzzy set. We applied this method to several real-world data sets. Through the combination of the two-dimensional projection of the data using the SOM and the fuzzy descriptions generated, the results are not only significant and accurate, but also very intuitive. Page:l780
  • 2. 2. Preprocessing To reduce computational effort without loosing significant information, we use sel[-organizi11g maps to create a mapping of the data space. Self-organizing maps use unsupervised competitive teaming to create a topologypreserving map of the data. Let us assume that we are given a data set / consisting of K samples I = { x 1, • • • , xK} from an 11-dimensional real space 1( <:;: JR!". A SOM is defined as a mapping from the data space 1( to an m-dimensional discrete output space C (most often m = 2). The output space consists of a finite number Nc of neurons. To each neurons E C of the output space. we associate a parametric weight vecTOr in the input space w·' = ( w! , w~ ) r E 1(. w ... , z, The mapping 4> : 1(-+C x >-> cj>(x) defines the neural map from the input space 1( to the topological structure C; cj>(x) is defined as cj> (x) = argmin; {ll x - wi ll v} , with 2 II X 11 V = 2.:" !!!!:_2 r=l V, r "'" Lr=l m, ' where lllr denotes an individual weight parameter which allows us to take different importances of the input parameters into account. The factor V, stands for the variance of the values (x) , .. . ,-t{). In other words, the function <P maps every input sample to that neuron of the map whose weight vector is closest to the input with respect to the distance measure Jl .ll v. The goal of the learning process is to reduce the overall distortion error An additional benefit of this preprocessing step is noise reduction. As the weight vector of each node represents the average of several data records, outliers and small variations in the input are eliminated. 3. Clustering Now we try to identify possible regions of interest in the map. Regions of interest are characterized through nodes whose weight vectors are close to each other in the data space. Additionally. we use the information provided by the map. Doing so, clusters are generated that are not onl y homogeneous with respect to the input space, but also within the map. To find regions of interest, we use a modified fuz zy c -means [3 ) clustering. Using this algorithm, a fuzzy segmentation of the data space is created. For initializing the fuzzy clustering, we first perform a crisp Ward clustering [2). Ward clustering is an agglomerative clustering method, which is, however, often not applicable due to its high computational effort . Through the data reduction done in the preprocessing step, we are not only able to apply this clustering method, but to take the topological information of the map into account, too. This is accomplished by merging only nodes/clusters which are neighbors in the map. The number of regions of interest L may be fixed a priori or determined by the Ward clustering initialization procedure, too. To compute the final fuzzy clustering, we use a modified distance measure, which also takes the weights and the topological structure of the map into account: "'" d (x,y) = ( where h(i,j) denotes the neighborhood relation in C. i.e. a scaling factor from the unit interval which rates the distance in the discrete topological structure of the map. For details about the learning algorithm, see (13) . The mapping provided by a SOM does not only create a set of weight vectors, it furthermore preserves the topology of the data space. As the map has a fixed dimension of m = 2, it is possible to plot each parameter (or cluster) with respect to the map in a very straightforward manner [6]. For our purposes, we will use nets with a several hundreds nodes. Through this large number of nodes , the overall distortion error can be kept small, however, training may take longer. 0-7803-7078-31011$10.00 (C)lOOl IEEE. Lr.; J m ( r Jx, -y,J ) 2 ) J n.1 - ~~ ~ l•(~(x),~(y))~ ffiaXH L." '-------r- I_m_r____~ =~ Note that, due to the division by maxkl IJ, - w;.1. dR(x, y) is always a value between 0 and I if all values Xr. Yr are in the interval [minkJ,,maxkJ,J. The exponent h(<jl(x), cj>(y))tl corresponds to the influence of the neighborhood relation I!(Q>(x), <P(y)); the smaller h( <jl(x) , cj> (y)) is (i .e. the larger the distance of node Q>(x) and node cp(y) in the map is), the stronger the raw distance dR(x,y) is stretched . The factor~ controls how strong the inftuence of the topological structure is (we use P= 2). The fuzzy clustering is computed using the usual iterative two-step update procedure. First, the degree of membership to which node i belongs to cluster k, de- Page:l781
  • 3. noted C(i, k). is computed as C(i,k) = : (i,k ) · 2:4~J a{i , l ) with a(i ,k ) = e-a·d(w',,~ )) The factor a 2: I determines the fuzziness of the clustering (1 corresponds to a very fuzzy clustering, while, e.g .• a value of 100 gives very crisp clusters~ we use a = 40). Then the cluster centers vk are updated by computing the weighted means of all nodes [ 10] : Logical operations on the truth values are defined using triangular norms and conorms, i.e. commutative, associative, and non-decreasing binary operations on the unit interval with neutral elements I and 0, respectively [12]. In our applications, we restricted to the Lukasiewicz operations: h (x,y) = max. (x + y - I ,0) SL(x,y) == min(x-t- y, 1) Then conjunctions and disjunctions of fuzzy predicates can be defined liS follows : t(A(x) 1 B(x)) = h(t(A(x)) ,t(B(x))) This procedure is repeated until the change of the cluster centers is lower than a predefined bound E. 4. Cluster Descriptions The final step is concerned with finding descriptions of the regions of interest determined in the clustering step. Depending on the underlying context of the variable under consideration, we assign natural language expressions like very very low, medium, large, etc . to each variable. These expressions are modeled by fuzzy sets which can either be defined by hand or automatically. When defining the fuzzy sets, in particular when they are generated automatically, it is important to preserve the semantic context of the natural language expressions (e.g. that low corresponds to smaller values than high) in order to achieve interpretable descriptions [5 ) . To find appropriate fuzzy segmentations of the input domains according to the distribution of the sample data automatically, we use a simple clustering-based algorithm . The cluster centers are first initialized by dividing the input range into a certain number of equally sized intervals. Then these centers are iteratively updated using a modified c-means algorithm (14] with simple neighborhood interaction to preserve the ordering of the cl usters. Usually, a few iterations (:$ 10) are sufficient for each domain. The fuzzy sets are then created as trapewidal membership functions around the centers computed previously. A fuzzy set A induces a fuzzy predicate t(x is A)= P.A(x), where tis the function which assigns a truth value to the assertion "xis A" . In a straightforward way, a fuzzy set A also induces the following fuzzy predicates [4]: t(x is not A)= 1- JlA(x) t(x is at least A) = sup{!J.A (y) I y :$ x} t(x is at most A) == sup{p.A (y) I y ~ x} 0-7803-7078-3/011$10.00 (C)2001 IEEE. t(A(x) V B(x)) = SL(t(A (x)) ,t(B(x))) Since we want to describe the class membership of the samples, we are interested in fuzzy rules of the form IF A(x) THEN C(x) (for convenience, we refer to such a rule as A ...... C. although the rule should rather be regarded as a conditional assignment than as a logical implication) where C is a fuzzy predicate that describes the class membership and A may be a compound fuzzy predicate composed of several atomic predicates, e .g. IF (age(x) is high) 1 (size (x) is at least high) THEN (class(x) is 3). The degree offulfillment of such a statement for a given sample xis defined as t (A (x) 1 C(x)) (corresponding to the so-called Mamdani inference [15]). The fuzzy set of samples fulfilling a predicate A, which we denote with A(/), is defined as (for x E /) P.A(J)(x) = t(A(x)). Let us define the cardinality of a fuzzy set B of input samples as the sum of P.n(x), i.e. IBI = L,P.n(x). xEI Finally. the cardinality of samples fulfilling a rule A ...... C can be defined by lA(/) nC(l)l = L,t(A(x) AC(x)) xE/ = L,T(t(A(x)),t(C(x))). xE/ To find the most accurate and significant descriptions, we use FS-FOIL (8], a modification of Quinlan's FOIL algorithm [ 18] for fuzzy rules . It creates a stepwise coverage of each cluster such that not only spheric, but arbitrary clusters can be handled. We compute descriptions of the individual clusters independently, therefore, the goal predicate Cis fixed and a description rule A-. Cis uniquely characterized by its antecedent predicate A. Page: 1782
  • 4. The basic algorithm can be summarized as follows: open nodrs =cluster; s; =0 start with the most general predicate {0} do { expand the best n previous predicates if a rule A -> Cis accurate & signi fie ant { add predicate A to rule set S; remove covered nodes from open nodes go on with the most general predicate {0} prune predicates } while open nodes i 0; In detail, the expansion of a predicate A is performed by appending a new atomic fuzzy predicate 8 by means of conjunction. To decide which predicates to expand, an entropy-based information gain measure is used to define a ranking of the set of predicates: G = IA(I)nC(I) ' (tog 2 IA(J) nC(I) I _log 2 'C(f)l) IA(I)I I II I Rules of interest have to be significant and accurate. The significance of rule A ->Cis defined by its support supp (A __. C) = IA{i) nC(I)I II I ' 5.1. Iris data We used the well-known Iris data set according to Fisher [91 to illustrate the ability of our approach to identify hidden structures in a data set. The data consist of three classes of tlowers, where each tlower is characterized by four attributes. Each class is represented by 50 samples. We generated a SOM with 10 x 10 nodes and a set of 4 clusters to separate the test data. Finally, a compact set of nine intuitive fuzzy rules was created by the above algorithm . Applying this set of rules to the original data resulted in 10.6% misclassified and 7.3% unclassified samples. Note that, throughout the clustering and labeling process, the goal information has not been taken into account. This means that the proposed method is able to label the goal classes with a correctness of over 80% without even knowing the correct classification. In the original Iris data set, one goal class is perfectly separated from the others. The remaining two classes, however, have a certain region of overlapping in the data space. As a detailed analysis has shown, three of the four generated clusters can be directly associated with one of the goal classes. The fourth cluster exactly contained the region of overlap, which demonstrates that our method is able to preserve the topology of the input space and to detect hidden structures in the data. the accuracy as its confidence con f( A...... C) = supp(A ..... C) , supp(A) where supp(A) is defined as supp(A) = IA(/1)1 i/ We consider only rules which satisfy reasonable requirements in terms of support and confidence. This is achieved by two lower bounds SUPPmin and confmin, respectively [ 1]. As, for all predicates A, B. and C, supp(A-+ C) ~ supp(A II B-+ C), we can prune the expanded set of predicates by removing all predicates X for which supp(X) < SUPPmin· The final degree of membership of a sample x E 2( to cluster i with respect to the rule set Si is given by: SLA(x) AES' 5. Results Now we will present three results with very different data sets and purposes-<lemonstrating the flexibility and general applicability of the proposed approach. 0-7803-7078-31011$10.00 (C)2001IEEE. 5.2. Image segmentation A typical application of clustering in computer vision is image segmentation, therefore, it seemed interesting to apply our three-stage approach to this problem. Moreover, the possibility to describe segments with natural language expressions gives rise to completely new opportunities in image understanding. First experiments using pixel coordinates and RGB values showed good results in terms of accuracy, however, the descriptions were rarely intuitive. In order to overcome this problem, we added HSL features (hue, saturation and lightness), which match the way humans perceive color much closer, in the final description step. Since humans have a certain understanding of colors and intensities, it is more appropriate to assume predefined fuzzy predicates corresponding to natural language terms describing these properties. By this way, moreover, it can be avoided that the meaning of the descriptions depend on the image under consideration. The example in Fig. 1 shows an image with 170 x 256 pixels, i.e. we have K = 43520 samples with n = 8 features. First, it was mapped onto a SOM with 10 x 10 nodes. Then four clusters and the corresponding descriptions were computed. On the right-hand side, Fig. 1 shows the computed segmentation. Figure 2 shows the Page: 1783
  • 5. final rule sets and the cross validation table which illustrates how good the clusters are described by the rule sets. classification prohlem with ad hoc structured probability mudels, binary decision trees, or Bayesian dassilier methods were only abk to achieve an accuracy of approximately 55'k ill]. We ..:omput.ed a SOM with 20 >: 20 nodes . which were then grouped to 5 clusters (Fig. 3 shows pkm of the class mcnihcrships over the SOM). Finally. a set of descriptions for each cluster was created. figure 1. Original iuwge ( i£ji) and its scgrnentarion (rig hi} Rule set 110: [blue •• B) [red <= L] &.&: [blue >= Figure 3. Clusters in tile yeast datu sel VH] Rille set #1: (lightness <= dark] To validate our results, we compared the originul classification. with our clusters and rules. Although we used less clusters than goal classes (which were not taken into account). the overall number of samples assigned to the wrong goal class was relatively small (cf. Fig. 4). Rule set #2: (ligtltnasn >= light:] Rille set #3: (hue orange) (hue •• red) [hue ~::;: yellow] [hue •• gi·een) U Rule #0 Clus1:er #0 : 97% c [lightness •• normal) Rule #1 or. Rule #2 07. Rule #3 O'l. Cluster #1: 07. 987. OJ. Cluster #2: 7'/, Oi. 987. 01. 0'1. Cluster #3: 27. 07. 07. 99'1. Tote1 Error: 0.41% (+0.43% missing) Figure 2. Rule sets for tile image segmeruation [C¥T) [N1JC) ('129): [MIT] (244): (~E3] (163): (XE2) (51): [ME1] (44): (EXC] #0 (163): ( 9%) (35): [VAC] (30): (POX] (20) : [ERLJ (5): c #l C~"l'l..l c #2 c #3 117. [317.] ?Y. 07. 4'l. c #4 13% 67. 14'l. 177. 58'l, [45'f.] 20% 1/. 5% 93% 82% O'l, O'l. 2'l. 11/. 33% 07. 33'l. 36 O'l. 23'l. 15'f. 25% 15% O'l, 457. 20'1. 0'1. BO'l. O'l, 0'1. 11% 17. 6'l. [68'l.] 27'/. !)'(. [59%] l'l. 5'l. 47. 5% 6Y, Total Error: 37% ( +3% <nissing) Figure 4. Cross validation table (goal vs. clusters), brackets i11dicate cluster-class assignments The rule sets in Fig. 2 can be interpreted as follows: The first rule set corresponds to the blue sky. The second rule set describes the black pants. The snow is classified by the third rule set. Finally. the jackets are identified by the t()urth rule set. 5.3. Yeast data set As third test case, we used the UCI yeast data set due The rule sets created for each cluster are shown in Fig. 5. Applying these rules to the original data, and comparing the result with the cluster memberships. an overage error of 10.38% with 53.63% missing occurred. The high number of missing samples is due to complex structure of the clusters (see Fig. 3). to K. Nakai. The datu set consists of K = 1484 sumplcs 6. Conclusion with n = 8 attributes. Each sample- belongs to one of ten classes. The data set has a very complex inner structure and a lot of inconsistencies. Attempts to solve this ln this paper we have shown how the combination of different methods (SOMs, clustering, and inductive 0-7803-7078-3/011$10.00 (C)2001 IEEE. Page: 1784
  • 6. Rule set #0: (gvh >u VH) '= [meg >= H] t&: [gvh <= Hl llti. [alm H] U (mit <= M] U (v• c >• VH] 1<1< (nue '• VL) Rul e set lll : [alm >= VH ) U (mit • • VVL] 1<1< [vae == M) Rul e set #2: [nuc >= M) [erl >• VL) 1<1< [vac Rule set #3: [meg •• VVL] U •u (5] U. Bodenhofer and P. Bauer. Towards an axiomatic treatment of "interpretability". In Proc. IIZUKA2000, pages 334-339, Iizuka, October 2000. [6] G. Deboeck and T. Kohoncn (Eds). Visual Explorations in Finance with Self-Organizing Maps . Springer, London, 1998. VVH] 1<1< [nuc >= VL] [alm • • VVL] Rule set #4: [mit >= M) Figure 5. Generated rule sets for the yeast data set learning) can minimize the shortcomings of the individual methods. Through the generic approach, our methods can be applied to a wide variety of problems, including classification, image segmentation. and data mining. The modular design allows to change individual components on demand or even to omit one step. e.g. when labeled data is available, it is possible to use this information as the goal parameter in the inductive learning step, instead of performing unsupervised clustering first. Acknowledgements Th is work has been done in the framework of the Kplus Competence Center Program which is funded by the Austrian Government, the Province of Upper Austria, and the Chamber of Commerce of Upper Austria. References [l] R . Agrawal. T. Imielinski, and A. N. Swami. Mining association rules between sets of items in large databases. In P. Buneman and S. Jajodia, editors, Proc. ACM SIGMOD Int. Conf on Management of Data, pages 207-216, Washington, DC, 1993. [7 ) C. Diamanrini and M. Panti. An efficient and scalable data compression approach to classification. ACM SIGKDD Explorations, 2:49-55, 2000. [8] M. Drobics , W. Winiwarter, and U. Bodenhofer. Interpretation of self-organizing maps with fuzzy rules. In Proc. 12th IEEE lilt. Conf on Tools with Artificial Intelligence (ICTA/'00), pages 304-311. IEEE Press, 2000. [9] R . A. Fisher. The use of multiple measurements in tax.onomic problems . Annals of Eugenics, 7: 179188, 1936. [10] F. Hoeppner, F. Klawonn, R. Kruse, and T. A. Runkler. Fuuy Cluster Analysis-Methods for Image Recognition, Classification, and Data Analysis. John Wiley & Sons, Chichester, 1999. I II] P. Horton and K. Nakai. A probabilistic classification system for predicting the cellular localization sites of proteins. ln D. J. States, P. Agarwal , T. Gaasterland. L. Hunter, and R. Smith, editors , Proc. 4th Int. Conf on IntelligenT Systems for Molecular Biology, pages 109-115. AAAI Press, Menlo Park, 1996. [12] E. P. Klement, R. Mesiar, and E. Pap . Triangular Norms, volume 8 of Trends in Logic. Kluwer Academic Publishers, Dordrecht, 2000. [13] T. Kohonen . Self-Organizing Maps, volume 30 of Information Sciences. Springer, Berlin, second extended edition, 1997. [14] Y. Linde, A. Buzo, and R. M. Gray. An algorithm for vector quantizer design. IEEE Trans. Commun., 28( I ):84-95, 1980. [2] M. R. Anderberg. Cluster Analysis for Applications . Academic Press, 1973. [ 15) E. H. Mamdani and S. Assilian. An experiment in linguistic synthesis of fuzzy controllers. Int. J. Man-Mach. Stud., 7:1-13, 1975. [3] J. C. Bezdek.. Pattern Recognition with Fuzz.y Objective Function Algorithms. Plenum Press, New York, 1981. [16] R. S. Michalski, I. Bratko, and M. Kubat. Machine Learning and Data Mining. John Wiley & Sons, Chichester, 1998. [4) U . Bodenhofer. The construction of orderingbased modifiers. In G. Brewka. R. Der, S. Gottwald, and A. Schierwagen, editors , FuuyNeuro Systems '99, pages 55-62. Leipziger Universitatsverlag, 1999. 0-7803-70711-31011$10.00 (C)2001 IEEE. [ 17] J. R. Quinlan. Induction of decision trees. Machine Learning, 1(1):81-106, 1986. [18] J. R. Quinlan. Learning logical definitions from relations. Machine Learning, 5(3):239-266, 1990. Page: 1785