[2024]Digital Global Overview Report 2024 Meltwater.pdf
Improving the correlation hunting in a largequantity of SOM component planes
1. Improving the correlation hunting in a large quantity of SOM component planes Classification of agro-ecological variables related with productivity in the sugar cane culture . Miguel BARRETO Andrés Pérez-Uribe MINISTERIO DE AGRICULTURA Y DESARROLLO RURAL asocaña
2. Self Organizing Maps A Self-organizing maps (SOMs) can be seen as a data visualization technique that reduces the dimensionality of data through the use of a self-organizing clustering algorithm. The problem that data visualization attempts to solve is that humans cannot visualize high dimensional data . These techniques can be used to improve the understanding of high dimensional data by visualizing information in a low dimensional space . A SOM presents high dimensional data in a low dimensional space by placing points that are close in the high dimensional space, close in the low dimensional space. From a computational point of view, the self-organizing model is both a projection method which maps high-dimensional data space into low-dimensional space (reduction of dimensionality), and a clustering method , so that similar data samples tend to be mapped to nearby neurons.
3. Component planes To improve the analysis of the relationships between variables and/or their influence on the outputs of the system, it is possible to slice the Self-organizing maps in order to visualize their so-called component planes Vector n Vector 2 Vector 1 Ra1AS P1AS TMAS V1
4. Example: Clustering of the SOM easily reveals distinct gene expression patterns: results of a reanalysis of lymphoma study (Junbai Wang et al, 2002) a) 42 DLBCL samples, the color scale of SOM red indicates high expression, blue indicates low expression. b) The cluster numbers resprent gruop of genes contained.
5. Correlation hunting The task of organizing similar components planes in order to find correlating components is called correlation hunting.
6. Correlation hunting The expression correlation does not include just linear correlations , but also nonlinear and local or partial correlations between variables
7. Correlation hunting However, when the number of components is large it is difficult to determine which planes are similar to each other .
8. Correlation hunting A new SOM can be used to reorganize the component planes in order to perform the correlation hunting. The main idea is to place correlated components close to each other.
9. Correlation hunting An advantage of using a SOM for component plane projection is that the placements of the component planes can be shown on a regular grid . In addition, an ordered presentation of similar components is automatically generated. A disadvantage is that the choice of grouping variables is left to the user .
10. More component planes … Heart disease 279 component planes This database contains 13 attributes (which have been extracted from a larger set of 75)
11. Clustering of SOM component planes based on the SOM distance matrix The U-matrix had been used as an effective cluster distance function. The U-matrix visualizes distances between each map unit and its neighbors, thus it is possible to visualize the SOM cluster structure .
12. Use the Vellido’s algorithm to partition the map The Vellido’s algorithm is used to obtain different partitioning levels of the clustering of the SOM. The Vellido’s algorithm provides a partitioning of the map into a set of base clusters . The number of clusters is equal to the number of local minima on the U-matrix ; allowing different levels of clustering.
15. A new approach 1358 experiments Management Climate Genotype Each agroecological event is unique in time and space , but it is possible to find similar characteristics between events that allow finding similar behaviors permitting to discover why and how the agroecological variables affect the crop development and therefore the agricultural productivity . Sowing Growing Harvest Soil
20. Classification of agro-ecological variables related with productivity (initial analysis) BMUs of the component planes: productivity, radiation 1 month before harvest (Ra1BH) and radiation 1 month after seed (Ra1AS).