Cartogram data projection for self-organizing maps
The Self-Organizing Map (SOM) is very often visualized by applying Ultsch's Unified Distance Matrix (U-Matrix) shading and labeling the cells of the 2-D grid with training data observations nearest to that node in feature space. Although powerful and the de facto standard visualization for SOMs, this does not provide for two key pieces of information when considering real world data mining applications: (a) While the U-Matrix indicates the location of possible clusters on the map, it typically does not accurately convey the size of the underlying data population within these clusters. (b) When mapping training data observations onto the 2-D grid of the SOM it often occurs that multiple observations are mapped onto a single cell of the grid. Simply labeling the observations on a single cell does not provide any insights of the feature-space distribution of observations within that cell and in practical data mining applications it is often desirable to understand the distribution or “goodness of fit” of the observations as they are mapped to the individual SOM cells. We address these problems with two complementary innovations. First, we increase or decrease the 2-D size of each cell according to the number of data elements it contains; an approach derived from the cartogram techniques in geography. Second, we determine the within-cell location of each datum according to its similarity in n-dimensional feature space to each of the neighboring nodes that surround it on the 2-D SOM grid. When multiple observations are mapped to a single cell then the plot locations will convey a sense of the data distribution within that cell. One way to view plotting of the data distribution within a cell is as a visualization of the quantization error of the map. Finally, we found that these techniques lend themselves to additional applications and uses within the context of SOMs and we will explore them briefly.^
David H Brown,
"Cartogram data projection for self-organizing maps"
Dissertations and Master's Theses (Campus Access).