Unsupervised Classification algorithms

Today several different unsupervised classification algorithms are commonly
used in remote sensing. The two most frequently used algorithms are the K-mean
and the ISODATA clustering algorithm.

Both of these algorithms are iterative
procedures. In general, both of them assign first an arbitrary initial cluster
vector. The second step classifies each pixel to the closest cluster. In the
third step the new cluster mean vectors are calculated based on all the pixels
in one cluster. The second and third steps are repeated until the "change"
between the iteration is small. The "change" can be defined in several different
ways, either by measuring the distances the mean cluster vector have changed
from one iteration to another or by the percentage of pixels that have changed
between iterations.

The ISODATA algorithm has some further refinements by
splitting and merging of clusters (JENSEN, 1996). Clusters are merged if either
the number of members (pixel) in a cluster is less than a certain threshold or
if the centers of two clusters are closer than a certain threshold. Clusters are
split into two different clusters if the cluster standard deviation exceeds a
predefined value and the number of members (pixels) is twice the threshold for
the minimum number of members.

The ISODATA algorithm is similar to the k-means algorithm with the distinct difference that the ISODATA algorithm allows for different number of clusters while the k-means assumes that the number of clusters is known a priori.

The objective of the k-means algorithm is to minimize the *within*
cluster variability. The objective function (which is to be minimized) is the
*sums of squares distances *(errors) between each pixel and its assigned
cluster center.

where
*C(x)* is the mean of the cluster that pixel *x* is assigned to.

Minimizing the *SS _{distances}* is equivalent to minimizing the
Mean Squared Error (MSE). The MSE is a measure of the within cluster
variability.

where N is the
number of pixels, c indicates the number of clusters, and b is the number of
spectral bands.

Note that the MSE is not the objective function of the ISODATA algorithm. However, the ISODATA algorithm tends to also minimize the MSE.

K-means (just as the ISODATA algorithm) is very sensitive to initial starting values. For two classifications with different initial values and resulting different classification one could choose the classification with the smallest MSE (since this is the objective function to be minimized). However, as we show later, for two different initial values the differences in respects to the MSE are often very small while the classifications are very different. Visually it is often not clear that the classification with the smaller MSE is truly the better classification.

From a statistical viewpoint, the clusters obtained by k-mean can be
interpreted as the Maximum Likelihood Estimates (MLE) for the cluster means if
we assume that each cluster comes from a spherical Normal distribution with
different means but identical variance (and zero covariance).

This touches upon a general disadvantage of the k-means algorithm (and
similarly the ISODATA algorithm): k-means works best for images with clusters
that are spherical and that have the same variance.

This is often not true
for remote sensing images. For example, a cluster with "desert" pixels is
compact/circular. A "forest" cluster, however, is usually more or less
elongated/oval with a much larger variability compared to the "desert" cluster.
While the "desert" cluster is usually very well detected by the k-means
algorithm as one distinct cluster, the "forest" cluster is often split up into
several smaller cluster. The way the "forest" cluster is split up can vary quite
a bit for different starting values and is thus arbitrary.