- Hierarchical clustering is a classic clustering algorithm.
- It produces a representation of the data with the shape of a binary tree, or a dendrogram, in which the most similar patterns are clustered
in a hierarchy of nested subsets.
- Initially, the dendrogram is empty, and data vector is in its own cluster. In each iterative step, the two clusters
with the maximum similarity are merged to form a subtree, and hence the current number of clusters is reduced by 1. The merging process is repeated until
the desired number of clusters K is obtained. Hence, there are K subtrees when the iterative merging process stops, constructing K clusters.
- The inputs to hierarchical clustering algorithm include the expression matrix (or similarity matrix) and the desired number of clusters, K.
- Different cluster similarity criteria yield different hierarchical clustering outcomes, some of the common similarity criteria are single-linkage, average-linkage, complete linkage and centroid linkage.