• Hierarchical clustering is a classic clustering algorithm.
  • It produces a representation of the data with the shape of a binary tree, or a dendrogram, in which the most similar patterns are clustered in a hierarchy of nested subsets.
  • Initially, the dendrogram is empty, and data vector is in its own cluster. In each iterative step, the two clusters with the maximum similarity are merged to form a subtree, and hence the current number of clusters is reduced by 1. The merging process is repeated until the desired number of clusters K is obtained. Hence, there are K subtrees when the iterative merging process stops, constructing K clusters.
  • The inputs to hierarchical clustering algorithm include the expression matrix (or similarity matrix) and the desired number of clusters, K.
  • Different cluster similarity criteria yield different hierarchical clustering outcomes, some of the common similarity criteria are single-linkage, average-linkage, complete linkage and centroid linkage.