• It is a linear transformation that transforms the data to a new coordinate system such that the greatest variance by any projection of the data comes to lie on the first coordinate (called the first principal component), the second greatest variance on the second coordinate, and so on.
  • PCA can be used for dimensionality reduction in a dataset while retaining those characteristics of the dataset that contribute most to its variance, by keeping lower-order principal components and ignoring higher-order ones. Such low-order components often contain the "most important" aspects of the data.
  • PCA has the distinction of being the optimal linear transformation for keeping the subspace that has largest variance.
  • Unlike other linear transforms, the PCA does not have a fixed set of basis vectors. Its basis vectors depend on the data set.
  • PCA is also called the (discrete) Karhunen-Loeve transform (or KLT, named after Kari Karhunen and Michel Loeve) or the Hotelling transform (in honor of Harold Hotelling).