Ordination with any dissimilarity measure: a weighted Euclidean solution

Authors: Michael Greenacre

Ecology, Vol. 98, No 9, 2293-2300, September, 2017

The classical approach to ordination is to use variants of the Euclidean distance to measure differences between samples (e.g., sites in a community study) based on their observation vectors (e.g., abundance counts for a set of species). Examples include Euclidean distance on standardized or log-transformed data, on which principal component analysis and redundancy analysis are based; chi-square distance, on which (canonical) correspondence analysis is based; and Hellinger distance, using square roots of relative values in each multivariate vector. Advantages of the Euclidean approach include the neat decomposition of variance and the ordination's optimal biplot display. To extend this approach to any non-Euclidean or nonmetric dissimilarity, a simple solution is proposed, consisting of the estimation of a weighted Euclidean distance that optimally approximates the dissimilarities. This preliminary step preserves the good properties of the classical approach while giving two additional benefits as by-products. Firstly, the estimated species weights, quantifying each species’ contribution to the dissimilarities, can be interpreted, and weights equal or close to zero can assist in variable selection. Secondly, the dimensionality remains that of the number of species, not the dimensionality inherent in the dissimilarities, which depends on the number of samples and can be considerably higher.