An algorithm for fitting the trilinear model
Has received much attention in chemometrics has several acronyms, e.g.:
|GRAFA||Generalized rank annihilation factor analysis|
|GRAM||Generalized rank annihilation method|
|DTLD||Direct trilinear decomposition|
|NBRA||Nonbilinear rank annihilation|
Ho and coworkers
developed an algorithm called rank annihilation factor analysis – RAFA – for estimating the concentration of an analyte in an unknown matrix solely using the unknown sample and a pure standard. This amazing property has later been coined the second-order advantage, as this property is obtained by using the second-order or two-way structure of the individual sample instead of merely unfolding the matrix to a long vector or first-order structure. The second-order advantage is identical to the uniqueness of the trilinear structure, and the structural model underlying the GRAM, RAFA, and DTLD algorithm is the same as that of the PARAFAC model.
The idea behind RAFA
was based on reducing the rank of the calibration sample by subtracting the contribution from the analyte of interest, that is, if the signal from the analyte of interest is subtracted from the sample data, then the rank of this matrix will decrease by one as the contribution of the analyte of interest to the rank is one in case of ordinary bilinear rank-one data like chromatographic or fluorescence data. This was intuitively appealing but the method itself was somewhat unsatisfactory and slow.
Avraham Lorber found that
the algorithm could be automated by realizing that the sought reduction in rank could be expressed as a generalized eigenvalue problem. Sanchez & Kowalski generalized the method into the generalized rank annihilation method – GRAM – in which several components could be present/absent in both calibration and standard sample.
The GRAM method
is based on the trilinear (PARAFAC) model and seeks to estimate the model using a (generalized) eigenvalue problem. The method is restricted by one mode having maximally dimension two (i.e., two samples). However, estimating the parameters of the model using two samples, will give the spectral and chromatographic loadings in case of a chromatography-spectroscopy analysis. As these parameters are fixed for new samples as well, the concentrations of analytes for new samples can be found by simple regression on the fixed parameters. Used in this way the method is called DTLD. This extension is normally based on defining a set of two synthetic samples based on linear combinations of the original samples. In the N-way toolbox the Tucker3 model is used for that purpose.
The DTLD and PARAFAC model are structurally identical, then what are the differences? The PARAFAC model is a least-squares model whereas the DTLD model has no well-defined optimization criterion. For noisefree data it gives the correct solution but for noisy data there is no provision for the quality of the model. The advantage of DTLD is the speed of the algorithm and for precise trilinear data it often gives good results.
The advantage of the PARAFAC model
is the possibility to modify the algorithm according to external knowledge, such as using weighted regression when uncertainties are available, or using non-negativity when negative parameters are known not to conform to reality. Also advantageous for the PARAFAC model is the easiness with which it can be extended to higher-order data.
Many authors have compared GRAM & DTLD with PARAFAC for their use in curve resolution They mostly find GRAM inferior to PARAFAC but suggest using GRAM for initialization of the PARAFAC algorithm which is also adopted in the N-way toolbox.
The basic principle of GRAM/DTD has been invented several times.
As early as 1972 Schönemann developed a similar algorithm, essentially and interestingly based on the same idea of Cattell from 1944 through Meredith that Harshman generalized to the PARAFAC model and algorithm.