Exercise “PARAFAC2 difficult data”

PARAFAC2 offers quite distinct possibilities
for handling data that cannot be modeled in a meaningful way by ordinary PARAFAC models. Often the trilinearity assumptions in PARAFAC are too strict and not adequate for describing the data. In those situations, PARAFAC2 can handle the data if the trilinearity is a problem in one mode

When does PARAFAC fail?
In order to see that ordinary PARAFAC can actually handle quite severe deviations from trilinearity, a small simulated example will be used. The amino acid fluorescence data will be used, only the emission mode will be shifted from excitation to excitation to make the data non-trilinear.

load claus
X = permute(X,[2 1 3]);

The data are rearranged with the permute command so that emission (the changing mode) is the first mode and excitation (the mode across which changes occur) is the last mode. Then the emission spectra are shifted:

for i=1:61 Xs(:,:,i) = X(1+round(i/4):170+round(i/4),:,i); end

To see the changes, first plot e.g. sample 1 in the unshifted situation. As this is a pure sample, it is close to rank one and all emission spectra have the same shape.

subplot(2,1,1) xx = squeeze(X(:,1,:)); % Sample 1 xx = xx*diag(sum(xx.^2).^(-.5)); % Normalize spectra plot(xx),axis tight title('Raw data - no shifts')

Then plot the shifted version of this sample to see the difference

subplot(2,1,2) xx = squeeze(Xs(:,1,:)); % Sample 1 xx = xx*diag(sum(xx.^2).^(-.5)); % Normalize spectra plot(xx),axis tight title('Raw data - with shifts')

As can be seen, the data are now far from being trilinear and hence PARAFAC can definitely not model the data to the extent that the unshifted data can be modeled.

Fit the PARAFAC model and compare the loadings with those of the unshifted model.

m = parafac(Xs,3);

It is impressive and maybe even surprising that the parameters turn out so well even with such severe shifts (non-trilinearities) in the data. In this case, there is a little ‘cheating’ involved because the regularity of the shifts makes life a little bit easier for PARAFAC. Still, the result is important because it clearly shows that meaningful results can be obtained also when PARAFAC is only approximately correct.

Using PARAFAC2
When the shifts are more irregularly spaced, then PARAFAC will have more difficulties and then PARAFAC2 may come in handy. Make a modified data array according to the below

for i=1:61 if i<30 % Don’t shift the first 30 excitations Xs(:,:,i) = X(1:170,:,i); else % but shift the last 31 excitation Xs(:,:,i) = X(1+round(i/3):170+round(i/3),:,i); end end

Plot the data to see the modifications

subplot(2,1,1) xx = squeeze(X(:,1,:)); % Sample 1 xx = xx*diag(sum(xx.^2).^(-.5)); % Normalize spectra plot(xx),axis tight title('Raw data - no shifts')

subplot(2,1,2) xx = squeeze(Xs(:,1,:)); % Sample 1 xx = xx*diag(sum(xx.^2).^(-.5)); % Normalize spectra plot(xx),axis tight title('Raw data - with shifts')

Now fit both a PARAFAC and a PARAFAC2 model to these data and plot the parameters to see the differences:

mpf1 = parafac(Xs,3); [A,H,C,P]=parafac2(permute(Xs,[2 1 3]),3);

subplot(3,2,1) for i = 1:10 plot(P{i}*H),hold on end hold off axis tight title('PARAFAC2 solution','fontweight','bold')

subplot(3,2,3) plot(A) axis tight

subplot(3,2,5) plot(C) axis tight

subplot(3,2,2) plot(mpf1{1}) axis tight

title('PARAFAC1 solution','fontweight','bold')

subplot(3,2,4) plot(mpf1{2}) axis tight

subplot(3,2,6) plot(mpf1{3}) axis tight

Apart from minor issues, the PARAFAC2 results are clearly preferred to the ordinary PARAFAC results. Why are there several similar emission loadings in PARAFC2? How many are there?

In fact, only the first ten versions of the PARAFAC2 emission loadings are plotted. Try to plot all the 61 versions. Now the emission plot is much more cluttered. Is that a problem?