8. TUCKER INTRODUCTION

8. TUCKER INTRODUCTION

Contents
By C. A. Andersson

The PARAFAC model
The Tucker3 model
The Tucker model family

Data used
howto1.mat and howto2.mat, two small simulated data sets to illustrate different basic properties of the Tucker models

Purpose
Getting an initial feeling for the difference between PARAFAC and Tucker.

Information
R. Henrion N-way principal component analysis theory, algorithms and applications.
Chemom.Intell.Lab.Syst. 25:1-23, 1994

Prerequisites
Knowledge of MATLAB, multi-way arrays and PARAFAC

Since interpretation of the Tucker3 model
requires some understanding of the model mechanics, we will elaborate on the subject of the factor-core relationship. The equations for the Tucker3 model will be given.
Three examples and two How-to cases are given to illustrate the theory.

The Tucker3 model has taken the name from
the psychometrician Ledyard R. Tucker who in 1966 proposed the model.
He also proposed a way to calculate the parameters of the model and since then, many improvements have been suggested with regards to the algorithmic solution.
The model itself has remained a strong tool for analysis of three-way (and higher-way) data arrays. The generality of the Tucker3 model, and the fact that it covers the PARAFAC model as a special case, has made it an often used model for decomposition, compression, and interpretation in many applications. In the sequel we will discuss pro et contra of the model and also include an application.
(Some mathematical notes on three-mode factor analysis. Psychometrika 31:279-311, 1966)

1. The PARAFAC model

The simplest three-way model is the PARAFAC model.
It was proposed simultaneously by Harshman (Foundations of the PARAFAC procedure: Models and conditions for an ‘explanatory’ multi-modal factor analysis. UCLA working papers in phonetics 16:1-84, 1970)

And Carrol & Chang (Analysis of individual differences in multidimensional scaling via an N-way generalization of ‘Eckart-Young’ decomposition. Psychometrika 35:283-319, 1970)

Who termed it CANDECOMP.
See Bro (PARAFAC. Tutorial and applications. Chemom.Intell.Lab.Syst. 38:149-171, 1997)
for a tutorial on PARAFAC . The w-component PARAFAC model can be considered as being related to the decomposition models well known from multivariate two-way analysis;
The three-way array X (r₁,r₂,r₃) is decomposed into a systematic part, described by w sets of outer product of vectors/profiles (each outer product is called a triad) and an unmodelled part denoted by E representing the deviation of the model from the real, measured, data. See Fig. 2.

The vectors a₁ to a_w are arranged columnwise in matrix A (r₁,w). Such a matrix is called a component matrix. Accordingly, the vectors/profiles in modes two and three are arranged in component matrices B (r₂,w) and C(r₃,w).

The mathematical formulation of the w-component PARAFAC model can take the form described in Eq. (1). Here, each element x_ijk (i=1,…,r₁; j=1,…,r₂; k=1,…,r₃) is approximated by the model as the sum of wouter products between the rows of the three component matrices.

The PARAFAC model has some very interesting features
Provided that no more than the correct number of components are extracted, the profiles are determined uniquely up to permutations and scaling. This allows for resolving pure chemical spectra from N-way data arrays as elaborated on in the earlier chapters.

It should be noted
that the PARAFAC model is simply a sum of w outer products between matching profiles.
As implied by Fig. 2, the PARAFAC model requires the same number of components to be extracted in all modes.

2. The Tucker3 model and 3-way PCA

Now we turn the discussion to the main issue
The Tucker3 model. We can picture the Tucker3 model as an extension of the PARAFAC-CANDECOMP model along the line of outer products.

A tutorial to chemical applications of the Tucker3 model
has been made by Henrion (N-way principal component analysis theory, algorithms and applications. Chemom.Intell.Lab.Syst. 25:1-23, 1994).

The textbook by Kroonenberg
(Three-mode Principal Component Analysis. Theory and Applications, Leiden:DSWO Press, 1983) gives a detailed mathematical description of the model and discusses advanced issues such as data preparation/scaling and core rotation. The Tucker3 model with orthogonal factors is also known as 3-way PCA (principal component analysis).

The Tucker3 model allows for extraction
of different numbers of factors in each of the modes. We will refer to these numbers as w₁, w₂ and w₃, where w₁ is the number of profiles in the first mode and w₂ and w₃ are the numbers of factors to estimate in modes 2 and 3. An easy way to explain the model is provided by means of Fig. 3

Note: in Fig 3., that B seems to be of size , w₂ x r₂ although it should be r₂ x w₂.
However, the figure is a three-dimensional representation and in 3D, there is no transpose!
Think about it a bit. How would it be possible to designate the difference between
A, B and C in the figure with transpose? Well, it isn’t and therefore the orientation of a matrix can not be defined from a 3D figure. It has to be defined elsewhere.

One major difference to the PARAFAC model is
the presence of the core array G (w₁,w₂,w₃). Fig. 3 illustrates that the model is a weighted sum of all possible outer products (i.e., triads) where the weight of the outer product between the i ‘th factor from A, the j ‘th factor from B and the k ‘th factor from C is determined by element g_ijk of the core.

The profiles derived by the PARAFAC model are often
left unconstrained for chemical applications. However, the factors in the Tucker3 model are often constrained to be orthogonal since the resulting core is easier to interpret and the model requires much less time for computation.

Example 1

Question
Given a three-way data array X with dimensions (10,18,5), we wish to estimate a Tucker3 model with two components in the first mode, three in the second and two in the third mode. What are the dimensions of the resulting data structures:
A (r₁,w₁), B (r₂,w₂), C (r₃,w₃) and G?

Answer

How to do in the N-way Toolbox #1
Here are the N-way Toolbox commands to type to calculate the Tucker3 model above.
The data arrays X, R and W are initialized with values that will solve the question posed in Example #1. The tucker-algorithm used in the N-way Toolbox gives orthogonal factors unless otherwise specified.

load howto1 %loads the data array X and R and W [Factors,G,SSE]=tucker(X,W); %Calculate the TUCKER3 model plotfac(Factors) %Automated way to plot the results
%in the'Factors' directly explcore(G,5); %Automated way to find the 5 most
%important factor combination'G'
%Lets see how this can be done manually   [A B C]=fac2let(Factors); %Extract the component matrices from
%'Factors' for manual investigation figure %Create a new figure to plot in
%this allows us to compare
plot(A),title('Factors in first mode')
%Plot the first component matrix    plot(B),title('Factors in second mode')
%Plot the second component matrix    plot(C),title('Factors in third mode')
%Plot the third component matrix    num2str(G,5) %Inspect the core manually remember
%the automated way above
%namely 'explcore'

In Eq. (2) the Tucker3 model
of X is written in a similar manner to that of PARAFAC, compare Eq. (1).

In Eq. (2) we see that the approximation of the ijkth element of X is a sum of the possible w₁w₂w₃ outer products each weighted by the corresponding elements of G. Usually the component matrices are constrained to be columnwise orthogonal, or independent, but this is not a necessary constraint. It suffices to constrain the length of the factors to a fixed value, i.e. one. The algorithm for estimating unconstrained factors is included in the N-way Toolbox. It is not trivial and will not be discussed here.

In order to facilitate the discussion of the model equations we introduce an operation called the Kronecker multiplication. It is designated by the symbol which applied as
A B yields the element-by-element multiplication of B with the elements from A,
see Eq. (3).

The Kronecker multiplication
is implemented in standard MATLAB, as kron(A,B).
Using this operator we can give a very simple mathematical expression of the relation between the matricized/unfolded matrix X⁽¹⁾ and the component matrices A (r₁,w₁), B (r₂,w₂), C (r₃,w₃) and the unfolded core G⁽¹⁾ (w₁,w₂w₃).
This is done in Eq. (4).

Note that instead of showing an error term in Eq. (4) the equality has been replaced by an approximation. The algorithms in the N-way Toolbox are based on minimizing the sum of the squared differences between the right-hand side and the left-hand side of Eq. (4).
We will not pursue the issue of algorithms further here (See Andersson and Bro: Improving the speed of multi-way algorithms: Part I. Tucker3. Chemom.Intell.Lab.Syst. 42:93-103, 1998).
Eq. (4) can be formulated in terms of the two remaining unfoldings X⁽²⁾ and X⁽³⁾
as shown in Eqs. (5) and (6).

If the orthonormal component matrices are known, the core can be calculated by Eq. (7).

3. The Tucker model family

To complete the introduction of the Tucker3 model
we need to describe two related models, namely Tucker2 and Tucker1.
For a thorough discussion of the many different three-way models, refer to Kiers (Hierarchical relations among three-way methods. Psychometrika 56:449-470, 1991).

The Tucker3 model
which we have discussed so far – is truly a three-way model since it explicitly establishes a relationship between factors in the three modes spanned by the data array.
However, if one of the modes contains very few observations – and the data needs no compression in that mode – one should consider applying the Tucker2 model which uses only two component matrices. We may also wish not to apply a model structure to that particular mode of the data for other reasons.

The Tucker2 model
leaves one mode in X uncompressed. Hence, the model uses only two component matrices, but still a full core. In the case of Tucker2, the core is often referred to as the extended core matrix. Mathematically we can formulate the Tucker2 model as done in Eq. (8).

In the formulation of Eq. (8), the third mode is left uncompressed as only matrices A, B and G are found. Note that G has full dimensionality in the uncompressed mode. The Tucker2 model does exploit fully the three-way structure of the data and there are applications that requires this feature.

The Tucker1 model
is simply PCA on one of the three unfolding matrices, not utilizing the three-way structure of data (some papers consider the Tucker1 model as all three solutions considered together). The Tucker1 model for mode 2 and 3 combined, can be computed directly in MATLAB by finding A as the w₁ first columns of U obtained by [U,S,V]=svd(X⁽¹⁾).

Example 2

Question
Having a three-way data array X (5,38,3) and using that three, three and two factors are required for the respective modes, you have calculated a Tucker3 model, a Tucker2 model (neglecting mode 3) and the three possible Tucker1 models.
What models include what kinds of component matrices and cores?
What are the dimensions of these arrays ?

Answer

How to do it in the N-Way Toolbox
Two examples are given on how to use the N-way Toolbox algorithm ‘tucker’.
We will calculate the solutions to Example #3 above, so the values in X and R are initialized to the proper values. You must specify the dimensions of the models to achieve the desired models. Plot the residuals of the Tucker3 model.

load howto2 %Loads the data array and R
%Examine an ordinary Tucker3 model W3=[3 3 2];

[Factors3,G3,SSE3]=tucker(X,W3);
%Calculate a (3,3,2) Tucker3 model

Xm3=nmodel(Factors3,G3); %Estimate the model from the calculated solutions Xres3=X-Xm3; %Calculate the corresponding three-way residuals for k=1:size(X,1),mesh(squeeze(Xres3(k,:,:))); title(['Sample num.', int2str(k),'/', int2str(size(X,1))]); pause; end, close

%Examine a Tucker2 model W2=[3 3 -1];[Factors2,G2,SSE2]=tucker(X,W2); %Calculate a (3,3,none) Tucker2 model Xm2=nmodel(Factors2,G2); %Estimate the model from the calculated solutions Xres2=X-Xm2; %Calculate the corresponding three-way residuals for k=1:size(X,1),mesh(squeeze(Xres2(k,:,:))); title(['Sample num.', int2str(k), '/', int2str(size(X,1))]); pause; end, close

Previously