Which model provide the best fit?

Fitting the full data set gives a well-fitting model
(99.937%) as it should, whereas fitting the model to the data with zeros is difficult because the zeros do not behave as the PARAFAC model (fit 28.415%).

However, the best-fitting model is
the one with many missing elements (NaN’s). It fits 99.942% of the variation. Why?

Well, if we want to do linear regression
with, say 5 pairs, we’ll get a certain model. But if three of the elements are missing, we only have to fit the model to two points. Naturally, this will give a perfect fitting model.
Hence, fitting to fewer data is easier statistically (not numerically; it usually takes longer).

However, better fit does not necessarily mean a better model.