Wine Samples
Red wines, 44 samples, produced from the same grape (100% Cabernet Sauvignon), harvested in different geographical areas, have been collected from local supermarkets in the area of Copenhagen, Denmark. Details on the geographical origins and number of wine samples analysed are given in Table 1.
Table 1. Geographical origin of the analysed red wines
Origin | Wine samples |
Argentina | 6 |
Chile | 15 |
Australia | 12 |
South Africa | 11 |
Total | 44 |
The wine samples have been analyzed using head space GC-MS and FT-IR analytical instruments. The FT-IR was a commercial WineScan instrument provided by FOSS Analytical A/S.
GC-MS data
For each sample a mass spectrum scan (m/z: 5-204) measured at 2700 elution time-points was obtained providing a data cube of size 44×2700×200. In Figure 1 an example of a chromatogram for one red wine sample is shown.
![]() |
Figure 1. Typical chromatogram showing the TIC (Total Ion Count) of one red wine sample. |
In the figure the abundance at each scan is found by summing the contribution of all intensities of mass channels investigated (m/z: 5-204).
FT-IR data
For all wine samples 14 quality parameters were predicted from the IR spectra (Figure 2) using the FOSS WineScan build-in calibration models (Table 2).
![]() |
Figure 2. Typical IR spectrum of one red wine sample. The water band regions around 1545-1710 cm-1 and 2968-3620 cm-1 should be excluded from the data analysis. |
Table 2. Quality parameters measured on the WineScan instrument and used in MVP (units shown in brackets)
# | Quality parameter |
1 | Ethanol (vol. %) |
2 | Total acid (g/L) |
3 | Volatile acid (g/L) |
4 | Malic acid (g/L) |
5 | pH |
6 | Lactic acid (g/L) |
7 | Rest Sugar (Glucose + Fructose) (g/L) |
8 | Citric acid (mg/L) |
9 | CO2 (g/L) |
10 | Density (g/mL) |
11 | Total polyphenol index |
12 | Glycerol (g/L) |
13 | Methanol (vol. %) |
14 | Tartaric acid (g/L) |
Get the data
The data are available in zipped MATLAB 6.x format. Download the data and write load Wine_v6 in MATLAB.
If you use the data we would appreciate that you report the results to us as a courtesy of the work involved in producing and preparing the data. Also you may want to refer to the data by referring to:
T. Skov, D. Balabio, R. Bro (2008). Multiblock Variance Partitioning. A new approach for comparing variation in multiple data blocks. Analytica Chimica Acta, 615 (1): 18-29
Zip-file information
Variable | Description | Dimensions |
Aroma_compounds | Peak areas of aroma compounds | 44×57 |
Class | Classes of wines (see Table 1) | 44×1 |
Data_GC | Three-way data | 44×2700×200 |
Elution_profiles | Summed mass dimension – see Figure 1 | 44×2700 |
IR_spectra | IR spectra without waterband | 44×842 |
IR_spectra_with_waterband | IR spectra with waterband – see Figure 2 | 44×1056 |
Label_Aroma_comp | Label aroma compounds | 1×57 |
Label_Elution_time | Label elution time in minutes | 1×2700 |
Label_Mass_channels | Label m/z | 1×200 |
Label_Pred_values_IR | Label quality parameters | 1×14 |
Label_Wine_samples | Label wine samples ARG: Argentina AUS: Australia CHI: Chile SOU: South Africa | 44×1 |
Mass_profiles | Summed elution time dimension | 44×200 |
Pred_values_IR | Quality parameters (see Table 2) | 44×14 |
axis_spectra_wavenumber | Axis for spectra in cm-1 | 1×842 |
axis_spectra_with_waterband_wavenumber | Axis for spectra with waterband in cm-1 | 1×1056 |