
This dataset explores how the chemical composition of white wine relates to consumer-perceived quality. It includes instrumental measurements of 89 German white wines, linking them to crowd-sourced ratings from the Vivino platform. Wines were analyzed using:
- GC-MS (Gas Chromatography–Mass Spectrometry) for volatile organic compounds (VOCs)
- FT-IR (Fourier-Transform Infrared Spectroscopy) for physicochemical parameters
After data preprocessing, mid-level data fusion was applied to combine the two analytical platforms. A Partial Least Squares (PLS) regression model was then trained to predict Vivino ratings based on the fused chemical dataset.
| Component | Details |
|---|---|
| Number of wine samples | 89 (64 used in prediction model) |
| GC-MS variables (VOC features) | 145 |
| FT-IR variables | 18 |
| Fused data matrix size | 89 samples × 163 variables |
| Vivino rating range | 3.1 – 4.2 (on a 1–5 scale) |
| PLS model performance | R² = 0.98 (calibration), Q² = 0.74 (CV), RMSECV = 0.09 |
Key predictive compounds include esters, terpenoids, and reducing sugar. This is one of the first studies to link analytical wine chemistry to consumer quality perception via crowd-sourced data.