Uncorking Wine Quality: Predicting Consumer Ratings from Analytical Chemistry

This dataset explores how the chemical composition of white wine relates to consumer-perceived quality. It includes instrumental measurements of 89 German white wines, linking them to crowd-sourced ratings from the Vivino platform. Wines were analyzed using:

  • GC-MS (Gas Chromatography–Mass Spectrometry) for volatile organic compounds (VOCs)
  • FT-IR (Fourier-Transform Infrared Spectroscopy) for physicochemical parameters

After data preprocessing, mid-level data fusion was applied to combine the two analytical platforms. A Partial Least Squares (PLS) regression model was then trained to predict Vivino ratings based on the fused chemical dataset.

ComponentDetails
Number of wine samples89 (64 used in prediction model)
GC-MS variables (VOC features)145
FT-IR variables18
Fused data matrix size89 samples × 163 variables
Vivino rating range3.1 – 4.2 (on a 1–5 scale)
PLS model performanceR² = 0.98 (calibration), Q² = 0.74 (CV), RMSECV = 0.09

Key predictive compounds include esters, terpenoids, and reducing sugar. This is one of the first studies to link analytical wine chemistry to consumer quality perception via crowd-sourced data.

Get the data

Exit mobile version