LAUSR.org creates dashboard-style pages of related content for over 1.5 million academic articles. Sign Up to like articles & get recommendations!

Data Quality: Importance of the ‘Before Analysis’ Domain [Theory of Sampling (TOS)]

Data Quality: what is it, where does it originate, how does it influence data modelling, what can chemometricians do about it? The ‘before analysis’ domain is prone to sampling errors… Click to show full abstract

Data Quality: what is it, where does it originate, how does it influence data modelling, what can chemometricians do about it? The ‘before analysis’ domain is prone to sampling errors resulting in uncertainties influencing the quality of both analysis and data analysis/data modelling. Nonrepresentative sampling of heterogeneous materials, batches, lots and process streams ‘before analysis’ contribute significantly to the total measurement uncertainty, MUtotal = MUsampling + MUanalysis. The total sampling error (TSE) can dominate over the total analytical error (TAE) by factors ranging 5, 10 or higher, depending on the degree of material heterogeneity encountered and the specific sampling procedure employed to produce the final analytical aliquot, which is the only material actually analysed. The analytical aliquot is the physical manifestation of transgressing the boundary from the before analysis (sampling) domain to the domain of analysis. It is only possible to guarantee representativity of the analytical aliquot, and thus of the analytical results with respect to the original target batch/lot/process stream, by invoking the necessary sampling domain competence stipulated by theory of sampling (TOS). Primary sampling is the most important stage in the full lot‐to‐analysis pathway, quantitatively dominating MUtotal (but subsequent subsampling stages can also be significant). If the sources of adverse sampling error effects have not been eliminated, the sampling process is biased and MUtotal will be unnecessarily inflated. TOS offers ways and means to deal actively with a potential sampling bias (which is fundamentally different from the analytical bias). Overlooking, or deliberately ignoring dealing appropriately with sampling effects constitutes a lack of due diligence, which has critical bearings on the QC/QA demands on both analysis and data analysis/modelling. This article presents all uncertainty contributions in the lot‐to‐analysis‐to‐data modelling pathway, which must be identified and managed, eliminated or maximally reduced, to be able to document a fully minimised MUtotal. Data analysts/chemometricians are part of a scientific collegium covering all three domains: sampling—analysis—data modelling, which are collectively responsible for ‘data quality’. This comprehensive scope has serious implications for the current PAT paradigm, the foundation of which turns out to need significant reform regarding a key process sampling aspect regardless of whether physical samples, or PAT sensor technology spectra, are extracted/acquired. This article introduces the essential minimum TOS competence that must be mastered by stakeholders from all three domains.

Keywords: quality; data modelling; analysis data; analysis; data quality; domain

Journal Title: Journal of Chemometrics
Year Published: 2025

Link to full text (if available)


Share on Social Media:                               Sign Up to like & get
recommendations!

Related content

More Information              News              Social Media              Video              Recommended



                Click one of the above tabs to view related content.