Abstract Correlation studies of chromatographic data are common strategies used to relate complex mixtures of chemical components. The congruence coefficient and correlation coefficient are commonly used to indicate the similarity… Click to show full abstract
Abstract Correlation studies of chromatographic data are common strategies used to relate complex mixtures of chemical components. The congruence coefficient and correlation coefficient are commonly used to indicate the similarity or correlation between different hyphenated chromatograms. However, these indices typically reduce chromatograms to a single dimension, and information in the other dimensions is not fully utilized. In this work, a new technique is developed to identify possible relationships among related high-dimensional data sets using powerful chemometric tools. First, principal component analysis is used to reduce experimental noise by reconstructing the original data sets. Then, canonical correlation analysis is utilized to obtain the canonical vectors of both data sets for comparison, which makes identification of the possible relationships between the data sets easier. An orthogonal projection operation is then applied to identify both common and different information between the matrix spaces spanned by the canonical vectors. Finally, the correlation and uncorrelation indices are defined from both the chromatographic and spectral directions on the basis of the Euclidean distance of all the elements of the final projection matrices. The new indices are more representative because they are generated via the complete employment of the entire data information that is embedded in hyphenated chromatography. In contrast to the conventional coefficients, the indices proposed in this study provide improved performance in a simulated HPLC-DAD data set and 12 real GC-MS data sets of ginseng, a widely used herbal medicine. The effects of various potential factors on the results are investigated.
               
Click one of the above tabs to view related content.