br Fig Correlation coefficients among
Fig. 1. Correlation coefficients among 30 metabolites after adjustment for age in (A) BC patients and (B) healthy controls. Blue and red represent negative and positive correlations, respectively. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)
significantly representative of 7 pathways. Associations among these 30 metabolites are shown in Fig. 1. Most metabolites of BC patients were observed to be more positively correlated with each other than those of controls, and correlations of certain metabolites in the two groups showed inverse relationships (Fig. 1).
As shown in Table 2, 18 of these 30 metabolites showed statistical significance between BC patients and healthy controls (q < 0.05). With the exception of hypoxanthine, acetylglycine, and three metabo-lites related to fatty Sappanone A metabolism (nonadecanoic acid, palmitic acid, and stearic acid), all other metabolites were decreased in BC patients compared to healthy controls. Considerable fold change values (FC >
1) and significant p-values, calculated based on mean ratios and uni-variate testing respectively, were also observed for 17 of the 18 sig-nificant between-group metabolites (Table 3), when comparing staged cancer patients and healthy controls. Of these, 8 metabolites were ob-served to be significant between stage I/controls, whereas 9 were sig-nificant between stage II/controls. When comparing early stage BC
patients (stages I and II) to control subjects, 6 metabolites were found to be significant; of these 6 metabolites, 4 were found to be mutually significant in both stage I/control and stage II/control comparisons. Thirteen metabolites were observed to be significant between stage III patients and controls. Similarly, thirteen metabolites had p < 0.001 (with FC ranging from 0.63 to 1.60) when comparing all-stage BC pa-tients to healthy controls (Table 2).
We compared patients with different molecular subtypes (ER/PR+, HER2+ vs. ER/PR +, HER2−), and compared triple negative (ER−, PR−, and HER2−) with non-triple negative patients. These results are shown in Table 4. No significant difference in metabolites was observed between these groups (all FDR q > 0.05). Interestingly, 15 metabolites were observed to have p-values < 0.05 when comparing ER, PR, HER2 status, and cancer stage among BC patients, although no significant differences in 30 metabolites were observed between the above-men-tioned comparison groups after multiple controlled comparisons (FDR) (Table 4).
Significant metabolites for comparison of BC patients and healthy controls.
Metabolite AUC p-Valuea FDR q-value Fold change VIPb
a p-Values calculated from univariate GLM testing. b VIP values obtained from the age-enhanced PLS-DA model (see Fig. 2).
3.2. Biomarker selection and evaluation of classification performance
To further explore potential biomarkers for discrimination between BC patients and healthy controls, levels of the 30 comparative meta-bolites were selected to establish an initial PLS-DA model. As can be seen in Supplemental Fig. S3, a separation trend was observed in the initial PLS-DA score plot [R2X (cum) = 0.291, R2Y (cum) = 0.398, Q2 (cum) = 0.312]. Clinical factors (i.e., gender, age, and medication) have often been incorporated in the building of predictive or diagnostic
clinical models, and such variables have recently been used to enhance metabolite biomarker models . To enhance the VIP metabolite model, age was included as a clinical factor. The enhanced metabolite model (Fig. 2A) showed a distinct separation trend between the two groups [R2X (cum) = 0.709, R2Y (cum) = 0.481, Q2 (cum) = 0.417], which indicated better performance than the initial PLS-DA model (Supplemental Fig. S3). To validate the reliability of the enhanced prediction model, a permutation test (n = 300) was conducted (Fig. 2B). The Q2-intercept value (−0.158) of the predictive model was lower than 0.05, indicating the model to be statistically sound. VIP values were obtained as listed in Table 2.
According to the VIP values from the enhanced PLS-DA model (VIP > 1) and the q-values from FDR-controlled comparisons (q < 0.05), 6 statistically significant, highly predictive features were retained for further analysis (Fig. 3). A third PLS-DA model was built using the 6 differential metabolites and subject age to distinguish BC patients from healthy controls. As can be seen in Fig. 4, the resulting PLS-DA model proved to be powerful in distinguishing BC patients from healthy controls, with an AUROC of 0.89 (95% CI: 0.85–0.93, sensi-tivity = 0.80, specificity = 0.75), which was more explanatory than that of each individual metabolite (see Table 2).