Fig. 6

Exploration of Support Vector Machines to Uncover Prospective Biomarkers Capable of Classifying Early-Stage HGSC vs Benign Gynecological Disease. Peptides with p < 0.05 were selected as features for support vector machine model training and validation. HGSC and Benign donors were split into training and test data sets. A SVM training using LOOCV was used to determine optimal cost (C) and number of principal components or features to maximize prediction accuracy determined by Matthew’s Correlation Coefficient or mean accuracy score. Within these analyses, model accuracy was increased with increasing features, however cost weight had less of an influence. Using 2 feature models, we identified several combinatorial peptides which provided high sensitivity and specificity using the test data set. For example, (B, C) support vector machine utilizing APOC4 and MUC1 was able accurately classify 8 out of 10 HGSC donors and 10 out of 10 Benign donors. ROC-AUC for this model was determined to be 0.90