Fig. 2

Hierarchical clustering of histotype-specific genes (HSGs) in the GSE44104 test cohort dataset and boxplots displaying expression for HSGs in the training cohort. A Heatmaps and hierarchical clustering (Euclidian distance, Ward.D2 clustering criterion) of expression data for 287 probes mapping to 145 HSGs: (left) HSGs with mapping probes in GSE44104, (right) the 287 most variable probes in GSE44104. Color mapping indicates z-score for gene expression, with genes and samples clustered separately. B-D Boxplots illustrating aberrant gene expression patterns (log2 normalized counts) in the training cohort for the top 4 HSGs for (B) CCC, (C) HGSC, and (D) MC showing between-group variance in expression. Values above boxplots represent Wilcoxon test p-values. CCC: Clear cell carcinoma, EC: Endometrioid carcinoma, HGSC: High-grade serous carcinoma, MC: Mucinous carcinoma, Z-Score: Relative measurement of a value in relation to the mean of a group of values to which it belongs, Gene-coverage: The number of HSGs with a matching probe in the dataset relative to the total number of HSGs