Encrypted login | home

Program Information

Classification of Prostate Cancer Gleason Scores Through Machine Learning From Multiparametric MRI

no image available
D Fehr

D Fehr*, A Wibmer , T Gondo , K Matsumoto , H Vargas , E Sala , H Hricak , J Deasy , H Veeraraghavan , Memorial Sloan Kettering Cancer Center, New York, NY


TU-AB-BRA-1 (Tuesday, July 14, 2015) 7:30 AM - 9:30 AM Room: Ballroom A

Purpose:To develop a machine learning-based automatic classification of Gleason Score (GS) 3+4=7 versus 4+3=7 prostate cancers (PCa) from multiparametric MRI (mpMRI) combined with first and second order texture features.

Methods:140 PCa were delineated in the peripheral and transition zones from patients undergoing mpMRI 6 months pre-prostatectomy. Lesions were contoured on T2-weighted MRI and apparent diffusion coefficient (ADC) maps by correlating the images with step-section pathology maps of the excised prostates. 114 tumors had a GS (3+4) and 26 had a GS (4+3). First order (mean, standard deviation, skewness, kurtosis) and second order texture features (energy, entropy, correlation, homogeneity, contrast) were analyzed. Due to large tumor class imbalance, we employed two different sample augmentation techniques: Gibbs and SMOTE oversampling. The samples in each class were oversampled to 200 samples. We evaluated the performance of three different classifiers, t-test support vector machine (t-test SVM), recursive feature elimination SVM (RFE-SVM), and adaptive boosting (AdaBoost), using 10-fold crossvalidation.

Results:RFE-SVM achieved the best classification performance (accuracy: 97%; Youden Index: 0.93) when trained using samples generated by the SMOTE method, followed by t-test SVM (78%,0.56) and AdaBoost (73%,0.46). When trained using samples generated from Gibbs oversampling, the performance of the classifiers were: RFE-SVM (90%,0.80), t-test SVM(71%,0.41), and AdaBoost (76%,0.52). In comparison, the classification performance of the same methods without sample augmentation was RFE-SVM (83%,0.11), t-test SVM(81%,0.00), and AdaBoost(79%,0.41). Classification with only mean ADC and mean T2 resulted at best in an accuracy of (67%,0.34) using Gibbs sampling.

Conclusion:We developed a fully automated method for classifying GS of PCa from multi-parametric images using texture features combined with sample augmentation. Our method enables to boost classifier performance despite highly unbalanced datasets. Furthermore, our results show that texture features enable the classifiers to achieve a much higher accuracy in comparison to not using any of them.

Contact Email: