Assessment of Different Machine Learning Techniques for Multivariate Radiation Pneumonitis Modeling
S Lee1*, J Bradley2, N Ybarra1, K Jeyaseelan1, J Seuntjens1, I El Naqa1, (1) Medical Physics Unit, McGill University, Montreal, QC, (2) Washington University School of Medicine, St. Louis, MOTU-G-108-5 Tuesday 4:30PM - 6:00PM Room: 108
Purpose: We intended to perform an independent verification of different machine learning methods for inferring radiation pneumonitis (RP) risk from patient-specific biological and dosimetric information
Methods:29 NSCLC patients who received chemoradiation were recruited from two institutions (22 from WUSTL, 7 from McGill). Blood samples were collected from each patient before and during radiotherapy (RT). From each sample the concentration of the five following candidate biomarkers were measured: alpha-2-macroglobulin (α2M), angiotensin converting enzyme (ACE), transforming growth factor β (TGF-β), interleukin-6 (IL-6), and osteopontin (OPN). Dimensionality of the raw biomarker data was reduced by a semi-supervised variable selection scheme. The reduced biomarker variables along with three known dosimetric RP predictors (mean lung dose, V20, tumor position in superior-inferior direction of the lung) were used as features for classifying high RP risk (CTCAE grade 2 or higher) patients. Four different machine learning methods (Bayesian Network, Logistic Regression, Naive Bayes, Support Vector Machine) were used for training classifiers on the WUSTL subset and were tested with respect to classification performance on the McGill subset.
Results:The following 4 biomarker variables were chosen by the variable filtering: (1) pre-RT concentration level of α2M, (2) ratio of pre- to intra-RT levels of α2M, (3) intra-RT ACE level, (4) intra-RT TGFβ level. The best performance was achieved by Support Vector Machine in terms of classification accuracy (85.71%) and the area under the ROC curve (AUC) (0.9167). Bayesian Network recorded the same accuracy and slightly lower AUC (0.8750). Less accurate models were Naive Bayes and logistic regression in which overfitting was predominant with the largest difference in classification performance between the training and the testing dataset.
Conclusion:Preliminary results from this ongoing study suggests the need of a machine learning approach capable of modeling inter-relation between physical and biological variables where commonly used multivariate logistic regression would be inappropriate.