Encrypted login | home

Program Information

Deep Reinforcement Learning for Automated Dose Adaptation in Lung Cancer

H Tseng

H Tseng1*, Y Luo1 , J Chien2 , R Ten Haken1 , I El Naqa1 , (1) University of Michigan, Ann Arbor, MI, (2) National Chiao Tung University, Hsinchu City, Taiwan


MO-RPM-GePD-JT-2 (Monday, July 31, 2017) 3:45 PM - 4:15 PM Room: Joint Imaging-Therapy ePoster Theater

Purpose: Investigate deep reinforcement learning (DRL) based on historical treatment plans to develop automated dose adaptation protocols for non-small cell lung cancer (NSCLC) patients which maximize local control at reduced rates of radiation pneumonitis (RP).

Methods: In a retrospective population of 114 NSCLC patients who received radiotherapy, a 3-component neural network framework was developed for reinforcement learning of dose fractionation adaptation from large-scale patient characteristics including SNPs, miRNA, PET radiomics, tumor and lung dose schedules. First, a generative adversarial network (GAN) was employed to learn from the large number of characteristics of the limited number of patients. Secondly, an artificial environment (AE) was reconstructed using Deep neural network (DNN) from the GAN data to estimate the transition probabilities between phases of the patient treatment courses. Thirdly, Deep Q-Networks (DQNs) were applied to the AE to choose the optimal dose in a response-adapted treatment setting. This machine learning approach was bench marked against clinical decisions actually made in an adaptive dose escalation previously used for 34 of the patients (based on avid PET signal in the tumor and limited by an NTCP of 17% for RP).

Results: Taking our adaptive dose escalation protocol as a blueprint for the GAN+AE+DQN architecture, we obtained an automated adaption dose estimate for use ~2/3 through the treatment. By letting the DQN freely control adaptive dose per fraction ranging from 1-5 Gy, it automatically favored dose escalation/de-escalation between 1.5~3.8 Gy, a range similar to that used in the protocol. Moreover, the DQN also suggested similar (but generally lower in magnitude) individual adaptive fraction doses as those used in the protocol with an RMS error=0.13 Gy.

Conclusion: We demonstrated that automated dose adaptation by DRL is promising and would yield similar results to those chosen by clinicians. However, further validation on larger datasets would still be required.

Contact Email: