Radiation Physics

PV QA 3 - Poster Viewing Q&A 3

TU_15_3264 - Automatic Inverse Treatment Planning for Cervical Cancer High dose-rate Brachytherapy via Deep Reinforcement Learning

Tuesday, October 23
1:00 PM - 2:30 PM
Location: Innovation Hub, Exhibit Hall 3

Automatic Inverse Treatment Planning for Cervical Cancer High dose-rate Brachytherapy via Deep Reinforcement Learning
C. Shen, Y. Gonzalez, H. Jung, L. Chen, N. Qin, and X. Jia; Department of Radiation Oncology, University of Texas Southwestern Medical Center, Dallas, TX

Purpose/Objective(s): Inverse treatment planning in radiotherapy is typically formulated as an optimization problem where objective function and constraints are carefully designed for different practical and clinical considerations. The relative weights among these terms affect final plan quality. While a treatment planning system is capable of solving the optimization problem under a given set of weights, it is a crucial task to tune these weights to achieve a satisfactory plan quality. At present, tuning these weights is performed by a planner manually. Not only does this require intensive efforts, the plan quality is affected by many factors such as the human experience and the available time for planning. In this study, we propose a deep reinforcement learning (DRL) based approach to accomplish the weight-tuning task in a human-like manner. We demonstrate this idea using an example problem of inverse planning in high-dose-rate brachytherapy (HDRBT) for cervical cancer. The optimization problem minimizes weighted sum of doses to four critical organs (bladder, rectum, sigmoid, and small bowel), while requesting CTV D90% is not lower than prescription.

Materials/Methods: We develop a virtual planner network (VPN) in lieu of a human planner. VPN observes dose-volume histograms (DVH) of a plan and outputs a decision about direction and amplitude to adjust an organ weight. To train VPN, we define a reward function as the sum of D2cc values of critical organs, as D2cc values are clinically relevant quantities in cervical cancer HDRBT. We train VPN using five patient cases via DRL. In this process, VPN attempts to adjust organ weights following an epsilon-greedy process, and is updated to learn those actions that improve the reward function. Once VPN is trained, we demonstrate the effectiveness of VPN in another five testing patient cases. In each case, organ weights are randomly initialized. VPN is then applied to repeatedly adjust organ weights based on observed plan DVHs, until the plan quality cannot be further improved.

Results: It is found that VPN is able to make effective decisions to automatically adjust organ weights. Plans generated with VPN reduce sum of D2cc values by 6.1% compared to those under the randomly initialized parameters, and by 4.6% compared to those plans delivered clinically.

Conclusion: DRL-based VPN is a promising approach for automatic inverse planning in cervical cancer HDRBT. Given the similar structure of the optimization problems between HDRBT and external beam therapy, the VPN approach is potentially applicable to external beam therapy.

Author Disclosure: C. Shen: None. Y. Gonzalez: None. N. Qin: None.

Chenyang Shen, PhD

University of Texas Southwestern Medical Center

Disclosure:
No relationships to disclose.

Presentation(s):

Send Email for Chenyang Shen


Assets

TU_15_3264 - Automatic Inverse Treatment Planning for Cervical Cancer High dose-rate Brachytherapy via Deep Reinforcement Learning



Attendees who have favorited this

Please enter your access key

The asset you are trying to access is locked. Please enter your access key to unlock.

Send Email for Automatic Inverse Treatment Planning for Cervical Cancer High dose-rate Brachytherapy via Deep Reinforcement Learning