Quantum deep reinforcement learning for clinical decision support in oncology: application to adaptive radiotherapy | Scientific Reports – Nature.com
Quantum deep reinforcement learning
Quantum deep reinforcement learning is a novel action value-based decision-making framework derived from QRL23 and deep q-learning10 framework. Like conventional RL9,31, our qDRL based CDSS framework is comprised of 5 main elements: clinical AI agent, ARTE, radiation dose decision-making policy, reward, and q-value function. Here, the AI agent is a clinical decision-maker that learns to make dose decisions for achieving clinically desirable outcomes within the ARTE. The learning takes place by the agent-environment interaction, which can be sequentially ordered as: the AI decides on a dose and executes it, and in response, a patient (part of the ARTE) transits from one state to the next. Each transition provides the AI with feedback for its decision in terms of RT outcome and associated reward value. The goal of RL is for the AI to learn a decision-making policy that maximizes the reward in the long run, defined in terms of a specified q-value function that assigns a value to every state-dose-decision pair obtained from the accumulation of rewards over time (returns).
Assuming Markovs property (i.e., an environments response at time (t + 1) depends only on the state and dose-decision at time (t)), the qDRL task can be mathematically described as a 5-tuple ((S, left| D rightrangle , TF, P, R)), where (S) is a finite set of patients states, (left| D rightrangle) is a superimposed quantum state representing the finite set of eigen-dose decision, (TF:S times D to S^{prime }) is the transition function that maps patients state (s_{t}) and eigen-dose (left| d rightrangle_{t}) to the next state (s_{t + 1}), (P_{LC|RP2} :S^{prime } to left[ {0,1} right]) is the RT outcome estimator that assigns probability values (p_{LC}) and (p_{RP2}) to the state (s_{t + 1}), and (R:left[ {0,1} right] times left[ {0,1} right] to {mathbb{R}}) is the reward function that assigns a reward (r_{t + 1}) to the state-decision pair (left( {s_{t} ,left| d rightrangle_{t} } right)) based on the outcome probability estimates.
Eigen-dose (left| d rightrangle) is a physically performable decision that is selected via quantum methods from the superimposed quantum state (left| D rightrangle) which simultaneously represents all possible eigen-doses at once. In simple words, (left| D rightrangle) is the collection of all possible dose options and (left| d rightrangle) is one of those options which is selected after a decision is made. Selecting dose decision (left| d rightrangle) is carried out in two steps: (1) amplifying the optimal eigen-dose (left| d rightrangle^{*}) from the superimposed state (left| D rightrangle) (i.e., (left| D rightrangle^{prime } = widehat{Amp}_{{left| d rightrangle^{*} }} left| D rightrangle)) and (2) measuring the amplified state (i.e., (left| d rightrangle = widehat{Measure}(left| {D^{prime } } rightrangle )).
The optimal eigen-dose (left| d rightrangle^{*}) is obtained from deep Q-net, which is the AIs memory. Deep Q-net, (DQN:S to {mathbb{R}}^{d}), is a neural network that takes patients state as input and then outputs q-value for each eigen-dose ((left{ {q_{left| d rightrangle } } right})). The optimal dose is then selected following greedy policy where the dose with the maximum q-value is selected (i.e., (left| d rightrangle^{*} = begin{array}{*{20}c} {argmax} \ {left| {d^{prime } } rightrangle } \ end{array} { q_{left| d rightrangle } })). We have applied a double Q-learning 32 algorithm in training the deep Q-net. The schematic of a training cycle is presented in Fig.2 and additional technical details are presented in the Supplementary Material.
We initially employed Grovers amplification procedure33,34 for the decision selection mechanism. While Grovers procedure works on a quantum simulator, it fails to correctly work in a quantum computer. The quantum circuit depth of Grovers procedure (for 4 or higher qubits) is much greater than the coherence length of the current quantum processor35. Whenever the quantum circuit length exceeds the coherence length, quantum state becomes significantly affected by the system noise and loses vital information. Therefore, we designed a quantum controller circuit that is shorter than the coherence length and is suitable for the task of decision selection. The merit of our design is its fixed length; since its length is fixed for any number of qubits, it is suitable for higher qubit systems, as much as permitted by the circuit width. Technical details regarding its implementation in quantum processor is presented in the Supplementary Materials.
An example of a controller circuit is given in Fig.5. Controller circuits use twice the number of qubits (n), which can be divided into control and main. Optimal eigen-states obtained from deep Q-net are created in the control by selecting the appropriate pre-control gates. Then the control is entangled with the qubits from the main via controlled NOT (CNOT) gates. CNOT gates are connected between a control qubit from the control and a target qubit from the main. CNOT gates flip the target qubit from (left| 1 rightrangle) to (left| 0 rightrangle) only when the control is in (left| 1 rightrangle) state and does not perform any operation otherwise. Because all the main qubits are prepared in (left| 0 rightrangle) state, we introduced the reverse gates (n X-gates in parallel) to flip them to (left| 1 rightrangle). X-gates flip (left| 0 rightrangle) to (left| 1 rightrangle), and vice-versa. The CNOT flips all the qubits whose controls are in (left| 1 rightrangle) state, creating a state that is element-wise opposite to the marked state. Finally, another set of reverse gates is applied to the main before making a measurement.
Quantum controller circuit for a 5 qubit (32 bit) system. (a) Quantum controller circuit for the selection of the state (left| {10101} rightrangle). The probability distribution corresponding to (b) failed Grovers amplification procedure for one iteration run in the 5-qubit IBMQ Santiago quantum processor and (c) successful quantum controller selection run in the 15-qubit IBMQ Melbourne quantum processor.
Another advantage of the controller circuit is controlled uncertainty level. The controller circuit has additional degrees of freedom that can control the level of uncertainty that might be needed to model a highly dubious clinical situation. By replacing the CNOT gate by a more general (CU3left( {theta ,phi ,lambda } right)) gate, we can control the level of additional stochasticity with the rotation angles (theta), (phi), and (lambda), which corresponds to the angles in the Bloch sphere. The angles can either be fixed or, for additional control, changed with training episode.
The patients state in the ARTE is defined by 5 biological features: cytokine (IP10), PET imaging feature (GLSZM-ZSV), radiation doses (Tumor gEUD and lung gEUD), and genetics (cxcr1- Rs2234671). Their descriptions are presented in Table 2. These 5 variables were selected from a multi-objective Bayesian Network study13, which considered over 297 various biological features and found the best features for predicting the joint LC and RP2 RT outcomes.
The training data analyzed in this study are obtained from the University of Michigan study UMCC 2007.123 (NCI clinical trial NCT01190527) and the validation data analyzed in this study are obtained from the RTOG-0617 study (NCI clinical trial NCT00533949). Both trials were conducted in accordance with relevant guidelines and regulations and informed consent was obtained from all subjects and/or legal guardians. Details on training and validation datasets, and necessary model imputation carried out to accommodate the differences in the datasets are presented in the Supplementary Materials.
Deep Neural Networks (DNN) were applied as transition functions for IP10 and GLSZM-ZSV features. They were trained with a longitudinal (time-series) dataset, with the pre-irradiation patient state and corresponding radiation dose as input features and post-irradiation state as output. For lung and tumor gEUD, we utilized prior knowledge and applied a monotonic relationship for the transition function since we know that gEUD should increase with increasing radiation dose. We assumed that the change in gEUD is proportional to the dose fractionation and tissue radiosensitivity,
$$frac{{gleft( {t_{n} } right) - gleft( {t_{n - 1} } right)}}{{t_{n} - t_{n - 1} }} propto d_{n} left( {1 + frac{{d_{n} }}{{frac{alpha }{beta }}}} right).$$
(1)
Here (gleft( {t_{n} } right)) is the gEUD at time point (t_{n}), (d_{n}) is the radiation dose fractionation given during the nth time period, and (alpha /beta) ratio is the radiosensitivity parameter which differs between tissue type. Note that we first applied constrained training42 to maintain monotonicity with DNN model, however the gEUD over time trend was flatter than anticipated, thus we opted for a process-driven approach in the final implementation. The technical details on the NNs and its training are presented in the Supplementary Material.
DNN classifiers were applied as the RT outcome estimator for LC and RP2 treatment outcomes. They were trained with post irradiation patient states as input and binary LC and RP2 outcomes as its labels.
RT outcome estimator must also satisfy a monotone condition between increasing radiation dose and increasing probability of local control as well as probability of radiation induced pneumonitis. To maintain this monotonic relationship, we used a generic logistic function,
$$p_{LC|RP2} = frac{1}{{1 + exp left( {frac{{gleft( {t_{6} } right) - mu }}{T}} right)}},$$
(2)
where (gleft( {t_{6} } right)) is the gEUD at week 6, and (mu) and (T) are two patient-specific parameters that are learned from training the DNN. Here, (mu) and (T) are the outputs of two neural networks that are fed into the logistic function and tuned one after the other, leaving the other fixed. The training details are presented in the Supplementary Materials.
The task of the agent is to determine the optimal dose that maximizes (p_{LC}) while minimizing (p_{RP2}). Accordingly, we built a reward function on the base function (P^{ + } = P_{LC} left( {1 - P_{RP2} } right)) as shown in Fig.6. The algebraic form is as follows,
$$R = left{ {begin{array}{*{20}l} {P^{ + } + 10 } hfill & { {text{if}} 70% < p_{Lc} < 100% ;{text{and}}; 0% < p_{RP2} < 17.2% } hfill \ {P^{ + } + 5} hfill & {{text{if}} 50% < p_{Lc} < 70% ;{text{and}}; 17.2% < p_{RP2} < 50% } hfill \ {P^{ + } - 1} hfill & {{text{if}} 0% < p_{Lc} < 50% ;{text{and}}; 50 < p_{RP2} < 100% } hfill \ end{array} } right.$$
(3)
Reward function for reinforcement learning. Contour plot of reward function as a function of the probability of local control (PLC) and radiation induced pneumonitis of grade 2 or higher (PRP2). Area enclosed by the blue line corresponds to the clinically desirable outcome, i.e., (P_{LC} > 70{%}) and ({P_{RP2}} <17.2{%}). Similarly, the area enclosed by the green lines corresponds to the computationally desirable outcome, i.e., (P_{LC} > 50{%}) and ({P_{RP2}} <50{%}). Along with (P_{LC} times (1-P_{RP2})) the AI agent receives+10 reward for achieving clinically desirable outcome,+5 for achieving computationally desirable outcome, and -1 when unable to achieve a desirable outcome.
Here the AI agent receives additional 10 points for achieving clinically desirable outcome (i.e., (p_{LC} > 70% quad {text{and}} quad p_{RP2} < 17.2%)), 5 points for achieving computationally desirable outcome (i.e., (p_{LC} > 50% quad {text{and}} quad p_{RP2} < 50%)), and -1 point for failing to achieve a desirable outcome altogether. The negative point motivates the AI agent to search for the optimal dose as soon as possible.
To compensate for low number of data points we employed WGAN-GP43, which learns the underlying data distribution and generates more data points. We generated 4000 additional data points for training qDRL models. Having a larger training dataset helps the reinforcement learning algorithm in accurately representing the state space. The training details are presented in the Supplementary Material.
See the rest here:
Quantum deep reinforcement learning for clinical decision support in oncology: application to adaptive radiotherapy | Scientific Reports - Nature.com
- D-Wave enters agreement to sell up to $400M shares from time to time - Yahoo Finance - June 14th, 2025 [June 14th, 2025]
- IBM is building a large-scale quantum computer that 'would require the memory of more than a quindecillion of the world's most powerful... - June 14th, 2025 [June 14th, 2025]
- Prediction: This Quantum Computing Stock Will Surge in 2025 - The Globe and Mail - June 14th, 2025 [June 14th, 2025]
- IBMs Fault-Tolerant Quantum Computer Breakthrough: Exec More Comfortable Than Ever About 2029 Delivery - TechRepublic - June 14th, 2025 [June 14th, 2025]
- Protection against quantum computing threats now within grasp for companies and institutions - Orange - June 14th, 2025 [June 14th, 2025]
- Planckian Partners With University of Naples to Accelerate Next-Gen Quantum Processor - The Quantum Insider - June 14th, 2025 [June 14th, 2025]
- Bitcoin devs scramble to protect $2.2tn blockchain from looming quantum computer threat - dlnews.com - June 14th, 2025 [June 14th, 2025]
- Quantum Art to Advance Scalable Quantum Computing Through Logical Qubit Compiler and NVIDIA CUDA-Q Integration - The Quantum Insider - June 14th, 2025 [June 14th, 2025]
- Why Shares of D-Wave Quantum Are Sinking This Week - The Motley Fool - June 14th, 2025 [June 14th, 2025]
- Mind-Blowing Quantum Leap: IBMs Groundbreaking Fault-Tolerant PC Set to Revolutionize Tech by 2029Prepare for Unprecedented Computational Power -... - June 14th, 2025 [June 14th, 2025]
- Why it's time to move beyond qubits for assessing quantum progress - Diginomica - June 14th, 2025 [June 14th, 2025]
- Quantum Computers Pose a Grave Risk to The Future. Here's Why. - ScienceAlert - June 10th, 2025 [June 10th, 2025]
- Want to Invest in Quantum Computing? 3 Stocks That Are Great Buys Right Now. - Yahoo Finance - June 10th, 2025 [June 10th, 2025]
- At 40 ISC 2025 Continues to Connect the Dots - HPCwire - June 10th, 2025 [June 10th, 2025]
- Vodafone teams up with Orca for quantum-powered network optimisation - Capacity Media - June 10th, 2025 [June 10th, 2025]
- IonQ goes quantum shopping: Buys Oxford Ionics for $1.075B - Silicon Canals - June 10th, 2025 [June 10th, 2025]
- Infleqtion Selected to Power the UKs Largest Quantum Computing Breakthrough - Business Wire - June 10th, 2025 [June 10th, 2025]
- BTQ Technologies Announces Strategic Partnership with QPerfect to Achieve Quantum Advantage Using Neutral Atom Quantum Processors - WV News - June 10th, 2025 [June 10th, 2025]
- Quantum computers are on the edge of revealing new particle physics - New Scientist - June 10th, 2025 [June 10th, 2025]
- Where Will IonQ Be in 5 Years? - The Motley Fool - June 10th, 2025 [June 10th, 2025]
- IonQ buys Oxford Ionics for $1.075B: 6 things to know about it - Tech Funding News - June 10th, 2025 [June 10th, 2025]
- IBM plans to build first-of-its-kind quantum computer by 2029 after 'solving key bottleneck' - Live Science - June 10th, 2025 [June 10th, 2025]
- IBM aims to build the worlds first large-scale, error-corrected quantum computer by 2028 - MIT Technology Review - June 10th, 2025 [June 10th, 2025]
- IBM announced that it will release a quantum computer that has solved the error problem by 2029. Qua.. - - June 10th, 2025 [June 10th, 2025]
- Vodafone aims to leverage quantum computer to streamline broadband installation routes - Telecompaper - June 10th, 2025 [June 10th, 2025]
- This tiny quantum computer could blow massive data centers out of the water with speed, power, and pure physics - TechRadar - June 1st, 2025 [June 1st, 2025]
- Where Will Rigetti Computing Be in 5 Years? - Yahoo Finance - June 1st, 2025 [June 1st, 2025]
- IonQ vs. Microsoft: Which Quantum Cloud Stock Is the Better Buy Today? - Zacks Investment Research - June 1st, 2025 [June 1st, 2025]
- Q1 2025 Quantum Technology Investment: Whats Driving the Surge in Quantum Investment? - The Quantum Insider - June 1st, 2025 [June 1st, 2025]
- Where Will Rigetti Computing Be in 5 Years? - The Motley Fool - June 1st, 2025 [June 1st, 2025]
- Our Online World Relies on Encryption. What Happens If It Fails? - Boston University - June 1st, 2025 [June 1st, 2025]
- Jim Cramer on D-Wave Quantum (QBTS): Of the Ones That Are Out There, This is the Best - Insider Monkey - June 1st, 2025 [June 1st, 2025]
- It Might Actually Be 20 Times Easier for Quantum Computers to Break Bitcoin, Google Says - Decrypt - June 1st, 2025 [June 1st, 2025]
- Want to Invest in Quantum Computing? 2 Stocks That Are Great Buys Right Now. - The Motley Fool - June 1st, 2025 [June 1st, 2025]
- IonQ vs. Microsoft: Which Quantum Cloud Stock Is the Better Buy Today? - Yahoo Finance - June 1st, 2025 [June 1st, 2025]
- CEOs who aren't yet preparing for the quantum revolution are 'already too late,' IBM exec says - Business Insider - June 1st, 2025 [June 1st, 2025]
- New quantum visualisation techniques could accelerate the arrival of fault-tolerant quantum computers - University of Oxford - June 1st, 2025 [June 1st, 2025]
- Marylands Quantum Capital Ambitions Rely on UMD Physicist Ronald Walsworth - Source of the Spring - June 1st, 2025 [June 1st, 2025]
- We asked an expert about quantum computer threat as Google and BlackRock ring the alarm - Crypto News - June 1st, 2025 [June 1st, 2025]
- Whats Happening With IONQ Stock? - Trefis - June 1st, 2025 [June 1st, 2025]
- New Startup Sygaldry Aims to Rethink AI Infrastructure With Quantum Hardware - The Quantum Insider - June 1st, 2025 [June 1st, 2025]
- Breaking encryption with a quantum computer just got 20 times easier - New Scientist - May 26th, 2025 [May 26th, 2025]
- D-Wave launches the Advantage2 quantum computer with more than 4,400 qubits - SiliconANGLE - May 26th, 2025 [May 26th, 2025]
- Nvidia in Talks to Invest in Quantum Startup PsiQuantum - The Information - May 19th, 2025 [May 19th, 2025]
- Quantum Computers Just Outsmarted Supercomputers Heres What They Solved - SciTechDaily - May 19th, 2025 [May 19th, 2025]
- Should You Buy IonQ Stock to Ride the Quantum Computing Revolution? The Answer May Surprise You - The Motley Fool - May 19th, 2025 [May 19th, 2025]
- D-Wave Quantum Stock Soaring On 509% Revenue Pop And Growth Prospects - Forbes - May 19th, 2025 [May 19th, 2025]
- Quantum Machines Launches Open-Source Framework that Cuts Quantum Computer Calibration From Hours to Minutes - The Quantum Insider - May 19th, 2025 [May 19th, 2025]
- Silicon qubits bring scalable quantum computing closer to reality - The Brighter Side of News - May 19th, 2025 [May 19th, 2025]
- Quantum Computers Are Here, but Are Cybersecurity Professionals Ready? - IoT World Today - May 19th, 2025 [May 19th, 2025]
- Quantum Computing Stock Tumbles After Last Week's 50% SurgeWatch These Key Levels - Investopedia - May 19th, 2025 [May 19th, 2025]
- Nvidia in talks to invest in PsiQuantum - Tom's Hardware - May 19th, 2025 [May 19th, 2025]
- Quantum computing: What is quantum error correction (QEC) and why is it so important? - Live Science - May 19th, 2025 [May 19th, 2025]
- Quantum Computing Roadmaps: A Look at The Maps And Predictions of Major Quantum Players - The Quantum Insider - May 19th, 2025 [May 19th, 2025]
- Quantum Computing Stock Surges as Firm Swings to Profit - Investopedia - May 19th, 2025 [May 19th, 2025]
- $850bn by 2040! Should I buy quantum computing stocks for my Stocks and Shares ISA? - Yahoo - May 19th, 2025 [May 19th, 2025]
- France, Germany, and the Netherlands Launch $33M Trilateral Quantum Initiative - The Quantum Insider - May 19th, 2025 [May 19th, 2025]
- Oxford Quantum Circuits Appoints Former GCHQ Director Sir Jeremy Fleming to Board - HPCwire - May 19th, 2025 [May 19th, 2025]
- Outside the Box: Socratic Machines and Quantum Ghosts - Fair Observer - May 19th, 2025 [May 19th, 2025]
- Preparing for the post-quantum era: a CIOs guide to securing the future of encryption - CyberScoop - May 19th, 2025 [May 19th, 2025]
- Quantum Computing First Quarter 2025 Earnings: EPS Beats Expectations, Revenues Lag - Yahoo Finance - May 19th, 2025 [May 19th, 2025]
- Nvidia in Talks to Invest in Quantum Computing Startup - The Information - May 19th, 2025 [May 19th, 2025]
- IonQ Stock Is Up 294% in the Past Year. Here's My Prediction For What Comes Next - The Motley Fool - May 19th, 2025 [May 19th, 2025]
- Does Billionaire Israel Englander Know Something Wall Street Doesn't? He Sold a Quantum Computing Stock Analysts Say to Buy. - The Motley Fool - May 19th, 2025 [May 19th, 2025]
- From R&D to ROI: The quantum computing revolution starts here - Techcircle - May 19th, 2025 [May 19th, 2025]
- How quantum computers could break RSA encryption and cure Alzheimer's - Interesting Engineering - May 19th, 2025 [May 19th, 2025]
- The race to perfect the quantum computer is on, and UC is helping America hold its lead - University of California - May 15th, 2025 [May 15th, 2025]
- Keysight Quantum Control System Embedded within Fujitsu and RIKENs World-Leading 256-Qubit Quantum Computer - Morningstar - May 15th, 2025 [May 15th, 2025]
- Keysight Technologies, Inc. Quantum Control System Embedded Within Fujitsu and Riken's 256-Qubit Quantum Computer - marketscreener.com - May 15th, 2025 [May 15th, 2025]
- The Worlds First Song Created by Artificial Intelligence Using a Quantum Computer Is HereIt Sounds Nothing Like What You Expect - The Daily Galaxy - May 11th, 2025 [May 11th, 2025]
- Regulation watch: how governments are dealing with the risks of quantum computing - Strategic Risk Global - May 11th, 2025 [May 11th, 2025]
- The age of the hype cycle: why science needs room to breathe - varsity.co.uk - May 11th, 2025 [May 11th, 2025]
- Quantums Double-Edged Sword: Balancing Risk and Readiness - InformationWeek - May 11th, 2025 [May 11th, 2025]
- The Computational Limit of Life May Be Much Higher Than We Thought - Yahoo - May 11th, 2025 [May 11th, 2025]
- BlackRock beefs up quantum compute threat warnings to Bitcoin investors - dlnews.com - May 11th, 2025 [May 11th, 2025]
- From false alarms to real threats: Protecting cryptography against quantum - cio.com - May 11th, 2025 [May 11th, 2025]
- Boosting quantum error correction using AI - Phys.org - May 11th, 2025 [May 11th, 2025]
- Laws governing finance and investment can help to protect society from dangers of quantum computing, study shows - Phys.org - May 11th, 2025 [May 11th, 2025]
- Quantum computing stocks jump after strong results from D-Wave Quantum (QBTS:NYSE) - Seeking Alpha - May 11th, 2025 [May 11th, 2025]
- Listen to the worlds first song made by a quantum computer and AI - The Next Web - May 10th, 2025 [May 10th, 2025]