Quantum deep reinforcement learning for clinical decision support in oncology: application to adaptive radiotherapy | Scientific Reports – Nature.com
Quantum deep reinforcement learning
Quantum deep reinforcement learning is a novel action value-based decision-making framework derived from QRL23 and deep q-learning10 framework. Like conventional RL9,31, our qDRL based CDSS framework is comprised of 5 main elements: clinical AI agent, ARTE, radiation dose decision-making policy, reward, and q-value function. Here, the AI agent is a clinical decision-maker that learns to make dose decisions for achieving clinically desirable outcomes within the ARTE. The learning takes place by the agent-environment interaction, which can be sequentially ordered as: the AI decides on a dose and executes it, and in response, a patient (part of the ARTE) transits from one state to the next. Each transition provides the AI with feedback for its decision in terms of RT outcome and associated reward value. The goal of RL is for the AI to learn a decision-making policy that maximizes the reward in the long run, defined in terms of a specified q-value function that assigns a value to every state-dose-decision pair obtained from the accumulation of rewards over time (returns).
Assuming Markovs property (i.e., an environments response at time (t + 1) depends only on the state and dose-decision at time (t)), the qDRL task can be mathematically described as a 5-tuple ((S, left| D rightrangle , TF, P, R)), where (S) is a finite set of patients states, (left| D rightrangle) is a superimposed quantum state representing the finite set of eigen-dose decision, (TF:S times D to S^{prime }) is the transition function that maps patients state (s_{t}) and eigen-dose (left| d rightrangle_{t}) to the next state (s_{t + 1}), (P_{LC|RP2} :S^{prime } to left[ {0,1} right]) is the RT outcome estimator that assigns probability values (p_{LC}) and (p_{RP2}) to the state (s_{t + 1}), and (R:left[ {0,1} right] times left[ {0,1} right] to {mathbb{R}}) is the reward function that assigns a reward (r_{t + 1}) to the state-decision pair (left( {s_{t} ,left| d rightrangle_{t} } right)) based on the outcome probability estimates.
Eigen-dose (left| d rightrangle) is a physically performable decision that is selected via quantum methods from the superimposed quantum state (left| D rightrangle) which simultaneously represents all possible eigen-doses at once. In simple words, (left| D rightrangle) is the collection of all possible dose options and (left| d rightrangle) is one of those options which is selected after a decision is made. Selecting dose decision (left| d rightrangle) is carried out in two steps: (1) amplifying the optimal eigen-dose (left| d rightrangle^{*}) from the superimposed state (left| D rightrangle) (i.e., (left| D rightrangle^{prime } = widehat{Amp}_{{left| d rightrangle^{*} }} left| D rightrangle)) and (2) measuring the amplified state (i.e., (left| d rightrangle = widehat{Measure}(left| {D^{prime } } rightrangle )).
The optimal eigen-dose (left| d rightrangle^{*}) is obtained from deep Q-net, which is the AIs memory. Deep Q-net, (DQN:S to {mathbb{R}}^{d}), is a neural network that takes patients state as input and then outputs q-value for each eigen-dose ((left{ {q_{left| d rightrangle } } right})). The optimal dose is then selected following greedy policy where the dose with the maximum q-value is selected (i.e., (left| d rightrangle^{*} = begin{array}{*{20}c} {argmax} \ {left| {d^{prime } } rightrangle } \ end{array} { q_{left| d rightrangle } })). We have applied a double Q-learning 32 algorithm in training the deep Q-net. The schematic of a training cycle is presented in Fig.2 and additional technical details are presented in the Supplementary Material.
We initially employed Grovers amplification procedure33,34 for the decision selection mechanism. While Grovers procedure works on a quantum simulator, it fails to correctly work in a quantum computer. The quantum circuit depth of Grovers procedure (for 4 or higher qubits) is much greater than the coherence length of the current quantum processor35. Whenever the quantum circuit length exceeds the coherence length, quantum state becomes significantly affected by the system noise and loses vital information. Therefore, we designed a quantum controller circuit that is shorter than the coherence length and is suitable for the task of decision selection. The merit of our design is its fixed length; since its length is fixed for any number of qubits, it is suitable for higher qubit systems, as much as permitted by the circuit width. Technical details regarding its implementation in quantum processor is presented in the Supplementary Materials.
An example of a controller circuit is given in Fig.5. Controller circuits use twice the number of qubits (n), which can be divided into control and main. Optimal eigen-states obtained from deep Q-net are created in the control by selecting the appropriate pre-control gates. Then the control is entangled with the qubits from the main via controlled NOT (CNOT) gates. CNOT gates are connected between a control qubit from the control and a target qubit from the main. CNOT gates flip the target qubit from (left| 1 rightrangle) to (left| 0 rightrangle) only when the control is in (left| 1 rightrangle) state and does not perform any operation otherwise. Because all the main qubits are prepared in (left| 0 rightrangle) state, we introduced the reverse gates (n X-gates in parallel) to flip them to (left| 1 rightrangle). X-gates flip (left| 0 rightrangle) to (left| 1 rightrangle), and vice-versa. The CNOT flips all the qubits whose controls are in (left| 1 rightrangle) state, creating a state that is element-wise opposite to the marked state. Finally, another set of reverse gates is applied to the main before making a measurement.
Quantum controller circuit for a 5 qubit (32 bit) system. (a) Quantum controller circuit for the selection of the state (left| {10101} rightrangle). The probability distribution corresponding to (b) failed Grovers amplification procedure for one iteration run in the 5-qubit IBMQ Santiago quantum processor and (c) successful quantum controller selection run in the 15-qubit IBMQ Melbourne quantum processor.
Another advantage of the controller circuit is controlled uncertainty level. The controller circuit has additional degrees of freedom that can control the level of uncertainty that might be needed to model a highly dubious clinical situation. By replacing the CNOT gate by a more general (CU3left( {theta ,phi ,lambda } right)) gate, we can control the level of additional stochasticity with the rotation angles (theta), (phi), and (lambda), which corresponds to the angles in the Bloch sphere. The angles can either be fixed or, for additional control, changed with training episode.
The patients state in the ARTE is defined by 5 biological features: cytokine (IP10), PET imaging feature (GLSZM-ZSV), radiation doses (Tumor gEUD and lung gEUD), and genetics (cxcr1- Rs2234671). Their descriptions are presented in Table 2. These 5 variables were selected from a multi-objective Bayesian Network study13, which considered over 297 various biological features and found the best features for predicting the joint LC and RP2 RT outcomes.
The training data analyzed in this study are obtained from the University of Michigan study UMCC 2007.123 (NCI clinical trial NCT01190527) and the validation data analyzed in this study are obtained from the RTOG-0617 study (NCI clinical trial NCT00533949). Both trials were conducted in accordance with relevant guidelines and regulations and informed consent was obtained from all subjects and/or legal guardians. Details on training and validation datasets, and necessary model imputation carried out to accommodate the differences in the datasets are presented in the Supplementary Materials.
Deep Neural Networks (DNN) were applied as transition functions for IP10 and GLSZM-ZSV features. They were trained with a longitudinal (time-series) dataset, with the pre-irradiation patient state and corresponding radiation dose as input features and post-irradiation state as output. For lung and tumor gEUD, we utilized prior knowledge and applied a monotonic relationship for the transition function since we know that gEUD should increase with increasing radiation dose. We assumed that the change in gEUD is proportional to the dose fractionation and tissue radiosensitivity,
$$frac{{gleft( {t_{n} } right) - gleft( {t_{n - 1} } right)}}{{t_{n} - t_{n - 1} }} propto d_{n} left( {1 + frac{{d_{n} }}{{frac{alpha }{beta }}}} right).$$
(1)
Here (gleft( {t_{n} } right)) is the gEUD at time point (t_{n}), (d_{n}) is the radiation dose fractionation given during the nth time period, and (alpha /beta) ratio is the radiosensitivity parameter which differs between tissue type. Note that we first applied constrained training42 to maintain monotonicity with DNN model, however the gEUD over time trend was flatter than anticipated, thus we opted for a process-driven approach in the final implementation. The technical details on the NNs and its training are presented in the Supplementary Material.
DNN classifiers were applied as the RT outcome estimator for LC and RP2 treatment outcomes. They were trained with post irradiation patient states as input and binary LC and RP2 outcomes as its labels.
RT outcome estimator must also satisfy a monotone condition between increasing radiation dose and increasing probability of local control as well as probability of radiation induced pneumonitis. To maintain this monotonic relationship, we used a generic logistic function,
$$p_{LC|RP2} = frac{1}{{1 + exp left( {frac{{gleft( {t_{6} } right) - mu }}{T}} right)}},$$
(2)
where (gleft( {t_{6} } right)) is the gEUD at week 6, and (mu) and (T) are two patient-specific parameters that are learned from training the DNN. Here, (mu) and (T) are the outputs of two neural networks that are fed into the logistic function and tuned one after the other, leaving the other fixed. The training details are presented in the Supplementary Materials.
The task of the agent is to determine the optimal dose that maximizes (p_{LC}) while minimizing (p_{RP2}). Accordingly, we built a reward function on the base function (P^{ + } = P_{LC} left( {1 - P_{RP2} } right)) as shown in Fig.6. The algebraic form is as follows,
$$R = left{ {begin{array}{*{20}l} {P^{ + } + 10 } hfill & { {text{if}} 70% < p_{Lc} < 100% ;{text{and}}; 0% < p_{RP2} < 17.2% } hfill \ {P^{ + } + 5} hfill & {{text{if}} 50% < p_{Lc} < 70% ;{text{and}}; 17.2% < p_{RP2} < 50% } hfill \ {P^{ + } - 1} hfill & {{text{if}} 0% < p_{Lc} < 50% ;{text{and}}; 50 < p_{RP2} < 100% } hfill \ end{array} } right.$$
(3)
Reward function for reinforcement learning. Contour plot of reward function as a function of the probability of local control (PLC) and radiation induced pneumonitis of grade 2 or higher (PRP2). Area enclosed by the blue line corresponds to the clinically desirable outcome, i.e., (P_{LC} > 70{%}) and ({P_{RP2}} <17.2{%}). Similarly, the area enclosed by the green lines corresponds to the computationally desirable outcome, i.e., (P_{LC} > 50{%}) and ({P_{RP2}} <50{%}). Along with (P_{LC} times (1-P_{RP2})) the AI agent receives+10 reward for achieving clinically desirable outcome,+5 for achieving computationally desirable outcome, and -1 when unable to achieve a desirable outcome.
Here the AI agent receives additional 10 points for achieving clinically desirable outcome (i.e., (p_{LC} > 70% quad {text{and}} quad p_{RP2} < 17.2%)), 5 points for achieving computationally desirable outcome (i.e., (p_{LC} > 50% quad {text{and}} quad p_{RP2} < 50%)), and -1 point for failing to achieve a desirable outcome altogether. The negative point motivates the AI agent to search for the optimal dose as soon as possible.
To compensate for low number of data points we employed WGAN-GP43, which learns the underlying data distribution and generates more data points. We generated 4000 additional data points for training qDRL models. Having a larger training dataset helps the reinforcement learning algorithm in accurately representing the state space. The training details are presented in the Supplementary Material.
See the rest here:
Quantum deep reinforcement learning for clinical decision support in oncology: application to adaptive radiotherapy | Scientific Reports - Nature.com
- D-Wave and Davidson Technologies Near Completion of Quantum Computer - insideHPC - April 27th, 2025 [April 27th, 2025]
- Why startups and tech giants are racing to build a practical quantum computer - CNBC Africa - April 27th, 2025 [April 27th, 2025]
- D-Wave and Davidson Technologies Near Installation Completion of Alabamas First On-Site Annealing Quantum Computer - Yahoo Finance - April 25th, 2025 [April 25th, 2025]
- IQM to install Polands first superconducting quantum computer - The Next Web - April 25th, 2025 [April 25th, 2025]
- IQM to Deploy Polands First Superconducting Quantum Computer - Business Wire - April 25th, 2025 [April 25th, 2025]
- Poland installs its first superconducting quantum computer - Tech.eu - April 25th, 2025 [April 25th, 2025]
- A quantum internet is much closer to reality thanks to the world's first operating system for quantum computers - Live Science - April 23rd, 2025 [April 23rd, 2025]
- Where Will Rigetti Computing Be in 10 Years? - Yahoo Finance - April 23rd, 2025 [April 23rd, 2025]
- D-Wave and Davidson Near Installation Completion of Alabamas First On-Site Annealing Quantum Computer - HPCwire - April 23rd, 2025 [April 23rd, 2025]
- Quantum Computer Breakthrough: Fujitsu and RIKEN Lead the Way - JAPAN Forward - April 23rd, 2025 [April 23rd, 2025]
- Fujitsu and RIKEN develop world-leading 256-qubit superconducting quantum computer - Capacity Media - April 23rd, 2025 [April 23rd, 2025]
- 3 Reasons to Buy This Artificial Intelligence (AI) Quantum Computing Stock on the Dip - Yahoo Finance - April 23rd, 2025 [April 23rd, 2025]
- New Mexico Wants to Be the Heart of Quantum Computing - WSJ - April 23rd, 2025 [April 23rd, 2025]
- IonQ and Toyota Tsusho Align to Distibute Quantum Computing Solutions Across Japanese Industries - The Quantum Insider - April 23rd, 2025 [April 23rd, 2025]
- Where Will Rigetti Computing Be in 10 Years? - The Motley Fool - April 23rd, 2025 [April 23rd, 2025]
- EeroQ Named The 2025 MSU Startup Of The Year - Yahoo Finance - April 23rd, 2025 [April 23rd, 2025]
- New QPU benchmark will show when quantum computers surpass existing computing capabilities, scientists say - Live Science - April 23rd, 2025 [April 23rd, 2025]
- "We've Reached the Future": Xanadu Unleashes the First Scalable Photonic Quantum Computer, Redefining Tech Boundaries in a $100 Billion Race... - April 23rd, 2025 [April 23rd, 2025]
- Fujitsu and Riken develop world-leading quantum computer - The Japan Times - April 23rd, 2025 [April 23rd, 2025]
- No Killer App Yet? Why Quantum Needs Theorists More Than Ever - The Quantum Insider - April 23rd, 2025 [April 23rd, 2025]
- Rigetti, Riverlane, and NQCC Awarded 3.5M ($4.7M USD) Innovate UK Grant to Advance Real-Time Quantum Error Correction - Quantum Computing Report - April 23rd, 2025 [April 23rd, 2025]
- The key to 'cat qubits' 160-times more reliable lies in 'squeezing' them, scientists discover - Live Science - April 23rd, 2025 [April 23rd, 2025]
- The mind-bending innovations that built quantum computing - C&EN - April 23rd, 2025 [April 23rd, 2025]
- Mysterious phenomenon first predicted 50 years ago finally observed, and could give quantum computing a major boost - Live Science - April 23rd, 2025 [April 23rd, 2025]
- Big Tech has officially entered its quantum era here's what it means for the industry - Business Insider - April 23rd, 2025 [April 23rd, 2025]
- This Is My Top Quantum Computing Stock for 2025, and It's Not IonQ or Rigetti Computing - The Motley Fool - April 23rd, 2025 [April 23rd, 2025]
- How Urgent Is The Quantum Computing Risk Facing Bitcoin? One Team Is Putting 1 BTC Up For Grabs To Find Out - Benzinga - April 23rd, 2025 [April 23rd, 2025]
- Classiq and Wolfram Join CERNs Open Quantum Institute to Advance Hybrid Quantum Optimization for Smart Grids - Quantum Computing Report - April 23rd, 2025 [April 23rd, 2025]
- New quantum breakthrough could transform computing and communication - The Brighter Side of News - April 23rd, 2025 [April 23rd, 2025]
- Benchmarking the performance of quantum computing software for quantum circuit creation, manipulation and compilation - Nature - April 23rd, 2025 [April 23rd, 2025]
- A new hybrid platform for quantum simulation of magnetism - Google Research - April 23rd, 2025 [April 23rd, 2025]
- Why CoreWeave, Quantum Computing, and Digital Turbine Plunged Today - The Motley Fool - April 23rd, 2025 [April 23rd, 2025]
- The race is on for supremacy in quantum computing - The Times - April 23rd, 2025 [April 23rd, 2025]
- Project 11 challenges everyone to crack the Bitcoin key using a quantum computer. The reward is 1 BTC - Crypto News - April 23rd, 2025 [April 23rd, 2025]
- 7 Reasons You Should Care About World Quantum Day - Maryland Today - April 16th, 2025 [April 16th, 2025]
- Want to Invest in Quantum Computing? 3 Stocks That Are Great Buys Right Now. - Nasdaq - April 16th, 2025 [April 16th, 2025]
- Quantum utility is at most 10 years away, industry experts believe - The Next Web - April 16th, 2025 [April 16th, 2025]
- We stepped inside IQMs quantum lab to witness a new frontier in computing - The Next Web - April 16th, 2025 [April 16th, 2025]
- Quantum Shift: Rewiring the Tech Landscape - infoq.com - April 16th, 2025 [April 16th, 2025]
- Roadmap for commercial adoption of quantum computing gains clarity - Computer Weekly - April 16th, 2025 [April 16th, 2025]
- Want to Invest in Quantum Computing? 3 Stocks That Are Great Buys Right Now. - The Motley Fool - April 16th, 2025 [April 16th, 2025]
- Quantum walks: What they are and how they can change the world - The Brighter Side of News - April 16th, 2025 [April 16th, 2025]
- A timeline of the most important events in quantum mechanics - New Scientist - April 16th, 2025 [April 16th, 2025]
- Crafting the Quantum Narrative: A How-To for Press Releases - Quantum Computing Report - April 16th, 2025 [April 16th, 2025]
- IonQ signs MOU with Japans G-QuAT to expand access to quantum computing and strengthen APAC collaboration - The Quantum Insider - April 16th, 2025 [April 16th, 2025]
- Preparing for quantum advantage while addressing its unique threat to cybersecurity - SDxCentral - April 16th, 2025 [April 16th, 2025]
- IONQ of the U.S., a leading company in quantum computing, will develop quantum network technology in.. - - April 16th, 2025 [April 16th, 2025]
- Impact of tariffs on tech prices, the promise of quantum computing, and new state historic places - WPR - April 16th, 2025 [April 16th, 2025]
- 1 No-Brainer Quantum Computing Stock Down 60% to Buy on the Dip in 2025 - 24/7 Wall St. - April 16th, 2025 [April 16th, 2025]
- Physicists put Schrdinger's cat in a microwave and the quantum experiment actually worked - Yahoo - April 12th, 2025 [April 12th, 2025]
- A week at Yale devoted to quantum, quantum, and more quantum - Yale News - April 12th, 2025 [April 12th, 2025]
- US military launches initiative to find the best quantum computer - New Scientist - April 12th, 2025 [April 12th, 2025]
- Proving quantum computers have the edge - Phys.org - April 12th, 2025 [April 12th, 2025]
- 3 Quantum Computing Stocks Poised for Explosive Growth - The Motley Fool - April 12th, 2025 [April 12th, 2025]
- DARPA begins scaling a quantum computer with 15 companies - Nextgov - April 12th, 2025 [April 12th, 2025]
- New DARPA Initiative Challenges the Creation of Operational Quantum Computers - AFCEA International - April 12th, 2025 [April 12th, 2025]
- Qolab Spearheads Hardware Development for DARPA's Quantum Benchmarking Initiative - Business Wire - April 12th, 2025 [April 12th, 2025]
- Want to Invest in Quantum Computing? 3 Stocks That Are Great Buys Right Now - The Globe and Mail - April 12th, 2025 [April 12th, 2025]
- A Useful Quantum Computer Within 10 Years? DARPA, 2 Australian Startups & More Are Working On It - TechRepublic - April 12th, 2025 [April 12th, 2025]
- Where Schrdingers cat came from and why its getting fatter - New Scientist - April 12th, 2025 [April 12th, 2025]
- Rigetti and IonQ Selected for U.S. Quantum Initiative. Moving From Hype to Prototype. - Barron's - April 12th, 2025 [April 12th, 2025]
- A Tangled Benchmark: Using the Jones Polynomial to Test Quantum Hardware at Scale - The Quantum Insider - April 12th, 2025 [April 12th, 2025]
- The dream of quantum computing is closer than ever | The Excerpt - USA Today - April 12th, 2025 [April 12th, 2025]
- Analysts Still Have a Near-Perfect Rating on This Strong Buy Quantum Computing Stock - The Globe and Mail - April 12th, 2025 [April 12th, 2025]
- Building Indias First Quantum Computer, a Foreign-Returned Physicist Battles the Bureaucracy - outlookbusiness.com - April 12th, 2025 [April 12th, 2025]
- Quantum computing drives innovation in AI and cloud tech - SiliconANGLE - April 12th, 2025 [April 12th, 2025]
- Delfts Quantware paves the way to the million-qubit quantum computer - Bits&Chips - April 8th, 2025 [April 8th, 2025]
- What's Going On With IonQ Stock Today? - Benzinga - April 1st, 2025 [April 1st, 2025]
- Quantum computer solves optimization problem at Ford's assembly line - Interesting Engineering - April 1st, 2025 [April 1st, 2025]
- Finnish Quantum Startup IQM in Talks to Raise Over 200 Million - Bloomberg.com - April 1st, 2025 [April 1st, 2025]
- Quantum Computing Approach Generates First Ever Truly Random Number - Discover Magazine - April 1st, 2025 [April 1st, 2025]
- National Quantum Computing Centre Launches Insights Paper Exploring Quantum Computings Transformative Potential in Healthcare and Pharmaceuticals -... - April 1st, 2025 [April 1st, 2025]
- JPMorganChase, Quantinuum, Argonne National Laboratory, Oak Ridge National Laboratory and University of Texas at Austin advance the application of... - April 1st, 2025 [April 1st, 2025]
- Certified randomness using a trapped-ion quantum processor - Nature - April 1st, 2025 [April 1st, 2025]
- What's Going On With Quantum Computing Stock Today? - Benzinga - April 1st, 2025 [April 1st, 2025]
- D-Wave Pushes Back At Critics, Shows Off Aggressive Quantum Roadmap - The Next Platform - April 1st, 2025 [April 1st, 2025]
- Quantum Computing Inc. Secures Quantum Photonic Vibrometer Order with Delft University of Technology - Yahoo Finance - April 1st, 2025 [April 1st, 2025]
- How quantum cybersecurity changes the way you protect data - TechTarget - April 1st, 2025 [April 1st, 2025]
- Pasqal Selected for 140-Qubit Quantum Computer to Be Hosted at CINECA - insideHPC - April 1st, 2025 [April 1st, 2025]
- D-Wave and Japan Tobacco use quantum to build a better AI model for drug discovery - SiliconANGLE - April 1st, 2025 [April 1st, 2025]