International Business Machines Corporation
Learning Interpretable Strategies in the Presence of Existing Domain Knowledge

Last updated:

Abstract:

A mechanism computes a discounted health variable with a penalty for deviating from clinical guidelines based on a distance function representing an allowed deviation from the clinical guidelines, applies reinforcement learning techniques on the discounted health variable to generate a reinforcement learning (RL) model for generating dynamic treatment regimes, and determines, for a patient for a plurality of times, a next action in a treatment regime using the RL model with no distance function, an optimal next action in the treatment regime with allowed deviation from the guidelines, and a next action in the treatment regime that adheres to the guidelines. The mechanism generates an outcome output display based on the determined next action in a treatment regime using the RL model with no distance function, optimal next action in the treatment regime with allowed deviation from the guidelines, and next action in the treatment regime that adheres to the guidelines.

Status:
Application
Type:

Utility

Filling date:

30 Dec 2019

Issue date:

1 Jul 2021