International Business Machines Corporation
METHOD FOR INTERMEDIATE MODEL GENERATION USING HISTORICAL DATA AND DOMAIN KNOWLEDGE FOR RL TRAINING

Last updated:

Abstract:

Embodiments may include novel techniques for intermediate model generation using historical data and domain knowledge for Reinforcement Learning (RL) training. Embodiments may start with gathering client data. For example, in an embodiment, a method, implemented in a computer system comprising a processor, memory accessible by the processor, and computer program instructions stored in the memory and executable by the processor, may comprise identifying historical data and domain knowledge of a client including mathematical properties of features, generating an intermediate model comprising a probabilistic description of the environment, such as an MDP graph or transition probability matrix based on the identified historical data and domain knowledge, training a Reinforcement Learning (RL)/Deep Reinforcement Learning (DRL) model using the generated intermediate model, and deploying the trained Reinforcement Learning (RL)/Deep Reinforcement Learning (DRL) model and continuing training the trained Reinforcement Learning (RL)/Deep Reinforcement Learning (DRL) model from a real environment.

Status:
Application
Type:

Utility

Filling date:

12 Oct 2020

Issue date:

14 Apr 2022