International Business Machines Corporation
Lexicographic deep reinforcement learning using state constraints and conditional policies

Last updated:

Abstract:

A computer-implemented method is provided for modified Lexicographic Reinforcement Learning. The computer implemented method includes obtaining, by a hardware processor, a sequence of tasks. Each of the tasks corresponds to, and has a one-to-one correspondence with, a respective award from among set of rewards. The method further includes performing, by the hardware processor for each of the tasks, reinforcement learning and deep learning for both of (i) one or more policies and (ii) one or more value functions, with a plurality of sets of samples. A plurality of solutions in a form of the one or more policies and the one or more value functions are parametrized by a single neural network with a selector which selects an input of the single neural network from among the plurality of sets of samples.

Status:
Grant
Type:

Utility

Filling date:

1 Mar 2019

Issue date:

9 Aug 2022