International Business Machines Corporation
LEVERAGING DYNAMICAL PRIORS FOR SYMBOLIC MAPPINGS IN SAFE REINFORCEMENT LEARNING

Last updated: 7 Sep 2022

Abstract:

Embodiments of the disclosure provide a reinforcement learning model configured to receive state data (e.g., image state data) and determine candidate actions (e.g., environment navigation actions, environment modification actions, etc.) based on the received state data. Embodiments of the disclosure further provide an object detector configured to generate symbolic state data (e.g., safety relevant data) from the state data. Accordingly, as described herein, a safety system can update a dynamical safety constraint based on the symbolic state data, as well as filter the actions determined by the reinforcement learning model and select an action to be executed based on the dynamical safety constraint. For instance, the safety system classifies each action (e.g., each candidate action determined by the reinforcement learning model) in each symbolic state as either "safe" or "not safe" based on the dynamical safety constraint (e.g., and a safe action may be selected and executed).

Status:

Application

Type:

Utility

Filling date:

18 Feb 2021

Issue date:

18 Aug 2022

Full patent description

Patent application document

International Business Machines Corporation LEVERAGING DYNAMICAL PRIORS FOR SYMBOLIC MAPPINGS IN SAFE REINFORCEMENT LEARNING

Abstract:

International Business Machines Corporation
LEVERAGING DYNAMICAL PRIORS FOR SYMBOLIC MAPPINGS IN SAFE REINFORCEMENT LEARNING