Honda Motor Co., Ltd.
REINFORCEMENT LEARNING WITH ITERATIVE REASONING FOR MERGING IN DENSE TRAFFIC

Last updated:

Abstract:

According to one aspect, a system for reinforcement learning with iterative reasoning may include a memory for storing computer readable code and a processor operatively coupled to the memory, the processor configured to receive a level-0 policy and a desired reasoning level n. The processor may repeat for k=1 . . . n times, the following: populate a training environment with a level-(k-1) first agent, populate the training environment with a level-(k-1) second agent, and train a level-k agent based on the level-(k-1) first agent and the level-(k-1) second agent to derive a level-k policy.

Status:
Application
Type:

Utility

Filling date:

28 Jul 2020

Issue date:

2 Sep 2021