Apple Inc.
Incorporating user feedback into text prediction models via joint reward planning

Last updated:

Abstract:

An example process includes: obtaining input token(s); determining, using a joint prediction model, based on the input token(s): a first predicted token following the input token(s) and a second predicted token following the first predicted token; and a first user action to be performed on the first predicted token, where determining the first user action includes: determining a first reward value for performing the first user action based on a first current reward value for performing the first user action and a second reward value for performing a second user action on the second predicted token; outputting the first predicted token; detecting a user action performed on the first predicted token; and in accordance with a determination that the detected user action does not match the first user action: causing parameters of the joint prediction model to be updated, the parameters being configured to determine the first user action.

Status:
Grant
Type:

Utility

Filling date:

31 Aug 2020

Issue date:

23 Nov 2021