Microsoft Corporation
TOKENIZING ALPHANUMERIC TEXT THROUGH USE OF FINITE STATE MACHINES
Last updated:
Abstract:
Described herein are technologies related to tokenizing alphanumeric text through use of a tokenization algorithm that is at least partially implemented as a finite state machine. The tokenization algorithm is configured to output numeric identifiers that represent tokens or sub-tokens in the alphanumeric text.
Status:
Application
Type:
Utility
Filling date:
2 Mar 2021
Issue date:
8 Sep 2022