Honda Motor Co., Ltd.
DOCUMENT ANALYSIS SYSTEM

Last updated:

Abstract:

There is provided a system configured to appropriately determine a topic count in accordance with LDA to estimate latent meanings of a document. For a plurality of documents d, a perplexity PPL of each document d is evaluated in accordance with a document generation probability in which the document d is generated when topic counts N for defining a topic model based on the LDA as a document generation model are hypothetically specified as different values and word groups are specified by different random numbers. The topic model is defined by a reference topic count N.sub.0 determined by combining a first topic count N.sub.1 (the number of topics indicating a highest cumulative frequency at which the perplexity PPL first indicates a minimum value) and a second topic count N.sub.2 (the number of topics indicating a highest cumulative frequency at which the perplexity PPL indicates a smallest value).

Status:
Application
Type:

Utility

Filling date:

22 Feb 2021

Issue date:

26 Aug 2021