Open Text Corporation
METHOD AND SYSTEM FOR DOCUMENT SIMILARITY ANALYSIS
Last updated:
Abstract:
A method for document similarity analysis. The method includes generating a reference document content identifier for a reference document, including identifying frequently occurring terms in reference document content, encoding each frequently occurring term in a term identifier and combining the term identifiers to form the reference document content identifier associated with the reference document. The method also includes obtaining at least one document similarity value by comparing the reference document content identifier to a set of archived document content identifiers stored in a document repository.
Status:
Application
Type:
Utility
Filling date:
14 Feb 2020
Issue date:
11 Jun 2020