International Business Machines Corporation
Document anonymization including selective token modification

Last updated:

Abstract:

Embodiments relate to an intelligent computer platform to selectively amend one or more tokens in a document. A first document set is subjected to natural language processing (NLP) and a vector score is identified for two or more documents of the first document set. Upon receipt of a new document, the new document is subjected to NLP and a new document vector score is identified. The new document is analyzed against the first document set, and the identified vector score of the first document set is compared to the vector score of the new document. One or more tokens of the new document are amended responsive to the comparison, and a new document version is created from the selective amendment.

Status:
Grant
Type:

Utility

Filling date:

9 Sep 2019

Issue date:

17 May 2022