International Business Machines Corporation
COMPUTERIZED ASSESSMENT OF ARTICLES WITH SIMILAR CONTENT AND HIGHLIGHTING OF DISTINCTIONS THEREBETWEEN

Last updated:

Abstract:

A computer receives a list of reference topics from a topic database and a set of articles related to said reference topics. The computer generates article n-grams and compares them to the reference topics using NLP to determine a primary theme for each article that corresponds to one of reference topics. The computer collects articles with common primary themes into at least one article group and determining an article comparison value between articles in the article group. Responsive to determining that an article comparison value is below a predetermined similarity threshold, determining a distinguishing feature associated with one of the compared articles that contributed to the article comparison value. The computer assigns articles having the distinguishing feature into a secondary group based, at least in part, on the distinguishing feature.

Status:
Application
Type:

Utility

Filling date:

6 Nov 2020

Issue date:

12 May 2022