International Business Machines Corporation
IDENTIFICATION OF CHANGES BETWEEN DOCUMENT VERSIONS
Last updated:
Abstract:
One embodiment provides a method, including: obtaining at least two documents, wherein one of the at least two documents comprises a revision different than another of the at least two documents; identifying, within each of the at least two documents, portions corresponding to groups of text containing a conceptual unit; assigning at least a subset of the identified portions to a category type corresponding to a topic of a given portion, wherein the assigning comprises (i) generating a semantic tag for the identified portions in the subset and (ii) tagging the identified portions in the subset with the semantic tag; and determining changes between the at least two documents, wherein the determining comprises (iii) aligning given portions across the at least two documents based upon a relationship between the given portions across the at least two documents, (iv) identifying semantic differences between the aligned portions, and (v) identifying any remaining unaligned portions.
Utility
2 Mar 2020
2 Sep 2021