Adobe Inc.
Method to identify and extract fragments among large collections of digital documents using repeatability and semantic information
Last updated:
Abstract:
A corpus of documents is processed using, for example, algorithms including deep learning and deep neural networks ("DNN"), to extract fragments across the corpus of documents. The extracted fragments can then be edited individually and referenced by a plurality of documents so that changes to the fragments are reflected universally across a corpus of documents efficiently. In one example case, a computer-implemented method is provided for extracting fragments in a digital document. The method includes indexing said document to generate a document element ID sequence; processing said document element ID sequence to generate at least one fragment candidate; processing said at least one fragment candidate to generate at least one respective fragment; and utilizing said at least one fragment to perform a reconstruction of said document.
Utility
11 Oct 2017
22 Dec 2020