International Business Machines Corporation
Automated document filtration with machine learning of annotations for document searching and access
Last updated:
Abstract:
Computer-based methods, systems, and computer readable media for managing documents within a content repository or documents within the document subsets are provided. Documents within the content repository may be classified into one of a functional category and a clinical category. Documents are applied to a machine learning annotation and analysis module to automatically annotate the documents to indicate relationships between entities. A request is processed for the documents including one or more search terms, wherein the search terms pertain to one or more entities from a group of gene, gene variant, drug, cancer and a biomedical/clinical term. Documents satisfying the request are identified by comparing the one or more search terms to the annotations and specific sections of the documents, and determining a relevance of a document based on the comparison and a frequency of the one or more search terms in each of the specific sections. The identified documents are ranked according to custom techniques.
Utility
4 Jan 2019
20 Jul 2021