Pacific Biosciences of California, Inc.
SYSTEMS AND METHODS FOR GRAPH BASED MAPPING OF NUCLEIC ACID FRAGMENTS

Last updated:

Abstract:

Technical solutions for mapping long nucleic acid sequence reads to a target sequence are provided. A directed graph, representing all or some of a genome and comprising one or more nonlinear topological components, is obtained for an organism having a heterozygous genome. Each nonlinear topological component has an initiating node and a terminal node connected by at least a first branch and a second branch. One of these branches corresponds to the target sequence. The directed graph uses a plurality of sequence reads from a biological sample of the organism. The sequence reads are overlapped by an unrestricted overhang amount, provided there is a minimum consensus region between each two sequence reads. A query sequence, encompassing at least the initiating node or the terminal node of a first nonlinear topological component, is obtained. The directed graph is used to form a mapping of the query sequence to the directed graph.

Status:
Application
Type:

Utility

Filling date:

24 Jan 2020

Issue date:

30 Jul 2020