Microsoft Corporation
TRACE RECONSTRUCTION OF POLYMER SEQUENCES USING QUALITY SCORES

Last updated:

Abstract:

Polymeric molecules such as deoxyribose nucleic acid (DNA) provide a storage medium for digital data that has advantages over conventional storage media. Accessing digital data stored in polymers includes decoding the output of sequencers which detect the physical order of monomer subunits in the polymers. This output includes errors which are corrected through the process of trace reconstruction. Trace reconstruction identifies a consensus output sequence from a set of noisy output reads provided by a sequencer. The accuracy of trace reconstruction is improved by using weighted majority voting to determine the consensus output sequence. Weights are based on quality labels assigned by the sequencer to its output. Quality labels may be derived from empirical error data. A quality label for a single position in an output read may be determined independently or it may be influenced by quality labels of other nearby positions in the read.

Status:
Application
Type:

Utility

Filling date:

31 Oct 2019

Issue date:

6 May 2021