Illumina, Inc.
SOFTWARE ACCELERATED GENOMIC READ MAPPING
Last updated:
Abstract:
Methods, systems, apparatus, and computer programs are disclosed for software-accelerated genomic data read mapping. In one aspect, the methods can include actions of obtaining a k-mer seed from a genomic data read, generating a genomic signature based on the obtained k-mer seed, determining a reference sequence location that match at least a portion of the k-mer seed using a hash data structure, wherein the hash data structure comprises N data cells comprising a first portion storing a predetermined genomic signature and a second portion storing a value that corresponds to a first occurrence of a reference sequence location that match at least a portion of the k-mer seed from which the predetermined genomic signature was derived, and selecting the determined reference sequence location as an actual alignment for the obtained k-mer seed based on one or more alignment scores.
Utility
15 Sep 2021
17 Mar 2022