Illumina, Inc.
Variant Classifier Based on Deep Neural Networks

Last updated:

Abstract:

We introduce a variant classifier that uses trained deep neural networks to predict whether a given variant is somatic or germline. Our model has two deep neural networks: a convolutional neural network (CNN) and a fully-connected neural network (FCNN), and two inputs: a DNA sequence with a variant and a set of metadata features correlated with the variant. The metadata features represent the variant's mutation characteristics, read mapping statistics, and occurrence frequency. The CNN processes the DNA sequence and produces an intermediate convolved feature. A feature sequence is derived by concatenating the metadata features with the intermediate convolved feature. The FCNN processes the feature sequence and produces probabilities for the variant being somatic, germline, or noise. A transfer learning strategy is used to train the model on two mutation datasets. Results establish advantages and superiority of our model over traditional classifiers.

Status:
Application
Type:

Utility

Filling date:

12 Apr 2019

Issue date:

17 Oct 2019