International Business Machines Corporation
Feature selection for efficient epistasis modeling for phenotype prediction
Last updated:
Abstract:
Various embodiments select markers for modeling epistasis effects. In one embodiment, a processor receives a set of genetic markers and a phenotype. A relevance score is determined with respect to the phenotype for each of the set of genetic markers. A threshold is set based on the relevance score of a genetic marker with a highest relevancy score. A relevance score is determined for at least one genetic marker in the set of genetic markers for at least one interaction between the at least one genetic marker and at least one other genetic marker in the set of genetic markers. The at least one interaction is added to a top-k feature set based on the relevance score of the at least one interaction satisfying the threshold.
Utility
14 Sep 2018
17 May 2022