International Business Machines Corporation
DATASET CREATION FOR DEEP-LEARNING MODEL
Last updated:
Abstract:
One embodiment provides a method, including: receiving a training dataset to be utilized for training a deep-learning model; identifying a plurality of aspects of the training dataset, wherein each of the plurality of aspects corresponds to one of a plurality of categories of operations that can be performed on the training dataset; measuring, for each of the plurality of aspects, an amount of variance of the aspect within the training dataset; creating additional data to be incorporated into the training dataset, wherein the additional data comprise data generated for each of the aspects having a variance less than a predetermined amount, wherein the data generated for an aspect results in the corresponding aspect having an amount of variance at least equal to the predetermined amount; and incorporating the additional data into the training dataset.
Utility
24 Feb 2020
26 Aug 2021