Intuit Inc.
METHOD AND SYSTEM FOR GENERATING SYNTHETIC DATA USING A REGRESSION MODEL WHILE PRESERVING STATISTICAL PROPERTIES OF UNDERLYING DATA
Last updated:
Abstract:
A method for generating a synthetic dataset involves generating discretized synthetic data based on driving a model of a cumulative distribution function (CDF) with random numbers. The CDF is based on a source dataset. The method further includes generating the synthetic dataset from the discretized synthetic data by selecting, for inclusion into the synthetic dataset, values from a multitude of entries of the source dataset, based on the discretized synthetic data, and providing the synthetic dataset to a downstream application that is configured to operate on the source dataset.
Status:
Application
Type:
Utility
Filling date:
27 Nov 2019
Issue date:
27 May 2021