Intuit Inc.
METHOD AND SYSTEM FOR GENERATING SYNTHETIC DATA USING A REGRESSION MODEL WHILE PRESERVING STATISTICAL PROPERTIES OF UNDERLYING DATA

Last updated:

Abstract:

A method for generating a synthetic dataset involves generating discretized synthetic data based on driving a model of a cumulative distribution function (CDF) with random numbers. The CDF is based on a source dataset. The method further includes generating the synthetic dataset from the discretized synthetic data by selecting, for inclusion into the synthetic dataset, values from a multitude of entries of the source dataset, based on the discretized synthetic data, and providing the synthetic dataset to a downstream application that is configured to operate on the source dataset.

Status:
Application
Type:

Utility

Filling date:

27 Nov 2019

Issue date:

27 May 2021