International Business Machines Corporation
INTEGRATING SIMULATED AND REAL-WORLD DATA TO IMPROVE MACHINE LEARNING MODELS

Last updated:

Abstract:

Techniques for data integration and labeling are provided. Training real-world signal data is collected for a physical environment, where the training real-world signal data comprises at least one of (i) coordinate information or (ii) a direction to move. Simulated signal data is generated for a first portion of the physical environment, and an aggregate data set is generated comprising the training real-world signal data and the simulated signal data. A machine learning (ML) model is trained using the aggregate data set. A first real-world data point is received, where the first real-world data point does not include coordinate information, and the first real-world data point is labeled based at least in part on coordinate information of the aggregate data set.

Status:
Application
Type:

Utility

Filling date:

15 Jan 2020

Issue date:

15 Jul 2021