International Business Machines Corporation
STREAMLINING DATA PROCESSING OPTIMIZATIONS FOR MACHINE LEARNING WORKLOADS

Last updated:

Abstract:

Techniques for refinement of data pipelines are provided. An original file of serialized objects is received, and an original pipeline comprising a plurality of transformations is identified based on the original file. A first computing cost is determined for a first transformation of the plurality of transformations. The first transformation is modified using a predefined optimization, and a second cost of the modified first transformation is determined. Upon determining that the second cost is lower than the first cost, the first transformation is replaced, in the original pipeline, with the optimized first transformation.

Status:
Application
Type:

Utility

Filling date:

2 Jun 2020

Issue date:

2 Dec 2021