BCE Inc.
SYSTEM AND METHOD FOR DATA INGESTION AND WORKFLOW GENERATION
Last updated:
Abstract:
A system and method are provided for coordinating data ingestion and workflow. In an implementation, the method includes: obtaining, at a processor, a plurality of data ingestion jobs; identifying, based on a stored batching factor, a subset of the plurality of data ingestion jobs to be grouped together; performing batch processing of the subset of data ingestion jobs together in a single shell action; and creating a workflow schedule based on the single shell action comprising the batched data ingestion jobs. The present disclosure advantageously provides batch processing of data ingestion jobs themselves, in contrast to existing approaches which may use data ingestion jobs to perform batch processing on underlying data. The data ingestion jobs can be Sqoop jobs, or in other formats or using other approaches such as through Kafka, Flume or Spark.
Utility
15 Dec 2020
17 Jun 2021