SAP SE
Data grouping for efficient parallel processing
Last updated:
Abstract:
An improved process for distributing data objects and a process for reducing skew in groups of data objects to be processed in parallel are provided herein. A request for parallel processing of a plurality of data objects is received. One or more groups for distributing the data objects are generated. Hash value intervals for the one or more groups are determined. Hash values for the plurality of data objects are determined. The plurality of data objects are distributed into the one or more groups based on their respective hash values and the hash value intervals. The plurality of data objects are processed in parallel by the groups comprising the distributed data objects. The processing results of the plurality of data objects are provided in response to the request.
Utility
6 Jun 2018
15 Jun 2021