one
Systems for parallel processing of datasets with dynamic skew compensation
Last updated:
Abstract:
Systems and methods are provided for parallel processing of datasets with dynamic skew compensation. The disclosed systems and methods may increase the efficiency of dataset processing by imposing maximum size limits on parallel processing environment tasks. The disclosed systems and methods may generate a target partition of a variable, a database storing data elements, a cluster that generates one or more output files based on the target partition and the data elements, and a display device that displays analysis results for the target partition using the one or more output files. Generation may comprise creating a calculation partition, mapping data elements according to the calculation partition, and generating the one or more output files based on the mapped data elements. The calculation partition may depend on a target partition and a uniform partition that partitions values based on one or more of statistical measures and pseudorandom functions.
Utility
21 Sep 2016
26 Jan 2021