Roblox Corporation
Addressing data skew using map-reduce
Last updated:
Abstract:
A system and method includes using a queue with map-reduce. The system includes a computer cluster that is to execute, by a first node, a first reduce operation on a first location of data to generate a first plurality of markers indicative of data at the first location of data and execute, by a second node, a second reduce operation on a second location of data to generate a second plurality of markers indicative of data at the second location of data. Responsive to generation of one or more markers, the computer cluster is to submit the one or more markers to a queue. Responsive to completing the first reduce operation by the first node, the computer cluster is to direct the first node to perform a first copy operation that copies first data identified by a first marker of the one or more markers in the queue.
Utility
26 Jul 2018
11 May 2021