Bank of America Corporation
JOINER FOR DISTRIBUTED DATABASES
Last updated:
Abstract:
A joiner accesses a first sorted dataset and a second sorted dataset. Each dataset includes a corresponding plurality of data blocks, each including a set of records. Each record is associated with a corresponding record key. A set of first records for each first data block of the first dataset is arranged based on values of the first record keys. A set of second records for each second data block of the second dataset is arranged based on values of the second record keys. A first root element is extracted from the first sorted dataset. A second root element is extracted from the second sorted dataset. In response to determining that the first and second root elements match, an output is generated by joining the first record associated with the first root element with the second record associated with the second root element.
Utility
18 Sep 2019
18 Mar 2021