Snowflake Inc.
Multidimensional and multi-relation sampling for implementing multidimensional two-sided interval joins
Last updated:
Abstract:
Disclosed herein are systems and methods for multidimensional and multi-relation sampling for implementing multidimensional two-sided interval joins. In an embodiment, a data platform receives query instructions for a two-sided N dimensional interval join, where N is an integer greater than 1. The two-sided N dimensional interval join has an interval-join predicate that compares intervals determined from the input relations in each of N dimensions. The data platform samples interval sizes in one or more input relations, and demarcates an N dimensional input domain based on the sampling. The data platform implements the two-sided N dimensional interval join using an N dimensional band join followed by a filter that applies the interval-join predicate. The N dimensional band join includes a hash join keyed to N dimensional domain cells overlapped at least in part by intervals in the input relations in each of the N dimensions.
Utility
23 Apr 2021
7 Dec 2021