Amazon.com, Inc.
Dynamic distributed data clustering using multi-level hash trees

Last updated:

Abstract:

Techniques are described for clustering data at the point of ingestion for storage using scalable storage resources. To cluster data at the point of ingestion, a data ingestion and query service uses a multilevel hash tree (MLHT)-based index to map a hierarchy of attribute values associated with each data element onto a point of a MLHT (which itself conceptually maps onto a continuous range of values). The total range of the MLHT is divided into one or more data partitions, each of which is mapped to one or more physical storage resources. A mapping algorithm uses the hierarchy of attribute fields to calculate the position of each data element ingested and, consequently, a physical storage resource at which to store the data element.

Status:
Grant
Type:

Utility

Filling date:

28 Jun 2018

Issue date:

14 Sep 2021