Microsoft Corporation
CONSISTENCY CHECKING FOR DISTRIBUTED ANALYTICAL DATABASE SYSTEMS
Last updated:
Abstract:
Embodiments described herein are directed to detecting data inconsistencies within a distributed database and identifying the cause thereof. For example, lineage events are emitted from different components of the distributed system that operate on various data files. A consistency checking engine analyzes these events and detects inconsistencies with respect to the data files. The embodiments described herein checks the integrity of the database and assists in understanding the root cause in case of a corruption. Moreover, it provides the timeline for the corruption and whether it is repairable or not. These properties enable determining the right time to restore the customer's database or the right set of actions to repair the corruption. In case of repairable corruption, the correct compensating repair actions may be applied.
Utility
25 Feb 2021
25 Aug 2022