VMware, Inc.
Global deduplication on distributed storage using segment usage tables

Last updated:

Abstract:

Solutions are disclosed for blocks in a multi-writer log-structured file system. Solutions include selecting candidate segments in a storage medium; reading blocks of the candidate segments; determining whether any blocks are duplicates; updating a reference count for the duplicate blocks; identifying unique blocks; writing at least a portion of the unique blocks to a log; determining whether the log has accumulated a full segment of data; based at least on determining that the log has accumulated a full segment of data, writing the full segment to the storage medium; updating a segment usage table (SUT) to mark the candidate segments as free; and updating the SUT to mark a segment of the storage medium as no longer free. Some examples identify a window start time and stop time, because older segments have been deduped and younger segments may be volatile. Some examples adjust the window to improve performance.

Status:
Grant
Type:

Utility

Filling date:

24 Apr 2020

Issue date:

17 Aug 2021