International Business Machines Corporation
Packing deduplicated data into finite-sized containers
Last updated:
Abstract:
Deduplicated data is packed into finite-sized containers. A similarity score is calculated between files that are similarly of the deduplicated data. The similarity score is used for grouping the similarly compared files of the deduplicated data into subsets for destaging each of the subsets from a deduplication system to one a finite-sized container. The similarity score is used for grouping the similarly compared files of the deduplicated data into subsets for destaging each of the subsets from a deduplication system to one of the finite-sized containers. An indication is received by a user of which of the similarly compared files are to be grouped into the subsets for destaging each of the subsets from a deduplication system to one of the finite-sized containers. Transitive closures are used for assisting with using the similarity score for grouping the similarly compared files of the deduplicated data into the subsets.
Utility
15 Dec 2017
3 Aug 2021