International Business Machines Corporation
CHANGE-POINT DRIVEN FEATURE SELECTION FOR MULTI-VARIATE TIME SERIES CLUSTERING
Last updated:
Abstract:
One embodiment provides a method, including: receiving a multi-variate time-series dataset comprising a plurality of time-dependent datasets; for each of the plurality of time-dependent datasets, segmenting each of the plurality of time-dependent datasets at a transition point; clustering segments of the plurality of time-dependent datasets into clusters having similar lengths of segments; for each cluster (i) selecting a representative segment length and (ii) identifying a feature subset in that cluster; identifying, across the feature subsets, subset transition points, wherein each of the subset transition points corresponds to a change in value that meets a predetermined threshold within its corresponding feature subset; and determining, by applying a threshold test to the subset transition points, a segment length to be used in segmenting the entire multi-variate time-series dataset.
Utility
19 May 2020
25 Nov 2021