Microsoft Corporation
IT MONITORING RECOMMENDATION SERVICE

Last updated:

Abstract:

Operational metrics of a distributed collection of servers in a cloud environment are analyzed by a service to intelligently machine learn which operational metric is highly correlated to incidents or failures in the cloud environment. To do so, metric values of the operational metrics are analyzed over time by the service to check whether the operation metrics exceed a particular metric threshold. If so, the service also checks whether such spikes in the operation metric above the metric thresholds occurred during known cloud incidents. Statistics are calculated reflecting the number of times the operational metrics spiked during times of cloud incidents and spiked during times without cloud incidents. Correlation scores based on these statistics are calculated and used to select the correlated operational metrics that are most correlated to cloud failures.

Status:
Application
Type:

Utility

Filling date:

29 Nov 2021

Issue date:

17 Mar 2022