International Business Machines Corporation
Methods and systems for proactive management of node failure in distributed computing systems

Last updated:

Abstract:

Embodiments for managing distributed computing systems are provided. Information associated with operation of a computing node within a distributed computing system is collected. A reliability score for the computing node is calculated based on the collected information. The calculating of the reliability score is performed utilizing the computing node. A remedial action associated with the operation of the computing node is caused to be performed based on the calculated reliability score.

Status:
Grant
Type:

Utility

Filling date:

14 Oct 2019

Issue date:

17 Aug 2021