International Business Machines Corporation
Methods and systems for proactive management of node failure in distributed computing systems
Last updated:
Abstract:
Embodiments for managing distributed computing systems are provided. Information associated with operation of a computing node within a distributed computing system is collected. A reliability score for the computing node is calculated based on the collected information. The calculating of the reliability score is performed utilizing the computing node. A remedial action associated with the operation of the computing node is caused to be performed based on the calculated reliability score.
Status:
Grant
Type:
Utility
Filling date:
14 Oct 2019
Issue date:
17 Aug 2021