Atlassian Corporation
SYSTEMS AND METHODS FOR TESTING RESILIENCE OF A DISTRIBUTED NETWORK
Last updated:
Abstract:
Method for testing the resilience of a distributed network. The distributed network comprising one or more services, each service associated with compute groups, each compute group comprising active compute instances. The method comprising: for a service from the plurality of services: retrieving test parameters, the test parameters indicating at least the schedule for performing a resilience test on the service, unique identifiers of compute groups registered for resilience testing, and the probability of terminating a compute instance; determining whether to terminate a compute instance based on the probability of terminating the compute instance; in response to determining to terminate the compute instance: randomly selecting a compute group from the compute groups registered for resilience testing; receiving a list of active compute instances for the selected group; randomly selecting an active compute instance from the list of compute instances for terminating; and causing the selected compute instance to terminate.
Utility
28 Sep 2018
2 Apr 2020