When Machine Learning Meets Raft: How to Elect a Leader over a Network
Description
Numerous well-known applications use the Raft consensus algorithm to maintain consistent replicas of their data on distributed nodes. Raft is based on a dynamically elected leader who is one of the distributed nodes, and its operations are unfortunately suspended during the election of the leader. Elections can be triggered by the failure of the current leader, in which case they are unavoidable, or by a network disconnect between the leader and another node, in which case a new inefficient leader will likely replace the previous one at the expense of additional system downtime. In this paper, Raft messages are monitored at every node, and Machine Learning is used to classify the aforementioned causes of each election. This data is used to increase the system’s availability by decreasing the total number of elections that could be conducted in a given time unit. Three supervised classifiers were trained with messages generated in a real Raft-operated distributed system that was deployed on a testbed and where multiple events triggering elections were applied. All classifiers are nearly 97% accurate at classifying the causes of these elections, approaching even 100% in some cases.
Files
1570911645 paper.pdf
Files
(334.5 kB)
Name | Size | Download all |
---|---|---|
md5:df4038501fe213ef1236af3dbd3361d2
|
334.5 kB | Preview Download |