Automatic failover of distributed databases has been more of a dream than reality for most IT organizations. But thanks to the development of the Raft Consensus Algorithm at Stanford University, automatic failover of databases may soon become a routine thing.
Raft was developed to solve the challenges associated with maintaining high availability across multiple classes of software running in distributed computing environments. While Raft has been available to vendors for some time, RethinkDB CEO Slava Akhmechet says it’s a challenging technology to actually implement, which explains why adoption of Raft by multiple vendors is coming at a slow pace.
Akhmechet says failover in distributed database environments can be especially challenging because database administrators generally have no control over the underlying IT infrastructure. As such, a critical network link can suddenly disappear, thereby rendering the database on the other side of that network link all but useless.
Raft is designed to replace the Paxos protocol, which has been the previous standard for managing failovers, with a protocol that makes it simpler to maintain machine states across a cluster of systems. In the event of a failure, Raft makes it possible for one database to seamlessly pick up where another left off. That means that all the hard, manual effort that IT administrators put into making sure that systems are always available may soon become a thing of the past.