Skip to main content

Consensus: Raft / Paxos — Agreement Under Failure

Algorithm for a cluster of nodes to agree on a single value (or log entry) even when some nodes fail.

When to use

  • Leader election in distributed systems (etcd, ZooKeeper)
  • Replicated log for consistent state across nodes (Kafka KRaft, CockroachDB)

Tradeoffs

  • Requires N/2+1 quorum — minority partition becomes unavailable
  • Latency penalty for every write (must wait for quorum acknowledgment)
type NodeState int

const (
Follower NodeState = iota // default, receives log entries from leader
Candidate // seeking votes after election timeout
Leader // sends heartbeats, accepts writes
)

type RaftNode struct {
state NodeState
currentTerm int
votedFor *string
log []LogEntry
}

// Transition: Follower → Candidate on election timeout
func (n *RaftNode) startElection() {
n.state = Candidate
n.currentTerm++
n.votedFor = &n.id
// broadcast RequestVote RPCs to all peers
}

Gotcha: Raft is Paxos made understandable. etcd uses Raft. Kafka replaced ZooKeeper with KRaft (also Raft). If you're building on these, you're already using consensus.