What I Wish I Knew Before My First MariaDB Galera Cluster
Advertisement
Galera Cluster's pitch is incredibly appealing: multiple MariaDB nodes, all completely writable, all synchronously in sync, no replication lag, no panicking over "which node has the latest data". All of that is genuinely true.
What the pitch absolutely does not emphasize is that "synchronous" and "multi-master" both come with brutal consequences that show up in production, not in a sterile demo with two nodes and zero real traffic. None of these are reasons to avoid Galera, but they are absolutely things I would have configured for from day one if I had known.
Quorum means three nodes is the real minimum
Galera strictly uses quorum to decide whether a node, or a group of nodes, is allowed to keep operating after a network split. A majority of nodes must be able to physically see each other.
With two nodes, losing either one means the remaining node literally cannot form a majority and instantly stops accepting writes. Two nodes gives you replication, not high availability. Three is the practical minimum. However, the third node does not need to be anywhere near as powerful as the other two. A tiny "arbitrator" node (using garbd, Galera's lightweight arbitrator daemon) can serve purely to maintain quorum without holding a full data copy at all.
Every write waits for the slowest node
Synchronous replication means a commit is not acknowledged until all nodes have explicitly certified it. This means your cluster's write performance is strictly bounded by the slowest node and the absolute worst network latency between your nodes.
This is perfectly fine when all nodes are in the exact same datacenter with sub-millisecond latency between them. It becomes painfully noticeable if someone suggests putting a Galera node in a completely different region "for redundancy." Every single write across the entire cluster now waits for a round trip to that distant node, and the cluster's write throughput dramatically drops to match. Geographic distribution with Galera is absolutely possible, but it is a deliberate architectural decision, not something to carelessly back into.
Schema changes need a plan, not just an ALTER TABLE
A plain ALTER TABLE on a large table in Galera can violently lock the entire cluster for the duration of the change. Every node applies it via Total Order Isolation by default, completely blocking writes cluster-wide until it is fully done.
For anything beyond a tiny table, Rolling Schema Upgrade (RSU) is the much better approach. You apply the change to exactly one node at a time, with that node temporarily desynced from the cluster while it processes the change:
SET SESSION wsrep_OSU_method='RSU';
ALTER TABLE large_table ADD COLUMN new_field VARCHAR(255);
Repeat this carefully per node. The node running the alter is briefly out of sync (and ideally actively taken out of the load balancer's rotation during that window), but the rest of the cluster keeps serving traffic perfectly normally throughout. The first time I ran a routine ALTER TABLE on a multi-million-row table with the default method, the whole cluster paused for the entire duration, which was an incredibly memorable way to learn this.
wsrep_local_state_comment is the status line that actually matters
Galera nodes can be "up" from a Linux process standpoint while not actually being part of a functioning cluster. They could be joining, acting as a donor, desynced, or in a hard error state. The single most genuinely useful thing to check, especially right after a node restarts, is this:
SHOW STATUS LIKE 'wsrep_local_state_comment';
SHOW STATUS LIKE 'wsrep_cluster_size';
wsrep_local_state_comment should proudly read Synced, and wsrep_cluster_size should perfectly match the number of nodes you expect. A node that restarted after a crash and is currently showing Joining or Donor/Desynced is mid-recovery. It is potentially doing a massive state transfer from another node (which is its own heavy load event worth knowing about) and is not ready for traffic yet, even though MariaDB itself might happily respond to a basic connection.
None of this is a reason not to use Galera
Every single one of these things is manageable once you know it is there. Use three nodes (or two plus an arbitrator) for real quorum. Keep nodes physically close together on the network. Use RSU for schema changes on anything non-trivial. Check wsrep_local_state_comment as a strict part of any health check or after any restart.
Galera absolutely does deliver on synchronous multi-master replication. The catch is simply that "synchronous" and "multi-master" are exactly the properties that make those four points matter so much. They are just the cost of the thing that makes Galera worth using in the first place.
Advertisement