When using MySQL Group Replication, it’s possible that some members are lagging behind the group. Due to load, hardware limitation, etc… This lag can become problematic to keep good certification behavior regarding performance and keep the possible certification failure as low as possible. Bigger is the applying queue bigger is the risk to have conflicts with those not yet applied transactions (this is problematic on Multi-Primary Groups).
Galera users are already familiar with such concept. MySQL Group Replication’s implementation is different 2 main aspects:
- the Group is never totally stalled
- the node having issues doesn’t send flow control messages to the rest of the group asking for slowing down
In fact, every member of the Group send some statistics about its queues (applier queue and certification queue) to the other members. Then every node decide to slow down or not if they …
[Read more]