Recently, I was working on a very unfortunate case that revolved
around diverging clusters, data loss, missing important log
errors, and forcing commands on Percona XtraDB Cluster (PXC). Even though
PXC tries its best to explain what happens in the error log, I
can vouch that it can be missed or overlooked when you do not
know what to expect.
This blog post is a warning tale, an invitation to try yourself
and break stuff (not in production, right?).
TLDR:
Do you know right away what happened when
seeing this log?
2023-06-22T08:23:29.003334Z 0 [ERROR] [MY-000000] [Galera] gcs/src/gcs_group.cpp:group_post_state_exchange():433: Reversing history: 171 -> 44, this member has applied 127 more events than the primary component.Data loss is possible. Must abort.
Demonstration
Using the …
[Read more]