I recently had a case where replication lag on a slave was caused
by a backup script. First reaction was to incriminate the
additional pressure on the disks, but it turned out to be more
subtle: Percona XtraBackup was not able to execute FLUSH
TABLES WITH READ LOCK
due to a long-running query, and the
server ended up being read-only. Let’s see how we can deal with
that kind of situation.
In short
Starting with Percona XtraBackup 2.1.4, you can:
- Configure a timeout after which the backup will be aborted
(and the global lock released) with the
lock-wait-threshold
,lock-wait-query-type
andlock-wait-timeout
options - Or automatically kill all queries that prevent the lock to be
granted with the
kill-long-queries-timeout
andkill-long-query-type
settings
Full documentation is …
[Read more]