You probably missed the news, but… PT-1751: Adds –where param to pt-online-schema-change This brings the possibility to perform what I would call an “inverted purge” because you are not actually purging rows from your multi-terabyte table, but rather, you copy the small percentage of rows you want to keep to a new table and then […]
We all know that data is important, and some businesses need historical data to be available all the time. The problem is that queries on large tables perform poorly if they are not properly optimized. We get many customer requests in Managed Services to purge/archive large tables, and to achieve it, we use pt-archiver.
Recently, we received a request to archive a large table, and the customer was worried about the downtime and performance issues during the archival.
We proposed a solution to the customer to archive the table using pt-archive. The idea is to archive old data to other tables and keep the latest data on the current table with minimal performance issues. All of the data will remain available and can be queried anytime.
In the blog, I will …
[Read more]Not long ago, we found a long-running query in one of our clients on Managed Services. While reviewing this long-running query, we found it was generated by a pt-archiver operation. That doesn’t sound good, right?
I was able to reproduce the issue on my test environment. In this article, I will show you how to use pt-archiver to purge data from a table using a secondary index. First, we need to have the full picture before going any deeper. Here is a representation of the table structure:
Create Table: CREATE TABLE `sbtest1` ( `id` int(11) NOT NULL AUTO_INCREMENT, `k` int(11) NOT NULL DEFAULT '0', `c` char(120) NOT NULL DEFAULT '', `pad` char(60) NOT NULL DEFAULT '', PRIMARY KEY (`id`), KEY `k_1` (`k`) ) ENGINE=InnoDB AUTO_INCREMENT=4000001 DEFAULT …[Read more]
This article discusses how to salvage a
crashed pt-online-schema-change
by leveraging
pt-archiver
and executing queries to ensure that the
data gets accurately migrated. I will show you how to
continue the data copy process, and how to safely close out the
pt-online-schema-change
via manual operations such
as RENAME TABLE
and DROP TRIGGER
commands. The normal process to recover from a crashed
pt-online-schema-change
is to drop the triggers on
your original table and drop the new table created by the script.
Then you would restart pt-online-schema-change
. In
this case, this wasn’t possible.
A customer recently needed to add a primary key column to a very busy table (with around 200 million rows). The table only had a …
[Read more]Percona Toolkit’s pt-archiver is one of the best utilities to archive the records from large tables to another tables or files. One interesting thing is that pt-archiver is a read-write tool. It deletes data from the source by default, so after archiving you don’t need to delete it separately.
As it is done by default, you should take care before actually running it on then production server. You can test your archiving jobs with the – dry-run OR you can use the –no-delete option if you’re not sure about. The purpose of this script is mainly to archive old data from the table without impacting OLTP queries and insert the data into …
[Read more]On April 3 at 10 a.m. PST, I’ll be giving a webinar titled “5 Percona Toolkit Tools for MySQL That Could Save Your Day.” In this presentation you’ll learn how to perform typical but challenging MySQL database administration tasks.
My focus will be on the following tools:
- pt-query-digest, to select the queries you should try to improve to get optimal response times
- pt-archiver, to efficiently purge purge data from huge tables
- pt-table-checksum/pt-table-sync, to check if data on replicas is in sync with data on the master
- pt-stalk, to gather data when performance problems happen randomly or are very short
- pt-online-schema-change, to run …
I really like the percona toolkit, we all love the percona
toolkit.
I know how it’s difficult to write operational and efficient
scripts (I try to do that myself everyday)
And it is even more difficult to share a script, to take the
responsibility to share its own code.
From there, understand that this article is simply a
review of my own thoughts about pt-archiver (with the
invaluable assistance of @maximefouilleul), I don’t
want to question the quality or usefulness of this tool.
I tried pt-archiver for the first time this week, and the first thing I do before using a tool is read the documentation (yes, I really like to read documentations)
I was intrigued by some options of this tool, first, I can read
“It deletes data from the source by default“.
Personally, I hate that you want to remove my data by default, I
prefer to have a –delete option …