Oracle’s Mats Kindahl to weave MySQL Fabric into Percona Live session

Mats Kindahl of Oracle is lead developer of MySQL Fabric

MySQL Fabric is an integrated framework for managing farms of MySQL servers with support for both high-availability and sharding. Its development has been spearheaded by Mats Kindahl, senior principal software developer in MySQL at Oracle.

Mats is leading the MySQL Scaling and High-Availability effort covering the newly released MySQL Fabric and the MySQL Applier for Hadoop. He is also the architect and implementer of several features (mostly replication features), including the row-based replication available in 5.1 and the binary log group commit available in MySQL 5.6. Before starting MySQL he earned a doctoral degree in the area of automated verification of distributed systems and worked with implementation of C and C++ compilers.

He'll be presenting at next month's

Percona Replication Manager (PRM) now supporting 5.6 GTID

Over the last few days, I integrated the MySQL 5.6 GTID version of the Percona Replication Manager (PRM) work of Frédéric Descamps, a colleague at Percona. The agent supports the GTID replication mode of MySQL 5.6 and if the master suffers a hard crash, it picks the slave having applied the highest transaction ID from the dead master. Given the nature of GTID-based replication, that causes all the other slaves to resync appropriately to their new master which is pretty cool and must yet be matched by the regular PRM agent.

For now, it is part of a separate agent, mysql_prm56, which may be integrated with the regular agent in the future. To use it, download the agent with the link above, the pacemaker configuration is similar to the one of the regular PRM agent. If you start from scratch, have a look

keepalived with reader and writer VIPs for Percona XtraDB Cluster

This is a followup to Jay Janssen’s October post, “Using keepalived for HA on top of Percona XtraDB Cluster.” We got a request recently where the customer has 2 VIPs (Virtual IP addresses), one for reader and one for a writer for a cluster of 3 nodes. They wanted to keep it simple, with low latency and does not require an external node resource like HaProxy would.

keepalived is a simple load balancer with HA capabilities, which means it can proxy TCP services behind it and at the same time, keep itself highly available using VRRP as failover mechanism. This post is about taking advantage of the

The use of Iptables ClusterIP target as a load balancer for PXC, PRM, MHA and NDB

Most technologies achieving high-availability for MySQL need a load-balancer to spread the client connections to a valid database host, even the Tungsten special connector can be seen as a sophisticated load-balancer. People often use hardware load balancer or software solution like haproxy. In both cases, in order to avoid having a single point of failure, multiple load balancers must be used. Load balancers have two drawbacks: they increase network latency and/or they add a validation check load on the database servers. The increased network latency is obvious in the case of standalone load balancers where you must first connect to the load balancer which then completes the request by connecting to one of the database servers. Some workloads like reporting/adhoc queries are not affected by a small increase of latency but other workloads like oltp processing and real-time logging are. Each load balancers must also check regularly if the database

The most common cause of unavailability

Hi, Happy new year.

I've done a lot of work on high-availability systems. There is a lot of writing on high-availability systems - how to implement failover, hot-spare systems, load-balancers etc.

However, most of these seem to make an assumption: humans are infallible.

In practice, this is not always the case.

In fact, I'd say that probably about 75% of downtime is caused by human errors, cock-ups, mistakes. I'm not an expert, but I suspect that it's about the same proportion as air crashes caused by pilot (or someone else's) error.

So, it's the human, stupid. PBKAC (problem between keyboard and chair).

Here are some possible fixes:

Give human less work to doWe can avoid SOME human errors by having systems automatically configure themselves, setup, or perform sanity checks before accepting settings.
"Blindly accepting" instructions

Q&A: Geographical disaster recovery with Percona Replication Manager

My December 4 webinar, “Geographical disaster recovery with  Percona Replication Manager (PRM),”  gave rise to a few questions. The recording of the webinar and the slides are available here, and I’ve answered the questions I didn’t have time to address below.

Q1: Hi, I was wondering if corosync will work in cloud environment. As far as I know it is hard to implement because of no support of unicast or multicast.

A1: Corosync supports the udpu transport since somewhere in the 1.3.0 branch. udpu stands for udp unicast and it works in AWS for instance. Most recent distribution are using 1.4.x so it is easy to find.

Q2: For token wouldn't it

How to add VIPs to Percona XtraDB Cluster or MHA with Pacemaker

It is a rather frequent problem to have to manage Virtual IP addresses (VIPs) with a Percona XtraDB Cluster (PXC) or with MySQL master HA (MHA). In order to help solving these problems, I wrote a Pacemaker agent, mysql_monitor that is a simplified version of the mysql_prm agent. The mysql_monitor agent only monitors MySQL and set attributes according to the state of MySQL, the read-only variable, the slave status and/or the output of the clustercheck script for PXC. The agent can operate in 3 modes or cluster types: replication (default), pxc and read-only.

The simplest mode is read-only, only the state of the read_only variable is looked at. If the node has the read_only variable set to OFF, then the writer is set to 1 and reader attributes is set to 1 while if the node has read_only set to ON, the writer attributes will

High-availability options for MySQL, October 2013 update

The technologies allowing to build highly-available (HA) MySQL solutions are in constant evolution and they cover very different needs and use cases. In order to help people choose the best HA solution for their needs, we decided, Jay Janssen and I, to publish, on a regular basis (hopefully, this is the first), an update on the most common technologies and their state, with a focus on what type of workloads suite them best. We restricted ourselves to the open source solutions that provide automatic failover. Of course, don’t simply look at the number of Positives/Negatives items, they don’t have the same values. Should you pick any of these technologies, heavy testing is mandatory, HA is never beyond scenario that have been tested.

Percona XtraDB Cluster (PXC)

MySQL Fabric: High Availability Groups

As you might have noticed, we have released a framework for managing farms (or grids, as Justin suggested) of MySQL servers called MySQL Fabric. MySQL Fabric is focused on being easy to use and extensible, and two extensions are currently part of the framework: one to manage high-availability and one to implement sharding.

High-Availability Groups One of the central concepts used to construct a farm is the high-availability group (or just group when there is no risk of confusion) and is introduced by the high-availability extension. As mentioned in the previous post, the group concept does not really represent anything new but is rather a formalization of how we think and work with the structure of the

Going to MySQL Connect 2013

MySQL Connect 2013 is coming up with several interesting new sessions. Some sessions that I am participating in got accepted for the conference, so if you are going there, you might find the following sessions interesting. For your convenience, the sessions have hCalendar markup, so it should be easier to add them to your calendar.

MySQL Sharding, Replication, and HA (September 21, 5:30-6:30pm in Imperial Ballroom B)

This session is an opportunity for you to meet the MySQL engineering team and discuss the latest tools and best practices for sharding MySQL across distributed server farms while maintaining high availability.

Come

