Planet MySQL

Displaying posts with tag: sharding (reset)

May

2010

Intra-query parallelism for MySQL queries without an appliance or closed source database

Posted by Justin Swanhart on Tue 25 May 2010 05:50 UTC
Tags:

benchmark, sharding, parallelism, shard-query, MySQL, Performance

*edit* I want to point out that this test was done on a single database server which used MySQL partitioning. This is a demonstration of how Shard-Query can improve performance in non-sharded databases too.*edit*.

Over the weekend I spent a lot of time improving my new Shard-Query tool (code.google.com/p/shard-query) and the improvements can equate to big performance gains on partitioned data sets versus executing the query directly on MySQL.

I'll explain this graph below, but lower is better (response time) and Shard-Query is the red line.

MySQL understands that queries which access data in only certain partitions don't have to read the rest of the table. This partition elimination works well, but MySQL left a big optimization out of partitioning: …

[Read more]

May

2010

I wrote a new tool that runs aggregation queries over MySQL sharded databases using Gearman.

Posted by Justin Swanhart on Sun 23 May 2010 05:18 UTC
Tags:

PHP, sql, scaling, partitioning, sharding, horizontal partioning, aggregation, MySQL, Performance

I created a new tool this week:
http://code.google.com/p/shard-query

As the name Shard-Query suggests, the goal of the tool is to run a query over multiple shards, and to return the combined results together as a unified query. It uses Gearman to ask each server for a set of rows and then runs the query over the combined set. This isn't a new idea, however, Shard-Query is different than other Gearman examples I've seen, because it supports aggregation.

It does this by doing some basic query rewriting based on the input query.

Take this query for example:

select c2, 
       sum(s0.c1), 
       max(c1) 
 from t1 as s0 
 join t1 using (c1,c2) 
 where c2 = 98818 
 group by c2;

The tool will split this up into two queries.

This first query will be sent to each shard. Notice that …

[Read more]

Apr

2010

DbCharmer – Rails Can Scale!

Posted by Alexey Kovyrin on Fri 16 Apr 2010 21:33 UTC
Tags:

Databases, Development, ruby, rails, scalability, My Projects, ActiveRecord, sharding, MySQL

Back in November 2009 I was working on a project to port Scribd.com code base to Rails 2.2 and noticed that some old plugins we were using in 2.1 were abandoned by their authors. Some of them were just removed from the code base, but one needed a replacement – that was an old plugin called acts_as_readonlyable that helped us to distribute our queries among a cluster of MySQL slaves. There were some alternatives but we didn’t like them for one or another reasons so we’ve decided to go with creating our own ActiveRecord plugin, that would help us scale our databases out. That’s the story behind the first release of DbCharmer.

Today, six months after the first release of …

[Read more]

Mar

2010

Not Only NoSQL!! Uber Scaling-Out with SPIDER storage engine

Posted by Mikiya Okuno on Tue 30 Mar 2010 09:58 UTC
Tags:

sharding, twitter, spider, MySQL

The history tells that a single RDBMS node cannot handle tons of traffics on web system which come from all over the world, no matter how the database is tuned. MySQL has implemented a master/slave style replication built-in for long time, and it has enabled web applications to handle traffics using a scale-out strategy. Having many slaves has been suitable for web sites where most of traffics are reads. Thus, MySQL's master/slave replication has been used on many web sites, and is being used still.

However, when a site grow large, amount of traffic may exceed the replication's capacity. In such a case, people may use memcached. It's an in-memory, very fast and well-known KVS, key value store, and its read throughput is far better than MySQL. It's been used as a cache for web applications to store 'hot' data with MySQL as a back-end storage, as it can reduce read requests to MySQL dramatically.

While 1:N replication …

[Read more]

Nov

2009

MySQL University: The Spider Storage Engine

Posted by Stefan Hinz on Mon 23 Nov 2009 16:56 UTC
Tags:

sun, plugins, university, engine, storage, sharding, spider, MySQL

This Thursday (November 26th, 14:00 UTC), Giuseppe Maxia will present the Spider Storage Engine. This session was originally scheduled for October 15th but had to be postponed for technical reasons.

Here's from the abstract: Everybody needs sharding. Which is not easy to maintain. Being tied to the application layer, sharding is hard to export and to interact with. The Spider storage engine, a plugin for MySQL 5.1 and later, solves the problem in a transparent way. It is an extension of partitioning. Using this engine, the user can deal transparently with multiple backends in the server layer. This means that the data is accessible from any application without code changes. This lecture …

[Read more]

Nov

2009

MySQL University: The Spider Storage Engine

Posted by Stefan Hinz on Mon 23 Nov 2009 08:56 UTC
Tags:

sun, plugins, university, engine, storage, sharding, spider, MySQL

This Thursday (November 26th, 14:00 UTC), Giuseppe Maxia will present the Spider Storage Engine. This session was originally scheduled for October 15th but had to be postponed for technical reasons.

[Read more]

Nov

2009

MySQL University: The Spider Storage Engine

Posted by Stefan Hinz on Mon 23 Nov 2009 08:56 UTC
Tags:

sun, plugins, university, engine, storage, sharding, spider, MySQL

This Thursday (November 26th, 14:00 UTC), Giuseppe Maxia will present the Spider Storage Engine. This session was originally scheduled for October 15th but had to be postponed for technical reasons.

[Read more]

Nov

2009

“Shard early, shard often”

Posted by Morgan Tocker of MySQL Performance Blog on Mon 16 Nov 2009 19:59 UTC
Tags:

tips, sharding, shard, MySQL

I wrote a post a while back that said why you don't want to shard. In that post that I tried to explain that hardware advances such as 128G of RAM being so cheap is changing the point at which you need to shard, and that the (often omitted) operational issues created by sharding can be painful.

What I didn't mention was that if you've established that you will need to eventually shard, is it better to just get it out of the way early? My answer is almost always no. That is to say I disagree with a statement I've been hearing recently; "shard early, shard often". Here's why:

There's an order of magnitude better performance that can be gained by focusing on query/index/schema optimization. The gains from sharding are usually much lower.
If you shard …

[Read more]

Oct

2009

Spider and vertical partition engines with new goodies

Posted by Giuseppe Maxia on Thu 15 Oct 2009 09:00 UTC
Tags:

scaling, ha, partitioning, sharding, partitions, spider, vertical, MySQL

The Spider storage engine should be already known to the community. Its version 2.5 has recently been released, with new features, the most important of which is that you can execute remote SQL statements in the backend servers. The method is quite simple. Together with Spider, you also get an UDF that executes SQL code in a remote server. You send a query with parameters saying how to connect to the server, and check the result (1 for success, 0 for failure). If the SQL involves a SELECT, the result can be sent to a temporary table. Simple and effective.

In addition to the Spider engine, Kentoku SHIBA has also created the …

[Read more]

Sep

2009

Video: The ScaleDB shared-disk clustering Storage Engine for MySQL

Posted by Sheeri K. Cabral on Tue 22 Sep 2009 21:02 UTC
Tags:

cluster, storage engine, Pythian, Clustering, sharding, RAC, scaledb, shared-disk, MySQL

Mike Hogan, CEO of ScaleDB spoke at the Boston MySQL User Group in September 2009:

ScaleDB is a storage engine for MySQL that delivers shared-disk clustering. It has been described as the Oracle RAC of MySQL. Using ScaleDB, you can scale your cluster by simply adding nodes, without partitioning your data. Each node has full read/write capability, eliminating the need for slaves, while delivering cluster-level load balancing. ScaleDB is looking for additional beta testers, there is a sign up at http://www.scaledb.com.

Slides are online (and downloadable) at http://www.slideshare.net/Sheeri/scale-db-preso-for-boston-my-sql-meetup-92009

Watch the video online at …

[Read more]

Top Authors

Oracle MySQL Blogs

Vendor Blogs

MySQL Links