I created a new tool this week:
http://code.google.com/p/shard-query
As the name Shard-Query suggests, the goal of the tool is to run
a query over multiple shards, and to return the combined results
together as a unified query. It uses Gearman to ask each server
for a set of rows and then runs the query over the combined set.
This isn't a new idea, however, Shard-Query is different than
other Gearman examples I've seen, because it supports
aggregation.
It does this by doing some basic query rewriting based on the
input query.
Take this query for example:
select c2, sum(s0.c1), max(c1) from t1 as s0 join t1 using (c1,c2) where c2 = 98818 group by c2;
The tool will split this up into two queries.
This first query will be sent to each shard. Notice that …