Showing entries 101 to 105
« 10 Newer Entries
Displaying posts with tag: sharding (reset)
Hard Loading – something to avoid.

Last week I got a question about sharding using our Spockproxy.  The question was how can I create a query for the proxy so it effectively runs:

/*in shard 1*/
SELECT * FROM table_a WHERE f_key IN (a, b, c);

UNION

/*in shard 2*/
SELECT * FROM table_a WHERE f_key IN (d, e, f);

By design our proxy will not do this.  The whole point is to hide the sharding from the application.  Given a query it will either send the same query to all the shards and combine the results or only send that query to one shard when it can figure out that the results(s) can only come from one shard (because you specified the shard key in the where clause).

I did figure out a way it could be done using views but would this ever be desirable?  

Like “Hard Coding” where values are built into the code of your application I’ll call this technique “Hard …

[Read more]
Is MySQL-partitioning useful for very big real-life-problems?

Some months ago I helped out in another project in which they had some performance problems. They had a very big table and the index of the table was bigger than the table itself. As every change in the table causes MySQL to recalculate/reload the index, this can take some seconds for such big tables.
So, I thought it would be a good idea to split that big table into very small ones. This should reduce the overhead of reloading big indices and instead reload only very small parts. And the next thought was: Is it possible to use the "new" MySQL-Partitioning for that?
Continue reading "Is MySQL-partitioning useful for very big real-life-problems?"

Hot cache data, sharding

In the last several months at Grazr, we've been wrestling with a large database (running on MySQL) of feeds and feed items. The schema is essentially a feeds table with child tables items, items_text (text), and enclosures. We have this database to provide the means for users to be able to merge (a Stream) feeds so that you have an aggregate feed with items for whatever feeds you want in the list of feeds for your merge. It works great, the only problem being the volume of data, which more data means the query to produce that merge becomes slower. We want this merge to be able to be run on the fly, and if it's too slow, the user experience is unacceptable.

So, now I'm in the process of implementing a "Hot Cache" of feeds with an LRU (Least Recently Used) policy. The idea being, that this cache provides a smaller data set for performing the merge query against. We need to be able to handle storing much more data than we currently do …

[Read more]
Horizontal Scaling with HiveDB

At the MySQL Conference & Expo 2008, Britt Crawford and Justin McCarthy, both from Cafepress.com, gave us a very interesting talk on scaling with HiveDB. I took a few notes (pasted below), their slides are online (warning: 6.1MB PDF), and if you’re after their abstract its available as well.

I also took a video of them (refer to Slide 12, for the IRC conversation):

The quick notes:

  • OLTP optimised (as it serves cafepress.com)
  • Cannot lock tables, or take it offline
  • Constant response time is more important than low latency (little slower query is ok, just not exponentially …
[Read more]
More progress on High Performance MySQL, Second Edition

Whew! I just finished a marathon of revisions. It's been a while since I posted about our progress, so here's an update for the curious readers.

Showing entries 101 to 105
« 10 Newer Entries