Planet MySQL

Displaying posts with tag: TokuView (reset)

Jun

2010

Making “Replace Into” Fast, by Avoiding Disk Seeks

Posted by Tokuview Blog on Wed 30 Jun 2010 04:10 UTC
Tags:

insert, update, replace, TokuDB, TokuView, Fractal Trees, B-Tree, disk seek, replace into, MySQL

In this post two weeks ago, I explained why the semantics of normal ad-hoc insertions with a primary key are expensive because they require disk seeks on large data sets. Towards the end of the post, I claimed that it would be better to use “replace into” or “insert ignore” over normal inserts, because the semantics of these statements do NOT require disk seeks. In this post, I explain how the command “replace into” can be fast with fractal trees.

The semantics of “replace into” are as follows:

if the primary (or unique) key does not exist, insert the new row
if the primary (or unique) key does exist, overwrite the existing row with the new row

The slow, expensive way B-trees use to implement these semantics are:

…[Read more]

Jun

2010

Making Updates Fast, by Avoiding Disk Seeks

Posted by Tokuview Blog on Tue 22 Jun 2010 14:19 UTC
Tags:

update, TokuDB, TokuView, Fractal Trees, B-Tree, disk seek, MySQL

The analysis that shows how to make deletions really fast by using clustering keys and TokuDB’s fractal tree based engine also applies to make updates really fast. (I left it out of the last post to keep the story simple). As a quick example, let’s look at the following statement:

update foo set price=price+1 where product=toy;

Executing this statement has two steps:

a query to find where product=toy
a combination of insertions and deletions to change old rows to new rows

The analysis is identical to that for deletions. Just like for …

[Read more]

Jun

2010

Disk seeks are evil, so let’s avoid them, pt. 4

Posted by Tokuview Blog on Fri 18 Jun 2010 21:12 UTC
Tags:

TokuDB, TokuView, Fractal Trees, B-Tree, disk seek, MySQL

Continuing in the theme from previous posts, I’d like to examine another case where we can eliminate all disk seeks from a MySQL operation and therefore get two orders-of-magnitude speedup. The general outline of these posts is:

B-trees do insertion disk seeks. While they’re at it, they piggyback some other work on the disk seeks. This piggyback work requires disk seeks regardless.
TokuDB’s Fractal Tree indexes don’t do insertion disk seeks. If we also get rid of the piggyback work, we end up with no disk seeks, and a two order of magnitude improvement.

So it’s all about finding out which piggyback work is important (important enough to pay a huge performance penalty for), and which isn’t.

This blog post is about one of the most …

[Read more]

Jun

2010

Making Deletions Fast, by Avoiding Disk Seeks

Posted by Tokuview Blog on Tue 08 Jun 2010 16:19 UTC
Tags:

delete, TokuDB, TokuView, Fractal Trees, B-Tree, disk seek, MySQL

In my last post, I discussed how fractal tree data structures can be up to two orders of magnitude faster on deletions over B-trees. I focused on the deletions where the row entry is known (the storage engine API handler::delete_row), but I did not fully analyze how MySQL delete statements can be fast. In this post, I do. Here I show how one can use TokuDB, a storage engine that uses fractal tree data structures, to make MySQL deletions run fast.

Let’s take a step back and analyze the work needed to be done to execute a MySQL delete statement. Suppose we have the table:

create table foo (
        id auto_increment
        a int,
        b int,
        primary key (id)
)

Say we wish to perform the following operation that deletes 100,000 rows:

delete from foo where a=1;

In MySQL, …

[Read more]

Jun

2010

Disk seeks are evil, so let’s avoid them, pt. 3 (Deletions)

Posted by Tokuview Blog on Wed 02 Jun 2010 17:40 UTC
Tags:

TokuDB, TokuView, Fractal Trees, B-Tree, disk seek, MySQL

As mentioned in parts 1 and 2, having many disk seeks are bad (they slow down performance). Fractal tree data structures minimize disk seeks on ad-hoc insertions, whereas B-trees practically guarantee that disk seeks are performed on ad-hoc insertions. As a result, fractal tree data structures can insert data up to two orders of magnitude faster than B-Trees can.

In this post, let’s examine deletions, and get an intuitive understanding for why fractal-tree data structures exhibit the same two orders of magnitude faster deletions than B-trees. In MySQL 5.1, this advantage is really eye-popping for TokuDB v. InnoDB, because InnoDB does not use its insert buffer for deletions. I understand there is a delete buffer in 5.5, which I …

[Read more]

May

2010

Disk seeks are evil, so let’s avoid them, pt. 2

Posted by Tokuview Blog on Tue 25 May 2010 20:23 UTC
Tags:

TokuDB, TokuView, Fractal Trees, MySQL

In part 1, I discussed why having many disk seeks are bad (they slow down performance), and how fractal tree data structures minimize disk seeks on ad-hoc insertions, whereas B-trees practically guarantee that disk seeks are performed on ad-hoc insertions. As a result, fractal tree data structures can insert data up to two orders of magnitude faster than B-Trees can.

Now that insertion disk seeks are out of the way (and I don’t want to shortchange the importance of getting rid of these seeks!), let’s look at other places where databases perform seeks, and see if we can get rid of them. Over my next couple of posts, I will look at several use cases and analyze whether disk seeks are required. If disk seeks are required, then performance will suffer on large amounts of data, for TokuDB and any other disk-based storage engines.

…[Read more]

May

2010

Disk Seeks are Evil, so Let’s Avoid Them, Part 1

Posted by Tokuview Blog on Thu 20 May 2010 19:42 UTC
Tags:

TokuView

Disk seeks are expensive. Typically, a disk can perform no more than a few hundred seeks per second. So, any database operation that induces a disk seek is going to be slow, perhaps unacceptably slow. Adding disks can sometimes help performance, but that approach is expensive, adds complexity, and anyhow minimizing the disk seeks helps more.

TokuDB fractal tree data structures deliver insertion performance benefits over traditional B-trees by performing fewer disk seeks on random insertions (in effect, turning random I/O into sequential I/O). This is why TokuDB typically outperforms InnoDB on insertion workloads, because TokuDB’s random insertions into secondary indexes is much faster than InnoDB’s insertions — up to two orders of magnitude faster.

So let’s consider the first place where TokuDB avoids a disk seek as opposed to a B-tree. On an …

[Read more]

May

2010

OpenSQL Camp Boston 2010

Posted by Tokuview Blog on Wed 19 May 2010 00:21 UTC
Tags:

opensql camp, announcement, TokuView, MySQL

OpenSQL Camp Boston 2010 will be held at the Stata Center in Cambridge, Massachusetts, October 15-17, 2010.

The Stata Center was designed by Frank Gehry and was completed in 2005. The Stata Center houses CSAIL (The MIT Computer Science and Artifical Intelligence Laboratory) and LIDS (The MIT Laboratory for Information and Decision Systems). Some of my favorite pictures of the Stata Center were taken during construction. (I’m a member of CSAIL)

The OpenSQL Camp will be held on the first floor …

[Read more]

May

2010

OpenSQL (2009 Portland) talk on an Open Storage Engine API

Posted by Tokuview Blog on Tue 11 May 2010 18:32 UTC
Tags:

storage engine, opensql camp, TokuView, MySQL

I just spotted the youtube video of my OpenSQL Camp (Portland 2009) talk on An Open Storage Engine API. I talked about some of technical issues for implementing storage engines across many SQL front ends, not just MySQL.

You can find this talk and other mostly technical material at http://tokutek.com/technology/.

May

2010

What is a Performance Model for SSDs?

Posted by Tokuview Blog on Tue 04 May 2010 19:14 UTC
Tags:

MySQL User Conference, ssd, TokuDB, TokuView, MySQL

Here are the slides and video for my MySQL UC ignite talk on measuring the performance of SSDs.

You can find this talk and other mostly technical material at http://tokutek.com/technology/.

This research was funded in part by the National Science Foundation.

Top Authors

Oracle MySQL Blogs

Vendor Blogs

MySQL Links