In our last post, Bradley described how auto increment works
in TokuDB. In this post, I explain one of our implementation’s
big benefits, the ability to combine better primary keys with
clustered primary keys.
In working with customers, the following scenario has come up
frequently. The user has data that is streamed into the table, in
order of time. The table will have a primary key that is an auto
increment field, ‘id’, and then have an index on the field
‘time’. The queries the user does are all on some range of time
(e.g. select sum(clicks) from foo where time > date
‘2008-12-19’ and time < date '2008-14-20';).
For storage engines with clustered primary keys (such as TokuDB
and InnoDB), having such a schema hurts query performance.
Queries do a range query on a secondary index (time), and then
perform point …
[Read more]