The next release of MongoDB includes the
ability to select a storage engine, the goal being that different
storage engines will have different capabilities/advantages, and
user's can select the one most beneficial to their particular
use-case. Storage engines are cool. MySQL has
offered them for quite a while. One very big difference between
the MySQL and MongoDB implementations is that in MySQL the user
gets to select a particular storage engine for each table,
whereas in MongoDB it's a choice made at server startup. You get
a single storage engine for everything on the particular mongod
instance. I see pros and cons to each decision, but that's a blog
for another day.
In MongoDB 3.0 …
I just pushed the new Java based iiBench for MySQL (and Percona
Server and MariaDB), the code and documentation are available now
in the iibench-mysql Github repo. Pull request are
welcome!
The history of iiBench goes back to the early days of
Tokutek.
Since "indexed insertion" is a strength of Fractal Tree indexes, the first iiBench was
created by Tokutek in C++ back in 2008. Mark
Callaghan rewrote iiBench in Python, adding several features
along the way. His version of iiBench is available in …
If you did not read my first blog post about Mark Callaghan’s (@markcallaghan) benchmarks as documented in his blog, Small Datum, you may want to skim through it now for a little context.
——————-
On March 11th, Mark, a former Google and now Facebook database guru, published an insertion rate benchmark comparing MySQL outfitted with the InnoDB storage engine with two NoSQL alternatives — basic MongoDB and …
[Read more]Since Fractal Tree indexes turn random writes into sequential writes, it’s easy to see why they offer a big advantage for maintaining indexes on rotating disks. It turns out that that Fractal Tree indexing also offers signficant advantages on SSD. Here are three ways that Fractal Trees improve your life if you use SSDs.
Advantage 1: Index maintenence performance.
The results below show the insertion of 1 billion rows into a table while maintaining three multicolumn secondary indexes. At the end of the test, TokuDB’s insertion rate remained at 14,532 inserts/second whereas InnoDB had dropped to 1,607 inserts/second. That’s a difference of over 9x.
Platform: Centos 5.6; 2x Xeon L5520; 72GB RAM; LSI MegaRaid 9285; 2x 256GB Samsung 830 in RAID0. |
Even on flash, I/O performance costs something. Since TokuDB employs Fractal Tree write-optimized …
[Read more]We are excited to announce TokuDB® v6.5, the latest version of Tokutek’s flagship storage engine for MySQL and MariaDB.
This version offers optimization for Flash as well as more hot schema change operations for improved agility.
We’ll be posting more details about the new features and performance, so here’s an overview of what’s in store.
- Flash
- TokuDB v6.5 continues the great Toku-tradition of fast insertions. On flash drives, we show an order-of-magnitude (9x) faster insertion rate than InnoDB. TokuDB’s standard compression works just as well on flash and helps you get the most out of your storage system. And TokuDB reduces wear …
I have been working for a customer benchmarking insert performance on Amazon EC2, and I have some interesting results that I wanted to share. I used a nice and effective tool iiBench which has been developed by Tokutek. Though the “1 billion row insert challenge” for which this tool was originally built is long over, but still the tool serves well for benchmark purposes.
OK, let’s start off with the configuration details.
Configuration
First of all let me describe the EC2 instance type that I used.
EC2 Configuration
I chose m2.4xlarge instance as that’s the instance type with highest memory available, and memory is what really really matters.
High-Memory Quadruple Extra Large Instance 68.4 GB of memory 26 EC2 Compute Units (8 virtual cores with 3.25 EC2 Compute …[Read more]
iiBench measures the rate at which a database can insert new rows while maintaining several secondary indexes. We ran this for 1 billion rows with TokuDB and InnoDB starting last week, right after we launched TokuDB v5.2. While TokuDB completed it in 15 hours, InnoDB took 7 days.
The results are shown below. At the end of the test, TokuDB’s insertion rate remained at 17,028 inserts/second whereas InnoDB had dropped to 1,050 inserts/second. That is a difference of over 16x. Our complete set of benchmarks for TokuDB v5.2 can be found here.
…
[Read more]I’m creating a library of benchmarks and test suites that will run as part of a Continuous Integration (CI) process here at Tokutek. My goal is to regularly measure several aspects of our storage engine over time: performance, correctness, memory/CPU/disk utilization, etc. I’ll also be running tests against InnoDB and other databases for comparative analysis. I plan on posting a series of blog entries as my CI framework evolves, for now I have the results of my first benchmark.
Compression is an always-on feature of TokuDB. There are no server/session variables to enable compression or change the compression level (one goal of TokuDB is to have as few tuning parameters as possible). My compression benchmark uses iiBench to measure the insert performance and compression achieved by …
[Read more]We recently made transactions in TokuDB 3.0 durable. We write table changes into a log file so that in the event of a crash, the table changes up to the last checkpoint can be replayed. Durability requires the log file to be fsync’ed when a transaction is committed. Unfortunately, fsync’s are not free, and may cost 10’s of milliseconds of time. This may seriously affect the insertion rate into a TokuDB table. How can one achieve high insertion rates in TokuDB with durable transactions?
Decrease the fsync cost
The fsync of the TokuDB log file writes all of the dirty log file data that is cached in memory by the operating system to the underlying storage system. The fsync time can be modeled with a simple linear equation: fsync time = N/R + K, where N is the amount of dirty data that needs to by written to disk, R is the disk write rate, and K is a constant time defined by the …
[Read more]OpenSQLCamp was a huge success! I took videos of most of the sessions (we only had 3 video cameras, and 4 rooms, and 2 sessions were not recorded). Unfortunately, I was busy doing administrative stuff for opensqlcamp for the opening keynote and first 15 minutes of the session organizing, and when I got to the planning board, it was already full….so I was not able to give a session.
-
- Comparing Non-Relational Databases: MongoDB, Tokyo Tyrant, CouchDB by Igal Koshevoy of Pragmaticraft