Showing entries 191 to 200 of 292
« 10 Newer Entries | 10 Older Entries »
Displaying posts with tag: TokuView (reset)
Write Optimization: Myths, Comparison, Clarifications

Some indexing structures are write optimized in that they are better than B-trees at ingesting data. Other indexing structures are read optimized in that they are better than B-trees at query time. Even within B-trees, there is a tradeoff between write performance and read performance. For example, non-clustering B-trees (such as MyISAM) are typically faster at indexing than clustering B-trees (such as InnoDB), but are then slower at queries.

This post is the first of two about how to understand write optimization, what it means for overall performance, and what the difference is between different write-optimized indexing schemes. We’ll be talking about how to deal with workloads that don’t fit in memory—in particular, if we had our data in B-trees, only the internal nodes (perhaps not even all of them) would fit in memory.

As I’ve already said, there is a tradeoff between write and read …

[Read more]
Compression Benchmarking: Size vs. Speed (I want both)

I’m creating a library of benchmarks and test suites that will run as part of a Continuous Integration (CI) process here at Tokutek. My goal is to regularly measure several aspects of our storage engine over time: performance, correctness, memory/CPU/disk utilization, etc. I’ll also be running tests against InnoDB and other databases for comparative analysis. I plan on posting a series of blog entries as my CI framework evolves, for now I have the results of my first benchmark.

Compression is an always-on feature of TokuDB. There are no server/session variables to enable compression or change the compression level (one goal of TokuDB is to have as few tuning parameters as possible). My compression benchmark uses iiBench to measure the insert performance and compression achieved by …

[Read more]
Online Advertiser Intent Media Selects TokuDB over InnoDB and NoSQL for Big Data Ad-Hoc Analysis

Intent Media

Issue addressed: Ad hoc analytics on clickstream data arriving too fast for InnoDB or NoSQL to handle.

TokuDB powers an online advertising application

The Company: Headquartered in New York, Intent Media is a fast-growing online advertising startup. The company helps some of the largest online retailers monetize their traffic more efficiently at scale by showing highly relevant and targeted advertising to the 97+% of e-commerce visitors who do not transact.

The Challenge: The Intent Media platform processes hundreds of millions of events a day generated by media placements across leading e-commerce sites — a textbook “Big Data” challenge. Intent Media’s data is used to optimize media placements, drive segmentation models, and …

[Read more]
Ask What Your Database Can Do for Your Country

How many in your household again?

One of President John Kennedy’s most memorable phrases is “ask not what your country can do for you –  ask what can you do for your country”.  I got to thinking about this over lunch with a fellow colleague in the big data space. After comparing named customers for a while, we realized we had forgotten one of the biggest “big data” customers whom we both have in common – the government.

Whether you believe in small or big government, one thing is for certain – it has some very big data on its hands. Some of this is freely available, such as the …

[Read more]
Database Insights from Archimedes to the Houston Rockets

Archimedes, the first DBA

According to a recent MIT Sloan Management Review study, top performing organizations use analytics 5 times more than lower performers. That’s pretty astounding. And while we all know about the ocean/lake/waves/(your favorite water analogy) of Big Data we struggle with everyday, information is not knowledge. So how can we get insight from data? Recent articles from O’Reilly and HBR offered some …

[Read more]
May the Index be with you!

 

The summer’s end is rapidly approaching — in the next two weeks or so, most people will be settling back into work. Time to change your mindset, re-evaluate your skills and see if you are ready to go back from the picnic table to the database table.

With this in mind, let’s see how much folks can remember from the recent indexing talks my colleague Zardosht Kasheff gave (O’Reilly Conference, Boston, and SF MySQL Meetups). Markus Winand’s site “Use the Index, Luke!” (not to be confused with …

[Read more]
It Actually is Easy Being Green

(Fractal) Tree Frog

Fractal Tree™ indexes are green. They have the potential to be greener still. Here’s why:

Remarkably, data centers consume 1-3 percent of all the US electricity. A majority of this power is used to drive servers and storage systems. Significant energy savings remain on the table.

Here’s why Fractal Tree indexing enables more energy-efficient storage: Data centers typically use many small-capacity disks rather than a few large-capacity disks. Why? One reason is to harness more spindles to obtain more I/Os per second. In some high-performance applications, users go so far as to employ techniques such as “ …

[Read more]
Cage Match: OldSQL, NoSQL and NewSQL

 

When I interviewed at Tokutek, I met a team of distinguished academics and engineers who could calmly and thoughtfully wax eloquent about the finer points of B-tree and Fractal Tree™ indexing,  drive I/Os, and database engines. Soon after, I discovered that several of my colleagues have a second passion — they practice Mixed Martial Arts (MMA). As Wikipedia explains, MMA showcases the “fighters of different disciplines, including boxing, Brazilian Jiu-Jitsu, wrestling, Muay Thai, karate and others.” I’ve since learned about many different fighting styles.

This was useful to understand when an MMA-style fight broke out in the MySQL world earlier this month between the different variants or …

[Read more]
This Weekend in Japan

We were happy to see a lot of folks from Japan on Twitter this weekend having a discussion about MySQL and Tokutek. While we always endeavor to explain ourselves as simply as possible, hearing what users and peers have to say and ask in their native language is very helpful. Here is a sampling of several of the 30+ tweets and re-tweets (translations courtesy of a colleague I know from frequent past visits to Tokyo and Yokohama):

.

First, @frsyuki provided a general overview:

“TokuDB” 新種のMySQLストレージエンジン。INSERTが20〜80倍ほど速い、パーティションなしで数TBのデータを突っ込める、MVCCサポートなど。Fractal Treeというアルゴリズムを実装しているらしい。http://www.tokutek.com/

[Read more]
Dude, Where’s my Fractal Tree?

Unless you are Ashton Kutcher (@aplusk), or one of his Hollywood buddies, you don’t need to read any further. Allow me to explain…

Over the weekend, we launched our new website. This type of announcement used to be interesting in the high-tech world. I heard Kara Swisher of the WSJ’s All things D speak at a MassTLC event in May.  She admitted back in the 1990s, when the web was just getting into high gear, that a new website from an interesting company might actually get some coverage. Not anymore.

I’ve also been told at all the SEO classes I’ve …

[Read more]
Showing entries 191 to 200 of 292
« 10 Newer Entries | 10 Older Entries »