Showing entries 231 to 240 of 292
« 10 Newer Entries | 10 Older Entries »
Displaying posts with tag: TokuView (reset)
Tokutek CEO talks to Wikibon about Big Data and MySQL

John Partridge, President and CEO of Tokutek, sits down to talk about MySQL performance and scalability with the Wikibon Project’s Dave Vellante at last week’s MassTLC Big Data Summit.

http://siliconangle.tv/video/tokutek-ceo-discusses-big-data-and-scaling-mysql

The Data is Coming! The Data is Coming!

This was an interesting week for data discussions in the Boston area. There were two back-to-back events this week — Big Data on Wednesday night hosted by TiE Boston and Channeling the Big Data Tsunami by MassTLC on Thursday morning. With desperate sounding headlines, I was beginning to fear that big data was going to storm the Boston ‘burbs to give us a British-style trouncing.

What a relief instead to hear luminaries from startups, established companies, research houses and VCs share their wisdom at the TiE and MassTLC events on what we …

[Read more]
My-Shhhh!-QL

I gave a talk entitled “How to Index Massive Data Sets Quickly” at the Morrelly Homeland Security Center. The event was hosted at the Long Island Forum for Technology (LIFT) and was jointly supported by NYSTAR and the Stony Brook Sensor CAT. The audience was composed primarily of technologists in the area. While many of the people in the audience were not necessarily MySQL users, the problem of working with massive data sets is one that many MySQL users face.

One topic we discussed in detail was the data-ingest problem common to three letter agencies: sensors generate thousands or millions of data items per …

[Read more]
Tokutek Founder Named to IOUG MySQL Council

I am delighted to announce that Dr. Bradley Kuszmaul, Co-Founder and Chief Architect at Tokutek, has been appointed to the IOUG’s first MySQL Council. In the announcement on the IUOG website, president Andy Flower spoke highly of the MySQL Council’s new members Sarah Novotny, Sheeri Cabral, Giuseppe Maxia, Rob Wultsch, Matt Yonkovit and Bradley Kuszmaul:  “their passion, independent perspectives, experience and collective knowledge of MySQL will provide a solid framework for us to support our members and those interested in the MySQL evolution.”

When not building new companies, Dr. Kuszmaul is engaged in advanced database research. His entry won 5 out of 6 categories in Jim Gray’s 2007 benchmark contest, sorting a terabyte in 197 seconds. He formerly was the designer of Akamai’s distributed data collection system, was a Yale …

[Read more]
Partitioning, Free Lunches, & Indexing, Part 2

Review

In part one, I presented a very brief and particular view of partitioning. I covered what partitioning is, with hardly a mention of why one would use partitioning. In this post, I’ll talk about a few use cases often cited as justification for using partitions.

Lots of disks → Lots of partitioning of tables

One use case for justifying partitions is that each partition can be placed on a separate disk to avoid spindle contention. I have to say that on this one, I agree with Kevin Burton, who makes the point that if you want to distribute I/O load across several disks, you can use a RAID configuration on the disks. In this case, he says that partitioning is not worth the trouble. [NB. He makes the point that this …

[Read more]
Partitioning, Free Lunches, and Indexing

Why partition?

Partitioning is a commonly touted method for achieving performance in MySQL and other databases. (See here, here, here and many other examples.) I started wondering where the performance from partitions comes from, and I’ve summarized some of my thoughts here.

But first, what is partitioning? (I’ve taken the examples from Giuseppe Maxia’s Partitions in Practice intro.)

CREATE TABLE by_year (
   d DATE
)
PARTITION BY RANGE (YEAR(d))
(
   PARTITION P1 VALUES LESS THAN (2001),
   PARTITION P2 VALUES LESS THAN …
[Read more]
Announcing TokuDB for MariaDB

Tokutek is pleased to announce support for MariaDB for the first time with TokuDB v4.1.1 for MariaDB v5.1.47.

Our customers are choosing MariaDB more and more frequently for their most demanding database applications. We are delighted to help raise MariaDB performance to the next level by making TokuDB available on this new platform. One of our customers, who wishes to remain unnamed for the present time, chose MariaDB + TokuDB for a 3 TB database after having evaluated other MySQL alternatives and finding them unacceptably slow.

TokuDB continues to be the ideal choice for complex / high-volume applications that must have fast response times and that must simultaneously store and query large volumes of rapidly arriving data:

  • Social Networking
  • Real-time clickstream analysis
  • Logfile Analysis
  • eCommerce …
[Read more]
Avoiding Fragmentation with Fractal Trees

Summary

B-trees suffer from fragmentation. Fragmentation causes headaches — in query performance and space used. Solutions, like dump and reload or OPTIMIZE TABLE are a pain and not always effective. Fractal trees don’t fragment. So if fragmentation is a problem, check out Tokutek

What is fragmentation?

What do I mean when I say “fragmentation”? People complain about two things when they talk about index fragmentation. They either find that the disk spaced used is much larger than the data, or they complain about query performance. In particular, they complain of range query performance, since point queries aren’t really affected by fragmentation. I’m going to focus on this second symptom, which is due to lack of locality of reference amongst the rows.

B-trees fragment

In the following, I’ll be talking about the limits of B-trees. Not InnoDB …

[Read more]
Scenarios where TokuDB’s Loader is Used

TokuDB’s loader uses the available multicore computing resources of the machine to presort and insert the data. In the last couple of posts (here and here), Rich and Dave presented performance results of TokuDB’s loader. Comparing load times with TokuDB 2.1.0, Rich found a 2.1x speedup on a 2 core machine, and a 4.2x speedup on an 8 core machine. Comparing load times with TokuDB 3.1, Dave found an 8.2x speedup on Amazon Web Services c1.large node with 8 cores while loading a table with 256 byte rows.

This leads to these natural questions: how does one use the TokuDB loader? Under what scenarios is it used?

The loader has two purposes:


  • to ease migration of data from other sources (e.g. …
[Read more]
Loading Tables with TokuDB 4.0

Often, the first step in evaluating and deploying a database is to load an existing dataset into the database. In the latest version, TokuDB makes use of multi-core parallelism to speed up loading (and new index creation). Using the loader, MySQL tables using TokuDB load 5x-8x faster than with previous versions of TokuDB.

Measuring Load Performance

We generated several different datasets to measure the performance of TokuDB when doing a LOAD DATA INFILE … command. To characterize performance, we vary

  • rows to load
  • keys per row
  • row length (including keys)

All generated keys, including the primary, are random, 8-byte values. The remaining data, needed to pad out the row length to specified length, is text.

Two files files are produced as part of data generation.

  1. data file, containing ‘|’ separated fields
  2. sql file, containing the …
[Read more]
Showing entries 231 to 240 of 292
« 10 Newer Entries | 10 Older Entries »