Since I announced SlackDB a few weeks ago, I’ve had a number of questions and interesting conversations in response. I thought I would summarize the initial feedback and answer some questions to help clarify things. One of the biggest questions was “Isn’t this what Drizzle is doing?”, and the answer is no. They are both being designed for “the cloud” and speak the MySQL protocol, but they provide very different guarantees around consistency and high-availability. The simple answer is that SlackDB will provide true multi-master configurations through a deterministic and idempotent replication model (conflicts will be resolved via timestamps), where Drizzle still maintains transactions and ACID properties, which imply single master. Drizzle could add support for clustered configurations and distributed transactions (like the NDB storage engine), but writes would still happen on the …
[Read more]As you might have guessed from my last couple blog posts, I’ve been experimenting with a few languages and libraries for a new project. I’ve finally gotten things far enough along to the point where I’d like to start getting other developers and potential users involved. I’m introducing SlackDB, an open source project that combines the functionality of relational databases with the ideas behind eventually consistent, shared-nothing data stores to provide a new database to support new and existing web applications in the cloud (enough buzzwords in there?). This is an idea I wrote about a while ago and recently I started putting a lot of night and weekend time into it.
It is still very early on in the development process, but the ideas behind it are starting to solidify and a fair amount of code is already written. It’s not all …
[Read more]Over the past year or so I’ve found myself evaluating my overall programming experience with the languages I’m working with. I might just be getting impatient in my old age (turning the big three-oh in a couple months), but I like to think I’m trying to find the most efficient way to solve the problem at hand. This has led me to learn and experiment with a number of languages, taking a look at each one’s strengths and weaknesses. I realize programming language selection is very subjective and folks can get quite passionate in the debate, but I’m still going to present my personal opinions on the matter. Flame away.
The main question that I’m trying to try to answer is: What language will enable me to solve the problem at hand correctly and in the fastest way possible? By correctly …
[Read more]The past two weeks have been both exciting and extremely busy, first traveling to Austin, TX for the first OpenStack Design Summit, and then back home to Portland, OR for The O’Reilly Open Source Conference (OSCON) and Community Leadership Summit. The events were great in different ways, and there was some overlap with OpenStack since we announced it on the first day of OSCON and created quite a bit of buzz around the conference. I want to comment on a few things that came up during these two weeks.
New Role
I’m now focusing on OpenStack related projects at Rackspace. I’m no longer working …
[Read more]A few months ago I wrote a tool that verified MySQL and Drizzle protocol compatibility, along with testing for all sorts of edge cases. In analyzing protocol command interactions in mysqld, I found that the MySQL server will happily read an infinite amount of data if you exceed the maximum packet size while using a special sequence of protocol packets. The reasoning behind this behavior is so that the server can be polite and flush your data before sending a “max packet exceeded” error message, but perhaps there should be a limit to one’s politeness. What’s more interesting is that you can do this during the client handshake packet without authorization, so anyone could do this to any open MySQL server. The appropriate thing to do here would be to set some maximum limit of data to read and force a connection close when it is reached, otherwise your bandwidth and CPU could be consumed (essentially a DoS attack).
This portion of code …
[Read more]Open Source Bridge, the “conference for open source citizens,” is right around the corner! The sessions were just announced and it’s going to be packed with quite a variety of really interesting talks. From open cloud computing topics to hardware hacking to language hacks (like HipHop from Facebook), I’m really looking forward to being there (I’m helping organize the event, but hopefully I’ll have time to attend sessions as well).
I wanted to point out a few of the great database talks:
[Read more]Last week I was surprised to see this paper bubble back up on Planet MySQL. It describes the pros and cons of thread and event based programming for high concurrency applications (like a web server), arguing that thread-based programming is superior if you use an appropriate lightweight threading implementation. I don’t entirely disagree with this, but the problem is such a library does not exist that is standard, portable, and useful for all types of applications. We have POSIX threads in the portable Linux/Unix/BSD world, so we need to work with this. Other experimental libraries based on lightweight threads or “fibers” are really interesting as they can maintain your stack without all the normal overhead, but it is hard to get the scheduling correct for all …
[Read more]Last Friday we held the Drizzle Developer Day at the Santa Clara convention center, taking advantage of the fact that many developers and interested contributors were already there for the MySQL Conference & Expo. Minus a few small glitches like wifi and pizza consumption location, I would say it was an overall success. There were a lot of new folks interested in learning about Drizzle and getting the server up and running. The day was organized by splitting folks up into small groups with matching interests, and then switching up groups every hour or so. We had groups focused on replication, documentation, writing plugins, the optimizer, Boots (the new client tool), and a “getting started” group.
The first group I participated in was about Boots, the new command line tool …
[Read more]Back in October I wrote about a student group I was sponsoring to create a new command line tool for Drizzle. The group wrapped up their part of the project (the term ended), and we now have a new tool called Boots! A few of the developers are still active in the project, and I’m planning to get involved more as well. We also have a couple students interested in hacking on it for Drizzle’s Google Summer of Code.
Boots is written in Python and aims to replace the the previous ‘drizzle’ tool (which was modified from the ‘mysql’ command line tool). It doesn’t support everything that the old tool has yet (like tab completion), but it adds some new features. For example, there are multiple ‘lingos’, or modular languages, that can be used to communicate with the …
[Read more]Back in January when I was between jobs I had a free weekend to do some fun hacking. I decided to start a new open source project that had been brewing in the back of my head and since then have been poking at it on the weekends and an occasional late night. I decided to call it Scale Stack because it aims to provide a scalable network service stack. This may sound a bit generic and boring, but let me show a graph of a database proxy module I slapped together in the past couple days:
I setup MySQL 5.5.2-m2 and ran the sysbench read-only tests against it with 1-8192 threads. I then started up the database proxy module built on Scale Stack so sysbench would route through that, and you can see the concurrency improved quite a bit at higher thread counts. The database module doesn’t do much, it simply does …
[Read more]