Over the last few years there has been an increasing interest in
immutable data management. This is a big change from the
traditional update-in-place approach many database systems
use today, where new values delete old values, which are then
lost. With immutable data you record everything, generally using
methods that append data from successive transactions rather than
replacing them. In some DBMS types you can access the older
values, while in others the system transparently uses the old
values to solve useful problems like implementing eventual
consistency.
Baron Schwartz recently pointed out that it can be hard to get
decent transaction processing performance based on append-only
methods like append-only B-trees. This is not a very
strong argument against immutable data per se. …
Like a lot of developers I started using a MacBook Pro around the
time of Tiger. I instantly loved it: simple, fast,
and virtually no system administration overhead. The genius of OS
X was that it never got in the way. You opened the box, pulled
out the machine, and got to work. It had a great user interface,
excellent development tools (Eclipse in my case) and the
command utilities like ssh, rsync, and bash worked seamlessly
with Linux systems.
Well, that was then and this is now. Starting with Lion I began
to spend an increasing amount of time fighting OS X instead of
getting work done. I'm now using Mavericks and have not seen much
improvement, in fact quite the contrary. Here are just a few of
the problems after the Lion to Mavericks upgrade:
- Spotlight indexes destroyed; need 2 days to regenerate
- AppleMail access to Gmail IMAP broken
- Time Machine stuck in …
Anders Karlsson wrote about Some myths on Open Source, the way I see it a
few days ago. Anders' article is mostly focused on
exploding the idea that open source magically creates high
quality code. It is sad to say you do not have to look very
far to see how true this is.
While I largely agree with Anders' points, there is far more that
could be said on this subject, especially on the benefits of open
source. I love working on open source software. Here are three
reasons that are especially important to me.
1.) Open source is a great way to disseminate technology to
users. In the best cases, it is this easy to get open
source products up and running:
$ sudo apt-get install software-i-want-to-use
A lot of software companies ( …
There have been a number of excellent articles about the pros and
cons of automatic database failover triggered by Baron's post on the GitHub database outage. In the spirit of
Peter Zaitsev's article "The Math of Automated Failover," it seems like
a good time to point out that database failure is usually not the
biggest source of downtime for websites or indeed applications in
general. The real culprit is maintenance.
Here is a simple table showing availability numbers out to 5
nines and what they mean in terms of monthly down-time.
Uptime | … |
Github had a recent outage due to malfunctioning automatic
MySQL failover. Having worked on this problem for several
years I felt sympathy but not much need to comment. Then
Baron Schwartz wrote a short post entitled "Is automated failover the root of all evil?"
OK, that seems worth a comment: it's not. Great
title, though.
Selecting automated database failover involves a trade-off
between keeping your site up 24x7 and making things worse by
having software do the thinking when humans are not around.
When comparing outcomes of wetware vs. software it is worth
remembering that humans are not at their best when woken up at
3:30am. Humans go on vacations, or their cell phones run
out of power. …
In late 2011 I attended a lecture by John
Wilkes on Google compute clusters, which link thousands of
commodity computers into huge task processing systems. At
this scale hardware faults are common. Google puts a lot of
effort into making failures harmless by managing hardware
efficiently and using fault-tolerant application programming
models. This is not just good for application up-time.
It also allows Google to operate on cheaper hardware with
higher failure rates, hence offers a competitive advantage in
data center operation.
It's becoming apparent we all have to think like Google to run
applications successfully in the cloud. At Continuent we run
our IT and an increasing amount of QA and development on Amazon Web Services …
The MySQL UC this past week was the best in years. Percona
did an outstanding job of organizing the main Percona Live event that ran Tuesday through
Thursday. About 1000 people attended, which is up from the
800 or so at the O'Reilly-run conference in 2011. There
were also excellent follow-on events on Friday for MariaDB/SkySQL, Drizzle, and Sphinx.
What made this conference different was the renewed energy around
MySQL and the number of companies using it.
- Big web properties like Facebook, Twitter, Google, and Craigslist continue to anchor the …
Since the famous conjecture by Eric Brewer and proof by Nancy
Lynch et al., CAP has given the world countless learned
discussions about distributed systems and many a well-funded
start-up. Yet who truly understands what CAP means?
Even a cursory survey of the blogosphere shows
profound disagreement about the meaning of terms
like CP, AP, and CA in real systems. Those who
disagree on CAP include some of the most illustrious
personages of the database community.
We can therefore state with some confidence that CAP is
confusing. Yet this observation itself raises deeper
questions. Is CAP merely confusing? Or is it the
case that as with other initially accepted but now doubtful ideas
like the Copernican …
MySQL community conferences are alive and well in 2012.
Percona has taken the initiative to host the yearly MySQL event
at the Santa Clara Hyatt; it's now called Percona Live MySQL Conference and Expo.
It runs from 10 through 12 April. But don't plan on
going home Thursday night. On Friday 13 April you can also
attend the SkySQL and MariaDB MySQL Solutions Day in
the same location. And wait, that's not all! Drizzle Day is also on 13 April and
also at the Hyatt, so you can catch up on what the Drizzle folks
have been up to for the last 12 months.
Now for some specifics on the conferences where Continuent will
be appearing. …
If you are interested in NoSQL databases (or maybe not) perhaps
you have seen the anonymous
"warning" about using MongoDB. It concludes with the
following pious request:
Please take this warning seriously.
Now there are a lot of great resources about data management on
the web but the aforementioned rant is not one of them. If
you plan to write technical articles and have people take them
seriously, here are a few tips.
- Sign your name. Readers are more impressed when they see you are not afraid to stand behind your words.
- Explain what problem you were trying to solve. Otherwise uncharitable readers might think you just started pumping information into a new database without thinking about possible consequences and now want to blame somebody …