In Maximum MySQL Database Size? Nick Duncan wants
to find out what the maximum size of his MySQL database can
possibly be. He answers that with a list of maximum file sizes
per file system type. That is not a useful answer.
While every file system does have a maximum file size, this
limitation is usually not relevant when it comes to MySQL maximum
database size. But let's start with file systems, anyway.
First: You never want to run a database system on a FAT
filesystem, ever. In FAT, a file is a linked list of blocks in
the FAT. That is, certain "seek" (backwards seek operations)
operations become slower the larger a file is, because the file
system has to position the file pointer by traversing the linked
list of blocks in the FAT. Since seek operations are basically
what a large database does all day, FAT is …
Where I work, Merlin is an important tool for us and provides a
lot of insight that other, more generic monitoring tools do not
provide. We love it, and in fact love it such much that we have
about 140 database agents reporting into Merlin 2.0 from about
120 different machines. That results in a data influx of about
1.2G a day without using QUAN, and in a data influx of about 6G a
day using QUAN on a set of selected machines.
It completely overwhelms the Merlin data purge process, so the
merlin database grows out of bounds, which is quite unfortunate
because our disk space is in fact very bounded.
The immediate answer to our purge problem was to disable the
merlin internal purge and with the kind help of MySQL support to
create a script which generates a list of record ids to delete.
These ids end up in a number of delete statements with very large
WHERE ... IN (...) clauses that do the actual delete.
This …
According to my findings in Bug #31876, MySQL does not commit data to disk
in Windows using the same method MS SQL Server and DB/2 are
using. The method MySQL uses appears to be seven times slower in
pathological scenarios.
The bug report contains a patch - thanks to the MySQL WTF (The
Windows Task Force) and the lab provided by the customer for
helping me to find that.
Does this work for you? I want to hear about your test
results.
Lately, I have had opportunity to evaluate a very large Ruby
installation that also was growing very quickly. A lot of the
work performed on site has been specific to the site, but other
observations are true for the platform no matter what is being
done on it. This article is about Ruby On Rails and its
interaction with MySQL in general.
Continue reading "Rubyisms"
This article does not even contain the words database or MySQL. I
still believe it is somewhat interesting.
Mail has, for some reason, always been playing a big role in my
life. I have been running mail for two, my girlfriend and me, in
1988. I have been running mail for 20 and 200 people in 1992,
setting up a citizens network. Later I designed and built mail
systems for 2 000 and 20 000 person corporations, and planned
mail server clusters for 200 000 and 2 million users. And just
before I became a consultant at MySQL I was working for a shop
that did mail for a living for 20 million users.
Mail is a very simple and well defined collection of services.
You accept incoming messages to local users, you implement
relaying for your local users with POP-before-SMTP and SMTP AUTH,
you build POP, IMAP and webmail accesses, and you deploy spam
filter systems and virus scanners for incoming and outgoing
messages. This services …
In Semi-Dynamic Data, Sheeri writes about
Semi-Dynamic Data and content pregeneration. In her article, she
suggests that for rarely changing data it is often adviseable to
precompute the result pages and store them as static content.
Sheeri is right: Nothing beats static content, not for speed
and neither for reliability. But pregenerated pages can be a
waste of system ressources when the number of possible pages is
very large, or if most of the pregenerated pages are never
hit.
An intermediate scenario may be a statification system and some
clever caching logic.
Statification is the process of putting your content generation
code into a 404 page handler and have that handler generate
requested content. The idea is that on a …
Scaling Patterns This is a translation of a
german language article I wrote two weeks ago
for my german language blog.
In 2004, when I was still working for web.de, I gave a little talk on Scaleout on Linuxtag. Even back then one major message of the talk was "Every read
problem is a cache problem" and "Every write problem is a problem
of distribution and batching":
To scale, you have to partition your application into smaller
subsystems and …