In the past few months, I have tested many NoSQL solutions.
Redis, MongoDB, HBase yet Cassandra is the Column Store DB I
picked because of its speed (on writes), reliability, built in
feature set that makes it multi-datacenter aware. The one other
personal reward for Cassandra is it is written in Java. I like
reading and writing in Java more than C++ although it really does
not matter for me personally in the end.
Let us talk about the reason why I am introducing Cassandra into
my infrastructure and some of its drawbacks I have noticed so
far.
Why it is being introduced:
We have a feature where we record every single click for 50
million Monthly Active Users (real-time) and storing this in
mySQL is just waste of semi-good hardware for data that is only
looked at for the past 24 hours. Over the course of some time
(couple of months) more than 3 billion rows accumulated, which
translated into a 3.5 TB distributed …
Updating the MPL. Funding for Lucid and eXo. StatusNet. And more.
Follow 451 CAOS Links live @caostheory on Twitter and
Identi.ca
“Tracking the open source news wires, so you don’t have
to.”
Updating the MPL
# ZDnet reported that the 10-year-old Mozilla Public
License will be updated by the end of 2010, while Mitchell Baker
explained the
process.
Funding for Lucid and eXo
# Lucid Imagination raised $10m in series B funding from Shasta
Ventures, Granite Ventures and Walden International.
# eXo Platform raised $6m from Auriga …
[Read more]This is the 182nd edition of Log Buffer, the weekly review of database blogs. Make sure to read the whole edition so you do not miss where to submit your SQL limerick!
This week started out with me posting about International Women’s Day, and has me personally attending Confoo (Montreal) which is an excellent conference I hope to return to next year. I learned a lot from confoo, especially the blending nosql and sql session I attended.
This week was also the Hotsos Symposium. …
[Read more]Persistence Smoothie: Blending NoSQL and SQL – see user feedback and comments at http://joind.in/talk/view/1332.
Michael Bleigh from Intridea, high-end Ruby and Ruby on Rails consultants, build apps from start to finish, making it scalable. He’s written a lot of stuff, available at http://github.com/intridea. @mbleigh on twitter
NoSQL is a new way to think about persistence. Most NoSQL systems are not ACID compliant (Atomicity, Consistency, Isolation, Durability).
Generally, most NoSQL systems have:
- Denormalization
- Eventual Consistency
- Schema-Free
- Horizontal Scale
NoSQL tries to scale (more) simply, it is starting to go mainstream – NY …
[Read more]Novell’s Q1. The future of OpenSolaris. And more.
Follow 451 CAOS Links live @caostheory on Twitter and Identi.ca
“Tracking the open source news wires, so you don’t have to.”
# Novell reported Linux platform revenue of $37.5m in Q1, up 6.4%.
# Internet.com reported that Novell’s Linux business broke even as Microsoft deal revenues fade.
# As the H reported Oracle exec Dan Roberts confirmed that OpenSolaris has a future at Oracle.
# Citrix acquired Paglo, launched GoToManage service.
# StatusNet …
[Read more]With the motivation from today’s public news on Twitter’s move from MySQL to Cassandra, my own skills desire following in-depth discussions at last November’s Open SQL Camp to consider Cassandra and yesterday’s discussion with a new client on persistent key-value store products, today I download installed and configured for the first time. Not that today’s news was unexpected, if you follow the Twitter Engineering Open Source projects you would have seen Cassandra as well as other products being used or evaluated by Twitter.
So I went from nothing to a …
[Read more]The No-SQL tag really lumps together a lot of concepts that are in fact as distinct from eachother as they are from SQL/RDBMS.
An object store is not at all similar to Cassandra and Hypertable, which is not at all like an column store. And when looking at BigTable derivatives, it’s quite important to realise that Google actually does joins in middle layers or apps, so while BigTable does not have joins, the apps essentially do use them – I’ve heard it professed that denormalising everything might be a fab idea, but I don’t quite believe in that for all cases, just like I don’t believe in ditching the structured form of RDBMS being the solution.
SQL/RDBMS has had a few decades of dominance now, and has thus become the great “general purpose” tool. With the ascent of all the other tools, it’s definitely worthwhile to look at them, but also realise that each (inluding SQL based ones) have their place. Moving all your …
[Read more]OpenSQLCamp was a huge success! I took videos of most of the sessions (we only had 3 video cameras, and 4 rooms, and 2 sessions were not recorded). Unfortunately, I was busy doing administrative stuff for opensqlcamp for the opening keynote and first 15 minutes of the session organizing, and when I got to the planning board, it was already full….so I was not able to give a session.
-
- Comparing Non-Relational Databases: MongoDB, Tokyo Tyrant, CouchDB by Igal Koshevoy of Pragmaticraft
The Cassandra database has been getting quite a lot of
publicity recently. I think this is a good thing in general, but
it seems that some people are considering using it for unsuitable
purposes.
Cassandra is a cluster database which uses multiple nodes to
provide
- Read-scaling
- Write-scaling
- High availability
Unless you need at least TWO of those things, you should probably
not bother.
Good reasons to use Cassandra:
High availability
Cassandra tolerates the failure of some nodes and will continue
to read data and take writes despite some nodes being offline or
unreachable - the exact behaviour depends on its settings and
what consistency level of read/write is requested.
Write scaling
Cassandra allows you to scale writes by just adding …
Key-value databases are catching fire these days. Memcached, Redis, Cassandra, Keyspace, Tokyo Tyrant, and a handful of others are surging in popularity, judging by the contents of my feed reader.
I find a number of things interesting about these tools.
- There are many more of them than open-source traditional relational databases. (edit: I mean that there are many options that all seem similar to each other, instead of 3 or 4 standing out as the giants.)
- It seems that a lot of people are simultaneously inventing solutions to their problems in private without being aware of each other, then open-sourcing the results. That points to a sudden sea change in architectures. Tipping points tend to be abrupt, which would explain isolated redundant development.
- Many of the products are feature-rich with things programmers need: diverse language bindings, APIs, embeddability, and the ability to speak familiar …