Showing entries 71 to 80 of 105
« 10 Newer Entries | 10 Older Entries »
Displaying posts with tag: sharding (reset)
Being successful like Pinterest without its DB adventures...

I just came across this: "Scaling Pinterest and adventures in database sharding"  (http://gigaom.com/data/scaling-pinterest-and-adventures-in-database-sharding/)
"Pinterest has learned about scaling the way most popular sites do — the architecture works until one day it doesn’t"Pinterest found out that "the architecture" is not scalable and they turned to development of a Scale Out mechanism also called Sharding.

I find it amazing that sharding, or in other words, the idea of "scale out by splitting and parallelizing data across shared-nothing commodity-hardware" is not supplied "out of the box" by "the architecture" (such as database, load-balancer, any other IT stuff). I'm wondering who was the one that decided that an IT issue like scale-out should be outsourced from the database to the …

[Read more]
Facebook makes big data look... big!

Oh I love these things: http://techcrunch.com/2012/08/22/how-big-is-facebooks-data-2-5-billion-pieces-of-content-and-500-terabytes-ingested-every-day/

Every day there are 2.5B content items shares, and 2.7B "Like"s. I care less about GiGo content itself, but metadata, connections, relations are kept transactionally in a relational database. The above 2 use-cases generate 5.2B transactions on the database, and since there are only 86400 seconds a day, we get over 60000 write transactions per second on the database, from these 2 use-cases alone, not to mention all other use-cases, such as new profiles, emails, queries...

And what's the size of new data, on top of all the existing …

[Read more]
Scale Up, Partitioning, Scale Out

On the 8/16 I conducted a webinar titled: "Scale Up vs. Scale Out" (http://www.slideshare.net/ScaleBase/scalebase-webinar-816-scaleup-vs-scaleout):


ScaleBase Webinar 8.16: ScaleUp vs. ScaleOut from ScaleBase
The webinar was successful, we had many attendees and great participation in questions and answers throughout the session and in the end. Only after the webinar it only occurred to me that one specific graphic was missing from the webinar deck. It was occurred to me after answering several audience questions about "the difference between …

[Read more]
ARM based data center. Inspiring.

In a previous post I wrote ARM based servers. Since then, and thanks to all the comments and responses I got, I looked more into this ARM thing and it's absolutely fascinating...

Look at this beauty (taken from the site of Calxeda, the manufacturer):

What is it? A chip? A server? No, it's a cluster of 4 servers...

And this:

is HP Redstone Server, 288 chips, 1,152 cores (Calxeda quad-core SoC) in a 4U server “Dramatically reducing the cost and complexity of cabling and …

[Read more]
Why shared-storage DB clusters don't scale

Yesterday I was asked by a customer for the reason why he had failed to achieve scale with a state-of-the-art "shared-storage" cluster. "It's a scale-out to 4 servers, but with a shared disk. And I got, after tons of work and efforts, 130% throughput, not even close to the expected 400%" he said.

Well, scale-out cannot be achieved with a shared storage and the word "shared" is the key. Scale-out is done with absolutely nothing shared or a "shared-nothing" architecture. This what makes it linear and unlimited. Any shared resource, creates a tremendous burden on each and every database server in the cluster.

In a previous post, I identified database engine activities such as buffer management, locking, thread locks/semaphores, and recovery tasks - as the main bottleneck in the OLTP …

[Read more]
Impressions from Amazon's AWS Summit in NYC

Yesterday (4/19) I attended the AWS Summit in NYC (http://aws.amazon.com/aws-summit-2012/nyc).

I'm a big fan and also a heavy user of AWS especially S3, EC2, and naturally, RDS. In every point in time I have several dozens of AWS machines running for me out there in the East region, and in some cases when we do some special benchmarks and tests, number of EC2 and RDS machines can easily reach 3-digit. As I said, I'm a fan...

A few quotes I was able to catch and document on my laptop, on my laps...:
"When you develop an app for facebook, you must be prepared (and be afraid) that to your party, not noone will show up, but everybody will show up!" So true! Simple and true. We all want to succeed, to have success with our app. We have to think about scaling from day 1.
"Database was bottleneck for building of sophisticated apps. This is …

[Read more]
So how can we scale databases?

There are ways to scale databases, unfortunately some are limited, some introduce complexities, some are do not fit the cloud...

By scaling solution I mean a solutions that help me scale my existing environment, my existing RDBMS. Some magic or technology that will take my existing Oracle or MySQL for example, to the next level, without porting to a new DB engine/vendor and without completely recoding my app.

Let's try to organize things a bit in this very summarized table, just to get the hunch of it. I can't imagine to cover it all in 1 table or even 100 pages, but that should be a start of a meaningful discussion to continue in next posts:

Solution Scales reads? Scales writes? Scales data? Scales sessions?
[Read more]
Applications come and go. Databases are here to scale.

In my heart, I'm a DBA, always was and always will be. People say I'm a database guy by the way I think, keep my car, and file my music and also bank statements... However I did great deal of development, design, architecture on the apps side. I (hope to) have some perspective.

Applications come and go. The second programming language I've ever learned and worked on was COBOL, some still say most of the world's lines of code are written in this language, maybe so, but anyway I since then have known and written in dozens of programming languages, from Assembly to Force.com, from Pascal to Delphi, from functional C to Object Oriented SmallTalk, C++, Java and , from compiled C/CGI to interpreted Perl, ASP and Ruby back to compiled node.js... My first applications ran on Main-Frame with green screen, later I created beautiful graphic client-server applications, later I had to create hideous white web applications …

[Read more]
DbCharmer 1.7.0 Release: Rails 3.0 Support and Forced Slave Reads

This week, after 3 months in the works, we’ve finally released version 1.7.0 of DbCharmer ruby gem – Rails plugin that significantly extends ActiveRecord’s ability to work with multiple databases and/or database servers by adding features like multiple databases support, master/slave topologies support, sharding, etc.

New features in this release:

  • Rails 3.0 support. We’ve worked really hard to bring all the features we supported in Rails 2.X to the new version of Rails and now I’m proud that we’ve implemented them all and the implementation looks much cleaner and more universal (all kinds of relations in rails 3 work in exactly the same way and we do not need to implement connection switching for all kinds of weird corner-cases in ActiveRecord).
  • Forced Slave Reads functionality. Now …
[Read more]
Shard-Query turbo charges Infobright community edition (ICE)

Shard-Query is an open source tool kit which helps improve the performance of queries against a MySQL database by distributing the work over multiple machines and/or multiple cores. This is similar to the divide and conquer approach that Hive takes in combination with Hadoop. Shard-Query applies a clever approach to parallelism which allows it to significantly improve the performance of queries by spreading the work over all available compute resources. In this test, Shard-Query averages a nearly 6x (max over 10x) improvement over the baseline, as shown in the following graph:

One significant advantage of Shard-Query over Hive is that it works with existing MySQL data sets and queries. Another advantage is that it works with all MySQL …

[Read more]
Showing entries 71 to 80 of 105
« 10 Newer Entries | 10 Older Entries »