Showing entries 291 to 300 of 383
« 10 Newer Entries | 10 Older Entries »
Displaying posts with tag: NoSQL (reset)
MySQL Cluster - Performance (UPDATE on PK) - >120K updates/sec

This post follows on the previous post on SELECT performance. In this post I want to show three things:

  1. How many single row UPDATEs per second you can do on on a Cluster with two data nodes (updating 64B data by the PRIMARY KEY, no batching)
  2. Show how MySQL Cluster scales with threads and mysql servers
  3. How ndb_cluster_connection_pool affects performance

Next post will be what happens to INSERTs.

Setup

  • two data nodes
  • one to four mysql servers
  • interconnected with Gig-E (single NIC)

deployed on six computers (of varying quality, but not really modern, see below). www.severalnines.com/bencher was co-located with each mysql …

[Read more]
The problem with a full box of big data tools

NoSQL”, for lack of better name, is a generic term that describes any data management system that does not use SQL as a query interface.  Generally this means any data management system that is non-relational, but the term also has also been stretched as far to include the boundaries of what constitutes a data management system at all (such as Hadoop).

Early on (a couple of years back in NoSQL time) when the term was coined I think the positioning was much more aggressive, but more recently this has been softened so now NoSQL is commonly quoted as meaning of “Not only SQL” or “next generation databases” (whatever that means).  The common message you get now is something along the lines of NoSQL systems are more “specialized”, each being designed to solve a smaller number of problems than the …

[Read more]
MongoDB the Definitive Guide by Kristina Chodrow and Michael Dirolf


The kind folks at O'Reilly sent me a fantastic book about MongoDB. This was a great read since it’s suited for people who do Operations and Development and Performance tuning (me). I've been using Cassandra for quite some time now (months lol) and the thing that has irritated me about Cassandra is the documentation for it. Cassandra documentation sucks, its hard to speed up on the internals. This MongoDB book is written by the most active participants that are developing MongoDB and the knowledge shows. What I like is it starts out on how to quickly get it up, add/get/update data to the DB. Then progresses to more advance topics-that talk about GridFS and MongoDB drivers. Personally I would like to see more elaboration of this facet in terms of motivation of why do this, what the win is and how it fits into the "Fast by Default" mantra. Each step is organized perfectly, …

[Read more]
MySQL Cluster - Performance (SELECT on PK)

In this post I want to show three things:

  1. How many single row SELECTs per second (on the PRIMARY KEY, no batching) you can do on on a Cluster with two data nodes
  2. Show how MySQL Cluster scales with threads and mysql servers
  3. How ndb_cluster_connection_pool affects performance

Next post will be what happens to INSERTs, and then UPDATEs.

Setup

  • two data nodes
  • one to four mysql servers
  • interconnected with Gig-E (single NIC)

deployed on six computers (of varying quality, see below). www.severalnines.com/bencher was co-located with each mysql servers to drive the load. The reads were a PK SELECT like:

SELECT data1,data2 FROM t1 WHERE id=[random];

data1 and data2 are each 256B, so in total 512B was read. …

[Read more]
A comprehensive database know-how collection

Sqlexamples.org is a community project that is focused on collecting real world  solutions for specific problems. Additionally, we want to collect database know-how that is related to SQL or NoSQL databases of all kinds. Content is indexed and freely available to everybody. We would like to invite every single database developer and administrator out there to take part! It does not matter which database you prefer, Oracle, MS SQL, MySQL, PostgreSQL, SQLite, CouchDB or MongoDB...

read more

The SMAQ stack for big data

SMAQ report sections

→ MapReduce

→ Storage

→ Query

→ Conclusion

"Big data" is data that becomes large enough that it cannot be processed using conventional methods. Creators of web search engines were among the first to confront this problem. Today, social networks, mobile phones, sensors and science contribute to petabytes of data created daily.

To meet the challenge of processing such large data sets, Google created MapReduce. Google's work and Yahoo's creation of the Hadoop MapReduce implementation has spawned an ecosystem of big data processing tools.

As MapReduce has grown in popularity, a stack for big data systems …

[Read more]
Big Data innovation marches on

With IBM intending to acquire Netezza the predicted consolidation in the distributed analytics market is well underway.  Recent deals include EMC/Greenplum Teradata/Kickfire and now IBM/Netezza.  A good breakdown of this deal is on Curt’s blog.  There is still more to go of course with one of the crown jewels, Vertica, still ripe for the picking. 

What this indicates is that MPP analytics has moved from the innovative edge into the mainstream market and now the more risk adverse large caps and now willing to invest substantially in growing this market.  Interestingly Microsoft made this move early with the …

[Read more]
sqlexamples.org - archive of free SQL / NoSQL examples

We're proud to introduce the sqlexamples.org community, a resource for database developers and administrators. Our aim is it to improve the availability of free (as in free speech) SQL and NoSQL related database examples of all kinds. We're not just focused on MySQL. Related is for example:

  • syntax examples
  • database schemata
  • database related source code
  • <your idea here>

A lot of valuable database related content gets published day by day in countless blogs all over the web. Our aim is it to archive and index this knowledge in a central database, open and accessible for everyone. When you want to help us building such an useful archive, all you have to do is to submit your RSS feed to sqlexamples.org . Additionally, content can be published directly on our platform if you like.

[Read more]
Was Stonebraker right?

Back in 2008 Stonebraker & DeWitt published a paper and associated blog post titled “MapReduce: A major step backwards”.  Their key points being Map Reduce is:

  1. A giant step backward in the programming paradigm for large-scale data intensive applications
  2. A sub-optimal implementation, in that it uses brute force instead of indexing
  3. Not novel at all — it represents a specific implementation of well known techniques developed nearly 25 years ago
  4. Missing most of the features that are routinely included in current DBMS
  5. Incompatible with all of the tools DBMS users have come to depend …
[Read more]
Open source in the clouds and in the debates

We continue to see more evidence of the themes we discuss in our latest CAOS special report, Seeding the Clouds, which examines the open source software used in cloud computing, the vendors backing open source, the cloud providers using it and the impact on the industry.

First, as usual, we are seeing consistencies between our own research — which indicates open source is a huge part of today’s cloud computing offerings from major providers like Amazon, Google, Rackspace, Terremark and VMware — and that of code analysis and management vendor Black Duck. In its analysis of code that runs the cloud, Black Duck also found a preponderance of open source pieces, in many cases the same projects we profile in our report.

Indeed, open source software is an important part of the infrastructure, …

[Read more]
Showing entries 291 to 300 of 383
« 10 Newer Entries | 10 Older Entries »