Showing entries 61 to 70 of 74
« 10 Newer Entries | 4 Older Entries »
Displaying posts with tag: Architecture (reset)
Standard Query Language (SQL) for NoSQL databases?

I recently came across an interesting blog post on RedMonk (not surprising, as I read most of their posts). It’s called It’s Beginning to Look a Lot Like SQL and basically it talks about query language for NoSQL databases. It seems that as NoSQL becomes more popular, users want to do more with it – a good level of querying, for example, is needed.

Now of course, since NoSQL is a family of products that work in radically different ways, it’s not certain that this is possible (or even desirable – read Alex Popescu’s post on the subject).

But my question is – why do you even need a query language for NoSQL data stores? After all, running queries on distributed data might be complex to implement, and time consuming. The better …

[Read more]
OpenDBCamp: Information Lifecycle Architecture

The Open DB Camp in Sardinia 2011 has had a number of sessions on varying topics. Topics range from MySQL over MongoDB to replication and High Availability.

I decided to tap into the database expert resources present here at Sardegna Ricerche by discussing a non-database issue, where one can expert database experts to have insights beyond those of end users. And they did.

The topic was the particular case of information overload many of us suffer from on our hard disks: Too many files, too hard to find.

  • How do we find the bank statement from April 2007 from the more-seldom-used account?
  • What are the ten best work-related pictures from last year?
  • Is this the most current …
[Read more]
MySQL 5.6: InnoDB scalability fix – Kernel mutex removed

For those interested in InnoDB internals, this post tries to explain why the global kernel mutex was required and the new mutexes and rw-locks that now replace it. Along with the long term benefit from this change.

InnoDB’s core sub-systems up to v5.5 are protected by a global mutex called the Kernel mutex. This makes it difficult to do even some common sense optimisations. In the past we tried optimising the code but it would invariably upset the delicate balance that was achieved by tuning of the code that used the global Kernel mutex, leading to unexpected performance regression. The kernel mutex is also abused in several places to cover operations unrelated to the core e.g., some counters in the server thread main loop.

The InnoDB core sub-systems are:

  1. The Locking sub-system
  2. The Transaction sub-system
  3. MVCC  views
[Read more]
Outliers and coexistence are the new normal for big data

Letting data speak for itself through analysis of entire data sets is eclipsing modeling from subsets. In the past, all too often what were once disregarded as "outliers" on the far edges of a data model turned out to be the telltale signs of a micro-trend that became a major event. To enable this advanced analytics and integrate in real-time with operational processes, companies and public sector organizations are evolving their enterprise architectures to incorporate new tools and approaches.

Whether you prefer "big," "very large," "extremely large," "extreme," "total," or another adjective for the "X" in the "X Data" umbrella term, what's important is accelerated growth in three dimensions: volume, complexity and speed.

Big data is not without its limitations. Many organizations need to revisit business processes, solve data silo challenges, and invest in visualization and collaboration tools to make big data understandable and …

[Read more]
Calculating your database size

I generally use the following MySQL INFORMATION_SCHEMA (I_S) query to Calculate Your MySQL Database Size. This query and most others that access the MySQL INFORMATION_SCHEMA can be very slow to execute because they are not real tables and are not governed by physical data, memory buffers and indexes for example but rather internal MySQL data structures.

Mark Leith indicates in his post on innodb_stats_on_metadata that Innodb performs 8 random(ish) dives in to the index, when anybody accesses any of SHOW TABLE STATUS, SHOW INDEX, INFORMATION_SCHEMA.TABLES,INFORMATION_SCHEMA.STATISTICS for InnoDB tables. This can have an effect on performance, especially with a large number of Innodb tables, and a poor ratio of innodb_buffer_pool_size to disk data+index footprint.

What is even more …

[Read more]
My favorite MySQL data type – DECIMAL(31,0)

It may seem hard to believe, but I have seen DECIMAL(31,0) in action on a production server. Not just in one column, but in 15 columns just in the largest 4 tables of one schema. The column was being used to represent a integer primary or foreign key column.

In a representative production instance (one of a dozen plus distributed production database servers) the overall database footprint was decreased from ~10 GB to ~2 GB, a 78% saving. In total, 15 columns across just 4 tables were changed from DECIMAL(31,0) to INT UNSIGNED.

One single table > 5GB was reduced to under 1GB (a 81% saving). This being my record for any GB+ tables in my time working with the MySQL database.

Had this server for example had 4GB of RAM, and say 2.5GB allocated to the innodb_buffer_pool_size, this one change moved the system from requiring more consistent disk access (4x data to memory) to being able to store all data in memory. Tests showed …

[Read more]
Kickfire Basics – The KFDB columnar storage engine

This is the first post in a new series of “Kickfire Basics” blog posts by myself and others here at Kickfire.  This series will review the basics of the Kickfire appliance starting from this post describing how data is stored on disk, to future posts on topics such as loading data into the appliance and writing queries which best leverage the capabilities of the SQL chip.

The Kickfire Equation
Column store + Compression + SQL Chip = performance

The Kickfire Analytic Appliance features the new KFDB storage engine which was built from scratch to handle queries over vast amounts of data.  KFDB is a column store in contrast to most MySQL storage engines which are row stores.  What follows is a description of our column oriented storage engine and how it improves performance over typical row stores.

This post concerns itself with the first part of the equation, the KFDB …

[Read more]
Scalable Internet Architectures

My old friend and collaborator Theo Schlossnagle at OmniTI posted his slides from his Scalable Internet Architectures talk at VelocityConf 2009.

The slides are brilliant even without seeing Theo talk and I highly recommend the time it takes to flip through them, for anyone who is interested in systems performance. If anyone took an mp3 of this talk I’m dying to hear it, please let me know.

For those of you unfamiliar with OmniTI, Theo is the CEO of this rather remarkable company specializing in Internet-scale architecture consulting. They generalize on Internet-scale architecture, not on one specific dimension the way Pythian specializes on the database tier. This allows them to see Internet-scale workloads from a unique systemic, multidisciplinary point of view; from the user experience all the way up the stack, through …

[Read more]
Do I have a 32-bit or 64-bit MySQL?

I was asked the question last week of how to tell whether a MySQL installation is 64-bit or 32-bit. Often, MySQL is pre-installed or installed via a package and it is forgotten when and how it was installed. I scratched my head for a little while and just run

/usr/sbin/mysqld --verbose --help
/usr/local/mysql/libexec/mysqld Ver 5.1.25-rc-debug for suse-linux-gnu on i686
(Source distribution)
...

This is somewhat telling and I can tell from having used MySQL that this is 32-bit. However, there's nothing that sticks out like a sore thumb that says one way or the other that is is 64 or 32 bit (ok, maybe "i686" (?))

I was working outside, which seems to be where I become inspired and had a "DUH" moment. The command "file" is the key for this:

32-bit linux:

patg@ishvara:~> file /usr/local/mysql/libexec/mysqld
/usr/local/mysql/libexec/mysqld: ELF …

[Read more]
Scalability Best Practices: eBay

Following a link from the High Scalability blog, I found this really great article about scalability practices, as told by Randy Shoup at eBay. Randy is very good at explaining some of the more technical aspects in more or less plain English, and it even helped me find some wording I was looking for to help me explain the notion (and benefits) of functional partitioning. He also covers ideas that apply directly to your application code, your database architecture (including a little insight into their sharding strategy), and more. Even more about eBay’s architecture can be found here.

Showing entries 61 to 70 of 74
« 10 Newer Entries | 4 Older Entries »