Analyzing transactional data is becoming increasingly common, especially as the data sizes and complexity increase and transactional stores are no longer to keep pace with the ever-increasing storage. Although there are many techniques available for loading data, getting effective data in real-time into your data warehouse store is a more difficult problem. VMware Continuent provides
The Big Data explosion in recent years has created a vast number of new technologies in the area of data processing, storage, and management. One of the biggest names to appear on the scene is Hadoop. In case you need a quick review, Hadoop is a Big Data storage system that takes in large amounts of data from servers and breaks it into smaller, manageable chunks. The technology is complex but at a high level the Hadoop ecosystem essentially takes a “divide and conquer” approach to processing Big Data instead of processing data in tables, as in a relational database like Oracle or MySQL.
[Read more]
In Mo’ Data, Mo’ Problems, we explored the paradox that “Big Data” projects pose to organizations and how Tokutek is taking an innovative approach to solving those problems. In this post, we’re going to talk about another hot topic in IT, “The Cloud,” and how enterprises undertaking Cloud efforts often struggle with idea of “problem trading.” Also, for some reason, databases are just given a pass as traditionally “noisy neighbors” and that there is nothing that can be done about it. Lets take a look at why we disagree.
With the birth of the information age came a coupling of business and IT. Increasingly strategic business projects and objectives were reliant on information infrastructure to provide information storage and retrieval instead of paper and filing cabinets. This was the dawn of the database and what gave rise to companies like Oracle, Sybase and MySQL. With the appearance of true Enterprise Grade …
[Read more]First things first: I could use this title for every year, it is an evergreen. In order for this title to make sense, there must be a specific context and in this case the context is Big Data. We have seen new ideas and many announcements in 2014, and in 2015 those ideas will shape up and early versions of innovative products will start flourishing. Like many other people, I prepared some comments and opinions to post back in early January then, soon after the season’s break, I started flying around the world and the daily routine kept me away from the blog for some time. So, as a good last blogger, it may be time for me to post my own predictions, for the joy of my usual 25 readers. Small Data, Big Data, Any Data The term Big Data is often misused. Many different architectures, objectives, projects and issues deviate from its initial meaning. Everything today seems to be “Big Data” – whether you collect structured or …
[Read more]Welcome to blog #2 in a series about the benefits of the Fractal Tree. In this post, I’ll be explaining Big Data, why it poses such a problem and how Tokutek can help. Given the fact that I am a lifelong fan of both Hip-hop and Big Data, the title was a no-brainer and, given the artist, a bit of a pun.
I am as tired as you of hearing the term “Big Data.” It’s so overused, that it ceases to have specific meaning anymore. You see, data hardly ever starts as “big” or a “problem.” Rather, it starts small and easily manageable, but gradually grows to some unimaginable size and becomes a beast in need of slaying, like the irradiated ant from a sci-fi film, growing to the size of a cruise ship. The nature of tackling such a tough problem means that the initial understanding of the factors involved is, oftentimes, incomplete at best; Catch-22 exemplified. During the course of problem it is …
[Read more]In my recent travels, I’ve been speaking with database users at various meetups and trade shows worldwide. Very often, I got questions centering around the best use cases for our products, be it TokuDB, our MySQL storage engine, or, TokuMX, our distribution of MongoDB. Over 90% of the time, I responded Cloud, Big Data or both. You see, in the software industry we’re like kindergartners, we like things to fit into neat categories. If you know any software sales people, you’ll recognize this as a fitting analogy (at least in terms of energy and attention span), but I digress. This strategy helps allocate resources where they are most likely to make an impact, and, thus, optimize our return on investment. In this blog series, I’m going to go slightly against the grain and explain why the Fractal Tree makes databases work in these environments.
Before I get into more detail, let me …
[Read more]“HBase and Hadoop are the only technologies proven to scale to dozens of petabytes on commodity servers, currently being used by companies such as Facebook, Twitter, Adobe and Salesforce.com.”–Monte Zweben.
Is it possible to turn Hadoop into a RDBMS? On this topic, I have interviewed Monte Zweben, Co-Founder and Chief Executive Officer of Splice Machine.
RVZ
Q1. What are the main challenges of applications and operational analytics that support real-time, interactive queries on data updated in real-time for Big Data?
Monte Zweben: Let’s break down “real-time,
interactive queries on data updated in real-time for Big Data”.
“Real-time, interactive queries” means that results need to be
returned in milliseconds to a few seconds.
For “Data updated in real-time” to happen, …
As of today, Continuent is part of VMware. We are absolutely over the moon about it.
You can read more about the news on the VMware vCloud blog by Ajay Patel, our new
boss. There’s also an official post on our Continuent company blog. In a nutshell
the Continuent team is joining the VMware Cloud Services
Division. We will continue to improve, sell, and support our
Tungsten products and work on innovative integration into
VMware’s product line.
So why do I feel exhilarated about joining VMware? There are
three reasons.
1. Continuent is joining a world-class
company that is the leader in virtualization and cloud
infrastructure solutions. Even …
October 27, 2014 By Severalnines
The term data warehousing often brings to mind things like large complex projects, big businesses, proprietary hardware and expensive software licenses. With Hadoop came open source data analysis software that ran on commodity hardware, this helped address at least some of the cost aspects. We had previously blogged about MongoDB and MySQL to Hadoop. But setting up and maintaining a Hadoop infrastructure might still be out of reach for small businesses or small projects with limited budgets. Well, perhaps then you might want to have a look at Redshift.
…
[Read more]
Computer science is like an enormous tool box you can rummage
through whenever you have a problem to solve. Most of the tools
are sturdy and practical, like algorithms for B-trees. Some are
also elegant, like consistent hashing in Dynamo. Finally there
are some tools that you never quite figure out even after years
of reflection. That piece of steel you are looking at could be
Excalibur. Or it could be a rusty knife.
The CAP theorem falls into the last category, at least
for me. It was a major topic in the blogosphere a few years
ago and Google Trends shows steadily increasing interest in the term since
2010. It's not my goal to explain CAP fully--a good
informal description is …