Planet MySQL

Displaying posts with tag: hadoop (reset)

Dec

2012

Posted by Venu Anuganti on Mon 10 Dec 2012 19:33 UTC
Tags:

database, analytics, hadoop, data warehouse, bigdata, MySQL, Data Analytics, Data Science, DataAnalytics, DataScience, Difference between data science and data analytics, How to hire data scientist, ROle of data analytics, Role of Data Scientist, What is Data science

As this topic came up a few times this week for discussion at various places, I thought of composing a post on “Data Scientist vs. Data Analytics Engineer”; even though[...]

Dec

2012

On Big Data, Analytics and Hadoop. Interview with Daniel Abadi.

Posted by Roberto V. Zicari on Wed 05 Dec 2012 16:49 UTC
Tags:

Uncategorized, sql, Google, analytics, hadoop, MapReduce, big data, daniel abadi, NoSQL, hadapt, MySQL, Google BigTable, nosql databases, relational databases, Yale University

“Some people even think that “Hadoop” and “Big Data” are synonymous (though this is an over-characterization). Unfortunately, Hadoop was designed based on a paper by Google in 2004 which was focused on use cases involving unstructured data (e.g. extracting words and phrases from Webpages in order to create Google’s Web index). Since it was not [...]

Dec

2012

Distributed Clustering Services

Posted by Venu Anuganti on Mon 03 Dec 2012 06:38 UTC
Tags:

postgresql, database, hadoop, NoSQL, bigdata, MySQL, Co-orinating systems, Distributed Clustering, Distributed co-ordination, Distributed systems, HA using Zookeeper, MySQL HA using Zookeeer, Spread, Zookeeper

Apart from my consulting as part of ScaleIn, I also invest to bootstrap companies with really disruptive ideas; and in the process met few database specific companies who are already[...]

Nov

2012

Typical “Big” Data Architecture

Posted by Venu Anuganti on Fri 30 Nov 2012 22:15 UTC
Tags:

postgresql, sql, database, scalability, ETL, hadoop, data warehouse, MapReduce, hbase, reporting, cloudera, NoSQL, vertica, Hive, bigdata, MySQL, SAS, Big Data Architecture, Big Data Warehouse, Data Architecture, Impala, NoSQL and BigData, Data Analytics, Data Science, kognitio, druid

Here is the typical “Big” data architecture, that covers most components involved in the data pipeline. More or less, we have the same architecture in production in number of places[...]

Nov

2012

MySQL and Hadoop Integration - Unlocking New Insight

Posted by Oracle MySQL Group on Thu 29 Nov 2012 18:58 UTC
Tags:

Apache, cluster, data, hadoop, BI, sqoop, NoSQL, MySQL, big

“Big Data” offers the potential for organizations to revolutionize their operations. With the volume of business data doubling every 1.2 years, analysts and business users are discovering very real benefits when integrating and analyzing data from multiple sources, enabling deeper insight into their customers, partners, and business processes.

As the world’s most popular open source database, and the most deployed database in the web and cloud, MySQL is a key component of many big data platforms, with Hadoop vendors estimating 80% of deployments are integrated with MySQL.

The new Guide to MySQL and Hadoop presents the tools enabling integration between the two data platforms, supporting the data lifecycle from acquisition and organisation to …

[Read more]

Oct

2012

Two Cons against NoSQL. Part I.

Posted by Roberto V. Zicari on Tue 30 Oct 2012 16:57 UTC
Tags:

postgres, Oracle, Open Source, Uncategorized, sql, ibm, analytics, hadoop, Excel, csv, big data, NoSQL, mongodb, json, riak, voltdb, basho, MySQL, nosql databases, Cindy Saracco, Clustrix, document stores, Dwight Merriman, InfoSphere BigInsights, John Hugg, New and old Data stores, relational databases, Steve Vinoski

Two cons against NoSQL data stores read like this: 1. It’s very hard to move data out from one NoSQL to some other system, even other NoSQL. There is a very hard lock in when it comes to NoSQL. If you ever have to move to another database, you have basically to re-implement a lot [...]

Jul

2012

Log Buffer #279, A Carnival of the Vanities for DBAs

Posted by The Pythian Group on Fri 27 Jul 2012 07:00 UTC
Tags:

Oracle, Log Buffer, SQL Server, hadoop, rhel, exadata, MySQL

In a typical organization, all work together to bring out a common good for the outside world. It’s interesting to see how all of these entities blog about technology, and there is more and more interest shown by managerial technologists about the database. This Log Buffer Edition appeases their appetites along with the others in [...]

Jul

2012

MySQL and Hadoop

Posted by Oracle MySQL Group on Thu 26 Jul 2012 11:50 UTC
Tags:

hadoop, MapReduce, MySQL, hdfs

Introduction

"Improving MySQL performance using Hadoop" was the talk which I and Manish Kumar gave at Java One & Oracle Develop 2012, India. Based on the response and interest of the audience, we decided to summarize the talk in a blog post. The slides of this talk can be found here. They also include a screen-cast of a live Hadoop system pulling data from MySQL and working on the popular 'word count' problem.

MySQL and Hadoop have been popularly considered as 'Friends with benefits' and our talk was aimed at showing how!

The benefits of MySQL to developers are the speed, reliability, data integrity and …

[Read more]

Jul

2012

So now Hadoop's days are numbered?

Posted by Doron Levari on Tue 10 Jul 2012 20:28 UTC
Tags:

database, scalability, analytics, hadoop, data warehouse, big data, parallelism, scale out, map reduce, database scalability

Earlier this week we all read GigaOM's article with this title:
"Why the days are numbered for Hadoop as we know it"I know GigaOM like to provoke scandals sometimes, we all remember some other unforgettable piece, but there is something behind it...

Hadoop today (after SOA not so long ago) is one of the worst case of an abused buzzword ever known to men. It's everything, everywhere, can cure illnesses and do "big-data" at the same time! Wow! Actually Hadoop is a software framework that supports data-intensive distributed applications, derived from Google's MapReduce and Google File System (GFS) papers.

My take from the article is this: Hadoop is a foundation, low-level platform. I used the word …

[Read more]

Feb

2012

A super-set of MySQL for Big Data. Interview with John Busch, Schooner.

Posted by Roberto V. Zicari on Mon 20 Feb 2012 09:28 UTC
Tags:

Oracle, Uncategorized, sql, innodb, memcached, hadoop, MapReduce, big data, mariadb, schooner, NoSQL, voltdb, MySQL, nosql databases, Apache Hadoop, SchoonerSQL, John Busch, Schooner Information Technology

“Legacy MySQL does not scale well on a single node, which forces granular sharding and explicit application code changes to make them sharding-aware and results in low utilization of severs”– Dr. John Busch, Schooner Information Technology A super-set of MySQL suitable for Big Data? On this subject, I have interviewed Dr. John Busch, Founder, Chairman, [...]

Top Authors

Oracle MySQL Blogs

Vendor Blogs

MySQL Links