Planet MySQL

Displaying posts with tag: hadoop (reset)

Nov

2010

451 CAOS Links 2010.11.02

Posted by The 451 Group on Tue 02 Nov 2010 17:47 UTC
Tags:

Oracle, Java, links, Linux, eclipse, opensource, Apple, symbian, Red Hat, Zend, Canonical, 451 group, 451caostheory, 451group, caostheory, matt aslett, mattaslett, matthew aslett, matthewaslett, open-source, The 451 Group, the451group, Giuseppe Maxia, JasperSoft, hadoop, continuent, acquia, android, Robert Hodges, Eucalyptus, cloudera, Informatica, Rightscale, coverity, Motorola, virgo, carlo daffar, cloud.com, Java community process, artemis, openoffice.rog, outcurve, stormy peters, unity, MySQL

JCP election results. Funding for Acquia and Continuent. Fedora 14. And more.

Follow 451 CAOS Links live @caostheory on Twitter and Identi.ca, and daily at Paper.li/caostheory
“Tracking the open source news wires, so you don’t have to.”

# The Java Community Process election results are in.

# Acquia closed an $8.5m series C funding round and announced that it has tripled its customer base in 2010.

# Continuent appointed Robert Hodges CEO and confirmed details of $5m funding from Aura Capital.

# Red Hat …

[Read more]

Nov

2010

Webinar: navigating the changing landscape of open source databases

Posted by The 451 Group on Mon 01 Nov 2010 15:04 UTC
Tags:

software, Databases, enterprisedb, Linux, database, opensource, webinar, 451 group, 451caostheory, 451group, caostheory, matt aslett, mattaslett, matthew aslett, matthewaslett, open-source, The 451 Group, the451group, hadoop, NoSQL

When we published our 2008 report on the impact of open source on the database market the overall conclusion was that adoption had been widespread but shallow.

Since then we’ve seen increased adoption of open source software, as well as the acquisition of MySQL by Oracle. Perhaps the most significant shift in the market since early 2008 has been the explosion in the number of open source database and data management projects, including the various NoSQL data stores, and of course Hadoop and its associated projects.

On Tuesday, November 9, 2010 at 11:00 am EST I’ll be joining Robin Schumacher, Director of Product Strategy from EnterpriseDB to present a …

[Read more]

Oct

2010

How Real is the Data Deluge?

Posted by Zack Urlocker on Mon 18 Oct 2010 12:03 UTC
Tags:

Technology, couchdb, hadoop, big data, r, mongodb, cassandra

It seems obvious that given the decreasing cost of storage and computation, there's going to be a significant increase in the volume of data that organizations accumulate over the next 10 years. But the type of data being accumulated may be different from the areas where traditional DBMSs dominated. It's not just about transactions; it's search patterns, on-line behavior, click-thru data, events fired off by smartphones, messages over Twitter & Facebook, log data of various kinds.

If an organization can figure out a better way identify prospects, or deliver more targeted ads, or optimize pricing decisions by analyzing terrabytes of data, they'd be crazy not to. Over the long term, companies that don't develop these capabilities will be at a competitive disadvantage.

As to what the implications are from a …

[Read more]

Sep

2010

The SMAQ stack for big data

Posted by O'Reilly Radar on Wed 22 Sep 2010 13:00 UTC
Tags:

Apache, Google, solr, data, hadoop, MapReduce, big data, NoSQL, smaq, strataconf

SMAQ report sections

"Big data" is data that becomes large enough that it cannot be processed using conventional methods. Creators of web search engines were among the first to confront this problem. Today, social networks, mobile phones, sensors and science contribute to petabytes of data created daily.

To meet the challenge of processing such large data sets, Google created MapReduce. Google's work and Yahoo's creation of the Hadoop MapReduce implementation has spawned an ecosystem of big data processing tools.

As MapReduce has grown in popularity, a stack for big data systems …

[Read more]

Sep

2010

Do We Need a New Programming Language for Big Data?

Posted by Zack Urlocker on Mon 13 Sep 2010 13:02 UTC
Tags:

Technology, Java, Delphi, c#, couchdb, hadoop, erlang, 10gen, kirk wylie, cloudera, james gosling, northscale, mongodb, cassandra, go, Hot Companies, riptano, anders hejlsberg, Scala

I'm the boards of two companies (Pentaho, Revolution Analytics) that are starting to see a lot of customer traction around Big Data. More and more companies in media, pharma, retail and finance are doing advanced analysis, reporting, graphing, etc with massive data sets. It made me wonder what other areas of the technology stack might evolve with the trend towards Big Data. Obviously, there's new middleware layers like Hadoop and Map Reduce, and we're also seeing the emergence of NoSQL data management layers with Cassandra, MongoDB, MemBase and others. But what …

[Read more]

Sep

2010

Open source in the clouds and in the debates

Posted by The 451 Group on Wed 08 Sep 2010 19:40 UTC
Tags:

postgresql, PHP, Java, software, Linux, ruby, Google, Python, opensource, amazon, 451 group, 451caostheory, 451group, caostheory, open-source, The 451 Group, the451group, Citrix, XenSource, cloud computing, VMWare, hadoop, black duck, jay lyman, jaylyman, rackspace, NoSQL, simon crosby, private cloud, public cloud, Open Cloud Initiative, sam johnston, Seeding the Clouds, Terremark, MySQL

We continue to see more evidence of the themes we discuss in our latest CAOS special report, Seeding the Clouds, which examines the open source software used in cloud computing, the vendors backing open source, the cloud providers using it and the impact on the industry.

First, as usual, we are seeing consistencies between our own research — which indicates open source is a huge part of today’s cloud computing offerings from major providers like Amazon, Google, Rackspace, Terremark and VMware — and that of code analysis and management vendor Black Duck. In its analysis of code that runs the cloud, Black Duck also found a preponderance of open source pieces, in many cases the same projects we profile in our report.

Indeed, open source software is an important part of the infrastructure, …

[Read more]

Sep

2010

Digg’s main competitor (Reddit) runs Cassandra but their VP of Engineering was fired for the decision to switch.

Posted by Kevin Burton on Wed 08 Sep 2010 04:16 UTC
Tags:

Uncategorized, hadoop, facebook, bigtable, cassandra, voldemort, digg, MySQL

Apparently, Digg performed a big migration from MySQL to Cassandra and a big migration to their new Digg v4 architecture and now their VP of Engineering has been shown the door:

Ever since Digg launched its new site design, it’s been plagued with all kinds of trouble, not least of which is that it keeps going down. The problems with the new architecture are so bad that VP of Engineering John Quinn is now gone, we’ve confirmed with sources close to Digg.

In a Diggnation video today, CEO Kevin Rose explained some of the technical issues the site is dealing with and why it can’t simply roll back to the previous architecture. The new version of Digg, v4, is based on a distributed database called Cassandra, which replaced the MySQL database the site ran on before. Cassandra is very advanced—it is supposed to be faster and scale …

[Read more]

Sep

2010

Integrating MySQL and Hadoop - or - A different approach on using CSV files in MySQL

Posted by Peter Romianowski on Sun 05 Sep 2010 21:46 UTC
Tags:

hadoop, MySQL

We use both MySQL and Hadoop a lot. If you utilize each system to its strengths then this is a powerful combination. One problem we are constantly facing is to make data extracted from our Hadoop cluster available in MySQL.

The problem

Look at this simple example: Let’s say we have a table customer:

CREATE TABLE customer {

id UNSIGNED INT NOT NULL,
firstname VARCHAR(100) NOT NULL,
lastname VARCHAR(100) NOT NULL,
city VARCHAR(100) NOT NULL,

PRIMARY KEY(id)
}

In addition to that we store orders customers made in Hadoop. An order includes: customerId, date, itemId, price. Note that these structures serve as a very simplified example.

Let’s say we want to find the first 50 customers, that placed at least one order sorted by firstname ascending. If both tables …

[Read more]

Aug

2010

The number of Hadoop jobs continue to rise

Posted by O'Reilly Radar on Sun 08 Aug 2010 21:16 UTC
Tags:

jobs, hadoop, big data

While still a small fraction1 of data management job postings, the number of job posts that mention "hadoop" continue to grow steadily. Year-over-year, there were 300% more such job posts2 in the first seven months of 2010 compared to the same period in 2009:

The fraction of "hadoop" jobs posted by California companies remain high, but is definitely lower than what it was last year:

(1) Over the last three months, job posts that mention "hadoop" were inching towards 8-10% of the number of job posts that mention "mysql".

(2) Data for this post is for U.S. online job postings through 7/31/2010 and is maintained in partnership with SimplyHired.com. We …

[Read more]

Jun

2010

451 CAOS Links 2010.06.29

Posted by The 451 Group on Wed 30 Jun 2010 00:14 UTC
Tags:

Oracle, links, Linux, Apache, microsoft, Yahoo, sun, opensource, Mozilla, Red Hat, 451 group, 451caostheory, 451group, caostheory, matt aslett, mattaslett, matthew aslett, matthewaslett, open-source, The 451 Group, the451group, SugarCRM, VMWare, hadoop, compiere, sco, talend, concurrent, nuxeo, tomcat, Mark Radcliffe, cloudera, bilski, DotNetNuke, simon phipps, Software Freedom Law Center, groklaw, karmasphere, datameer, appistry, david wiley, goto metrics, jorg janke, kitenga, mike masnick, Nick Halsey, oozie, twitter. microstreatgy, zementis

Elephants on parade: Hadoop goes mainstream. And more.

Follow 451 CAOS Links live @caostheory on Twitter and Identi.ca
“Tracking the open source news wires, so you don’t have to.”

Elephants on parade
# Cloudera launched v3 of its Distribution for Hadoop and released v1 of Cloudera Enterprise.

# Karmasphere released new Professional and Analyst Editions of its Hadoop development and deployment studio.

# Talend announced that its Integration Suite now offers native support for Hadoop.

# Yahoo …

[Read more]

Top Authors

Oracle MySQL Blogs

Vendor Blogs

MySQL Links