Showing entries 151 to 160 of 164
« 10 Newer Entries | 4 Older Entries »
Displaying posts with tag: hadoop (reset)
451 CAOS Links 2009.08.07

Monty Widenius dissects MySQL’s dual license. Intuit moves to the EPL. And more.

Follow 451 CAOS Links live @caostheory on Twitter and Identi.ca
“Tracking the open source news wires, so you don’t have to.”

# Monty Widenius blogged about the apparent changes to the dual licensing of MySQL.

# Intuit announced that its code.intuit.com will be moving from CPL to EPL.

# Matt Asay asked whether Google’s open source advocacy might be a scheme to lower the value of patents.

# Vision Mobile’s Andreas Constantinou explained the differences between open source …

[Read more]
Is ScaleDB Using MapReduce? Competing with Hadoop?

I’ve had a few VCs ask how we compare to Hadoop and companies using MapReduce. With Google blessing MapReduce, it seems to be the cool new thing. I figure I’m going to have to explain this to VCs, so I might as well blog about it.

MapReduce is a process of dividing a problem into small pieces and distributing (mapping) those pieces to a large number of computers. Then it collects the processed data and merges (reduces) it into a result set. Hadoop provides the plumbing, so users focus on writing the query and Hadoop handles the dirty work of mapping and reducing. Such a query, using a procedural language like Java, is more complex than a comparable SQL query, but more on that below.

So what is MapReduce good for? It really shines when you want to summarize, analyze or transform a very large data set. This is why it is well suited to web data. Map reduce doesn’t utilize an index, so the tradeoff you need to consider is whether …

[Read more]
Is ScaleDB Using MapReduce? Competing with Hadoop?

I’ve had a few VCs ask how we compare to Hadoop and companies using MapReduce. With Google blessing MapReduce, it seems to be the cool new thing. I figure I’m going to have to explain this to VCs, so I might as well blog about it.

MapReduce is a process of dividing a problem into small pieces and distributing (mapping) those pieces to a large number of computers. Then it collects the processed data and merges (reduces) it into a result set. Hadoop provides the plumbing, so users focus on writing the query and Hadoop handles the dirty work of mapping and reducing. Such a query, using a procedural language like Java, is more complex than a comparable SQL query, but more on that below.

So what is MapReduce good for? It really shines when you want to summarize, analyze or transform a very large data set. This is why it is well suited to web data. Map reduce doesn’t utilize an index, so the tradeoff you need to consider is whether …

[Read more]
451 CAOS Links 2009.06.12

Yahoo opens up Hadoop distribution. Microsoft and Novell claim customer wins. And more.

Follow 451 CAOS Links live @caostheory

The elephant in the room
Plenty of news emerged form the Hadoop Summit this week, including Cloudera announced support for Amazon Elastic Block Storage (EBS) and introduced Sqoop, open source tool for importing databases into Hadoop, while Yahoo! Released! The! Yahoo! Distribution! Of! Hadoop! opening up its Hadoop developments to the wider community. As Savio Rodrigues noted, there has been a surge in the number of contributors for the Hadoop project in the last year.

Best of the rest

[Read more]
Simulating indexes in Hadoop

You should not try to use Hadoop as a “drop-in” replacement of your current (R)DBMS. That said it is still possible to utilize the power of cluster computing while circumventing its weaknesses when it comes to ad-hoc or real-time queries. We use Hadoop as an on-line system tightly integrated with our application and use it for both, long-running analytical queries and ad-hoc style queries.

In the mindset of a “traditional” database engineer one of the biggest concerns about Hadoop, or MapReduce in conjunction with a distributed file system in general, is the lack of indexes. Set aside that the debate “(R)DBMS vs MapReduce” is most of the time superfluous and sometimes almost leads to religious debates, the absence of a thing like an index is one the biggest hurdles you face when migrating data from a traditional DBMS.
Even …

[Read more]
451 CAOS Links 2009.06.02

Cloudera lands funding. SourceForge acquires Ohloh. Novell reports Linux growth. And more.

Follow 451 CAOS Links live @caostheory

Cloudera shows signs of progress

GigaOM reported that Cloudera raised $6m Series B funding from Accel and Greylock and is now looking beyond web applications to wider enterprise adoption of Hadoop. Cloudera also announced its first certification program for Hadoop.

Open source goes mainstream in the UK
There have been signs of change recently with regards to open source adoption in the UK, which has traditionally lagged behind the rest of Europe and the US. CBR Magazine provided an analysis of …

[Read more]
PDI cloud : massive performance roundup

Dear Kettle fans,

As expected there was a lot of interest in cloud computing at the MySQL conference last week.  It felt really good to be able to pass the Bayon Technologies white paper around to friends, contacts and analysts.  It’s one thing to demonstrate a certain scalability on your blog, it’s another entirely to have a smart man like Nicholas Goodman do the math.

Sorting massive amounts of rows is hard problem to take on.  Making it scale on low-cost EC2 instances is interesting as it …

[Read more]
Frank Mashraqi on Hadoop, memcached, and why the MySQL Conference is cool

Today I spoke with Farhan “Frank” Mashraqi, former Fotolog DBA, now working at a startup, NetEdge, working on social analytics. He’s talking about the two sessions he’s giving next week at the MySQL Conference & Expo 2009, as well as the benefits of being at the MySQL Conference & Expo.



He’s giving two talks:

  1. Hadoop and MySQL: Friends with Benefits in where he will tell you about how you can combine data sets and queries, some of which run on Hadoop, and others which run on MySQL, but eventually probably end up in MySQL (he works on this cool stuff at NetEdge, the startup he’s currently attached …
[Read more]
451 CAOS Links 2009.03.17

Cloudera debuts Hadoop support with $5m in funding. The financial value of open source. More patent problems for Red Hat. Government open source projects on both sides of the pond. Symbian’s release plan. And more.

Follow 451 CAOS Links live @caostheory

Cloudera makes it official
We previously reported the launch of Cloudera a new vendor set up to provide support for Apache Hadoop and related projects back in October. The company made its official debut in not-so polite open source society with the launch of its distribution for Hadoop and …

[Read more]
451 CAOS Links 2009.01.02

A bumper CAOS Links rounding up the news and views from the festive period, including: Red Hat revenue up 22% in 3Q. Alan Cox departs for Intel. Evolving open source business strategies. The commercialization opportunity around OpenOffice.org. And more.

Official announcements
Red Hat Reports Third Quarter Results Red Hat

OpenLogic Survey Highlights Enterprise Perspectives on Open Source Application Servers OpenLogic

Asianux Concludes Triumphant Year, Welcomes Fifth Member Asianux

News articles
The future of open source

[Read more]
Showing entries 151 to 160 of 164
« 10 Newer Entries | 4 Older Entries »