Showing entries 171 to 180 of 211
« 10 Newer Entries | 10 Older Entries »
Displaying posts with tag: big data (reset)
Write Optimization: Myths, Comparison, Clarifications, Part 2

In my last post, we talked about the read/write tradeoff of indexing data structures, and some ways that people augment B-trees in order to get better write performance. We also talked about the significant drawbacks of each method, and I promised to show some more fundamental approaches.

We had two “workload-based” techniques: inserting in sequential order, and using fewer indexes, and two “data structure-based” techniques: a write buffer, and OLAP. Remember, the most common thing people do when faced with an insertion bottleneck is to use fewer indexes, and this kills query performance. So keep in mind that all our work on write-optimization is really work for read-optimization, in that write-optimized indexes are cheap enough that you can keep all the ones you need to get good read performance.

[Read more]
From Under the Desk to the Cloud

 

Review of the O’Reilly Strata Making Data Work Conference

(reprinted from my guest blog for the Cloud Council of 7)

Monica Rogati of LinkedIn told a story of the early days at the firm, when the reporting system consisted of a single server under someone’s desk. One day, someone needed an Ethernet cable and unplugged the machine from the data outlet in the wall. LinkedIn’s data reporting, its life blood, instantly came to a screeching halt.

The Push to the …

[Read more]
What is the biggest challenge for Big Data?

Often I think about challenges that organizations face with “Big Data”.  While Big Data is a generic and over used term, what I am really referring to is an organizations ability to disseminate, understand and ultimately benefit from increasing volumes of data.  It is almost without question that in the future customers will be won/lost, competitive advantage will be gained/forfeited and businesses will succeed/fail based on their ability to leverage their data assets.

It may be surprising what I think are the near term challenges.  Largely I don’t think these are purely technical.  There are enough wheels in motion now to almost guarantee that data accessibility will continue to improve at pace in-line with the increase in data volume.  Sure, there will continue to be lots of interesting innovation with technology, but when organizations like …

[Read more]
Online Advertiser Intent Media Selects TokuDB over InnoDB and NoSQL for Big Data Ad-Hoc Analysis

Intent Media

Issue addressed: Ad hoc analytics on clickstream data arriving too fast for InnoDB or NoSQL to handle.

TokuDB powers an online advertising application

The Company: Headquartered in New York, Intent Media is a fast-growing online advertising startup. The company helps some of the largest online retailers monetize their traffic more efficiently at scale by showing highly relevant and targeted advertising to the 97+% of e-commerce visitors who do not transact.

The Challenge: The Intent Media platform processes hundreds of millions of events a day generated by media placements across leading e-commerce sites — a textbook “Big Data” challenge. Intent Media’s data is used to optimize media placements, drive segmentation models, and …

[Read more]
NSA, Accumulo & Hadoop

Reading yesterday that the NSA has submitted a proposal to Apache to incubate their Accumulo platform.  This, according to the description, is a key/value store built over Hadoop which appears to provide similar function to HBase except it provides “cell level access labels” to allow fine grained access control.  This is something you would expect as a requirement for many applications built at government agencies like the NSA.  But this also is very important for organizations in health care and law enforcement etc where strict control is required to large volumes of privacy sensitive data.

An interesting part of this is how it highlights the acceptance of Hadoop.  Hadoop is no longer just a new technology scratching at the …

[Read more]
Ask What Your Database Can Do for Your Country

How many in your household again?

One of President John Kennedy’s most memorable phrases is “ask not what your country can do for you –  ask what can you do for your country”.  I got to thinking about this over lunch with a fellow colleague in the big data space. After comparing named customers for a while, we realized we had forgotten one of the biggest “big data” customers whom we both have in common – the government.

Whether you believe in small or big government, one thing is for certain – it has some very big data on its hands. Some of this is freely available, such as the …

[Read more]
NoSQL Now 2011: Review of AdHoc Analytic Architectures

For those that weren’t able to attend the fantastic NoSQL Now Conference in San Jose last week, but are still interested in the slides about how people are doing Ad Hoc analytics on top of NoSQL data systems, here’s my slides from my presentation:

No sql now2011_review_of_adhoc_architectures View more presentations from ngoodman We obviously continue to hear from our community that LucidDB is a great solution sitting in front of a Big Data/NoSQL system. Allowing easy SQL access (including super fast, analytic database cached views) is a big win for reducing …

[Read more]
Database Insights from Archimedes to the Houston Rockets

Archimedes, the first DBA

According to a recent MIT Sloan Management Review study, top performing organizations use analytics 5 times more than lower performers. That’s pretty astounding. And while we all know about the ocean/lake/waves/(your favorite water analogy) of Big Data we struggle with everyday, information is not knowledge. So how can we get insight from data? Recent articles from O’Reilly and HBR offered some …

[Read more]
Reply to The Future of the NoSQL, SQL, and RDBMS Markets

Conor O'Mahony over at IBM wrote a good post on a favorite topic of mine “The Future of the NoSQL, SQL, and RDBMS Markets”.  If this is of interest to you then I suggest you read his original post.  I replied in the comments but thought I would also repost my reply here.

-----------------------------------------------------------------------------------------------

Hi Connor, I wish it was as simple as SQL & RDBMS is good for this and NoSQL is good for that.  For me at least, the waters are much muddier than that.

The benefit of SQL & RDBMS is that its general purpose nature has meant it can be applied to a lot of problems, and because of its …

[Read more]
IA Ventures - Jobs shout out

My friends over at IA Ventures are looking both for an Analyst and for an Associate to their team.  If Big Data, New York and start-ups is in your blood then I can’t think of a better VC to be involved in. 

From the IA blog:

"IA Ventures funds early-stage Big Data companies creating competitive advantage through data and we’re looking for two start-up junkies to join our team – one full-time associate / community manager and one full time analyst. Because there are only four of us (we’re a start-up ourselves, in fact), we’ll need you to help us investigate companies, learn about industries, develop investment theses, perform internal operations, organize community events, and work with portfolio companies—basically, you can take on as much …

[Read more]
Showing entries 171 to 180 of 211
« 10 Newer Entries | 10 Older Entries »