Machine Learning Times
Machine Learning Times

This excerpt is from Information-Management. To view the whole article click here.  

5 years ago
The Big Data Odyssey


In the world of IT, we seem to fall victim to a new technology “siren’s song” approximately every five years. The most recent of these new irresistible forces is Big Data.

While our industry has been talking about Big Data for more than a decade, it is during the past five years that this business concept has taken flight, and it now pervades nearly all discussions held between IT organizations and the businesses they support.

I too have been swept up in the Big Data frenzy, both as a technologist and an author. I have interacted with thousands of practitioners in hundreds of organizations, and I’m lucky to have a feel for the pulse of the industry.

One thing is clear: Our industry has entered Gartner’s “Trough of Disillusionment” with Big Data, and organizations are struggling to determine why they are not gaining the breakthrough results that so many promise from Big Data.

There are remarkable similarities in the stories of Big Data failure, and one theme stands as a clear common source: Organizations aren’t actually doing Big Data at all.

USE CODE PATIMES16 for 15% off Predictive Analytics World Conference pass.

Most organizations I have encountered buy into the notion that Big Data is a technology change, rather than a cultural, social and behavioral change. In this way, they are following the siren’s song that suggests achieving Big Data results is as simple as switching technology platforms.

This is most emphatically not the case, yet many organizations don’t realize the error until its Big Data efforts fall short of expectations. The Siren’s Song of “Big BI” While attempting to do Big Data, most organizations I work with are instead doing “Big Business Intelligence,” or “Big BI.” Big BI has three basic characteristics:

1. You’re Asking the Same Questions of the Same Data, You Just Have More of It

Most organizations that I work with believe that Big Data is all about Bigness; that it’s the size of their datasets that matters most, and a petabyte of data is inherently more valuable than a terabyte. Hence, when these organizations try to embrace Big Data they do so by looking at the same old data they always have, but they look at more of it. Even worse, they often ask the same old questions of that same old data, somehow thinking that using more data will somehow change the results.

You can make the biggest cluster the world has ever seen, fill it with peta- or even zetta-bytes of data, but if you’re asking the same question of the same data you will almost certainly get the same answer. To get Big Data “right” you need to ask new questions of new data. The best approach consists of taking familiar structured data and combining it with other data sources (preferably unstructured data) that you’ve never analyzed before. Put them together, and then start asking new questions. That is how you get new insights.

2. You’re Running Batch Processes, and Review the Results

Periodically In traditional BI we spend a great deal of time in processing, assessing and cleansing data, long before we actually analyze it. This is so common for us that the process has a name: Extract, Transform and Load (ETL).

After these clean-up steps occur, we set up a process that actually analyzes the data as a batch, which generates an output after the batch completes. ETL and batch processing are fundamental concepts in Business Intelligence, hence we generally feel that this is the right way to perform analytics.

The problem with these approaches is that they’re not right for a Big Data world. We still need to do these analyses to support our business; quarterly reports aren’t going away any time soon. But, if you consider the experiences that you and I now demand from our smart phones, apps and context-sensitive services, the need for real-time or even predictive insights becomes critical.

By the time you cleanse your data and run your batch process, I’ve already hailed another taxi, rented another apartment or signed up for another brand of car insurance. Big Data is about generating insights in real-time. If you’re waiting for a batch to run in order to review a report you’re doing Big BI.

3. You’re Wondering What to Do Next

The world of BI is one where reports are often viewed as an end-product. Many people in BI believe their work is completed once a report is generated. Reports are viewed as business outputs, rather than as business inputs. Many in business can identify with the experience of receiving periodic business reports.

We diligently review each report that may provide an interesting metric or two, but how often do we leave those review meetings with the urge to take immediate action? I would argue that the actions resulting from such reports and meetings are rare and often consist of telling our organizations, “Do what you do, just faster, cheaper or more emphatically.”

This is not a path to breakthrough results. If we don’t learn to act upon new insights from our analytics efforts we are treating reports as outputs, rather than as inputs. The point of Big Data should be to generate new, different, value-added actions, not more reports. If you’re efforts lead to numerous reports, and minimal action, then it’s likely that you’re stuck in a Big BI world.

Moving from Big BI to Big Data How do you free yourself from the endless loop of Big BI? How do you start to learn new things about your business, your customers and yourself, so that you can generate new results? How do you get to Ithaca, like Odysseus in The Odyssey, and avoid the rocky shores in between?

This excerpt is from Information-Management. To view the whole article click here

By: Christopher Surdak
Originally published at

Leave a Reply

Pin It on Pinterest

Share This