Big Data Bubble?

This is beginning to feel like 2001 all over. The hucksters and shiners are out there trying to make a big buck off the latest tech buzz!

FOLLOW THE MONEY and remember the old adage, garbage in garbage out!
Big data, or for that matter any data passes through 4 key states before it returns value.

  1. Data Acquisition: These are the agents that gather data, whether it be the gps location on your smart phone, last search you performed or photons collected by the Hubble telescope or CPU cycles consumed by a single server in a bank of 10,000. An agent is there collecting, gathering and becoming aware of its data.
  2. Data Encapsulation: The data must then be moved, transformed into an accessible data set and saved for later retrieval. The data itself, must have data that describes the data, otherwise it will be impossible to derive any sort of correlation. This ‘meta data’ association with data creates either planned or ad hoc structural data.
  3. Data Information: Organizing structured data by filtering anomalies (cleaning), bringing together different data sets (aligning) and creating data views is the process of converting raw, nearly un usable data, into high value information.
  4. Data Knowledge: Finally, this is where all of today’s attention seems to be. The shiny object! Presenting Data Information (the data views) in an attractive package that provides business leaders with the high confidence knowledge to perform a profitable business action.

Nearly every start-up I have seen mentioned, or acquisition done by a major company has focused on item #4, the shiny object shown to CEO’s and CIO’s. Even more disconcerting is item # 1. This is being performed regardless of the ‘value’ of the data being gathered (thus the ubiquitous ‘exabytes of data’ being bandied about as big data). Handling data after collected is not free, just look at what AWS charges for data storage. And the more data you have means the more movement and processing (steps #2 and #3). For a look at those costs, look at AWS charges for bandwidth and EC2 processing.

The bubble is coming!

Leave a Reply