Big Data? Why not Focus on Small Data First

A recent report from The Data Warehouse Institute[1] reveals that only a small percentage of respondents are organized to execute Big Data initiatives.  In addition, many are capturing Big Data from many sources, but the in-depth analytics side of Big Data was only happening in pockets.  In general, many of the respondents were in the pre-adoption or early adoption phases of Big Data initiatives.

So, how does this reconcile with the constant hype cycle around Big Data?  It doesn’t.  In fact, Gartner’s 2014 Hype Cycle for Emerging Technologies[2] has put Big Data on a downward trajectory towards the “Trough of Disillusionment” and an expectation that real productivity in this space will not be achieved for another 5 to 10 years.  What is the real message here?  From our perspective, it’s the fact that most organizations still have yet to get their “Small Data” in the right place to want to even have a foundation on which to drive a Big Data initiative.  Without the essentials in place that allow for enterprise data management and reporting, organizations should step very wisely into the Big Data space, as they will be ignoring risk and value-unlocking efforts that could be achieved with Small Data.

What is Small Data?                     

Small data is the information that you use every day to manage the business of your organization.  It drives process execution, reporting, and, ultimately, decisions.  It’s information like customers, materials, vendors, assets, financial account, employees, etc.  We use the term “Small” here to distinguish it from “Big Data”, which is only “Big” in the sense that the volume of data collected is significantly larger, might change more quickly, and have more variability than what you might see in the “Small” data space.  Without Small Data, most organizations would be flying blind, and managers would be unable to manage.

Given that Small Data is so important, think about the time that is spent ensuring that it works for your organization.  Do you trust the information that sits behind the daily report you use to manage how resources are allocated within your organization?  Or, do you build complex spreadsheets to parse out what you need and get to the real answers?  In many cases, organizations are being run off of these spreadsheet “shadow” systems to account for challenges in the Small Data space.

Getting control of small data, how it flows through your organization, and where it adds value to the bottom line should be a key goal for executives.  The main avenue of exploration should be in the area of data governance: who owns the data, what does it mean, is it master data or transactional data, how do we use, and who can change it.  These seem like simple concepts, but can be massive hurdles for organizations where this Small Data is pervasive across processes, systems, and organizational divides.

Priorities, Priorities, Priorities

What about organizations who have figured out Small Data already?  First of all, congratulations; your organization has delivered a capability set that can offer competitive advantage in the marketplace.  Now, your next challenge will be to replicate what has done with your Small Data on your Big Data, which presents another level of complexity to be tackled.  Current technologies in the Big Data space make tackling the questions of data lineage and data quality a difficulty that your organization will have to face[3].

Given all of the above, should we abandon all Big Data efforts and the promise of the underlying technology?  No, but an organization should take a long, hard look at itself before choosing to prioritize Big Data initiatives when it hasn’t solved its Small Data challenges.




Related Reading