Debating the definition and merits of Big Data is like listening to a group of people who just watched the same movie. Some liked it, some didn't and everyone has a different opinion on what it all means.
When we've talked to government IT managers over the past year, we've quickly discovered that, to some, the term Big Data simply means rapidly growing collections of data – both structured and unstructured. To others, the term refers to sets that already have reached a certain size. To others, a collection doesn't qualify as Big Data unless it includes streaming data and collected social media interactions.
Measuring Big Data - Our Stake in the Ground
In about a week, IDC Government Insights will issue a forecast for Big Data in government. Since it can be challenging to gain common consensus on what qualifies as Big Data, we had to put our own stake in the ground to say exactly what we are counting, what we believe the U.S. government is spending on Big Data solutions and what level of growth we think the market will see in the coming years.
Basically measuring Big Data comes down to volume, velocity, and processing capability.
Under IDC's definition, in order to qualify as Big Data the collection needs to meet one of three criteria. There first needs to be more than 100 terabytes of data collected, or the data generated needs to exceed 60% growth per year, or the data received is fed into you system via ultra-high-speed streaming. Then, no matter which of the three criteria has been met, the data also needs to be deployed on a dynamically adaptable infrastructure. If it also meets this standard, there are two more checkpoints. It must come from two or more data formats and/or data sources, or the data must arrive via high-speed streaming connection. Only if the data (and its associated IT system) meets these checkpoints can the collection qualify as Big Data under our designation.
Under the definition outlined above, we believe the U.S. government will spend over $701 million on Big Data collection and processing activities in 2014.
We also believe this spending will double over the next three years, due to rapid growth in the types of data being collected (sensors associated with smart city efforts, stepped up defense-related data collection, and vast quantities of information related to everything from agricultural patterns to public health data to commercial transactions.
We further believe that different functional areas of Big Data will increasingly fall under the domain of significantly different types of contractors. For example, out upcoming report ranks vendors on a scale of 1 to 10 on how they are gaining revenue from different aspects of Big Data. Not surprisingly big systems integrators are handling the professional services aspect of Big Data collection and integration, while other vendors are highly focused on things like storage (think companies like EMC) Software, (think companies like SAP and Oracle) and networking (think Cisco and HP).
We also understand that government is a key leader when it comes to data collection and treating Big Data as a key resource that can be mined - in a way that will help both citizens and government understand more about what is happening in our country from multiple standpoints, including economic, social, health and national security.
So What Does this Mean for Privacy?
The White House has been active in addressing the privacy concerns that occur when so much personal data is collected. In January of this year the president issued a report called Big Data and the Future of Privacy. It talked about balancing the need to keep the country safe with the need for individual privacy. Since then Presidential Counselor John Podesta has set up a Website to collect public feedback on Big Data matters.
Our long-term take on government Big Data is that data collections will continue their rapid expansion. The government's demand for storage, hosting and processing power will continue to expand, and select key players will develop expertise in specific Big Data niches. It's fascinating to watch this whole process unfold, and even more interesting to measure it.