In this article I want to explore 3 connected ideas. The first is about big data, the phenomenon that now makes available enormous, staggering, volumes of information almost instantaneously. The second is a condition that says information already known to us can limit how big data can be used because there are other types of knowing to expand thinking. And the third idea is the sum of the first two: that the intersection of big data and a different way of seeing information means a model must be designed to utilize large amounts of validated information in a reasonable way.
So the era of big data is here. Imagine Niagara Falls and the millions of gallons of water that shoot over the precipice virtually every minute. That’s the scale of information we envision when thinking of the amount of data we can reach out and grab—or in some cases is pushed to us—everyday. It would be impossible to know it all. But one benefit of massive volume is, when looked at through a certain lens, we have an opportunity to connect seemingly unrelated bits of data and discover trends, make predictions, even pre-position products and services long before we click, point or touch. It’s the compilation of colossal amounts of data that presents a challenge. How do we pluck just the right information we need from this torrent of bits? This is the difficulty with information management in the era of big data; it’s like trying to take a sip from a fire hose. Our need is not to get information it’s how to get just the right content to help us work with more accurate and insightful facts and, smarter and faster?
Clearly then we see how working out a process to employ big data and make quality business decisions is difficult. Furthermore consider this context, our second condition:
“There are known knowns” began an answer to a question at a US Department of Defense News Briefing made by Donald Rumsfeld while serving as United States Secretary of Defense in February 2002. Actually, here’s the whole tortured phrase… “there are known knowns; there are things we know that we know. There are known unknowns; that is to say, there are things that we now know we don’t know. But there are also unknown unknowns – there are things we do not know we don’t know.” Though it may seem convoluted it is a “brilliant distillation of a quite complex matter,” said Mark Steyn, a Canadian columnist and echoed by many others, even legions of his detractors.
While good information on its own is valuable its utility when combined with other data to discover other, perhaps new data and still newer meanings is really profound. Sometimes the information is known and we need to fasten it in context, other times we don’t know there is… and what is… trustworthy information but have to discover it; and more abstract yet, as unknowns the potential of useful but opaque information demands we peer into the future and ask ‘what if’ and proceed to manufacture information on (hopefully intelligent and intuitively perceptive) speculation.
If you’re in the business of solving problems—and who isn’t really—you’ll need an information life cycle model to regard big data and the ‘knowns issue’ to manage a collection of information for maximum use. And beware; too much data without vetting and affirmation, means you might miss the really important stuff, an effect that keeps security services awake at night. And therein lies the third concern of massive information management.
By summary then, we face three elements in our quest to make big data work for organizations:
- Gathering information factoring the effect gained when combinations of content reveal even newer more, newer, meaningful data
- Respecting knowns and unknowns as fact and as potential ‘black swans’ (an unpredictable or unforeseen event, typically one with extreme consequences) that can and will skew results if not discovered prior or during information capture or application
- Culling the really useful information or data—those bits directly related to the problem at hand—from the gargantuan amount of information flying about making it accessible, contextual and changeable.
Here’s a model than might help us slow down a bit, turn down the faucet and cull out know information and potentially new content when big data offers additional tonnage of content.
The flow chart illustrates how information would be categorically organized; a model for the standardization of an information life cycle in big data world.
Ultimately culling useful information from an almost limitless stream comes down to energy, resourcefulness and commitment. When building a learning course for example, your subject matter experts deliver very specific information as they must do. However, is there other data in text, as visuals, in video that might provide a different way to see the information? Clarifying content by shifting the context just a little bit can often shine a light into corners formerly unseen. Whether one has the time or inclination to make the effort to go shopping for more information is dependent on time and budget, yes, however, when looking to make learning better and richer, drinking from the stream is often a task worth enduring. Creating metaphors mined from a combination of newly discovered information can improve the user experience—and enjoyment—like spinning a kaleidoscope and seeing new patterns. Using a model such as the one proposed might make such an effort more reasonable.