“Data-Driven Thinking” is written by members of the media community and contains fresh ideas on the digital revolution in media.
Today’s column is written by John Shomaker, CEO at AdJuggler.
The New York Times recently ran a column by Gary Marcus and Ernest Davis, two professors in NYU’s computer science department, questioning the merits and hype of so-called “big data.”
The term big data reflects the exponential increase in data that many industries, governments and individuals now collect, as well as the supposed insights yielded by these heaps of data. The combination of pervasive, real-time data collected from online interactions, social media, ecommerce and mobile devices, combined with the ever-shrinking costs for database software and storage, are the drivers behind frequent assertions that 90% of all known human knowledge has been captured in the past decade. Is this data, as the professors imply, more noise than signal?
The noise observation is understandable. Arguably, we’re still in the first phase of big data: collection. Google, the epicenter of all data with photographs outside your front door, recently acquired Nest as an Internet of Everything play, so it can collect more data inside your house. And, just weeks ago, Facebook received permission to acquire Oculus VR, ostensibly to track your virtual life now that your real life is fully on display.
But I’m still a believer. In most industries the leaders differentiate themselves by building, pricing and servicing products that reflect the attitudes and behaviors of their customers, who are increasingly described as a portfolio of segments or individuals. Digital advertising is on the forefront of big data, operating a rapid-fire and increasingly intelligent digital dialogue with consumers as they learn, shop, buy, share and recommend.
In the quest to swim past the data itself and find real insight, I see five scenarios where big data presents tangible improvement over smaller data sets:
- Replacing less statistical analyses
Prior to the growth in data, many analyses, segmentation profiles and operating metrics were based on small sample sizes, surveys or focus groups. Today, the larger data sets allow for richer, more statistically significant results, marginalizing more qualitative findings.
- Finding a needle in the haystack
The breadth of the datasets themselves – what’s captured – is also much larger. Yes, this leads to a huge risk of noise, but, in certain applications such as cybercrime or terrorism, considering obscure attributes can yield important, yet unforeseen, results.
- Getting to the big picture
One of the biggest computing and operating challenges of the last 30 years has been system fragmentation and data “islands.” For companies with multiple divisions, country organizations, products and customer touch points, big data finally enables a comprehensive view of the customer through their entire life cycle.
- The fourth dimension: time
Historically, most data sets and data analyses were limited to one-time snapshots and didn’t accurately reflect longitudinal changes over time. By capturing data – lots of it – and capturing it by day, hour or second, our understanding is more complete and trended predictions more accurate.
- Real time
Just as more data is collected across an expanded set of online user connections, we and the machines can learn and respond in an equally real-time manner. In the world of security trading or online advertising, it is commonplace to respond to a live trade or consumer in mere milliseconds, adjusting pricing and offers based on the individual and the entire pool of users.
Big data is relative. There’s no agreed-upon size that defines it. But as the cost of data collection, storage and analysis approaches zero, organizations are motivated to innovate how data can lead to truly unique insights, organizational differentiation and entirely new business models.
Follow John Shomaker (@jshomaker), AdJuggler (@AdJuggler) and AdExchanger (@adexchanger) on Twitter.