We’ve talked before about the host of business analytics challenges that Big Data is poised to solve. For some companies, these are beginning to see the light of day as they bring the increasingly sophisticated and powerful tools available in the marketplace to bear on novel data scenarios. For other companies, they are struggling simply to get on the bandwagon and wrap their arms around the complex issues involved with developing an effective big data solution and deploying it in a manner that generates true and quantifiable business value for their organization.
That said, it behooves us to take a look at one of the problems that Big Data will not solve: report proliferation.
Any BI or Data Warehouse architect will be able to recount some situation from their career in which two directors (or worse yet, C-level executives) sat on opposite sides of the table, each clutching a report that came out of their systems and had been carefully prepared by their analytics staff, and which claimed to be about the exact same data, but did not match. What had happened? Over time teams had taken reports and adapted them for users’ requests, adding a column here, changing the sort order there. The result was an explosion of report variants that in the vast majority of cases, were used once if at all and then left to rot in the data warehouse. Which one is the right one? Sadly, they all are. Or conversely none of them is.
Now, as we struggle with Big Data paradigm challenges, we find ourselves revisiting the “single version of the truth” rallying cry of the mid 2000’s. What was happening them is happening now: we saw the problem of diverging versions of data interpretation, but had no way of reconciling them all. It’s exacerbated ten years later by the sheer volume of data involved, as well as the variety of data sources we’ve never integrated into our analytics processes before. With no precedent, everyone is making up the data interpretation “rules” as we go. The cause is different, but the effect is the same. Report volumes explode as we struggle to identify the information inside the mountain of data.
One growing edge for Big Data will be its capacity to deliver on the “single version of the truth” as data volumes and velocities increase at unprecedented rates. Organizations and reporting teams that find effective ways to guide their user communities into responsible “curatorship” of Big Data analytics will be much better-prepared to withstand scrutiny of their reports when they (inevitably) appear to diverge from others’ reports or from legacy versions of their own!
DataHub Writer: Douglas R. Briggs
Mr. Briggs has been active in the fields of Data Warehousing and Business Intelligence for the entirety of his 17-year career. He was responsible for the early adoption and promulgation of BI at one of the world’s largest consumer product companies and developed their initial BI competency centre. He has consulted with numerous other companies about effective BI practices. He holds a Master of Science degree in Computer Science from the University of Illinois at Urbana-Champaign and a Bachelor of Arts degree from Williams College (Mass)..
View Linkedin Profile->
Other Articles by Douglas->