As a publisher of data, I’ve always been fascinated by the growing use of the term `Big Data’ as it always seemed an ambiguous phrase with no apparent definition.
Standard Media Index (SMI) publishes huge volumes of data – at last count the Australian data set alone contains more than 18 billion rows of data – and the number literally grows at an exponential rate each time we add another dimension (filter), product category or month of data.
So we certainly fit the definition of being `Big’. In the last year alone our data set has swollen as we’ve grown the number of categories on which we report by 25%, while also adding the digital ad type and state source dimensions in Australia.
Our digital ad type views alone contains five new data options: the display, mobile, video, search and other digital ad formats (all of which can be seen against each of the 3,400 digital vendors and 37 product categories) while the state source (also known as agency market) dimension provides the same level of detail (i.e. all vendor and category detail) for each of the Melbourne, Sydney, Brisbane and Adelaide/Perth agency markets.
So, given the obvious size of SMI’s data set, does that mean we qualify as Big Data? Or is Big Data something bigger than simply a very large data set?
The issue was fortunately clarified recently by a great article in Fairfax Media’s Boss magazine titled `The Raw Facts on Big Data’.
The article was based on a survey by Boss and the University of Sydney’s Business School to members of The Australian Financial Review’s Business Leaders’ Panel.
Among the questions posed was how to define Big Data. Of the 100-plus respondents, 45% defined Big Data as multiple data sets being linked together and 43% said it was Using data to make evidence-based decisions.
Each response provides interesting insights for SMI and its place in the advertising research market.
Historically, the ability to understand customer behaviour absolutely did require – and for most industries still does require – the linking together of multiple data sets.
We’re often reading stories in today’s media about marketers overlaying credit card or other non-identifiable data with data from reward programs and the like to try and unlock the buying habits of their potential customers so they can better target their products and marketing.
However, some single data sets are sophisticated enough that they can stand alone and be as meaningful and actionable to individuals or businesses as a combination of data sets.
SMI for example, gives media companies the benefit of not having to link together or survey their main customers – the media agencies – as it has pulled together their actual payment detail to provide a view of exactly where these customers are spending their clients’ funds, and all the data is updated each month.
Similarly, advertisers can see for the first time the actual spend trends within their product category, across all media sectors, in real time each month.
And that leads us to the second definition of Big Data – using data to make evidence-based decisions – and SMI certainly fits within that definition.
Evidence-based decisions can only be made if the data user is confident the evidence is completely reliable, and as SMI’s data is primary source data – it’s literally the collation of the actual payments by the agencies to each individual media vendor each month – it is arguably the only data which can be relied upon to help inform big decisions.
And that obviously breeds a greater level of sophistication across the media sector.
So the answer to my initial question seems to be that SMI is a fundamental part of the Big Data equation for the media sector.
The actual ad spend data we produce is informing the data scientists within the media world, ensuring our media industry also benefits from this new world of `Big Data’.