Standard Media Index - Standard Media Index has the only accurate, actionable ad spend data fresh from the invoicing source, so you can tackle high stakes decisions with confidence.

back to blog

Big Data: The TV Marketers’ Guide

August 8th 2017

A version of this article originally appears on FierceCable

Definition Source: Merriam-Webster Dictionary

As Big Data continues to gain volume, validity and velocity, the imperative for marketers – particularly those in the traditionally linear Television space – is to acquire the knowledge and skills to take the initiative, articulate and respond to clients’ needs in our ever-evolving sphere.

The TV ecosystem will incrementally slip into the Data space, catalyzed by new distribution platforms, new content consumption behaviors, new targeting capabilities, new analytics-driven intelligence, and maturing automated buying. Despite the foreignness of the language, and the near-overwhelming plethora of acronyms, we do not all have to instantly metamorphosize into data scientists.

Conversely, this is an opportunity for us to better grasp how data can have real world applications (and implications), understand the vernacular and varied methodologies of data-science-based research, and find new, yet creative, ways to deploy data adroitly to effectively accomplish marketing goals.

Data is merely a new asset within our playbook, albeit a powerful one in our data-rich era.

A data-literate individual will be cognizant of how the activation of certain media stimuli will create desired outcomes, and perhaps even its causality or how this may be attributed (be it via “linear correlation,” “time-series regression,” or “Single Source”). This implies a more scientific observation than the vague (albeit true) mantra, “TV (or substitute preferred media type here) campaigns produce brand awareness, which results in sales lifts.”

More importantly, data literacy will empower us to ask the right questions, understand the “whys” and “wherefores,” and help marketers ferret out additional resources to achieve specific marketing goals. How can we immerse ourselves in the gargantuan realm of data? Understanding the basic nomenclature is a good start. We can then evolve this knowledge by asking the right questions, and ultimately, turning information into action. As has been much noted, actionable data (and concise insights) is the gateway to strategic decision-making, and accelerates our competitive advantage. Once we understand that specific data exists (and there is an over-abundance of data – both good, and bad - available), and we know what we would like to achieve, all else is just noise.

To get started, begin understanding the following commonly-used nomenclature (some may be more familiar than others) and adding them to your conversations:

Big data

A broad, catch-all term, for vast amounts of data that cannot be processed using obsolete tech. This data might be structured, or unstructured. Challenges include data capture, storage, analysis, curation, search, sharing, transfer, visualization, querying, updating, and information privacy. As the saying goes, it is not the size of data that matters, but what you do with it.

1st party data

First-party data is ambrosia; the richest source of insight into your ideal customer. This is data that is collected, owned, and managed by your organization. It is the proprietary data collected by you, from your audience and customers, and stored in your systems. For most brands, this constitutes their CRM (Customer Relationship Management) dataset.

2nd party data

Second-party data is data that consumers provide to another company that—with the consumer’s notification and consent—you may access through a partnership agreement. More plainly phrased, this is someone else’s First-party data. For example, you could broker a deal with an airline to share loyalty card data to improve your targeting.

3rd party data

Third-party data is data that is collected on other sites, platforms, or offline, by a third party. Third-party data is used extensively by companies to better understand their audiences and to better target prospective customers. This is useful for demographic, contextual, and behavioral targeting.


Before any Data Onboarding, or Matching, can occur, the handling of Personally Identifiable Information (PII) is critical. PII is a highly sensitive issue in information security, as well as privacy laws. PII is information that can be used to identify, distinguish, or locate a single person, or to pinpoint an individual in context (both behavioral, or otherwise differentiate one person from another). PIIs can take the form of first/ last name and postal address, an email, or (unhashed) Customer ID.

Offline Data

Offline data is data that is collected, and stored, in ‘offline’ systems, like CRM datasets, Point of Sale (POS) systems, Frequent Shopper Card (FSC) databases, or contact center applications. Typically, this data is tied to some form of Personally Identifiable Information (PII). A lot of marketers have struggled to integrate this offline goldmine into their Digital marketing and Cross-Media attribution efforts. Data onboarding resolves this disconnect.

Data Integration

Rather inconveniently, customer data is often warehoused in disconnected systems. Data integration is the process of combining data sets that live in different applications so the marketer has a unified view of its customer. In addition to naming conventions (think “F.Name” in Data Set 1 vs “First Name” in Data Set 2), or alias values (CA vs California), the marketer also needs some way to recognize that Christine Hayes (maiden name) and Chris Newberger (married name) are actually the same person. This picture gets even more complicated when matching offline records to online data. For example, the digital ads that Christine was exposed to. To put these data sets together, marketers need a privacy safe approach to recognition and data connectivity that anonymizes the individual’s records before matching and linking their data. Companies like LiveRamp, Neustar, and Experian are working on a “Universal ID,” which combines offline characteristics to online identifiers, in an anonymized, privacy-compliant, manner (aka PII-free).

Data Matching

This is the act of connecting entities (eg, a marketer’s customers) in one data set, to their corresponding ID codes in another data set. For instance, you could match cookie data from your website to CRM data about your customers, and make sure Joe Customer gets treated the way Joe Customer deserves (courtesy of his purchase history), whenever he’s on your site. To comply with privacy regulations and best practices, offline data needs to be anonymized before it can be matched to Mobile Devices, or Digital IDs.

Deterministic/ Probabilistic Matching

Data matching can be either deterministic or probabilistic. Deterministic Matching performs comparisons based on given factors and weighing calculations on two data records to determine a precise match. It then generates a score for whether the data records match or not. In contrast, Probabilistic Matching takes into account the relative closeness of the data and the context of the data records, and then assigns each of the identifiers a weighted score for the likelihood that the data records match (this is considered a “fuzzier” methodology).

Match Rate

The Match Rate is the percentage of corresponding/ overlapping IDs when two disparate data files (offline IDs to cookies, pixels, tags, for example) are combined. This is a key factor in validating how well a marketer can execute Cross-Device identity mapping. The higher the match rate, the greater the viability – or adroitness - of any insights gained from the matched dataset. For example, a 70% match rate means that 70% of the two files used for matching corresponded to each other, whilst the remaining 30% did not have a common identifier/ characteristic, which precluded it from the match. Unmatched IDs (depersonalized) are typically discarded in a match: the percentage of unmatched IDs is called the Attrition Rate. As data quality, and Identity Unification (across media) becomes increasingly vigorous, the Attrition Rate will (we hope) become correspondingly lower.

Data Onboarding

Data onboarding is the process of matching data collected offline to data collected online. A marketer’s offline customer data can include first-party information, for example, purchase data (collected in-store) or loyalty-card center data, as well as Third-party information like demographics and buyer propensities (sourced from Experian, for example). This offline data is matched to online Devices and Digital IDs in a privacy-compliant way by companies like LiveRamp and Neustar, modeled to scale, then delivered (anonymized) to the ad platforms, DMPs, and Social channels, where advertisers can run their campaigns. This process enables more strategic Cross-Channel, and Omnichannel, campaigns.

Data Management Platform (DMP)

A ‘Data Management Platform’ sounds simple enough: it is a platform for managing data. DMPs evolved from the need to analyze data collected in anonymous cookies from websites and DSPs. DMPs allow marketers to build segments—using behavioral data from a marketer’s own campaign and/ or a 3rd party—to target specific audiences with relevant ads. DMPs centrally manage and present all specific marketers’ campaign activity, as well as audience data, to help them optimize their media buys, and creative executions. When combined with data onboarding, DMPs’ capabilities are amplified. Data onboarding connects a marketer’s offline customer data with the online data stored in their DMP to target more effectively, and build Look-alike models that help them reach more prospects who resemble their best customers.

Demand-Side Platform (DSP)

A ‘Demand Side Platform’ is software that automates the buying of display, video, mobile and search ads for marketers. It automates the more complicated aspects like targeting the right audiences, buying the right impressions in real time, delivering the right creative, and finding the right publishers.


We are fast approaching some serious data geekery, but this is worth knowing. ETL stands for ‘Extract, Transform, and Load’. Those are basically the three things a marketer needs to do when moving data between systems. Specifically, a marketer has to extract the data from the source, transform them into a standardized format, and then load them into a destination system. This is the most common approach to linking the data stored in one structure to the data stored in another structure, or merge multiple disparate files into a single coherent format.

Agency Trading Desk

Agency Trading Desks are centralized management platforms used by ad agencies that specialize in programmatic media and audience buying. They are typically layered on top of a DSP, or other audience buying technologies. Trading desks attempt to help clients improve their advertising performance and receive increased value from their display advertising. Trading desk staff don't just plan and buy media; they also measure results, and report audience insights to their clients. All major advertising holding companies have agency trading desks, including Publicis (VivaKi), IPG MediaBrands (Cadreon), WPP (Xaxis) and OMG (Accuen/ Annalect). Trading desks were created in order to provide both client and agency more control over their ad placement (“aka “Premium” buys). When working with an ad network, the client often has limited say over where the ad is placed. Working with a trading desk allows the client to direct where ad dollars are spent, and more closely examine the results to optimize, if necessary.


In a research sample, the number of Households, or Persons, supplying consistent, useable information for reports, or special tabulations. In-tab is usually expressed as a percentage of the sample supplying usable information (“good data”) over an average period of time (ie: 30 days, a year, etc).


Data modeling is the analysis (and determination) of specific data objects and their relationships to other data objects. It is often the first step in database design.

Look-alike Modelling

Look-alike modeling is a smart, and commonly-deployed, way to expand a marketer’s audience and extend reach. This occurs when the marketer (and its 3rd-party data provider) analyze its current customer data to come up with a target audience “segment” of people with similar behavior, be it demographics or preferences. If single mothers who love watching Bravo’s Real Housewives franchise and Starbucks are sweet spot customers, you could ensure your next campaign reaches more people most likely to fit that bill via Look-alike modeling. By aligning your online and offline data, data onboarding makes look-alike modelling a lot easier.

1:1 Marketing

As the scale of marketing efforts grow, and granular targeting capabilities of both offline (TV, in particular) and online channels/ audiences evolve, marketing has to become more custom. 1:1 marketing is about delivering experiences that have been customized to cater to the likely preferences of a given customer, rather than broad segments.

Targeting (aka Advanced Targeting/ Data-Driven Targeting/ Advanced Audiences)

Targeting is the process of using customer data to shape distinct segments of your audience (based on behavioral data, consumption, ownership, or propensities), and subsequently pursuing them with distinct/ relevant experiences. Targeting can be complicated if a marketer’s customer data lives in silos, separated from the platforms used to engage with these customers.


Retargeting is the Tiffany Trump of Digital advertising. This technology/ strategy lets a marketer reengage with people who have previously interacted with them, deploying creative that reflects their previous interaction/ interests. For example, if an individual visited your site, and browsed for blue shoes, a cookie is dropped in their browser to facilitate recognition and tracking. Subsequently, your banner placements on the individual’s digital journey could reinforce a message encouraging them to come back for those specific shoes. With the 1:1 matching of TV Exposure data to Digital datasets, as well as the creation of a Universal ID, this could become an annoying, repetitive, retargeting loop.


Rather than addressing a marketer’s entire audience akin to one big, homogeneous blob, the creation of segments allows the division of their target market into meaningful chunks, based on preferences, behaviors, past activity, visitation, purchase, affinities, and demographics. These Segments may then be modeled to scale via Look-alike modeling, made available to other marketers, and targeted. Segmentation is the cornerstone of Digital marketing, and taken to its natural extreme, leads to a segment of one - or 1:1 marketing (verboten, given PII-issues).

Cross-Media/ Multichannel/ Omnichannel Marketing

This is where a marketer may reach the same customer on different marketing channels via a single campaign. Eg: James sees a commercial on TV, followed by display ad on Facebook, advertising a new pair of shoes from a brand he likes. The next day, he sees a video ad on for free shipping on that same pair of shoes. The week after that, he receives an email with a 10% off coupon for those shoes, and chooses to buy them at the mall the following weekend.


As in the James example above, if somebody saw a banner, a Tweet, a TV ad, and a magazine ad, how can you tell what led to their decision to buy? The basic aim of attribution modeling is to figure out which marketing actions, or channels, contributed most to a certain customer action. More specifically, it is about using analytics to give credit where credit is due, and knowing how much credit is due. This gives you the data you need to optimize everything, from budget allocations to messaging to campaign strategies.

Traditionally, in digital attribution, the Last Touch (or, conversely, First Touch) model in analytics assigns 100% credit to the final (or first) touchpoints/ clicks that immediately precede sales, or conversions. Logically, this is an overly simplistic approach, which does not measure the impact of Brand-building, and reinforcement. Similarly, in a Linear Attribution model (the simplest of Multi-Touch Attribution models), each touchpoint in the path to conversion are accorded equal credit for the sale. This may also be seen as faulty, as well, as it assumes all media types to be equally impactful.

As a result, Multi-Touch Attribution is the preferred marketing technology attribution solution that tracks a series of touchpoints through the purchase funnel, and assigns weighted revenue credit to each touchpoint.

Regression Analysis

Regression analysis is a form of predictive modelling technique which investigates the relationship between a dependent (target) and independent variable(s) (predictor). This technique is used for forecasting, time series modeling, and investigating the causal effect relationship between variables. Regression analysis is an important tool for modeling and analyzing data.

Single Source Methodology

Single Source is the gold standard of attribution measurement for TV and other media/ marketing exposure, and purchase behavior, over time for the same individual, or Household. This measurement is gauged through the collection of data components supplied by one or more parties overlapped through a single, integrated system of data collection. In TV advertising measurement, Single Source data are used to explore an individual’s loyalty and buying behavior in relation to advertising exposure within varying windows of time. Eg. Year, Quarter, Month, or Week. In this sense, single-source data is a compilation of (1) Home-scanned sales records, and/or loyalty card purchases from retail or grocery stores, and other commerce operations, (2) TV tune-in data from cable set top boxes, or people meters (push-button, or preferably, passive), or household tuning meters, and (3) Household demographic information. The value of Single Source data lies in the fact that it is highly disaggregate across individuals, and within time. Single Source data reveals differences among Households’ exposure to a brand’s ads, and their purchases of those brands within advertising fluctuations. (Credit of this descriptor goes to Bill Harvey, the progenitor of Single Source)

Closed Loop/ Lift Analysis

Old-school digital campaign optimization was based on clicks and impressions. Now it’s possible to measure the impact of digital marketing on what really matters—sales, or ROI. The key to this solution is anonymization: when purchase data is anonymized, it can be linked with campaign exposure data (info on which Households, Individuals, Devices were exposed to an ad or website). With this, all marketers can have end-to-end pictures of their campaigns, which can be used to measure sales lift.


The perplexing feeling expressed by most marketers when they encounter the fragmented nightmare that is their data layer. Or after reading this article.

Boon Yap

More Posts

Data to Drive Sales Strategies: New Product Category Ad Spend Data in Australia and New Zealand

As the advertising industry evolves, we’ve realized that there is still more that can be done to shed light on what’s happening in the advertising industry. And clearly, the next area of the advertising market requiring improvement is that of Product Category advertising expenditure.

November 8th 2017

Introducing Predictive Ad Earnings Forecasts for FB and GOOGL

We’re extremely excited to share our new Ad Earnings Model with everyone. In short, it’s a new, predictive data point that helps investors understand the fundamental performance of key media companies, such as Google and Facebook, to assess near and long-term potential upside.

October 25th 2017

NFL Sees +2% YoY Growth in Ad Revenue Across Televised Games in September

In September 2017, compared to September 2016, advertising spend across games on television networks increased by +2% - from $504M to $513M. This reflects in-game advertising, and does not include any revenue from pre - or post-game shows. Across all televised NFL games, commercial load grew by +2%. Meaning if a viewer watched every nationally aired football game, they saw around 15 more minutes of commercials than in Sept. 2016.

October 23rd 2017