2024 Predictions are in… What Now? (Part Two – A New Era Spawns New Data Needs)

Feb 11

Is your data prepared for this unprecedented marketing industry reset sparked by consumer privacy, digital id signal loss, retail media, clean room proliferation, AI potential, and more? That is the theme of this two-part blog. Part one offered suggestions on how to pragmatically assess your data readiness. Part two below shares the types of data you should consider infusing into your marketing stack to mitigate this new era’s upcoming risks and capture new opportunities.

Before jumping into these data source recommendations, it is important to first understand the challenges and resulting implications that will emerge or intensify in the coming few years. I purposefully tried to keep these generic as I believe these themes will be widespread across your entire marketing enterprise and therefore justify you being strategic, not just tactical, about how you infuse this data into your solutions.

Scale – The expected loss of digital id level signals like 3rd party cookies, mobile ad ids, and maybe IP addresses will crush the effective scale of open web advertising and the derivative data we rely on for targeting, modeling, and measurement. Sure, walled gardens will still have authenticated person/device level data, but it will be isolated, obfuscated, or restricted for many marketing use cases. Advertisers will need to piece together more targeting methodologies and media publisher networks than ever to maintain their current scale and performance.

Quality – Data quality has always been a challenge and is already responsible for a huge percentage of marketing waste and data scientist time. It’s not glamorous, but it will need even more attention. Corporate AI dreams and related cost-cutting mandates will quickly surface data quality prerequisites. Making matters worse, I expect bad actors to inject more data that looks good, but is not good using sophisticated fraud, made of advertising schemes, bogus media content categorizations, knowingly stale identity ids, etc.

Completeness – As industry power further shifts to first-party data owners, your marketing use cases will need to go to where their data is (i.e. clean rooms). These commercial and privacy-motivated changes will create more data silos than ever before. Data for only one retailer. Data for only one media publisher. And so on. If not careful you will find yourself running your business on half-truths or even no truth because some internal silos do not have access to some external partner silos. AI cannot fix this. Training your models on skewed or limited data will just result in skewed and limited models and their resulting actions.

Relevancy – The ad industry has been spoiled with the ability to deliver highly relevant ads to individuals who they know did buy the product, visit the store, or see the ad. Even if an ad was served with less than relevant media content, they still knew it was delivered to the target audience. As deterministic addressable targeting and measurement lose scale, the data and methods that fill that void need to be relevant. More decisions will be based on probabilities and proxies. Assessing and testing the relevancy of that decision-making and model training data will be critical.

Consistency – Our marketing use cases will be even more dependent on data that is fragmented, aggregated, and modeled. That means your data aggregates will need to be joined with other’s data aggregates. Unfortunately, few segment, category, cohort, and classifier definitions and taxonomies neatly align for humans, let alone data technologies. Industry groups and leading data providers try to establish standards, but competitive dynamics usually get in the way. More attention will need to be paid to data and processes that can harmonize data as much as possible to enable more scale, automation, and efficiencies.

Sorry to be a Danny Downer, but ready or not, the industry is changing. Just using AI is not enough. AI will quickly become the new baseline and in some respects be commoditized. The differentiator will be the data. Your AI is what your AI eats.

Ok, so what types of data can help us solve or mitigate the scale, quality, completeness, relevancy, and consistency challenges ahead of us? I listed several types of data below that you should consider adding to your current marketing data stack and why. I did not spend time talking about the core datasets that you should already have handled like customer data, purchase data, ad impression data, etc. My objective is to stretch your thinking.

Panel Data

Yes, much-maligned panels are on my list. Before some of you tune me out, let me be clear. I am not suggesting that we back off the industry push to use larger passively collected data from retailer POS, set-top boxes, and click streams. The scale and quality of those datasets is a no-brainer. Unfortunately, despite their size, those datasets are often limited in scope and have representativity skews. That means your AI-based insights, recommendations, and automated decisions will be similarly skewed unless you mitigate its blind spots. I am suggesting that representative panel data can help you maximize the value of those bigger datasets by injecting norms, defaults, calibrations, and other ‘outside’ perspectives.

By ‘panel data’ I am not just saying those relatively tiny panel solutions where consumers are recruited to share their television viewing, purchasing, opinions, etc. I also include other broad-based data sources that have adequate universe coverage and have been thoughtfully augmented with other data and analytics to address quality issues, geo-demographic imbalance, trend breaks, etc. Your goal is to have a reasonable ‘truth set’ of consumer behavior at market, competitor, channel, segment, household, person, campaign, and pantry level.

Everyone’s situation is different, but some of the challenges that panel data has helped me mitigate before include… My promotion-heavy retailer data lacks insight to inform my every-day-low-price strategies. My DSP impression data can only be trusted to optimize buys like the ones I have previously made. My consumer interest profiles are limited to the media content I previously bought and could afford. My set-top box data knows household viewing but struggles to know who specifically was watching. My retailer purchase data knows they did not buy that product there but cannot be certain they did not buy it elsewhere. My campaign measurement can only quantify incremental sales for retailers sharing data.

Not addressing these blind spots leads to missed opportunities, media waste, underreporting, subsidizing purchases, etc.

Contextual Data

Despite all the efforts to maintain the addressable advertising status quo, the scale and cost of that will suffer in this new marketing era. This will shift more advertising dollars to the ‘next best thing’ and very often that will be contextual targeting. Everyone will need to get good at factoring context into their marketing stack, whether to mitigate the loss of addressable targeting, personalize creatives to the moment, optimize bid prices, or augment measurement results used for future planning.

Broadly speaking, contextual targeting is about predicting when and where your audience will be and how receptive they are to your brand message. Our lives are complicated and what life moments influence us to listen, buy, or advocate varies wildly. Here are some examples that I have used before… At home, at work, at store, commute time, dinner time, preparing a recipe, shopping for a recipe, searching for a recipe, school starts next week, spring break next week, researching fantasy football stats, watching the news, watching sports news instead, local music festival this weekend, dicey weekend weather forecast, gas prices up, mortgage rates up, etc. Then there is personal context that is especially powerful but requires careful sensitivities like… new home, new job, no job, approaching retirement, newly married, newly divorced, new baby, baby going to college, gluten intolerance, heart-healthy diet, chocolate lover, etc.

In a perfect world, understanding the power of context is like assembling a diary of each consumer’s daily activities. A neatly categorized time-series of everything consumers do that is overlaid with a time-series of everything they buy. Understanding what people do informs your customer profiles and understanding when they do it helps identify causalities and opportune moments for your brand. Together they help advertisers know the time, place, and message that is most likely to find and influence your audience, at the right price.

Finding and affording contextual data that captures everything consumers do and why is never-ending, but likely your brand team already has a good sense of the ‘context that matters’. Start there, but then stretch yourself. Getting better requires pushing on your assumptions and trying new things. For some brands, getting weather, pollen, climate, and other environmental data is required. Others might benefit from licensing data on school schedules, team sports game times, or local event calendars.

The contextual data that matters for everyone comes from media content consumption. Picking the best media content and classification vendors can be tricky. This is much more than keywords and word clouds. Let me give you a few timely examples… Less sophisticated keyword-based capabilities will probably have Taylor Swift high on the NFL topic keyword list. Blindly targeting those keywords without first confirming that the article is truly about football would result in lots of beer ads on pop music sites. Likewise, luxury car manufacturers might saturate the northern Florida market with ads next to Jacksonville ‘Jaguar’ football articles. Your vendor must understand sentence structures too because… You can be right in a debate on who will win the big game this weekend. You can also be on the Right for an upcoming presidential debate. Or you can take a right into the local tavern to forget it all.

Net, contextual data properly classified will inform the best time, space, and mood to catch your audience with a relevant compelling message and at a price that makes sense. Media channels, platforms, and now browsers will each implement unique contextual targeting methods. You can buy contextually relevant publisher content directly, you can programmatically buy ads in real-time next to desirable media content, you can buy ads physically next to desirable out-of-home context, you can give cohort-based black boxes your desired topics, etc. In any scenario, you need to correlate these contextual dynamics to your desired brand behaviors. This needs to be core to your enterprise and not a collection of media channel one-offs. You should no longer just trust your gut and annual surveys.

Taxonomy Data

Aligning data across marketing channels, retailers and media platforms has never been easy. Competitors don’t play nice and they love to inject their own secret sauce at the expense of industry-standard data definitions and categorizations. We have been somewhat spoiled that device platforms and 3rd parties have provided us with pseudo-anonymous digital ids like 3rd party cookies, mobile ad ids, and IP addresses to stitch data together. That luxury is fading, however. Don’t ask me when, but a good third is already gone, so that should not be your first question.

While much of the industry focus is on what to do if we lose the ability to link individual consumers, shoppers, and households across our data, let’s not lose focus that other data dimensions have similar challenges that will be amplified by the proliferation of clean rooms. Linking together things like stores, store clusters, fulfillment centers, products, categories, campaigns, etc. requires sharing ids and definitions. Complicating matters will be that consumer privacy legislation and strategic first-party data investments will restrict data sharing, especially at the most elemental building block level.

Some data will be easier to map across parties and platforms, but few will be easy. Even CPG product mapping that benefits from a Universal Product Code can be a challenge when retailer data is based on proprietary retailer SKU or article numbers. All bets are off when trying to map products at the category or attribute level. Ditto for asymmetrical media content definitions (e.g. IAB, TOPICs, genres), calendars (e.g. fiscal years, primetime), and geographies (e.g. DMAs, MSAs, SMMs). Even linking data at the most element level can cause data quality issues due to sloppy naming (e.g. 7-11, 7-Eleven, Seven-Eleven, etc.)

Industry groups like the IAB, MRC, NAICS, Uniform Code Council, and data providers attempt to create common definitions and taxonomies, but their competitive and political biases hamper universal adoption. It is best to just accept that you will need to fit square pegs into round holes and systemically mitigate less-than-perfect category, segment, and cohort matching. Do not let perfect get in the way of better, faster, and cheaper.

Identity Data

Not sure this one qualifies as a ‘new’ marketing data need, but it needs more attention regardless. By now, most advertisers have some sort of an identity graph that they use to drive their digital advertising activities. If they are paying attention, they will have noticed the scale and quality of their graph is degrading and will soon be at unacceptable levels without overt action. This will severely degrade their ability to effectively target on the open web reducing reach and driving up costs. Just as importantly, that same historical cause-and-effect behavioral data is used by your AI superhero that you are counting on to save your marketing future.

Regardless of the ultimate fate of 3rd party cookies, mobile ad ids, and IP addresses, there will surely be some level of opted-in and registration-based ids available for open web targeting and modeling (e.g. UID 2.0, RampId, UD5, etc.) I suggest that folks assess these so-called alternative ids, if anything to buy time to get their act together on this new impending reality. Worst case, these ids can provide access to cheaper inventory for browsers that lack 3rd party cookies, aid in their testing of Google’s privacy sandbox, and ultimately ensure that you will have adequate sample for the AI models that you will be increasingly dependent on.

Feedback Data

Your marketing teams learn from their mistakes and repeat what works. Will your AI? It should, but you should eliminate all doubt and overtly tell it what you did and how well it worked. Inject the actions taken, vetted results, key metrics, extenuating circumstances, etc. It is even okay to provide expert qualitative feedback. If an ad creative was not well received by consumers, retailers, or executives, inject that too. Your marketing mix optimization logic should be told that your CTV ad creative itself was bad and that it should not drag down the relative performance of the entire TV media channel.

The gold standard of feedback is what is often referred to as ‘closed loop’ marketing, where campaigns are planned, executed, and measured in a single flow and where historical campaign measurements are leveraged for future campaigns. The next big goal should be to break down internal marketing silos that are ‘out of the loop’ and share more data feedback across marketing channels, teams, and use cases. This may get harder as media and retailer partners restrict data access to clean rooms and commercial and privacy restrictions limit what data can be taken out of those clean rooms. That should not stop your goal to share learning and best practices across your marketing enterprise.

Beyond media campaign feedback, there may be an even bigger opportunity to share data across currently disparate internal marketing functions. The idea here is that ultimately all data used across all marketing decisions boils down to people buying products at a place and price and potentially promotion (i.e. 4P’s, remember them?). Sadly, many great insights that inform one marketing function are blind to or reinvented by another marketing function. I suggest that you piggyback on all the AI attention and efficiency pursuits to create more cross-marketing data synergies. Some examples I have seen work…. Enable the assortment team to share consumer-decision-tree outputs to drive audience development and personalization. Help the audience team share audience composition details with the measurement team to optimize future audience buying strategies. Encourage the individual retailer merchandising, category, shopper, and media teams to share feedback on their tactics, assumptions, and investments to optimize the overall shopper experience and retailer relationship.

Human Data

No, not more data about humans, but data from humans. Your human experts may be slow and expensive compared to AI, but they know their stuff. The key to exceeding the very high expectations of AI is to mentor it. Put your unwritten rules, battle wounds, domain knowledge, and other cheat codes into the data. Your coaching will help your AI find more good stuff and suppress the bad stuff.

Human data can come in many forms. Sometimes it’s just tagging data with a qualitative score (e.g. great, good, okay, bad). Other times it’s a domain-specific thesaurus (e.g. chocolate = cocoa = fudge = mocha = choccy = truffle). These little data tips can be particularly powerful for AI because sometimes the patterns it finds are too small to surface individually, but if you expertly group the signals, AI can more confidently surface them. Every brand will have dozens of unique examples that will need expert curation and maintenance but do not overlook the simple ones. Even basic things like classifying time into themes that humans intuitively know can sometimes help surface more signals. For example, for most people know that 7am Monday, 8:30am Tuesday and 6:30am Wednesday are all morning commute times, and Saturday 8am is not. Food manufacturers know that consumers reading a dinner recipe at 9am on a Saturday morning are likely building a shopping list which is a prime ad targeting opportunity, but that same person looking at the same dinner recipe at 6pm Tuesday night is more likely to be preparing dinner which is less awesome for an ad sales conversion that day.

Net, your goal is to ensure that the patterns and knowledge that your teams have learned from years of experience are reflected in your AI-based insights, decisions, and automated actions. This requires that you capture, summarize, and integrate that knowledge in a way that AI can easily discover. If your data scientists say this is just feature engineering, it is. Lean in with them.

The End

I hope this was helpful and there was something you can take away and leverage. Sorry if you were looking for some juicy tell-all commentary on individual data companies. This is not the right forum for that, and besides, I could not credibly critique individual providers without first knowing your unique situation and budget. Contact me if you would like my help.

Daniel Cropsey

2024 Predictions are in… What Now? (Part Two – A New Era Spawns New Data Needs)

“Do better” - Your First-Party Data Users

2024 Predictions are in… What Now? (Part One - Get Honest About Your Data)