The Media Trade Upgrade: Probabilistic vs. Deterministic, Simplified
By Reagan McNameeKing
Welcome to the second edition of The Media Trade Upgrade, the series giving programmatic traders essential info about the latest and greatest in ad tech – all in quick, bite-sized pieces. In today’s blog post, we’ll define the differences between deterministic vs. probabilistic data, what it means for audiences, and how it impacts campaign planning for programmatic traders. Ad tech has a reputation for jargon, but (for once) we didn’t make up these two: probabilistic and deterministic data have applications in many industries, academia, and other institutions. Don’t worry – we’re just talking about advertising today.
Estimated reading time: 5 minutes
Trending: Probabilistic and Deterministic
Using probabilistic and deterministic data to target audiences is nothing new. But, with third-party cookies becoming an endangered species, marketers need to make choices about alternative identifiers and identity solutions. By the end of 2022, 100+ identity solutions were available, according to Prohaska Consulting’s most recent Identity Partner List. What does this mean for advertisers?
- 63% of advertisers are experimenting with one or more solutions to reach cookieless audiences (or are actively planning to do so).
- 57% of advertisers who haven’t adopted a cookieless solution say it’s because they don’t know how the solutions work.
(Source: ID5 – The State of Digital Identity 2022)
Programmatic traders increasingly need to participate in identity-related conversations. Understanding the nuances of probabilistic and deterministic audiences is key to these discussions. And now that the industry is tasked with finding alternatives to third-party cookies, the words probabilistic and deterministic are even more prevalent in team meetings and client conversations. Read on to stay in the know on this essential vocab.
Definition: Deterministic audiences are groups created by data supplied directly from users. Deterministic data is also known as validated or qualified data.
What it really means: These data points are usually accurate at the time of collection because they’re personally identifiable. Deterministic audiences are created from a snapshot of data, so accuracy may fade over time.
Examples of deterministic data:
- Name and address
- Phone number
Deterministic data may include gender, location, age, and other personal details. For example, users tracking exercise activity in a fitness app, might give the app publisher deterministic data like how often they run, bike, or do yoga. Depending on the user’s data privacy preferences and app permissions, the app publisher could use this information to create a data-confirmed cohort. Email addresses could also be used for a deterministic ID match.
Is deterministic data the same as first-party data? It depends. Deterministic data like an email address supplied by a user falls under the umbrella of first-party data. Some data that isn’t user-supplied, such as transaction history, is still first-party (even though it’s not technically deterministic).
Implications for campaign planning:
Deterministic audiences make targeting (and retargeting) more accurate and effective and can be a more efficient ad spend. However, with limited first-party data, deterministic audiences aren’t as scalable and have campaign reach limitations.
Definition: Probabilistic audiences are groups with a high probability of being accurately profiled. These audiences rely on models built on a subset of deterministic data to identify a larger target audience. May be categorized at the individual or cohort level.
What it really means: These audiences are based on probabilities — but can also be validated against deterministic data. Probabilistic audiences may be an alternative to purely deterministic ones, but they can’t exist without deterministic data.
Examples of probabilistic data:
- Non-unique device characteristics
- IP address
- Browser or device type
Some probabilistic solutions also recreate identity using implicit signals such as IP address or a device user agent. While probabilistic audiences don’t have the surefire accuracy of audiences built with deterministic data, they are far more scalable.
Let’s say a user is streaming video content on an Xbox. We don’t have a personal identifier and have no way of knowing who owns the console. Still, we can use these data points to categorize the user into a probabilistic audience and match them to relevant advertising content. In this example, we can infer that the user is probably a gamer. The fact that they’re streaming at 2 a.m. means there’s a good chance they’ll fit in our college-aged audience interested in late-night food delivery services.
Implications for campaign planning:
Audiences built with probabilistic data are at the heart of contextual targeting. Call us biased, but here’s a great example: Contextual solutions like Visual Intent use machine learning to target audiences based on how likely they are to consume content about certain topics. For advertisers, this creates an opportunity to place their ads alongside premium visual content, thanks to Getty Images’ exclusive partnership with Verve Group.
The scalability of probabilistic audiences makes them ideal for awareness campaigns, but technologies like Moments.AI have already proven applications for more targeted conversion-oriented campaigns as well.
According to ID5’s 2022 State of Digital Identity Report, 81% of DSPs and 79% of SSPs have already implemented one or more identity solutions. As a marketer, make sure you understand what your preferred DSP and other technologies use or integrate with.
Pros and Cons of Deterministic vs. Probabilistic Data
Which type of audiences should marketers use? Realistically, you’ll use both. Here are some of the benefits and drawbacks of probabilistic vs. deterministic audiences:
|Deterministic Audiences||Probabilistic Audiences|
|High accuracy = improved targeting||Predictive target can mean reaching |
audiences earlier in the buyer journey
|Cons||Risk of using outdated data|
|Only as good as the deterministic data |
used to build the model
TL;DR: Talking Points About Identity in Ad Tech
- It’s only a matter of time before cookies are a thing of the past. Many advertisers are ready to dip their toes into alternative cookieless identifiers.
- Identity through informed consent will be essential to test and validate solutions.
- Deterministic data is increasingly scarce. Unaddressable audiences will only keep growing, making it critical for the industry to test and redefine privacy-first strategies and tools.
More to the Identity Story
Terms like probabilistic and deterministic are the tip of the iceberg when understanding the identity landscape. Verve Group is committed to providing clear, helpful content to support our friends and colleagues across ad tech, even if you’re not our customer. We have a recent in-depth eBook, Identity: Decoded, and plenty of blog and social content to help demystify identity for marketers, consumers, and the supply side of advertising. Follow along on LinkedIn, or subscribe to our newsletter below!