A Marketer’s Guide to Evaluating Bidder Performance of Multiple DSPs

Posted on March 05, 2019
By Praveen Rajaretnam, Senior Product Marketing ManagerSenior Product Marketing Manager

Dozens of mobile performance-centric demand-side platforms (DSPs) are available in the market. How do you choose between them? After all, each mobile DSP has different strengths and capabilities, and it is difficult to name one as the best for everyone. Rather, it is about which DSP fits your requirements the best.

How do you begin the selection process? A few of the key criteria to consider include:

  • Type of inventory access (banner, native or video; in-app or mobile web; gaming or non-gaming).
  • Self-serve capability (access to campaign setup/management dashboard).
  • Level of transparency (traffic source, traffic type and pricing).
  • Bidding options (CPM, CPC, CPI, CPA, etc.).
  • Reporting capabilities (dashboard, API).
  • Data Management Platform (DMP) integrations (none, built-in, third-party vendor support).
  • Support for anti-fraud vendors (none, built-in, external vendor support)
  • Pricing model/Take rate (spend-based or performance-based)
  • Targeting capabilities (app, OS, handset, etc.)

After going through all of this, even the most discerning marketer will be left with a handful of DSPs to choose from based on the above criteria and requirements.

This, at this point, brings us to the crucial question of how to evaluate the shortlisted DSPs on performance. Essentially, you are evaluating the efficiency of the bidder - is it placing the optimum bid on the right users with the right message (creative) at the right time?

Since most DSPs have the same traffic sources (for example, if they connect to the same ad exchanges and other sources of supply), the campaigns will target the same audience with the same set of creatives. This is likely to result in the cannibalization of your ad spends, as you end up bidding against yourself for the same user with the same creative. This results in higher CPM costs. But worse, you incorrectly evaluate the DSPs and end up choosing the less-than-ideal choice.

The best way to test or evaluate DSPs is to ensure non-overlapping audiences so you can avoid cannibalization.

Here are three ways to set up the right evaluation process and avoid potential pitfalls.

1. Assign Different Segments to Each DSP

This is the most effective way to evaluate the performance of a DSP’s bidder. If the DSP does remarketing, set up remarketing campaigns in the following way:

  1. Divide the user list (say, 30-day dormant users - i.e, the installed-but-not-purchased-in-the-last-30-days segment) randomly among DSPs.* This ensures a non-overlapping audience.
  2. Run each campaign for a minimum duration of 45 days. This is required as it takes at least two to three cycles to optimize towards key performance indicators (with each cycle taking anywhere between seven and 10 days).
  3. Consider only the last seven days’ performance (say, cost per transaction and scale) for comparison.

* This requires the app to have a large user base and monthly active user (MAU) count. For example, if you are working with three DSPs, the 30-day dormant user count needs to be fairly large. Otherwise, it becomes a needle-in-a-haystack problem for DSPs and the test results might not be statistically significant.

2. Geographic A/B test

If the above option is not feasible (either due to the DSPs not providing remarketing or if they use different bidders for remarketing and user acquisition), then we recommend doing a geographic A/B test.

  1. In this case, each DSP is given a different region, usually at the city level. This is because the accuracy of location data derived from ad requests is typically not reliable enough for further granularity.
  2. The cities should be split among DSPs in such a way so as to minimize/account for behavioral variations. For example, allocate New York and LA to DSP 1 and Philadelphia and San Francisco to DSP 2. Essentially, splitting one city from each coast.
  3. Post thirty days, reverse the targeting provided to each DSP.

3. Time-based A/B test

If you have limited cities to target or if you are unable to split the cities without introducing high variance, then a time-based A/B test is the only other option. It’s worth noting that this is the most time-consuming and least reliable test of all. Here, each DSP runs alone during different time periods so as to ensure no or limited overlap.

In order for this to work, you need to ensure that there are no major product updates over the duration of the evaluation.

The campaigns can be set up as follows:

  1. Run exclusively with one DSP for at least 30 days.
  2. Then stop and switch to the next.

This test is generally not recommended unless it’s for a mature app doing a head-to-head comparison between two DSPs, with no major product updates being pushed during that period.

Note: All the above tests assume that the advertiser adopts a Last Click/Last View attribution model.

Besides the bidder, other aspects of a DSP can also be evaluated during this the tests. Most critically, the quality of their audience data and level of support and service offered.

Look for concierge-like support and vendors who are more focused on customer service rather than their profitability. Unexpected or intransparent fees, outages, deliverability issues, slow response times and unacceptable answers are all red flags.

About the Author

Praveen Rajaretnam has over a decade of experience in mobile marketing and growth marketing. He started his career as an engineer at a cyber-security firm, working on automation and performance testing. Praveen also started a social-commerce firm, running marketing and growth strategies there. He spends considerable time researching anti-fraud methodologies, attribution mechanisms and real-time bidding mechanisms.

More Posts by Praveen: