Digital Marketing   |   Clock Icon 5 min read

When Data Lies: The Problem with Univariate Analysis

by Charles Moehnke   |   Oct 26, 2021

As digital marketers, we turn to data to tell us when our efforts are working and when it’s time to make a change.

We face decisions like:

Do we up our ad spend this month? or Should we change the CTA on our homepage?

The answers are in the data… right?

Whether or not intentionally manipulated, data can be a compass pointing in the wrong direction, leading to mistaken claims. When this happens, our best intentioned, data-driven strategic steps become a skewed course of action that may not be in the best interest of the business.

Introducing univariate analysis, the frequent culprit of data misinterpretation.

What is univariate analysis?

Univariate analysis is the technique of comparing and analyzing the dependency of a single predictor and a response variable. This method is one dimensional, not accounting for how the response variable may be impacted by variables other than the predictor.

These “other” variables are called confounding variables, and they’re likely confusing the cause and effect relationships within your data set.

Without considering confounding variables, a univariate analysis’ surface level data may not be an accurate representation of your digital marketing efforts.

How can univariate analysis skew digital marketing data?

Let’s look at a hypothetical example we might run across in our digital marketing data.

We conduct a univariate analysis to look at our website’s conversion rate by device type. In this analysis, device type (desktop or mobile) is the single predictor and total conversions/conversion rate is the response variable:

We can see that our mobile website converts at 77% the rate of the desktop site.

From this report, we may decide to invest in mobile conversion rate optimization (CRO) opportunities.

In the univariate analysis, you’ll notice that as the single predictor, device type is the only variable influencing conversion rate.

Look at what happens when we conduct a multivariate analysis by considering age as a confounding variable to determine conversion rate by device type:

For each age group, mobile conversion rate exceeds desktop conversion rate. This is because older people are far more likely to convert, but also far less likely to use mobile devices. Prior to accounting for age, the data implied the opposite.

While improving mobile conversion rates will help the business overall, the notion that something is “wrong” with the mobile website is incorrect. Instead, accounting for age reveals opportunities to find ways to improve conversion rates with younger consumers, or improve the performance of the desktop website, which receives more overall traffic.

And of course, there may be other impactful confounding variables requiring further analysis before the data can be used as a launching point for your next SEO update or PPC campaign. In this example, considering the influence of social media on the younger demographic or the audience’s geographic location could create a more accurate portrayal of the website’s conversion rate.

What can we do to prevent data misinterpretations?

Misleading data can be easy to come by, but hard to spot.

Avoid data misinterpretations with these helpful tips:

1. Slice and dice your data

Always look deeper to see if there are other ways of looking at the data with stronger correlations.

Performance is often strongly influenced by:

  • Age

  • Customer segment

  • Geography

  • Marketing channel, source/medium, keywords

  • Income levels

  • Gender

  • Device type

  • Landing page

2. Use your marketing know-how

Does the data trend make sense as you understand the business?

If not, dig deeper.

If so, create your hypothesis and determine if there are other ways the data could be used to confirm or disprove that hypothesis, or devise a test that would more accurately isolate the effect.

3. Leverage machine learning and automation where available

While not immune from statistical issues, especially around sample bias, artificial intelligence and machine learning are well-designed for multivariate analysis. Algorithms are great at identifying data clusters and determining where correlation is strongest.

At Workshop Digital, we use automated tools like Google Ads Smart Bidding, Data Driven Attribution, and Python.

Always dig deeper

Data is a great tool for understanding digital marketing wins (and losses), but it can be misleading. Be skeptical.

Dig deeper when you’re presented with data, and ask yourself, are all the variables that may affect the outcome accounted for in the analysis? And remember, never accept an analysis that hasn’t accounted for the influence of confounding variables.

Portrait of Charles Moehnke

Charles Moehnke