Calculating Click Through Rates for SEO, based on Google Search Console Data (in R)

Updated June 4, 2019: Added a YouTube video which will guide you through the setup process in RStudio and how to run the script yourself.


Averages lie & average click-through rates aren’t very helpful! Here I said it. What I do believe in, is that you can calculate click-through rates (CTR) for your own site in a great way though. With data from Google Search Console. Especially while using R, so let’s dive in on how this works and what you can do with it!

Why ‘Averages’ Lie?

Look at the graph below, a great CTR for position 1. But for example, this research shows that the average CTR for position 1 is: 24% (AWR data, Feb 2019). Which one is correct? Neither of them. As it really depends on the industry, what features show up in the search results that might decrease CTR (think rich snippets like local packs, news results). All of this is making it really hard to make a good analysis of what you could expect if you rank higher for a bunch of keywords in your industry. So while I was working at Postmates on ranking certain category pages better we decided to calculate our own CTR and were intrigued by how far CTRs were off from research (the research isn’t wrong! It’s just generalized across industries). Eventually, with the data in hand, we were able to make better estimates/forecasting of how much traffic we could expect when rankings would increase in that segment. In the rest of this post, I’ll go more in-depth on the specific practice on how we calculated this.

Using Google Search Console Data

Visual of Google Search Console Report (this is not RVshare data)

You’ve seen this report in Google Search Console, providing you with a detailed view of the performance of your keywords and the average position for your keywords. In this graph, we see something positive, a CTR & position that go up slightly over time. But what if you would want to know the average CTR for a certain segment of keywords per position. That’s way harder to do in the interface. Because of that, I used the R script from Mark Edmondson that he wrote about here almost three years ago.

It will help you extract the data from Google Search Console in a raw way so you can use it to digest it and create your own visualizations (like the one we’ll talk about next).

Visualizing CTR Curves in R

CTR Curve visualized CTR per Position (note: this is randomized data for an unknown site)

So let’s dive right into how you can do this yourself, I’ll provide you with the full R script and you will need to download RStudio yourself in step 1.

  1. Download and Install RStudio
  2. Download the following .r script from Gist
  3. Run these commands to install the right packages for RStudio:
    1. install.packages(“googleAuthR”)
    2. install.packages(“searchConsoleR”)
    3. install.packages(“dplyr”)
    4. install.packages(“ggplot2”) if necessary
  4. Line #21 – Change this to the property name from Google Search Console
  5. Line #25 – Not neccesary: If you want the CTR curve for positions over 20, change the number.
  6. Line #40 – Recommended: Exclude the word(s) that are part of your brand name. So you get the right CTR curve for non-branded keywords only
  7. Line #41 – Not necessary: This script is taking a ‘sample’ of 50.000 keywords to calculate your CTR curve of. You can increase this limit to more if needed, if you have less than 50.000 keywords it’s not an issue
  8. Run the script! The output should be a visual as shown earlier in this post

Want to take a deep breath and let me help you go over this again? I’ve made a quick screen share video of what to do in RStudio and how to use the R script.

Hopefully, now, you’ve had a better chance to understand what the actual CTR is for your own site and you can use this to visualize CTR curves for specific parts of your site or pages that have a similar META description. Over time you could use this, for example, for measuring the impact on CTR.

Credits where credits are due! There are many use cases for using CTR data by visualizing it with R, and I’m grateful that a while ago Mark Edmondson opened my eyes about this + credits to Tim Wilson’s documentation on using R and improving visualizations.

Want to read this article in Spanish? Read it here.

Adding additional site speed metrics to Google Analytics: measuring First Input Delay (FID)
Keyword Gap Analysis: Identifying your competitor’s keywords with opportunity (with SEMrush)

Comments

  1. Step 3 “Run these commands to install the right packages for RStudio:”
    Is that in Terminal or RStudio?

    • Martijn Scheijbeler
      April 24, 2019 - 8:42 pm

      You will run this in RStudio, it’s an R package :). You can easily do that through their interface.

  2. Hi! Thanks for sharing this!

    I’m a bit of a newb with R programming. Wondering if you can help point me in the right direction.

    I went through all your steps, but I am getting:

    Error: object ‘hh’ not found

    Error in ggplot(click_curve, aes(positionRound, CTRmean)) :
    object ‘click_curve’ not found

    Any idea how i can address this? Thanks so much!

  3. Nicely written article and I have used the script and it works well. I have created a curve for Non branded and now I want to create a curve for a set of generic keywords. Do you know how to do this as it looks like I can’t put an OR in the dimensionFilterExp.

    • Martijn Scheijbeler
      May 14, 2019 - 10:49 pm

      Good question Iwan, what you could do is gather all the data and then run a filter on the keywords that you want to include. It’s a bit more complicated though as you’re basically changing the dataset that you want to include. If it’s something that you can do with a regular expression (I assume not), then I would do it within the dimensionFilterExp.

  4. Great! Hope to see more post about SEO, R and analysis. Thanks

  5. Really cool article, I am new to RStudio. Once I ran the script I get:

    [1] “Domain Name” Just redacted my client domain here

    Any thoughts on what I fudged up?

  6. Hi Martin –

    This is a very useful projection!

    When I run the CTR plot, I get the error:

    Warning message:
    Removed 2 rows containing missing values (geom_point).

    Any thoughts on what is causing this?

  7. Nice!

    How to i exclude keywords in the line ?
    dimensionFilterExp = c(“query!~share”)

    • Martijn Scheijbeler
      August 20, 2019 - 1:30 am

      It’s easy Jeroen, just do: c(“query!~share”, “query!~yourquery”). It’s not unlimited but you can add quite a few.

  8. very nice. I began to understand a little about the data exposed on search engines

Leave a Reply

Your email address will not be published / Required fields are marked *