Announcing my Technical SEO Course on CXL Institute

If there was one thing that I could teach people in SEO, it was always the technical side of SEO that came up first. Mostly, because I think it’s a skill that doesn’t suit too many SEOs and there is already enough (good or bad, you’ll be the judge of that) content about the international, link building or content side of SEO out there. As technical SEO is getting more and more technical and in-depth about the subject itself, I’m excited to announce that I’m launching a new technical course with the folks of CXL institute.

The course will cover everything from structured data to XML sitemaps and back to some more basic on-page optimization. Along the way, I show you my process for auditing a site and coming up with the improvements. I’ll try to teach you about as many different issues and solutions as I could think of.

It’s not going to be ‘the most complete’ course ever on this topic, technical SEO evolves quickly, and likely some things will already be outdated now it’s published, while we have worked on it for months. But I’m going to do my best to inform you here and on CXL Institute about any changes or any improvements that we might be able to make in a future version. If you have any questions about the course or want to cheer me on, reach out via Twitter on @MartijnSch.


Keyword Gap Analysis: Identifying your competitor’s keywords with opportunity (with SEMrush)

Keyword research can provide you with a lot of insights, no matter what tool you’re using they all can provide you a great deal of insight into your own performance but also that of others. But while I was doing some keyword research I thought about writing a bit more about one specific part: gap analysis. In itself an easy to understand the concept, but it can provide a skewed view of your competition (or not). To demonstrate this we’ll take a look at an actual example of some sites while using SEMrush’s data.

What does your competition look like?

You know who your competitors are right? At least your direct ones, but often people who work in SEO or the level above (Marketing/Growth) don’t always know who the actual players are. I worked in the food delivery industry, but more often than not I was facing more competition from totally random sites or some big ones than our competitors (for good reason). So it’s important to know what your overlap is in the search rankings (it’s one of the reasons you should actually be tracking rankings, but that’s a topic for another day) with other sites. This way you know what competitors are rising/declining in your space and what you can learn from their strategies to apply to your own site. But this is exactly where the caveat is, is that actually the case!?

So let’s look at an example, as you can see in this screenshot from SEMrush the playing field for Site A is quite large. They’re ‘ranking’ for tens of thousands of keywords and are placing in a decently sized industry. While they’re ahead of their competition it’s also clear that there are some ‘competitors’ in the space that are behind them in search visibility.

So let’s take the next competitor, we’ll call them ‘Competitor A’. What we see here is that they rank for 250.000 keywords. A significant number still, compared to what we’re ranking for. It doesn’t mean though that all their keywords are what we’re ranking for. So let’s dive into gap analysis.

Keyword Gap Analysis

In short, there are three ways to look at keyword analysis:

  • What keywords am I ranking for, that my competitor is also ranking for (overlap)?
  • What keywords am I ranking for, that my competitor is not ranking for (competitive advantage)?
  • What keywords am I not ranking for, that my competitor is ranking for (opportunity)?

Today we’ll only talk about the last one, what keywords could I be ranking for, as my competitor is already ranking for them, to drive more growth. When using SEMrush you can do this by creating a report like this (within the Keyword Gap Analysis feature):

Screenshot of: Creating a keyword gap analysis report in SEMrush

You always have the three options available to select. In this case, we’ll do the Common Keywords Option. And the result that we should see looks something like this:

Screenshot of the result, with a list of keywords.

What are the keywords with actual opportunity?

So there is apparently xx.xxx keywords that I’m not ranking for (and likely should). That’s significant and almost leads me to believe that we’re not doing a good job. So what the problem often is, which is not a bad thing. Is that the majority of these keywords are being driven by the long tail (specific queries with very low volume). So what ends up happening is that I’m likely looking at tons of keywords that I don’t want to focus on (and hopefully will benefit from by just creating a little bit more generic good content). So when I did this for a competitor and filtered down on keywords that were for them at least ranking position <20 and had a volume >10 monthly I had only 2500 keywords left. That’s just a few % of the keywords that we got started with. It’s required to add that I’m not saying to ignore the other keywords, but now you have the keywords that you have a real opportunity to drive actual results. In the end, you should be able to rank well, as your competitor is already ranking (position: <20), there is actual volume (>10) and you’re not in there at all.

This is just something that I was playing with while exploring some industries, and it’s a topic that I haven’t seen a lot of content about over the last years. While the data is often available it will both help you get new content ideas but also helps you identify the actual value (keyword volume should turn into business results: impressions x CTR x Conversion Rate == $$$) on the revenue side.


Calculating Click Through Rates for SEO, based on Google Search Console Data (in R)

Updated June 4, 2019: Added a YouTube video which will guide you through the setup process in RStudio and how to run the script yourself.


Averages lie & average click-through rates aren’t very helpful! Here I said it. What I do believe in, is that you can calculate click-through rates (CTR) for your own site in a great way though. With data from Google Search Console. Especially while using R, so let’s dive in on how this works and what you can do with it!

Why ‘Averages’ Lie?

Look at the graph below, a great CTR for position 1. But for example, this research shows that the average CTR for position 1 is: 24% (AWR data, Feb 2019). Which one is correct? Neither of them. As it really depends on the industry, what features show up in the search results that might decrease CTR (think rich snippets like local packs, news results). All of this is making it really hard to make a good analysis of what you could expect if you rank higher for a bunch of keywords in your industry. So while I was working at Postmates on ranking certain category pages better we decided to calculate our own CTR and were intrigued by how far CTRs were off from research (the research isn’t wrong! It’s just generalized across industries). Eventually, with the data in hand, we were able to make better estimates/forecasting of how much traffic we could expect when rankings would increase in that segment. In the rest of this post, I’ll go more in-depth on the specific practice on how we calculated this.

Using Google Search Console Data

Visual of Google Search Console Report (this is not RVshare data)

You’ve seen this report in Google Search Console, providing you with a detailed view of the performance of your keywords and the average position for your keywords. In this graph, we see something positive, a CTR & position that go up slightly over time. But what if you would want to know the average CTR for a certain segment of keywords per position. That’s way harder to do in the interface. Because of that, I used the R script from Mark Edmondson that he wrote about here almost three years ago.

It will help you extract the data from Google Search Console in a raw way so you can use it to digest it and create your own visualizations (like the one we’ll talk about next).

Visualizing CTR Curves in R

CTR Curve visualized CTR per Position (note: this is randomized data for an unknown site)

So let’s dive right into how you can do this yourself, I’ll provide you with the full R script and you will need to download RStudio yourself in step 1.

  1. Download and Install RStudio
  2. Download the following .r script from Gist
  3. Run these commands to install the right packages for RStudio:
    1. install.packages(“googleAuthR”)
    2. install.packages(“searchConsoleR”)
    3. install.packages(“dplyr”)
    4. install.packages(“ggplot2”) if necessary
  4. Line #21 – Change this to the property name from Google Search Console
  5. Line #25 – Not neccesary: If you want the CTR curve for positions over 20, change the number.
  6. Line #40 – Recommended: Exclude the word(s) that are part of your brand name. So you get the right CTR curve for non-branded keywords only
  7. Line #41 – Not necessary: This script is taking a ‘sample’ of 50.000 keywords to calculate your CTR curve of. You can increase this limit to more if needed, if you have less than 50.000 keywords it’s not an issue
  8. Run the script! The output should be a visual as shown earlier in this post

Want to take a deep breath and let me help you go over this again? I’ve made a quick screen share video of what to do in RStudio and how to use the R script.

Hopefully, now, you’ve had a better chance to understand what the actual CTR is for your own site and you can use this to visualize CTR curves for specific parts of your site or pages that have a similar META description. Over time you could use this, for example, for measuring the impact on CTR.

Credits where credits are due! There are many use cases for using CTR data by visualizing it with R, and I’m grateful that a while ago Mark Edmondson opened my eyes about this + credits to Tim Wilson’s documentation on using R and improving visualizations.

Want to read this article in Spanish? Read it here.


Adding additional site speed metrics to Google Analytics: measuring First Input Delay (FID)

Web Analytics is still one of my pet peeves, and while I don’t get to spend a ton of time on it anymore these days, I still enjoy digging through blog posts and coming up with new ideas on what to track and how it can help for (speed) optimization. While I was looking through a big travel site’s source code a couple of weeks ago trying to figure out what we could improve, I noticed a Google Analytics event that was being fired that was ‘new’ to me. It was used to sent information about ‘interaction delays’ as an event. After digging, I figured out what it was, and as I couldn’t find a ton of information about the topic itself in relation to Google Analytics I think it’s worth a blog post.

Disclaimer: There is not a lot of new original material in here, a lot of the information can be found in old updates on the Google Developers Web Updates section, and most credits go to Philip Walton. But I think it’s worth giving these topics some more attention and providing some additional solutions on how to integrate this with Google Analytics.

Site Speed Reports in Google Analytics

The site speed reports in Google Analytics have been proven to be useful for optimizing average load times and identifying how long it takes for a certain page to load (page loaded). You can use the data to figure out if your server is slow (Server Connection Time and Server Response Time) or to look at Domain Lookup Time to see how long it takes for a DNS lookup. But have you ever noticed yourself that for some sites it takes a tiny bit while the page is loading to actually start interacting with it while it’s being painted (the JS/CSS scripts) on your screen? Mostly on slow connections, like your phones mobile network, this can be obvious from time to time. That’s why the following metrics can come in handy as they will start measuring the time to the first paint and the first input delay that can happen.

Why is this metric not already part of the reports? Google Analytics can only start measuring the data whenever you’re loading the script. The speed data that it reports on is being gathered through the Speed API in your browser, but other data for a metric like this isn’t part of that. It’s also a fairly technical metric as you will realize after this. So for most basic users, it would cause a lot of confusion I’d imagine.

First Input Delay – FID

The important definitions:

  • FCP – First Content Paint: The time it takes (in milliseconds) for the first paint (pixels) on the screen.
  • TTI – Time to Interactive: The time it takes for the page to start loading and to be fully interactive.
  • FID – First Input Delay: The time between the first interaction (click, scroll, JS) of the user and the time it takes for user input to be acted upon by the browser.

This FCP metric is already part of the reports that you might have seen in Lighthouse. The obvious problem with that is, is that it’s just a one-off metric. It could be different for your users and you likely want to have a higher sample size to measure this.

Measuring First Input Delays (FID)

So let’s talk about how useful this is actually going to be for your site, starting with the implementation for Google Analytics and Google Tag Manager. The ChromeLabs team has done an amazing job providing a small ‘library’ for tracking the performance metrics via JavaScript. So these are the steps to follow for tracking this in Google Analytics (Gtag/Analytics.js) and GTM:

Measuring First Input Delays in Google Analytics

The script that you have just included provides a Listener that can be used to check when an event needs to be fired or just to save it to the DataLayer.

If you’re using analytics.js add this to your HEAD (under the minified script and after initializing GA):

perfMetrics.onFirstInputDelay(function(delay, evt) {
  ga('send', 'event', {
    eventCategory: 'SiteSpeed Metrics',
    eventAction: 'First Input Delay',
    eventLabel: evt.type,
    eventValue: Math.round(delay),
    nonInteraction: true
  });
});

If you’re using gtag.js (GAs latest version) add this to your HEAD (under the minified script and after GA has been initialized):

perfMetrics.onFirstInputDelay(function(delay, evt) {
  gtag('event', 'First Input Delay', {
    'event_category': 'SiteSpeed Metrics',
    'event_label': evt.type,
    'value': Math.round(delay),
    'non_interaction': true
  });
});

Measuring Input Paint Delays in Google Tag Manager

The integration for Google Tag Manager is obviously a little bit more complex as you need to add a new Trigger and Variable.

window.dataLayer = window.dataLayer || [];
perfMetrics.onFirstInputDelay(function(delay, evt) {
  dataLayer.push({
    'event': 'first_input_delay', // Not necessarily needed if this loads before GTM is initialized.
    'first_input_delay_value': Math.round(delay)
  });
});

Create the Value:

Add it to the Google Analytics configuration, so it will be sent along either your Events or your Pageviews (really decide on this for whatever works best in your use case). In this case, I’ve added it to a Custom Dimension on a page level, but you can also easily send this to a Custom Metric to calculate averages.

Custom Reporting on Speed Metrics

When you’re using a custom metric to report on FID you can easily create a metric on a page level to show to average first input delay for a page type or template. In this case, I created an example of a report that will show this only for new visitors (who likely haven’t loaded assets like JS/CSS/Images that are cached).

Adding other speed metrics

This is just the First Input Delay that you could be adding as a speed-related metric to GA. If you do some digging and are interested in this topic I would recommend going through the rest of the events here. That will give you enough information and a similar integration to measure First Paint (FP), Time to Interactive (TTI).

All the resources on this topic:

Like I said in the disclaimer, I mostly wanted to make it easier to implement but all the documentation around how to set this up and what it entails can be found on these resources.


Diversifying Channels for Actual Growth

I’ve worked with many companies who’ve shown exceptional growth, triple digits year over year that brings them to the next levels in their industries (music, education, marketplaces, etc.). But… in some cases, it wasn’t as good as it should have been. Because the main channels that they were using were vastly too big for what they should have been. So let’s dive a bit deeper into what that means.

When 80-90% becomes a problem

For some of these top performers in their space, they all had one fundamental problem: They were far too much relying on one specific channel that was driving their growth. If one channel (which is/was often Social Media or Organic Search) is driving over 80% of your traffic and/or revenue you might be growing but you’re also in immediate danger. As you can see in the following graph, this company was growing greatly for years before this. However what they weren’t realizing is, that they were not particularly setting themselves up for SEO success. They weren’t doing anything wrong, but they also weren’t doing anything in a way that I would consider a world-class SEO program. Then Google decided to change their approach to certain sites in their algorithm and this happened in the span of a few months. They lost over 40% of traffic and with that approximately 20-30% of their business.

So ask yourself, does your site/business have a healthy divide in traffic? Have you looked at the difference in channels for new and returning visitors? Looking at it the wrong way (combined) will likely skew your approach.

What channels this applies to

  • SEO: If the majority of your traffic is coming in from SEO you have a serious problem. Remember the company that I was talking about at the beginning of this article. They saw their traffic drop with over 90% overnight! I repeat: overnight! This meant that their revenue streams itself crashed with over 80% (they weren’t solely relying on SEO at their revenue driver at the time).
  • Paid Search & Social: When you’re thinking about this channel you’re likely thinking about Google AdWords, Bing Ads and for the social channels; Facebook, Twitter, LinkedIn, etc. It makes sense, I do the same and because of that, I can’t blame you for it. But most companies aren’t even using Bing Ads, or only look at one of these channels. Even with, for example, Bing just being a few percents of your spend it will help you diversify your strategy a lot and protect you from Google changes that might hurt you in the long run. In addition, there are also tons of other networks out there which you could combine that could easily drive a few percent of your PPC spend. We looked into this recently and decided to move part of our display budget over to another platform/vendor just to ensure this, they were able to drive the same ROAS metrics and it felt safer to move a few percent of traffic to a smaller player.
  • ‘Email Marketing’: In most cases, I don’t think this is really an acquisition strategy at all. Because from what source are these people actually signing up to be on your newsletter/mailing list? Likely one of the other channels that have been mentioned in this article.
  • Social Media: Think about all the publishers who doubled down on social media a few years ago (Vice, Buzzfeed, etc.). Social media was a great driver of engagement, branding, and traffic for them. But have you noticed that trend over the last two/three years? Most of them have for sure not seen a traffic increase over that period. The reasons why: the social networks decided to change the way they ranked content and are likely less interested in sending more clicks to publishers instead of keeping them on their own platform.

In all these cases, still focus on these channels, they’re great drivers of growth for your business. Don’t rely on just one of them!

Why not other channels?

In my opinion, the problem doesn’t apply in most cases to referral traffic, affiliate marketing, and multi-level-marketing. As the majority of traffic from these is spread (it should) across many different partners and sites.

Valid Traffic Model

To marketers and founders, I would say, think more about the divide of your traffic and what is bringing in the actual revenue. Explore other channels, it’s not bad to kickstart growth on one channel (social/community, SEO, paid acquisition), in most cases I would encourage founders & startups actually to focus on this. It’s better to take a leap of faith and double down on a channel then to suck at a few channels.