Category: SEO

Sitemaps; Setup, Monitoring & Metrics for Analysis

Sitemaps; Setup, Monitoring & Metrics for Analysis

Last updated on March 20th, 2018 at 05:14 pm

In my effort to write longer posts on a specific topic I thought it was time to shed some light on something that we’ve been working on during the last months at Postmates and something that I never thought of as a topic that could become interesting: sitemaps. They’re pretty boring in itself, it’s a technology where you give search engines basically all the URLs for a site, that you want them to know about (indexed) and you take it from there. Even more so, as most sites these days run on a CMS like WordPress where tons of plugins can take care of this for you. Don’t get me wrong, do use them if you are on one! But as I work mainly for companies that don’t have a ‘standard’ CMS I worked multiple times on creating sitemaps and having their integrations work flawless. Over time that taught me a ton of things and recently we discovered that certain additional features in the process can help speed up the process. That’s why I think it was time to write a detailed essay on sitemaps ;). (*barf: definitive guide).

TLDR; How can sitemaps help you get better insights, how to set them up?

  1. Sitemaps will provide you with insights on what pages are submitted and which ones are indexed.
  2. You create create sitemap files by uploading XML or TXT files with dumps of URLs
  3. All your different content on pages can be added to sitemaps: images, video, news.
  4. Different fields for priority, last modified and frequency can give search engines insights in the priority for certain URLs to be crawled.
  5. Create multiple sitemaps with segments of pages, for example by product category.
  6. Add your sitemap index file to your robots.txt so it’s easy to find for a search engine.
  7. Submit your sitemap and ping sitemap files to search engines for quick discovery.
  8. Make sure all URLs in your sitemaps are working and returning a 200 status code, think twice: do you all want them to be discovered?
  9. Monitor your data and crawls through log files and Google Search Console.

Goals

When you start working on sitemaps there is a few things to keep in mind. The ideas that you have around them and the goal: what problem that you have are they solving? For small sitemaps (100 pages) I’m honestly not sure if I would support sitemaps. There is probably a lot of other projects that would have more impact on SEO/the business.

If you’re thinking about setting up sitemaps there is a few goals that it will help you accomplish:

  • Get better insights into what pages are valuable to your site.
  • Provide search engines with the URLs that you want them to index, the fastest way to submit pages at scale.

Overall this means that you want to support the best sitemap infrastructure you can as that will help you get the best insights ever, the quickest way to get these insights and most of all get your pages indexed + submitted as fast as possible.

Setup

Sitemap

Format? XML/Text? Does the format matter, for most companies probably not as they’re using a plugin to support their sitemaps. If you want to go more advanced and get better insights I would go with the XML format myself. From time to time we’re using text file sitemaps where we just dump all the URLs. They’ll help in getting you a sitemap quick and dirty if you don’t have the time or resources quickly.

Types: There are multiple formats for sitemaps to support different content types.

  • Pages: In there you’ll dump all the actual URls that you have on the site and that you want a search engine to know about. You can add images for these specific pages to that Schema as well to ensure that the search engine understands what images are an important part of the page.
  • Images: For both image search as making an impact with the pages you can add sitemaps for images.
  • Videos: Video sitemaps used to have a bigger impact back in the days as the video listings were a more prominent part of the search results page. These days you mostly want to let search engine know about them as they’re usually part of an individual page.
  • News: News is not really its own format as they’re just individual pages. But Google News sitemaps do have their own format. Creating a News Sitemap – Google.
  • HREFLang: This is not really a type of content but it’s still important to think about. If your pages have a translated version, you want to make sure they’re being listed as the ‘duplicate’ version of that. Read more information about that here in Google’s support.

Fields

  • Frequency: Does the page change on a regular basis? Some pages are going to be dynamic and will always change. But for some of them they will change only daily, weekly, monthly. It’s likely worth it to include this as a good signal in combination with the Last Modified field and the header.
  • Last Modified: We do want to let a search engine know what kind of pages have been updated/modified and which ones aren’t. That’s why I’d always recommend to organizations that they should include this in their sitemap. In combination with the Last Modified header, we’ll talk about that in the next step it will be a good enough signal to assess if the page has been modified or not.
  • Priority: This is a field that I wouldn’t spend too much time thinking about. On multiple occasions, Google has mentioned that they don’t put any value or effort into understand this field. Some plugins use it and it won’t hurt. But for custom setups it’s not something that I would recommend adding.

Last Modified

Has the actual sitemap changed since the last time it’s been generated? Yes or No? In some cases your sitemap won’t change. You didn’t add any new products/articles. Have you ever run this in your terminal:

curl -I https://www.example.com/sitemap/sitemap_index.xml

Look at the headers, if you see a Last Modified header, it will be a signal to see when the page has been last modified. We use it to tell the last time it was updated. We combine this with serving a Last Modified Header at the URLs that are in the sitemaps. Sometimes this won’t always work as pages can change momentarily (based on availability of products for example).

Segmenting Pages

For better insights it’s really useful to segment your sitemaps. The limit per sitemap is in the end 50.000 URLs, but there is basically not a required minimum. The way you’ll see sitemaps being segmented is in multiple ways. Based on these you can get more segmented insights, is 1 category of pages better indexed then another one.

Categories: Most companies that I work with are segmenting there pages by the categories they’ve defined themselves. This could be based on region or for example by product categories for an ecommerce site.

Static Pages: Something that most people with custom build sites don’t realize is that there is usually still a ton of pages that aren’t backed up by a database that you you want insights on too. Think about: contact, homepage, about us, services, etc. List all these pages in a different sitemap (static_sitemap.xml) and include this file in your sitemap index too.

Sitemap Index

If you have multiple sitemaps (10-25+) you want to look into creating a sitemap index file, with this you can just submit 1 file and with that the search engine will be able to find all the underlying files that are part of the sitemap. This saves you adding multiple sitemap URLs to Google Search Console/Bing Webmaster Tools and will also give you the ability to add only 1 line to your robots.txt file. In the end it’s another sitemap technically which lists all the different URLs of the other sitemaps.

Robots.txt

You want to make sure that on first entry a search engine will know about your sitemaps. Usually one of the first files a search engines’ crawler will look at is the robots.txt file as it needs to know what it can/can’t look at on a site. As we just talked about the sitemap index, we’re going to list that one in the robots.txt file for your site which should live on https://www.domain.com/robots.txt. It’s just as simple as adding this one line to it:

Sitemap: https://www.domain.com/sitemap/sitemap_index.xml

Obviously the URL can be different based on where you have hosted your sitemap index file.

GZIP

If you’re a big site you likely have servers that won’t go down and can take quite a hit but if you have extensive sitemap files they could easily get up to +50MB that is not a file transfer that can be done in a matter of two seconds. Also it can just slow down things on both your end and the end of the search engine. That’s why we’ve started GZipping our sitemap files to make for a faster download and speed up that process, at the same time you make it 1 step more complicated for people to copy paste your data.

PING Search Engines

Guess what, it has an affect. I thought it was crazy too, but we found a tiny bit of proof that actually pinging a search engine will result in something. As you mostly will likely only care about Google and Bing we still have a way of letting them know about a page:

Submit your sitemap

Probably not worth explaining, you need to make sure that you can get insights into your XML sitemaps and the URLs that are listed in there. So make sure to submit your sitemaps to Google Search Console and Bing Webmaster Tools.

Pubsubhubbub

One of the projects that is very unknown is the PubSubHubbub project, it will let, mostly publishers, be instantly notified (through a specific push protocol) when new URLs are published in a feed. This protocol works through an ATOM feed (do you still know about that protocol?) that you provide. Once you have registered the feed with the right services you can make it easier for them to be notified of new pages.

XSLT

XML Sitemaps aren’t easy to read for a regular person. If you’re not familiar with the format of XML it might be uncomfortable. Luckily a while back people invested XSLT. This will let you ‘style’ the output of XML files to something that is more readable. This would make it easier to see certain elements in the sitemaps that you’ve listed. If you want to make them more readable I would advise looking into: https://www.w3schools.com/xml/xsl_intro.asp.

Quality Signals

Search engines like sites that are of high quality. The pages are the best, the URLs are always working and your site never goes down. Chances are high that all of this doesn’t always apply to your sitemaps as some pages might not be great. Some things to consider when you’re working on this:

  • 301/302/404: Are all URLs in your sitemap responding like they should with a 200 response? In the best case scenario none of your URLs should be responding with another response code then that. In reality most sitemaps always contain some errors.
  • NoIndex: Have you included URLs in your sitemap that are actually excluded by a noindex meta tag or header? Make sure that it’s not the case.
  • Robots.txt: An even bigger problem, are you telling the search engine about URLs that you actually don’t want them to look at?
  • Canonical Pages: Is the actual URL that you’re listing the canonical URL/original URL or are you listing the pages that are still ‘stealing’ the content from another page, like a filter page. Do you really want to list these URLs in your sitemap?

With all of these signals, some might have a big/small impact others won’t matter at all. But at least think about the implications that they might have when you’re building out your sitemaps.

Airflow

Lately I’ve been working a ton with Apache Airflow, it’s the framework that we use at Postmates, invented by the great folks at Airbnb and mostly use for dealing with data pipelines. You want to do X, if X succes you want it to go on to task Y. We’re using that for the generation of sitemaps, if we can generate all sitemaps we want to have them pinged with the search engines, if that succeeds we want to run some quality scripts, if that is done we want to be notified on both email and Slack to tell us at what time the script succeeded.

For some sitemaps we want it to run everyday, for a specific segment we want to have it run on an hourly basis. The insights from Airflow will give us the details to see if it’s failing or not and will notify us when it succeeds/fails. With this setup, we have constant monitoring in place to ensure that sitemaps are being generated daily/hourly.

Monitoring

Eventually you only want to know if your pages are of good enough quality that they’re being indexed by the search engine. So let’s see how can see this in Google Search Console.

Index coverage

A useful report in Google Search Console is the Index Status report (Google Index > Index Status). It will show for the property that you’ve added how many pages have been indexed and what pages have been crawled. As the main goal for a sitemap is driving up the number of pages being submitted for the Google index the following step is making sure that they’re being indexed. This report will give you that first high level overview.

Sitemap Validation: Errors & Amount of URLs

But what about the specifics of the sitemap, are the URLs being crawled properly and are the URLs being submitted to the index. The sitemap reports give you this level of detail (in this case 98% is indexed, which makes sense, the 2% missing are some products that were test ones that Google seemed to have ignored, luckily!). Remember what we talked about before regarding segmenting your pages? If you would have done that you would have seen in this particular example what percent of pages in that sitemap was submitted / indexed. Very useful if you work on big sites where the internal link structure for example is lacking and you want to push that. These reports can (they not always) give you insights into what the balance could be between them.

Quality Assurance

  • Are the URLs working (200 status code)? An unknown fact, but Google doesn’t like following redirects or finding broken URLs in your sitemaps. Spend some time on making sure that these pages aren’t in there or add the right monitoring to prevent it from happening. Since we’ve starting Gzipping our sitemaps that’s become a tiny bit harder as you first need to unpack them. But for quality testing we still have scripts in place that on demand can run a crawl of the sitemap to see if all URLs in there are valid.
  • Page Quality: Honestly, is this page really worth it to be indexed in Google? Some pages are just not of the quality that they should be and so sometimes you should take that into account when building up sitemaps. Did you filter out the right pages?

Metrics & Analysis

So far we’ve talked about the whole setup and how to monitor results. Let’s go a little step further before we close this subject and look at the information in log files. It’s a topic that I became more familiar with and have worked closely with over the last months too:

Log Files

As log files can be stored on the web server that you’re also using for your regular pages you can get additional insights into how often your sitemaps are being viewed and if there are any issues with sitemaps. As we work on them on a regular basis it could be that they break. That’s why we make sure that for example we monitor the status codes for the URLs so that we can see when a certain sitemap doesn’t hit a successful 200 status code.

Proving that pinging works

A while back we started to ping our sitemaps to Google and Bing, both make it clear (Google) that if you have an existing sitemap and you want to resubmit it this is a good way to do it. This sounds weird, Google got rid of their ‘submit a URL’ feature for the index years ago. So we were skeptic to see if this had any impact. As it was really easy to implement, you just fire a GET request to a Google URL with the sitemap URL in there. What we noticed is that we saw Google almost immediately try to look at these URLs. As we refresh this specific sitemap every hour, we also ping it every hour to Google. Guess what happens, every hour for the last weeks they look at the sitemap by now. Who says you can’t influence crawlers? Result? If you want to ensure that Google is actually looking at a page and actively crawling it, pinging seems to prove that, that is actually happening.

Screenshot of this from a Kibana dashboard where we log server requests

What if you can’t ping? Usually I would only recommend pinging a search engine if your whole sitemap generation process is fully automated, it doesn’t make sense to open your browser or have a tiny script for this. If you still want to basically experience the same, use the Resubmit button in Google Search Console > Sitemaps to achieve the same.

Future

This is not all of it and I’ve gone over some topics briefly, I didn’t want to document everything as there’s already a ton of information from Google and other sites about how you can specifically setup sitemaps. In my case, we’re on a route to figure out how we can make our sitemap setup near perfect, what I’m still wanting to investigate or analyze:

  • Adding a Last Modified Header to pages in the sitemap, what is the effect of pinging a sitemap and Google looking at all pages or just the ones that are modified?
  • Segmenting them even further, let’s say I only add 100/1000 pages to a sitemap and start creating just more of them, does that influence crawling, do we get better insights?

Resources

You want to learn more about sitemaps, look into the following resources to learn more about the concept, the idea behind it and the technical specification:

Next steps?

When I started writing I didn’t plan to have this become everything I know about sitemaps. But what did I miss? What optimizations can we apply to sitemaps in order to get better insights, speed up the crawling of pages. This is just one of the areas of technical SEO but probably an important one if you’re looking for deeper insights into what Google or Bing think about your site. If you have questions or comments, feel free to give a shout on Twitter: @MartijnSch

Dealing with SEO within your company / internally

Dealing with SEO within your company / internally

Last updated on March 16th, 2018 at 07:30 pm

“My company/manager/CEO doesn’t understand SEO, my engineers have no idea on how to implement X, I don’t get the buy-in that I need.” Just some of the comments that I hear in real life and see pass by on Twitter. That’s why in late 2017 I asked this question on Twitter. So that’s why I thought it would be time for a write up on the scenarios that I’ve seen over the last few years in SEO and the other ways that I’ve seen that help for getting input. I’ll try to share some insights into how I’m/we’re dealing with explaining and dealing with SEO internally.

Is there any believe in SEO?

For a successful SEO strategy and great results the first thing that you’ll need is somebody in the organization needs to support SEO. If that person or role isn’t there it’s going to be really hard to get things moving forward. But how do you know that support is there. In my case I’ve been lucky, my first job at Springest I had a guy who started a SEO agency himself, TNW was such a big online tech player I didn’t even need to explain what SEO was and these days at Postmates I got specifically asked if I wanted to focus on SEO when I joined. All with the believe that SEO would help move the business forward. But in the end you don’t always need to have people that know what SEO is and know everything inside out. It will help if they’re able to help you out when you have questions of SEO and at least know the good and bad parts about it. If the person that’s asking you to help out is also talking about link buying I would probably reconsider my decisions to work for them a few times  (and probably decide not to).

Explaining the value

In most organizations you’re going to need to explain the value of SEO, in the end it remains a black/dark grey box in which it’s hard to explain what kind of impact you can deliver on. But what you can explain in my opinion is the following:

  • What is the opportunity, how big is your industry/nice in terms of search volume?
  • What is our current position? How good/bad are we performing against our competition?
  • Who’s our competition really? It’s most likely not the companies that offer the same product but probably the ones that are beating your *ss in the search results.
  • Increase conversion rate while they’re at it, increase awareness while they’re at it, increase referral traffic while they’re at it.

These are just a few that additional values that you can bring to the table as an SEO (team) in a company. Most of all, if you do a great job on keyword research you can tell your internal organization a lot about the keywords and intent of the users within your industry/niche.

But how do you proceed, to excel even more. In the end you want your whole organization to be supportive and help the cause of SEO. The more people that work on it the more you can hopefully grow your visibility and with that your relevant business metrics (clicks, leads, sign ups, etc.).

Creating Buy In

What kind of support do you need, why do you need it and what can you do to get it? Let’s talk a little bit about that:

Team Support

Does your team know what SEO is, how it can help the company and what their contribution to it could be/mean? Very often I don’t see SEOs talk with the IT team/engineers/developers or whatever job title they’ll have in your organization. The often talked about phrase: “Sometimes you should just have a beer with them to build up a good relation” most often is incorrect. You’ll build up a better relationship, that I agree with. But that doesn’t always cover an actual understanding of the problem which is still going to be essential.

Building up understanding on what you’re/they’re working on

Does your boss, his boss and the CEO what you work on for SEO? Pretty sure that they don’t. So it’s not surprising that you don’t have all the support or resources that you would need. Start educating them, on what you’re working on and what the results are. If you’re in doubt about something there are multiple paths to go: run an experiment, launch an MVP as soon as possible or create the business case/technical documentation so you’re aware that you know what you’re building.

My {fill in random job title} doesn’t understand SEO

“Why didn’t my engineer think of adding redirects?”, “Why didn’t our content team use the right keywords?”, “Why doesn’t my boss understand what SEO is really about?”. Questions that you must have asked yourself and I can’t blame you. But the answer to all of them is easy: “Because you haven’t told them”. In the end all these things matter to SEO and your success, so why don’t you explain it once more. Repetition makes it easier to have these answers printed on their minds.

What can help me build up an understanding of SEO?

Besides working with your team it’s even more important to work in a nice way with the other teams in a company. It’s very likely that there are more people outside of SEO in your company then there are working with it on a daily basis. All of these folks can help you out with SEO too. I remember the times where I was in desperate need for a copywriter after hiring dramas that continued for months and our receptionist turned out to be an English major and able to help out immediately.

Internal Decks/Meeting

The previous two companies that I’ve worked for were relatively small (<75) and as I was an ‘early’ employee at both I saw a lot of people come in. In the end that made it easier to explain SEO to them as I either hired them or they had a manager that I already worked with within the organization. At Postmates that is quite different, I came in years after founding the company at the point where we had over 400+ people and a growing organization.

That’s why early on, when I was formulating the SEO strategy I started creating a slide deck explaining SEO for the rest of the organization and also telling them more about the projects that we already worked on or would be working on in the next months. Whenever a new team would be formulated or somebody would join the Growth team I tried to keep up with setting up a meeting with them and seeing if there would be any overlap or room to work together. In the end your Comms teams, Support teams have probably some interest in SEO or you can help them with their work with the tools, resources and/or products that you have available.

Weekly Status Meeting/Email/Update

When you work with multiple people on a team it’s hard to keep them all up to date with everything that is going on. Regular status updates, either in person or via VC, email can help with that. As we have multiple engineers working on SEO at the same time they can already get behind on what’s going on easily. That’s why on a weekly basis we send out updates on the work that they’ve done to the organization but also to the the bigger team. With this it’ makes it easier to show progress, list down what we’re planning on working on and provide early results. So things that we list in the email are:

  • Experiments started/finished: Do we have any results, or what did we launch this week?
  • What did we do last week? What are the tickets that we worked on, what kind of early results do we have, is this already being picked up by a search engine?
  • What are we going to work on this week? What issues will we work on, what kind of results might that provide, why do we work on this.
  • What did we learn last week? What kind of results die we see, what kind of growth did we achieve.

Monthly Updates

On top of that we send some headlines once a month for the bigger projects that we launched so we know what kind of progress we have made for the quarterly targets that we have. This will give a more birds eye view on what we’re achieving and if that’s on track with what we planned upfront. It’s a similar update to the Monthly one, but a bit more high level and readable for the whole team and people that don’t work with SEO on a day to day basis.

What’s next?

This is still not good enough, even when you have internal support you always have new questions rise and even when they all support you it’s probably going to happen that they start asking deeper questions that you need to keep explaining. This is something I now endure, questions basically are more relevant to your own work. Which is awesome, as it makes up for great debates that in the end will only improve products and SEO strategy. Even better, this will all help streamline the process of SEO and usually speed up the output.

What is SEO Experimentation?

What is SEO Experimentation?

Last updated on November 4th, 2018 at 10:45 pm

If you’ve been reading some of my blog posts in the past you’ll have noticed that I worked a lot on analytics, experimentation, and SEO. Having been able to combine these areas together has led to the point that for both Postmates and The Next Web previously, we worked on setting up a framework for SEO Experimentation. In my opinion, over time you’d like to learn more about what Google appreciates and how it can help you future wise, to think about what you should be focusing your attention on == prioritization. Lately, I read a blog post on the SEMrush blog with the title: SEO Experiments that will blow your mind (clickbait alert! so I won’t link to it). Hoping there would be a lot of great ideas in that blog post I started reading realizing that over 50% of examples weren’t even close to being an experiment. They were just research, from over a year ago (which was alarming too!).

Which pushed me to write this essay to tell you more about what SEO experimentation really is and how you can learn more about it as it’s still a relatively new area in SEO that is only been explored and exposed by a few big companies.

Update: since the original publication date I’ve become even more excited about SEO Experimentation and its possibilities. That’s why I’ve updated this post with more information and some frequently asked questions that I get about the subject.
Last update: November 4, 2018 – Added an additional resource from the Airbnb Team.

What really is, SEO Experimentation?

You’re testing 2-X variations of a template on your site (all product pages, all category pages) and measure over a period of time what the impact is in organic search traffic of the pages (usually a template) that have seen changes in your experiment. You want to isolate your changes as much as possible, you set a certain level of significance and you calculate in a mathematical way how your results have changed in the period before and after the change.

It’s not:

  • Research (I  compare data between periods too, but it doesn’t make it an experiment).
  • Guesswork: I’ve seen my pageviews go up with X% after optimizing my title on this page.

I would encourage you to also read this post on Moz by Will Critchlow in which he shared how they built Distilled ODN (we worked with their platform when I was at TNW) and how you can be testing with SEO yourself.

How is SEO Experimentation different from user testing?

Measurement: Instead of measuring conversion rates or other business metrics you’re mostly focused on tracking the number of sessions that are coming in from organic search. If the variants that you’ve worked on increase you’ll start calculating the impact on this. On the side, you’re still making sure that the conversion rate of these pages doesn’t decline as, in the end, it will still be the business metrics that will count.

Bucketing: Instead of bucketing users and sending user A into bucket A you’re doing this with pages. In the end, you want to make sure that the same pages (just like you do with users) end up in the same bucket. How you usually do that is that you sort by alphabet or category to have some kind of logic. What’s important though is to make sure that these buckets are in balance. Which is harder than you would do with user testing.

The difference between bucketing for users experimentation and SEO experiments
Bucketing works differently with SEO Experimentation

What are some examples of SEO Experiments?

Examples of things that you could be testing? Let’s list a few examples that I can think of:

  • Updating Title tags: does a certain keyword help, do I need to make them longer/shorter?
  • Adding/Removing Schema.org, what is the impact of adding structured data?
  • Adding content: does adding more content to a page help with its ranking? What if I add an additional heading to my content.
  • Internal link structures: do you want to add more relevant links to other pages?
  • Testing Layouts, what layout does help better for the SEO on this page? Ever noticed why Amazon has so many different product pages 😉

How do I start doing SEO Experimentation?

Let me give you a very much shortened and a simplified idea of how an SEO experiment works:

  1. Think about what you want to be testing (adding descriptions, setting a canonical), write down a hypothesis, we’ll talk about this shortly. 
  2. Figure out if you have enough traffic (500+ per variation per day) would be a minimum I’d say.
  3. Figure out how you can serve different content/HTML per page. Are you able to serve different areas on your site based on a randomized factor, still making sure that your buckets are just as big (50/50)?
  4. Setup the experiment and let it run for as long as needed to see valid results (usually at least 3-4+ weeks). You want to make sure that Google has picked up on all the changes on your page and has re-indexed the impacted pages.
  5. Analyze the results after this period and look at how these buckets of pages performed before your experiment and after. Are these results valid, you didn’t make any other changes to these pages in the meantime? Are the results per variant significant?
  6. You have found the winner + loser is. Time to start iterating and launch another experiment.

How to document an SEO Experiment?

You want to make sure that what you do around SEO experimentation is well documented, this will help you in the future with figuring out what kind of experiments are working and what you learned. When you can run over 25 experiments a year you probably won’t know after a year how many of these were successful and how to interpret the results. For documenting SEO experiments I’ve created a template that we filled in with the data on the actual experiment.  You can find it here and copy it in your own Google Drive for own use:

How to analyze an SEO Experiment?

You want to make sure before an SEO experiment is running that you know what has happened with it before it starts. It’s basically the ‘control’, you want to make sure your bucket is providing stable results so you can clearly analyze the difference when the new variants are being launched.

Creating buckets for SEO Experimentation
Bucketing needs to ensure that there are additional buckets so you can measure the baseline and take care of anomalies in your data sets.

Bucketing: Your bucketing needs to make sure that it’s sending the right variant to your analytics account (most often used: Google Analytics). When the experiment starts, make an annotation so you know from when you start analyzing the results.

Logs: Logs can come into play when you start analyzing the results. In most cases, your experiment won’t generate results in the first week as the changes in your variant haven’t been picked up in the experiment. That’s why you want to look at your log files to ensure that the new variants have been crawled.

Measuring & Analyzing impact: For measuring the impact you’re segmenting down the buckets and measure what happened before the experiment and after. To see if the changes are significant or not, you need to rely on the CausalImpact library to see what has happened or not. You want to send the data for different buckets in a way that can be visualized like this:

Sending dataLayer events for measuring SEO experiments
Send data about the buckets and elements (de)activated to web analytics

Anomalies: Analyze your buckets individually! Do you see any spikes in traffic that might be hurting the data quality? You need to be sure that is not the case. For example, what if one of the buckets contain pages about political topics that all of a sudden see a spike in search volume. This doesn’t mean that your page has been performing better, it means there was just more demand so the data for that variant might be invalid.


Examples of SEO Experiments

As I mentioned both at The Next Web and Postmates I was responsible for running SEO experiments. Most of them were around optimizing <title> tags. As changes to this have, in most cases, a direct connection to the CTR within the SERPS. The title is, in the end, used as the headline for a specific search result. So let me walk you through an example of an SEO experiment as we ran it at Postmates.

Updating Titles with additional keywords

The problem: We noticed a lot of search traffic for terms around food delivery in which a zip code, like 91615 was mentioned. As we could ‘easily’ create pages for zip codes we wanted to know if that was worth it, so: “What can we do to drive more additional searches around zip codes without building new landing pages and wasting valuable engineering resources doing so”.

The solution: As we knew for restaurants in what specific zip codes they were active we had the ability to mention the zip code in the title. As we were doing this across tens of thousands of restaurants we knew that we had enough of a sample size.

  • Old:
    • {Restaurant Name} {Street Address} in ({City}} – Postmates On-Demand Delivery
    • Paxti’s Pizza 176 Fillmore Street in San Francisco – Postmates On-Demand Delivery
  • New:
    • {Restaurant Name} {Street Address} in ({City}} ({Zip Code}) – Postmates On-Demand Delivery
    • Paxti’s Pizza 176 Fillmore Street in San Francisco (97521) – Postmates On-Demand Delivery

The result: It was inconclusive, in the end, that wasn’t likely the outcome that you were hoping for. But I want to paint a realistic picture of what can happen when you run experiments. In a lot of cases, you don’t see enough changes in the data to be certain that it’s an improvement. In this case, we expected that a title change wasn’t good enough to actually compete for zip code related queries. The food delivery industry is one of the most competitive in the world for SEO so we knew it was always possible to have an outcome like this.


Frequently Asked Questions + Answers

There are a lot of questions that come up when I talk about this subject with people. So I’ll try to keep up this blog post with any new questions that might arise:

Isn’t this cloaking? Doesn’t this hurt for my Google rankings?

Not really, you’re not changing anything based on who’s looking at the page. You’re changing this only on certain pages that are being served and the search engine + user will see the same thing. Ever looked at Amazon’s product pages and wonder why they all have a different layout? Because they’re testing for both user experience as SEO.

Do you want to learn more about this subject?

Great, when we were setting up our own SEO experimentation framework at Postmates about 6 months ago I tried to find all the articles related to it and talked to as many people as possible on this. These were mostly the articles I would refer you to if you want to learn more.

Resources

When I wanted to learn more about SEO experimentation I started to figure out what was already written on the web, most of these resources are from teams & companies that I worked with before. So if you’re enthusiastic about this subject, read more here:


Let’s really start innovating in the SEO industry and let’s get rid of terrible clickbait headlines. SEO Experimentation is like I mentioned something that we should be embracing as it’s a more scientific approach that is going to lead to finding new insights. If you want to talk more about this, feel free to reach out. Like I said, I have been updating this post for over 6+ months now, and will keep on doing so in the future.

Finding & Dealing with Related Keywords

Finding & Dealing with Related Keywords

Last updated on September 26th, 2018 at 04:47 am

How do you go from 1 keyword and find another 10.000 that might also be relevant to your business/site. One of the things that I’ve been thinking about and worked on for some sites recently. It’s fun as with smaller sites it makes it easy to get more insights into what an estimated size can be of an industry/niche that a company operates in. This ain’t rocket science and hopefully, after this blog posts, you’ll get some new ideas on how to deal with this.

How to get started?

Pick 1 keyword, preferably short-head: coffee mug, black rug, Tesla Roadster. They’re keywords that can create a good start for your keyword research as they’re more generic. In the research itself, we’ll talk about ways to get more insights into the long tail based on this 1 keyword.

From 1 to 10.000

Start finding related keywords for the keyword(s) you picked that you consider relevant. Use the tools that we’re going to talk about after this and repeat the process for all the keywords that you get back after the first run: 1 = 100 results = 10.000 results. Depending on the industry/niche that you operate in you might be able to find even more keywords using this method. When I started doing research for a coffee brand within 30 mins I ended up with data for 3 big niches within that space and over 25k keywords.

What tools are out there?

Obviously, you can’t do this without any tools. For my own research, I use the tools that are listed beneath. They’re a mix of different tools but they have the same output eventually. Getting to know more keywords but at the same time also get different input on intent. Focused on search (I’m looking for.. {topic_name}) and other search intent (I have a question around {topic_name}).

Besides the tools that I’ve listed there are many more that you could be using that I want you to benefit from:

    • Google Adwords Keyword Tool: The best source for related keywords by a keyword.
    • SEMRush: The second best source likely as they’re using all sorts of ways to figure out what keywords are related to each other. Also a big database of keywords.
    • AnswerThePublic: Depending on why/what/where/who you’re looking for AnswersThePublic can help you find keywords that are related to a user question.

Suggested Searches:

    • Google, Bing, Yahoo: The biggest search engines in the world are all using different ways to calculate related searches through their suggestions. So they’re all worth looking into.
    • Google Trends: Is a keyword trending or not and what keywords are related to a trending topic. Mostly useful when you’re going after topics that might have (had) some popularity.
    • YouTube: Everything video related, need I say more.
    • Wikipedia: You really are looking for some in-depth information in the topic, Wikipedia can likely tell you more about the topic and the related topics that are out there.
    • Instagram: Everything related to pictures and keywords, their hashtags might mislead you from time to time.
    • Reddit: The weirdest place to find keywords and topics.
    • Quora: Users have questions, you can answer them. The most popular questions on Quora on a topic are usually the biggest questions on your customer’s minds too.
    • Yahoo Answers: Depending on the keyword the data can be a bit old, who still uses Yahoo? But it can be useful to get the real hardcore keywords with a question intent.
    • Synonyms: The easiest relevance, find the keywords that have the same intention.
    • Amazon: Find keywords that people are using in a more transactional intent and that you might search for when you’re looking for a product. Great for e-commerce.

Grouping Keywords

When you’ve found your related keyword data set it’s time for the second phase, grouping them together. In the end, 1 keyword never comes alone and there is a ton you can do with them if you group them together in a way that makes sense for you….

By name/relevance/topical: Doing this at scale is hard, but I’m pretty sure that you see the similarity between the keywords: coffee mug and: black coffee mug. In both ‘coffee mug’ is the keyword that is overlapping (bigram). If you start splitting up keywords with different words relatively fast you’re able to find the top words and word combinations that your audience is using most. If you’re wanting to find out more on how to group them, check out KeywordClarity.io where you can group keywords together based on word groupings.

By keyword volume: If you have the right setup you can retrieve the keyword volumes for all of these keywords and start bucketing the keywords together based on short-head and the long tail. This will enable you to get better insights into the total size of the volume in your industry/niche.

By ranking/ aka opportunity: It would be great if you can combine your keywords with data from rankings. So you know what opportunity is and for what words you’re still missing out on some additional search volume.

What’s next?

Did you read the last part? What if you would start combining all three ways of grouping them? In that case, you’ll get more insights into the opportunity, your current position in the group and what kind of topical content you should be serving your audience. Food for thought for future blog posts around this topic.

Using Keyword Rankings In SEO

A few weeks ago I gave a talk at an SEO Meetup in San Francisco. It was a great opportunity to get some more feedback on a product/tool that I’m working on (and that we are already using at Postmates). You’ll hear more on this in the upcoming months (hopefully). In a previous blog post at TNW I talked about using dozens of GBs of data to get better insights in search performance. Over the last years I kept working on the actual code around this to also provide myself with more insights into the world around a set of keywords.

Because billions of searches are done on a daily basis and ~20% of queries haven’t been searched for in the past 30-90 days it means that there is always something new to find out. I’m on the hunt to explore these new keyword areas/segment & opportunities as fast as possible to get an idea on how important they can be.

That means two things:

  1. The keyword might be absolutely new and has never been searched for.
  2. The keyword has never come up on the radar of the company, it was never a related keyword or never got an impression simply because content didn’t rank for it.

Usually the next thing you want to know is what their ranking is so you can start improving on it, obviously that can be done in thousands of ways. But hopefully the process would usually work something like this. Moving up from an insane ranking (read: nowhere to be found) to the first position within a dozen weeks (don’t we all wish that can happen in that amount of time?).

Obviously what you’re looking for is hopefully a graph for a keyword that will look something like this:

What am I talking about?

Back at TNW my team was tracking 30.000 keywords on a weekly basis to get better insights into what was happening with our search volume & our rankings. It has multiple benefits:

  1. Get insights into your own performance for specific keywords.
  2. Get insights in your actual performance in search engines (are 100 keywords increasing/stable/decreasing?).
  3. Get insights into your competitors performance.

Besides that there is a great opportunity to learn more about the flux/delta of changes in the search results. You’re likely familiar with Mozcast & SERPMetrics Flux and other ‘weather’ radars that monitor the flux in rankings for tons of keywords to see what is changing and if they’re noticing an update. With your own toolset you’ll be able to get insights into that immediately. I started thinking about this whole concept years ago after this Mozcon talk from Martin McDonald in 2013. One of the things that are particularly interesting:

Share of Voice

You’ve also likely heard of the concept of Share of Voice in search. In this case we’re talking about it in the concept of rankings. If you rank #100 in the search results, you’ll get 1 point. If you’ll rank #1 you would assign it 100 points. Which basically means that you will get more points the higher you’ll rank. If you bundle all the keywords together, let’s say 100 you can get: 100 x 100 = 10.000 in total. Over time this will help you to see how a lot of rankings will be influenced and where you’re growing instead of being focused on just the rankings of 1 keyword (always a bad idea in my opinion).

In addition to measuring this for yourself, there will also be other useful ways you can use Share of Voice:

  • Who are my competitors: Obviously you know your direct competitors, but most of the times that doesn’t mean that they’re the same as you’re going against in search results. Get the top 10-20-50-100 (whatever works for you) and count the URLs for the same domain in all of the keywords in a group and multiply that by their Share of Voice. The ones that raise to the top will be the competitors that are annoying you most.
  • Competitors: You’re familiar now with the concept, so if you apply the same thing to your competitors you’re able to figure out how they’re growing compared to you and what their coverage is in search for a set of keywords. Basically providing you with the data you otherwise would have to dig up somewhere else.

How can you combine it with other data sets?

In a future blog posts I’m hoping to tell you more about how to do the actual work to connect your data to other sets in order for it to make sense. But the heading I’m going for right now is to also look more at competitors/ or at least other people in the same space. There is probably a big overlap with them but there also will be a lot of keywords missing.

What’s next?

I’m nearing the end of the first alpha version to use, it will enable users to track their rankings wherever they want. Don’t dozens of tools already do that? Yes! I’m just trying to make the process more useful for bigger companies and provide users with more opportunities to expand their keyword arsenal. All with the goal to increase innovation in this space and to lower costs. It doesn’t have to be expensive to track thousands of keywords whenever you want.

Measuring SEO Progress: From Start to Finish – Part 2: From Creation to Getting Links

Measuring SEO Progress: From Start to Finish – Part 2: From Creation to Getting Links

How to measure (and over time forecast) the impact of features that you’re building for SEO and how to measure this from start to finish. In this series I already provided some more information on how to measure progress: from creation to traffic (part 1). This blog post (part 2) will go deeper into another aspect of SEO: getting more links and how you can measure the impact of that. We’ll go a bit more into depth on how you can easily (through 4 steps, 1 bonus step) get insights into the links that you’ve acquired and how to measure their impact.

1. Launch

You’ve spent a lot of time writing a new article or working on a new feature/product with your team, so the last thing you want is not to receive search traffic for it and not start ranking. For most keywords you’ll need to do some additional authority building to make sure you’ll get the love that you might be needing. But it’s going to be important to keep track of what’s happening around that to measure the impact of your links on your organic search traffic.

2. Monitor

So the first thing you’d like to know if your new page is getting any links, there are multiple ways to track this. For this you can use the regular link research tools, that we’ll talk about more in depth later in this piece. But one of the easiest ways for a link to show real impact is to figure out if you’re receiving traffic from it and when that time was. Just simple and easy to figure out in Google Analytics. Head to the traffic sources report and see for that specific page if you’re getting any referral traffic. Is that the case? Then try to figure out when the first visit was, you’ll be able to monitor more closely then since when you’ll have this link or look at the obvious thing, the published date if you can find it.

How to measure success?

Google Alerts, Mention, Just-Discovered Links (Moz) and as described Google Analytics. They’re are all tools that can be used to identify links that are coming in and might be relatively new. As they’re mentions in the news media or just the newest being picked up by a crawler. It’s important to know more about that as you don’t’ want to be dependent on a link index that is updating on an irregular basis.

3. Analyze

Over a longer period of time you want to know how your authority through links is increasing. While I’m not a huge fan of the ‘core metrics’ like Domain Authority, Page Authority, etc. as they can change without providing any context I rather look at the graphs and new and incoming root domains to see how fast that is growing. In the end it is a numbers game (usually more quality + quantity) so that’s the best way to see it. One of my favorite reports in Majestic is the cumulated links + domains so I can get an easy grasp of what’s happening. Are you rapidly growing up and to the right or is progress slow?

How to measure success?

One suggestion that I would have is to look at the cached pages for your links: So by now you’ve figured out what kind of links are sending traffic, so that’s a good first sign. But are they also providing any value for your SEO? Put the actual link into Google and see if the page is being indexed + cached. It is? Good for you, that means the page is of good enough quality and being cached for Google’s sake. It’s not, hmm then there is work to do for no and your actual page might need some authority boosting on its own.

4. Impact

Are you links really impacting what’s happening to the authority and ranking of the page. You would probably want to know. It’s one of the harder tasks to figure out as you have a lot of variables that can be playing a role in this. It’s basically a combination of the value of these links, which you could use one of the link research tools’ metrics for or just looking at the actual changes for search traffic for your landing page. Do you see any changes there?

5. Collect all the Links

In addition to getting insights into what kind of links might be impacting your rankings for a page you’ll likely want to know where all of your links can be find. That’s relatively simple, it’s just a matter of connecting all the tools together and using them in the most efficient way.

So sign up for at least the first three tools, as Google Search Console and Bing Webmaster Tools are free, you can use them to download your link profiles. When you sign up for Majestic you’re able to verify your account with your GSC account and get access to your own data when you connect your properties. So you just unlocked three ways of getting more data.

That’s still not enough? Think about getting a (paid) account at three other services so you can download their data and combine it with the previous data sets, you’re not going to be able to retrieve much more data and get a better overview as you’re now leveraging 6 different indexes.

(P.S. Take notice that all of them grow their indexes over time, a growing link profile might not always mean that you’re getting more links, it might be that they’re just getting better at finding them.)

How to measure success?

Download all the data on a regular basis (weekly, monthly, quarterly) and combine the data sets, as they’re all providing links and root domains you can easily add the sheets together and remove the duplicate values. You won’t have all the metrics per domain + link that way but still can get a pretty good insight into what your most popular linking root domains + links are.
In the previous part I talked more about measuring the impact from creation to getting traffic. Hopefully the next part will provide more information on how to measure business impact & potentially use the data for forecasting. In the end when you merge all these different areas you should be able to measure impact in any stage independently. What steps did I miss in this analysis and could use some more clarification?

Measuring SEO Progress: From Start to Finish – Part 1: Receiving Traffic

Measuring SEO Progress: From Start to Finish – Part 1: Receiving Traffic

How to measure (and over time forecast) the impact of features that you’re building for SEO and how to measure this from start to finish. A topic that I’ve been thinking about a lot for the last few months is. It’s hard, as most of the actual work that we do can’t be measured easily or directly correlated to results. It requires a lot of resources and mostly a lot of investment (time + money). After having a discussion about this on Twitter with Dawn Anderson, Dr. Pete and Pedro Dias I thought it would be time to write up some more ideas on how to get better at measuring SEO progress and see the impact of what you’re doing. What can you do to safely assume that the right things are impacted.

1. Create

You’ve spent a lot of time writing a new article or working on a new feature/product with your team, so the last thing you want is not to receive search traffic for it. Let’s walk through the steps to get your new pages in the search engines and look at the ways you can ‘measure’ success at every step.

2. Submit: to the Index and/or Sitemaps

The first thing you want that you can impact is making sure that your pages are being crawled, in the hope that right after they’ll be indexed. There’s a different way to do this, you can either submit them through Google Search Console to have them fetched, beg that this form still works, or list your pages in a sitemap and submit these through Google Search Console.

Want to go ‘advanced’ (#sarcasm)? you can even ping search engines for new updates to your sitemaps or use something like Pubsubhubbub to notify other sources as well to know there is new content or pages.

How to measure success? Have you successfully submitted your URL via the various steps. Then you’ve basically completed this step. For now there’s not much more you can do.

3. Crawled?

This is your first real test, as submitting your page doesn’t even mean these days that your page will be crawled. So you want to make sure that after you submit the page is being seen by Google. After they’ve done this they can evaluate if they find it ‘good enough’ to index it. Before this step you mostly want to make sure that you, indeed, made the best page ever for users.
How to measure success? This is one of the hardest steps as most of the time (at least for bigger sites) you’ll need access to the server logs to figure out what kind of URLs have been visited by a search engine (User Agent). What do you see for example in the following snippet:

30.56.91.72 - - [06/Sep/2017:22:23:56 +0100] "GET" - "/example-folder/index.php" - "200" "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)" - www.example.com

It’s a visit to the hostname: www.example, on the specific path: /example-folder/index.php, which returned a 200 status code (successful) at September 6th. And the User Agent contained Googlebot. If you’re able to filter down on all of this data in your server logs, you can identify what pages are being crawled and which not over a period of time.

4. Indexed: Can the URL be found in the Index?

Like I mentioned before, a search engine crawling your page doesn’t mean at all that it’s a guarantee that it will also be indexed. Having worked with a lot of sites with pages that are close to duplicate it shows the risk that they might not be indexed. But how do you know and what you can do to evaluate what’s happening?
How to measure success? There are two very easy ways, manual: just put the URL in a Google Search and see if the actual page will come up. If you want to do this at a higher scale look at sitemaps indexed data in Google Search Console to see what percentage of pages (if you’re dealing with template pages) is being indexed. The success factor, when your page shows up. It means that it’s getting ready to start ranking (higher).

5. First Traffic & Start Ranking

It’s time to start achieving results, the next steps after making sure that your site is indexed is to start achieving rankings. As a better ranking will help you get more visits (even on niche keywords). In this blog posts I won’t go into what you can do to get better rankings as there have been written too many blog posts already about this topic.

How to measure success? Read this blog post from Peter O’Neill (Mr. MeasureCamp) on what kind of tracking he added to measure the first visits from Organic Search coming. This is one of the best ways I know for now, as it will also allow you to retrieve this data via the Google Analytics Reporting API making it easier to automate reporting on this.

As an alternative you can use Google Search Console and filter down on the Page. So you’re only looking at the data for a specific landing page. Based on that you can see over time how search impressions + clicks have been growing and when (only requirement is that you should have clicks in the first 90 days of launch of this page, but you’re a good SEO so capable of achieving that).

6. Increase Ranking

In the last step we looked at when you received your first impression. But Google Search Console can also tell you more about the position for a keyword. This is important to know to make sure that you can still increase your efforts or not to get more traffic in certain areas.

In some cases it means that you can still improve your CTR% by optimizing the snippet in Google. For some keywords it might mean that you hit your limit, for other it might mean that you can still increase your position by a lot.

How to measure success? Look at the same report, Search Analytics, that we just looked for the first visit of a keyword. By enabling the data for the Impressions you can monitor what you rankings are doing. In this example you see that the rankings are fluctuating on a daily basis between 1-3. When you’re able to save the data on this over time you can start tracking rankings in a more efficient way.

Note: To do this efficiently you want to filter down on the right country, dates, search type and devices as well. Otherwise you might be looking into data from other countries, devices, etc. that you’re not interested in. For example, I don’t care right now about search outside of the US, I probably rank lower and so they could drop my averages (significantly).

As Google Search Console only shows the data on a 90 day basis I would recommend saving the data (export CSV). In a previous blog post I wrote during my time at TNW I explained how to do this at scale via the API. As you’re monitoring more keywords over time this is usually the best way to go.

7. First Positions

In the last step I briefly mentioned that there is still work to be done when you’re ranking for a specific keyword when you’re in position 1. You can still optimize your snippet usually for a higher CTR%. They’re the easier tasks in optimization I’ve noticed over time. Although at scale it could be time consuming. But how do you find all these keywords.

Keyword Rankings

I still believe in keyword rankings, definitely when you know what locations you’re focusing on (on a city, zipcode or state level) you’re able these days to still focus on measuring the actual SERPs via many tools out there (I’m still working on something cool, bear with me for a while until I can release it). The results in these reports can tell you a lot about where you’re improving and if you’re already hitting the first positions in the results.

How to measure success? You stay in the same report as you were in before. Make sure that you’ve segmented your results for the right date range and that you segmented on the right device, page, country or search type that you want to be covered in. Export your data and filter or sort the column for position on getting the ones where position == 1. These are the keywords that you might want to stop optimizing for.
What steps did I miss in this analysis and could use some more clarification?
In the next part of this series I would like to take a next step and see how we can measure the impact from start to finish for links, followed by part three on how to measure conversions and measure business metrics (the metrics that should really matter). In the end when you merge all these different areas you should be able to measure impact in any stage independently.

What tools am I using for SEO?

What tools am I using for SEO?

Last updated on December 15th, 2017 at 09:33 pm

A while back somebody posted the SEO platforms/vendors/tools that he was using at his agency job (as an SEO). Me missing some great tools in there decided to respond but it also got me thinking about my own toolset and decided to dedicate a blog post to it, to get better recommendations and learn from others what they’re using but hopefully also to shine some light on what I am looking for in tools. This is not all of it and I din’t really have time to explain in detail what I’m using specific tools for (I might dedicate some posts over time to this). But at least wanted to give you a first look. So here we go..

In general I have three requirements for tools:

  • It should be easy to use & user friendly, no weird interfaces and stuff that only works (90% of my tools).
  • The most data/features available, or the opposite: have a very specific focus on 1 element of what I’m looking for.
  • They must have an API, so I can build things on top of it, preferably this is included in the pricing of the tool (normal for most tools these days).

Google Search Console, Bing Webmaster Tools, Yandex Webmaster Tools

Obviously Google Search Console is the tool that really matters out of the three. As most of my time is being spent managing our visibility in Google. My favorite reports are Search Analytics for getting a quick overview in our performance (we use most of their data outside of it, by using their API/R library). Structured Data (don’t forget about the Structured Data Testing Tool) to track what we’re doing with Schema.org on our pages. From time to time I might look into the Index Status report when I’m dealing with multiple domains at the same time.

One of the reasons why I like Bing Webmaster Tools is that their Index Explorer enables you to find directories & subdomains that exist on the site. A great benefit if you’re just getting started with a new site. Still after years at The Next Web and these days at Postmates I’m find out about folders or subdomains that you never hear about on a day to day basis but might cause issues for SEO.

Google Analytics & Google Tag Manager

You get the point on this one right? You’re tracking your traffic and the combination of the two can help you track all the contextual data through custom dimensions or other metrics/dimensions that will help you understand your data better. I’ve blogged about them many times on The Next Web while I was there and will remain to do so in the future.

Screaming Frog & Deepcrawl

Getting more insights in your technical structure is super valuable when you’re working on a technical audit. But ScreamingFrog for day to day use for subsets of data and Deepcrawl for weekly all-pages crawls are very powerful and help me get more insights into what kind of pages or segments are creating issues. I like to use them both as they have different reports and certain differences between tools help me better understand issues.

In my current toolset, Botify which I’ll mention later in this document, is a third option.

SEMrush & Google Adwords Keyword Tool

You always want more insights in keywords and you want to know more about them, that’s what both tools are great at. They give you a great basis for a keyword research which you can use as the start of your site’s architecture, keyword structures and internal links structures. In my previous blog post on Google Search Console I kicked off the basis for a keyword research based on that, if you want to take it easy: go with these tools (as a start).

Majestic

Majestic, might not be the most user friendly (hint & sorry!), but as they have one of the largest indexes it’s great for link research. In this case I definitely value data + quality over the friendliness of the tool.

AuthorityLabs / SERPmetrics

I still deeply believe in using ranking data, as I have the opportunity to do this at large scale & use the data for both national & local level it helps me get a better understanding in what’s happening in the rankings and mostly what’s moving. It doesn’t necessarily have to be that I’m interested in our own rankings or our competitors. But if certain features in the SERP suddenly move up it will help me understand why certain metrics are moving (or not). It’s a great provider of intelligence data that you can leverage for prioritization and measuring your impact.

AuthorityLabs used to be my favourite tool to use, these days as they changed their pricing model I switched over to SERPmetrics.

Botify / Servers

I’ll try to write a follow up blog post on this explaining how this data can help you in getting more insights into the performance of the features/products that you build. But getting more insights from the log file data that you have on your servers can be extremely useful (must add that this is a thing that mostly applies to big site SEO). Right now I’m using Botify for this.

Monitoring

For bigger sites it’s really hard to keep up to date on the latest changes as so many people are working on it. That’s why we want to make sure that you get alerted when important SEO features are changing. We’re using some custom scripts in Google Drive but also like to use SEORadar.

Google Cloud Platform

My former coworker Julian wrote a great blog post on how to scale up ScreamingFrog and run it on a Google Cloud server. It’s one of many use cases why you want to use the Google Cloud Platform. Besides their server, analyzing large data sets with BigQuery (with or without using their Google Analytics connection) provides you with a better ability to handle large sums of data (log file, internal databases, etc.).

APIs

  • Data: In addition to the tools I just listed, there are a few APIs that I’m using on a regular basis that are making my life easier as they’re providing a lot of data. They’re APIs to retrieve keyword volumes and related keywords, to handle things on a bigger scale you’re going to want to be able to work with APIs instead of dealing with Excel files.
  • Reporting: Most of the reports that you’re delivering on can be automated. That is one of the best things that can deliver a great timesaver. By using the Google Analytics reporting in Google Sheets, googleAnalyticsR, SearchConsoleR and the Google Analytics Reporting API V3 and V4.

What am I still looking for?

  • Quality Control & Assurance: Weekly crawls aren’t enough if things are messed up. You want to know this on an hourly basis. Mostly when things are moving so fast that you can’t keep track of changes anymore.
  • More link data: Next to Majestic it would be great to be able to combine the datasets of others as well when doing this research. Doing this manually is doable but not on a regular basis.
  • More keyword data: When you start your keyword research you can just start with a certain set of keywords. But it could be that you’re forgetting about a huge set of keywords in a nice related industry. I’m exploring how to have more keywords to start your keyword research with (we’re not talking about 19 extra keywords here, more like 190.000 keywords).

I’m sure the set of tools will keep evolving over the next months when new things happen. I’d love to learn more about the tools that you’re using. Shoot in the comments or on Twitter what I should be using and I’ll take a look!

From 99% ‘duplicate content’ to 15 editors and back to ‘duplicate content’

Duplicate content is (according to questions from new SEOs and people in online marketing) still one of the biggest issues in Search Engine Optimization. I’ve got news for you, it for sure isn’t as there are plenty of other issues. But somehow it still always comes up to the surface when talking about SEO. As I’ve been on both sides of the equation, having worked for comparison sites and a publisher I want to reflect on both angles. Why I think it’s really important that you see both sides of the picture when looking into why sites could have duplicate content and if they do it on purpose or not.

When I started in SEO about 1211 years ago I worked for a company who would list courses from all around the globe on their website (Springest.com, let’s give them some credit), making it possible for people to compare them. By doing this we were able to create a really useful overview of training courses on the subject of SEO for example. One downside of this was that basically none of the content we had on our site was unique. Training courses are often a very strict program and in certain cases are regulated by the government of institutions to provide the right qualification to attendees. Making it impossible to change any of the descriptions on contents, books or requirements as they were provided by the institutions (read: copy pasted)

Having worked at the complete other side with The Next Web where I had the privilege of working with 10-15 full-time editors all around the globe who write unique, fresh and (news) content on a daily basis. Backed up by dozens of people willing to write for TNW where are presented with the opportunity to chose what kind of posts we publish. It made some things easier, but even at TNW we ran into content issues. The tone of voice over time devalues/changes as editors come and go. But also when you publish more content from guest authors it’s hard to maintain the right balance.

These days I’m ‘back’ with duplicated content, working at Postmates where we work on on-demand delivery. Now it makes it easier to deal with the duplicate content that we technically have from all of the restaurants (it’s published on their own site and on some competitors). But with previous experience it’s way easier to come up with so many more ideas based on the (duplicate) content that you already have. It also made me realize that most of the time you’re always working with something that is duplicate, either it be the product info you have in ecommerce, the industry that you operate in. It’s all about the way you slice and dice it to make it more unique.

In the end, search engine optimization is all about content. Either duplicated or not. We all want to make the best of it and there is always a way to provide a unique angle. Although the angle of the businesses and the way of doing SEO for them is completely different there are certain skills required that I think could provide you with a benefit over a lot of people when you’ve worked with both.

Retrieving Search Analytics Data from the Google Search Console API for Bulk Keyword Research

Retrieving Search Analytics Data from the Google Search Console API for Bulk Keyword Research

Last updated on December 15th, 2017 at 09:27 pm

Last year I blogged about using 855 properties to retrieve all your Search Analytics data. Just after that Google luckily released that the limits on the API to retrieve only the top 5000 results had been lifted. Since then it’s been possible to potentially pull all your keywords from Google Search Console via their API (hint: you’re not able to get all the data).

Since I’ve started at Postmates now well over two months ago one of the biggest projects that I started with was getting insights into what markets + product categories we’re already performing OK in from an SEO perspective. With over 150.000 unique keywords weekly (and working on increasing that) it is quite hard to easily get a good grasp on what’s working or not as we’re active in 50+ markets that influence the queries that people are searching for (for example, show me all the queries over a longer period of time with only Mexican in the title across all markets, impossible from the interface). That’s why clicking through the Search Analytics feature in Google Search Console was nice for checking specific keywords quickly, but overall it wouldn’t help in getting detailed insights into what’s working and what’s not.

Some of the issues I was hoping to solve with this approach:

  • Pull all your data on a daily basis so you can get an accurate picture of the number of clicks and how that changes over time for a query.
  • Hopefully get some insights into the actual number of impressions. Google Adwords Keyword Tool data is still vary valuable but as it’s grouped it can be off on occasion. Google Search Console should be able to provide more accurate data on a specific keyword level.
  • Use the data as a basis for further keyword research and categorization.

Having used the Google Search Console API a bit before I was curious to see what I could accomplish pulling in the data on a daily basis and making sense of it (and combining it with other data sets, maybe more on that in later blog posts).

The process:

  • Daily pull in all the keywords, grouped by landing page so you know for sure you get all the different keyword combinations and your data isn’t filtered by the API.
  • Save the specific keyword if we haven’t saved it before, so we know if the keyword was a ‘first-hit’ for the first time.
  • For every keyword that you return do another call to the API to get the country, landing pages and metrics for that specific query.

In our case we categorize the keywords right after we pull them in to see if it’s matching a certain market or product category. So far this has been really useful for us as it’s providing way better ways for dashboarding.

Some of the things that I ran into while building out this:

What to look out for?

  • The API is very much limiting the keywords that you get to see with only impressions. I was able to retrieve some of the data but on a daily basis the statistics for impressions are off with 50% from what I’m seeing in Google Search Console. However clicks seems to only have a small difference, win!
  • Apparently they’re hiding some of the keywords as they qualify them as highly personal. So you’ll miss a certain percentage because of that.
  • The rate limits of the Google Search Console aren’t very nice, for over 5k keyword it’s taking quite long to pull in all the data as you have to deal with their rate limits.

Most of these items aren’t really being an issue for us, we have better sources for volume data anyway. In the future we’re hoping to gather more data around different sources to extend that. I’m hoping to blog about somewhere in the future.