The top metrics & KPIs other teams care about for SEO

SEOs care about many metrics, often the wrong ones (rankings, DA, PA, you name it, etc.). What is often forgotten are the metrics that are important for the other teams/departments within their organization. In the end, building a company isn’t just done by one company. Over the years, it’s been clear that you work with many departments simultaneously on the same effort, and often they care about your channels’ metrics. Just not always the same ones, so in this blog post, I wanted to shine some additional light on what metrics you should think about for other departments. It’s the followup to this tweet that got quite the attention, and this post gives me the ability to go a bit more into depth on the whys?

This list is likely incomplete and is using some generic names. Your organization might have different names or have additional departments that might not be covered here. Hopefully, this gives you a better insight into how to think about various departments related to SEO.

🏢 C-Level

Depending on what type of organization you work in and how broad your C-suite is, you are likely to report at some level into a COO/CMO that cares about the SEO metrics. But often in 100+ person companies, they don’t have the depth anymore to really deep dive into the SEO cases that you’re facing within an SEO team on a day-to-day basis.

Metrics:

  • Revenue, average order value (this number should in most use cases not be too much different from the performance of other channels), and the number of Transactions.
  • Sessions from organic search as an absolute number but also the percentage of total traffic. Primarily the latter as you want to keep a healthy/diverse balance for your marketing mix. Something that I blogged about before.

🛠 Product

What is Product building that you can benefit from, and how are you working with Product to prioritize the most important changes to the product to drive additional growth from organic search. No product is finished so there is always something that you can help prioritize from an SEO point-of-view.

Metrics:

  • Load time: There has been enough buzz about the importance of site speed for good reason.
  • Number of Pages per Template
  • Growth in Sessions
  • Best Performing Page Segments
  • Conversion Rate from Organic Search, etc.

Not necessarily in that order, but usually, metrics that are impacted with/by the Product organization.

💻 Engineering

Sitespeed, code velocity, sitespeed and load times. Well you get the point. It’s all about how fast the site is and how quickly you can work with an engineering team to get changes that you want fixed implemented.

Metrics:

  • Load times/site speed, traffic to specific sections of the site.
  • Velocity of tickets/items that you want Engineering to implement.

💲 Finance

Metrics that show the potential for growth and the return on investment. In the end, in many companies, Finance is the gatekeeper of money flowing in and out. They want to get a better insight into what you’re spending and how that eventually contributes to the bottom line. Providing a simple version of a P&L for SEO will likely return a happy smile if you’re able to produce that.

Metrics:

  • ROI % (how much have you spend on SEO resourcing: team, tools, other expenses for content) versus what will it return
  • Budget Spend, Returned Revenue, and future growth.

🖼  Marketing

Likely your closest allies in the ‘battle of SEO’ together with Product. Depending on the organizational structure, you probably find the SEO team itself here or in Product. So having enough impact on the metrics that your marketing team cares about is important.

📝 Content: Do you have a separate content team? They’ll likely care about the organic traffic coming to their pages, and they should care about the impact on those business metrics too. Besides that, any insight into specific keywords (volume, CTR) is always useful for a team like this to help optimize existing content.

Metrics:

  • Impact on branded search terms: sessions.
  • Increase/decline so you can measure the uplift of other brand awareness campaigns.

🧳   Sales & Business Development

At what scale are you still able to set up partnerships and does that actually fit into the scope of SEO at scale? Likely the answer is, no. That’s why you want to partner with a sales/biz dev team that can help you solidify partnerships and companies to work within your space. They have better skills and you can likely provide them meanwhile with more useful input on who to go after.

Metrics:

  • The number of big partnerships.
  • A shortlist of partners that you want them to go after, not just for dumb link building (preferably not, in my opinion). Instead, create lasting relationships that impact the industry, TAM (Total Addressable Market), and market presence.

📞  Customer Service

The better, faster, and more quickly you can answer your customers’ questions likely the better your business will thrive in today’s environment. Often this means that providing the answer directly in search (think featured snippets). You can’t just do this alone as an SEO team, you need the input from people in Customer Service, they’re the ones talking to your customers about the (mainly) negative and positive situations. The more you can support them with the metrics that they care about, the more comfortable both your lives might become.

Metrics:

  • Organic Traffic to Support related pages, the number of calls/chats that you avoid by better-optimized pages, etc.
  • The top questions that you can answer directly via featured snippets.
  • The top 100 pages on your support portal, based on organic search segmentation.

📈 Growth

When I was on the Growth team at Postmates the insane velocity that was produced there to grow faster was great to see. As SEO isn’t the fastest-growing channel often (especially not in the short-term, as I can throw 1M towards PPC tomorrow and create near-instant results) it’s important to show how it’s attributing to the mix of long and short-term initiatives of a growth team.

Metrics:

  • Growth % of the SEO channel, compared to MoM, WoW or YoY.
  • Long term contributions of growth, as lots of SEO growth is evergreen and at relatively low costs.

👩‍💻 Human Resources

Admittedly, this one is one of the most distanced departments from just SEO, but if you’re a big organization and recruiting for dozens or even hundreds or roles, how important it could be to help drive traffic to a Careers/Jobs section on a site. If that’s the case showing the importance of driving job applicants could be incredibly helpful to help understand what SEO can do for them.

Metrics:

  • The number of job applicants that applied because they found the jobs via Search.
  • Traffic to a specific segment of the site from Organic Search: Careers/Jobs.
  • The number of pages marked up properly with structured data for Jobs.

What metrics are missing? What do you measure for your organization? There are so many different business models out there that likely this list is far from complete for; for example, B2B cases are likely to be missing here.


Saving Bing Search Query Data from the Bing Webmaster Tools’ API

Over the last year, we spent a lot of time working on getting data from several marketing channels into our marketing data warehouse. The series that we did on this with the team has received lots of love from the community (thanks for that!). Retrieving Search Query data from Bing has proven to be one of the ‘harder’ data points: there is a lack of documentation, there a no real connectors directly to a data warehouse, and as it turns out the returned data (quality) is … ‘interesting’ to say the least. That’s why I wanted to write this blog post, to provide the code to easily pull out your search query data from Bing Webmaster Tools and give more people to evaluate their data. Hopefully, this provides the overall community with a better insight into the data quality coming out of the API.

Getting Started

  1. Create an account on Bing Webmaster Tools.
  2. Add & Verify a site.
  3. Create an API Key within the interface (help guide).
  4. Save the API Key and the formatted site URL.

The code

These days I spent most of my time (whenever I get to write code) coding in Python, that’s why these.

import datetime
import requests
import csv
import json
import re

URL = "https://example.com"
API_KEY = ''

request_url = "https://ssl.bing.com/webmaster/api.svc/json/GetQueryStats?apikey={}&siteUrl={}".format(API_KEY, URL)

request = requests.get(request_url)
if request.status_code == 200:
    query_data = json.loads(request.text)

    with open("bing_query_stats_{}.csv".format(datetime.date.today()), mode='w') as new_file:
        write_row = csv.writer(new_file, delimiter=',', quotechar='"')
        write_row.writerow(['AvgClickPosition', 'AvgImpressionPosition', 'Clicks', 'Impressions', 'Query', 'Created', 'Date'])

        for key in query_data["d"]:
            # Get date
            match = re.search('/Date\\((.*)\\)/', key["Date"])

            write_row.writerow([key["AvgClickPosition"] / 10,
                                key["AvgImpressionPosition"] / 10,
                                key["Clicks"],
                                key["Impressions"],
                                key["Query"],
                                datetime.datetime.now(),
                                datetime.datetime.fromtimestamp(int(match.group(1)) // 1000)])

Or find the same code here in a Gist file on Github.

Steps to take

  • Make sure you have all the needed dependencies installed: json, re, requests, csv.
    • pip install requests json re csv
  • Run the script: python bing_query_stats.py and enter the API Key and Site URL in the constants at the top of the script.
  • If everything is successful the information is saved in this file: bing_query_stats_YYYY-MM-DD.csv

Data Quality

As I mentioned in the intro, the data quality is questionable and leaves very much up to the imagination. It’s one of the reasons why I wanted to share this script, so others can get their data out and we can hopefully learn more together on what the data represents. The big caveat seems that the data is exported at the time of extraction with a date range of XX days and it’s not possible to select a date range. This means that you can only make this data useful if you save it over a longer period of time and based on that calculate daily performance. This is all doable in the setup we have where we’re using Airflow to save the data into our Google BigQuery data lake, but because it isn’t as straight forward this might be harder for others.

So please share your ideas on the data and what you ran into with me via @MartijnSch


Case Study: How Restructuring 6800 Content Pieces For SEO Worked

I presented the content in this blog post about a week ago for the Traffic Think Tank community (highly recommend it), but after a Twitter thread on this topic as well, it’s time to turn it into a blog post.

Sometimes you have to take a stand and make something better when it’s already performing well. Over the last months, the RVshare marketing team worked on some great projects; one of them that I was involved in was restructuring 6800 pieces of content that we created a while ago. The content and pages they were on were performing outstanding (growing +100% YOY without any real effort), but we wanted to do more, to help users and boost SEO traffic. So we got started…

Why restructure content?

A couple of years ago, we published the last WordPress page/post in a series of 600+, the intent: go after a category near and dear to the core of the RVshare business: help more people rent an RV. We did that by creating tons of articles specifically for cities/areas. Now over two and a half years later, the content is driving millions of people yearly, mainly from SEO, but we knew that there was more as it’s not our core business. We also weren’t leveraging all the SEO features that have become available since two years ago, think about additional structured data like FAQs but also monetization that we thought was important. All improvements that we had to go back into every post for if we wanted to take advantage of it.

What we did, leveraging Mechanical Turk.

One of the biggest obstacles wasn’t necessarily rebuilding pages, coming up with a better design, etc. WE have a great team that is nailing this on a daily basis. But having to deal with 650 posts that contained ten sub-elements itself was a struggle. The content was structured in a similar way but some quick proof of concepts identified that scraping wasn’t the solution as the error ratio was way too high as with most projects we wanted to ensure that the content could be restructured at low costs not to avoid this project not having a valid business case (does the actual opportunity outweigh the potential costs to restructure the content?).

Scraping versus Mechanical Turk

As we had initially structured the content the same way: headline, description, etc. we were able to have at least a way to get the data out. When we did some testing to see if we would be able to scrape it looked unfortunate, there were too many edge cases as the HTML itself around it was barely structured enough to get the actual content out of it.

We looked into Mechanical Turk as the second option as it gave us the ability to quickly get thousands of people on a task to look at the content and take out what we needed. We wrote the briefing, divided the project in a few chunks, and within 10-12 hours, we had the content individualized per piece. We did our best to deal with most of the data cleaning from the workers directly in the briefing and form but also had some cleaning scripts ready. After it was cleaned, we imported the data into our headless CMS Prismic.

How to do this yourself?

  1. Create an account on Mechanical Turk.
  2. Create a project focused around content extraction.
  3. Identify what kind of content you want individualized, it works best if there is a current structure (list format, table) that can be followed by the Turks. This way, you can tell them to pick up content piece X, Y, Z, for a specific URL.
  4. Identify the fields that you want to be copied.
  5. Upload a list of URLs that you want them to cover and additionally the # that it has on the list.
  6. Start the project and verify the results.
  7. Upload the data automatically back into your CMS (we used a script that could directly put the content as a batch into our headless CMS Prismic.io)

Rebuilding

We decided to build the content from the ground up, which meant:

  • Build out category pages with the top content pieces by state.
  • Build out the main index page with the top content from all states.
  • Build the ability to showcase this content on all of our other templated pages across RVshare.

By building out the specific templates, it gave us additional power to streamline internal linking, create better internal relevance, build-out structured data but mainly figure out a right way on how to leverage a headless CMS with all its capabilities instead of just having raw (read: ‘dumb’) content that can’t be appropriately structured. We already use the headless CMS Prismic.io to do this, in which you can create custom post types, as you see in this screenshot. You define the custom post type and can pick the kind of fields that you want, which turns itself just another CMS after that. The content can then be leveraged through their API.

How to do this yourself?

We were previously leveraging WordPress ourselves, but all entities were saved as 1 post. If you’re able to do this differently and save pieces individually it’s many times easier to create overview pages by using categories (and/or tags). This is not right away something that you can always do without development support.

Results

Because of the design changes, engagement increased with over 25% because of the new format. Monetization is making it more interesting to keep on iterating on the results. Sessions were unfortunately really hard to measure we launched the integrations a few weeks prior to the kick-off of COVID-19 resulting in a downwards spiral and a surge in demand right after. Hopefully, in the long-term, we’ll be able to tell more about this. We are sure though that we didn’t suffer on SEO results.


Want to see the new structure of the pages? You can find it here as our effort on the top 10 campgrounds across the United States.


Part 5: Airflow on Google Cloud Composer – Building a Marketing Data Lake and Data Warehouse on Google Cloud Platform

In the previous blog posts (part 1, part 2, part 3, and part 4) in this series, we talked about why we decided to build a marketing data warehouse. This endeavor started by figuring out how to deal with the first part: making the data lake. In the fourth blog post, a more technical one, I’ll give some insights into how we’re leveraging Apache’s Airflow to build the more complicated data pipelines, and I give you some tips on how to get started.

This blog post is part of a series of five? (maybe more, you never know), in which we’ll dive into the details of why we wanted to create a data warehouse, how we created the data lake, how we used the data lake to create a data warehouse. It is written with the help of @RickDronkers and @Hussain / MarketLytics, who we’ve worked with alongside during this (ongoing) project.

Getting Started with Cloud Composer

Cloud Composer is part of Google’s Cloud Platform and brings you most of the upside of using Apache Airflow (open source) and barely any of the downsides (setup, maintenance, etc.). Or to follow their main USP: “A fully managed workflow orchestration service built on Apache Airflow.” While we had worked with Airflow before, we weren’t looking forward to spending time having to worry about its management as we planned to spend most time setting up and maintaining the data pipelines. In the end, then you have to stick to create pipelines (DAGs).

What is it suitable for?

You want to load data from the Google Analytics API, store it locally, translate some values to something new, and have it available in Google BigQuery. However, you would build it; it’s multiple tasks and functions that are depending on itself. You wouldn’t want to load the data into BigQuery when the data wouldn’t have been cleaned (trash in, trash out sounds familiar?). With BigQuery, the next task is only being processed if the previous step was successful.

Tasks

Tasks are in almost every case; just one thing: get data from BigQuery, upload a file from GCS into BigQuery, download a file from Cloud Storage to local, process data. What makes Airflow very efficient to work with is that the majority of data processing tasks already have pre-built functions. The first three tasks that I listed here are operators (GoogleCloudStorageDownloadOperator, GoogleCloudStorageToBigQueryOperator) that operate as functions.

Versus Google Cloud Functions

If you mainly run very simple ‘pipelines’ that only exist of 1 function that needs to be executed or have only a handful use cases, it is likely overkill to leverage Cloud Composer; the costs might be too high, you still have overhead with DAGs. In that case, you might be better off with Google Cloud Functions as you can write similar scripts that will enable you to also trigger them with Google Cloud Scheduler to run at a specific time.

Costs

The costs for Google Cloud Composer are doable, for a basic setup, it’s around 450 dollars (if you run the instances 24 hours * 7 days a week) as you leverage multiple (a minimum of 3) small instances. For more information on the costs, I would point to this pricing example.

Building Pipelines

See above an example data pipeline, in typical Airflow fashion every task is depending on the previous task. In other words: notify_slack_channel would not run if any of the previous tasks would fail. All tasks are happening in a particular order from left to right. In most cases, data pipelines become more complicated as you can have multiple flows going on at the same time and combining them at the end.

Tips & Tricks

Google Cloud Build, Repositories

The files for Google Cloud Composer are saved in Google Cloud Storage. Which is smart in itself, but at the same time, you want them to live in a Git repository so you can efficiently work on it together. By leveraging this blog post, you’re able to connect the Cloud Storage bucket to a repository and set up a sync between the two. This will help you build a deployment pipeline basically and make sure that only production-ready code from your master branch ends up in GCS.

Managing Dependencies

After working with it for a few months now, I’m still not sure if managing dependencies through Google Cloud Composer is a good or bad thing, as it creates some obstacles if you want to run a deployment and want to add some Python libraries (as your servers could be down 10-30 mins at a time). For other setups, this usually is a bit more smooth and creates less downtime.

Sendgrid for Email Alerts

One of the upsides of Apache Airflow is that it sends alerts upon failure of tasks. Make sure to set up the Sendgrid notifications while you’re setting up Google Cloud Composer. This will be the most straightforward way of receiving email alerts (for free, as in most cases, you shouldn’t get too many failure emails).

README

Document the crap out of your setup and DAGs. When I took over some of the pipelines that were used at Postmates for XML sitemap generation it was a nightmare, it was hard to read, the code didn’t make a lot of sense, and we had to refactor certain things just because of that. As sometimes pipelines (just like regular code) can be left untouched/unviewed for months (as they literally sometimes only have one job) you want to make sure that you come back and understand what happens inside the tasks.


Again… This blog post is written with the help of @RickDronkers and @Hussain / MarketLytics who we’ve worked with alongside during this (ongoing) project.


Part 4: Visualization with Google DataStudio – Building a Marketing Data Lake and Data Warehouse on Google Cloud Platform

In the previous blog posts (part 1, part 2, and part 3) in this series, we talked about why we decided to build a marketing data warehouse. This endeavor started by figuring out how to deal with the first part: building the data lake. In the fourth blog post, we’ll chat about how we are visualizing all the data we saved in previous steps by using Google DataStudio.

This blog post is part of a series of four? (maybe more, you never know), in which we’ll dive into the details of why we wanted to create a data warehouse, how we created the data lake, how we used the data lake to create a data warehouse. It is written with the help of @RickDronkers and @Hussain / MarketLytics who we’ve worked with alongside during this (ongoing) project.

How we build dashboards

Try to think ahead about what you need: date ranges, data/date comparisons, filters, what type of visualization. This will help you build a better first version right away as it gives you the ability to have a good version right away. What that mainly looked like for us:

  • Date ranges: The business is so seasonal that our Year over Year growth is most important for RVshare, and since we often don’t get to see all the context on metrics on a weekly basis, we default to 30 days.
  • Filters: For some channels (PPC, Social), it’s more relevant to be able to filter down the data on a campaign or social network level. Because in most cases the aggregate level doesn’t tell the whole story right away.
  • Visualization: We need the top metrics: sessions and revenue in view right away with the comparison YoY right away so we know within seconds what is going on and how that can improve things.

Talking to Stakeholders (Part Deux)

In the first blog post, we talked about connecting with our stakeholders (mainly our channel owners) and gathering their feedback to build the first initial versions of their dashboarding (beginning with the end in mind). We used this approach to put the first charts, tables, and graphs on the dashboards after which we connected back again with the owners to see what data points were missing and in some cases to validate the data that they were seeing on their dashboards. This helped us get additional feedback for fast follows and made for quick iterations on data that we had and could also show. For social media, as an example, it turned out that we wanted to show additional metrics that we hadn’t thought of initially but were in our data lake anyway. These sessions provided a good way for us to build additional pieces into our data warehouse while we were at it. These days some of these dashboards are used weekly to report to other teams in the organization or used within the team itself.

Best Practices

Blended Data

Do you want to add this to Google DataStudio, or do you want to create synced/aggregate tables in BigQuery? For most of our use cases, we have opted for using DataStudio to create JOIN blended data sources. It’s easier – we have the ability to quickly pull some new data together versus having to deal with the data structures and complicated queries. In some use cases, we noticed that we were missing data in our warehouse tables (not the lake) and were able to make adjustments/improvements to them by creating dashboards.

Single Account Owner

Because we work with Rick and Hussain as ‘third parties’, we opted for using 1 shared owner account, transferring owner access is incredibly hard when it’s a Google Apps account so we made sure that the dashboards are owned through an @rvshare.com account, it’s not a big topic but could cause tons of headache in the long term.

Keep It Simple St*pid

Your stakeholders probably have the desire / and the time to look at less than you think. Instead of having them jump through too many charts start simple and then add based on feedback if they want to see more, less is more in this case.

This has added benefit of them feeling engaged and more interested in using it. In our own use case, we leverage our reporting on a weekly basis for a team meeting, which makes it already a more often leveraged us case.

Calculated Fields – Yay or Nay?

As we made most of the tables that we leverage in DataStudio from scratch during our ETL process, we had the opportunity to decide if we wanted to leverage calculated fields in BigQuery or if we do the work in the queries itself. Honestly, the answer wasn’t easy, and as we made modifications in the dashboards, it became clear that having them set up in DataStudio wasn’t always scalable and easy as with data modifications or changes in tables they are removed.

Google BigQuery

Tables or Queries? In our case, we often used the table information from BigQuery and the specific columns in there to drive the visualization in DataStudio. The alternative for some of them is that we directly query the data in BigQuery, with the BI Engine reservation that we have in there we can speed up intense queries rather easily.


Again… This blog post is written with the help of @RickDronkers and @Hussain / MarketLytics who we’ve worked with alongside during this (ongoing) project.