The Concept of Input Metrics for SEO

Sessions, transactions, and revenue are not metrics that an SEO shift overnight. It’s a matter of often waiting for Google. They’re considered output metrics, the work that you’ve put in results in those.

I just put down the book: Working Backwards by Bill Carr and Colin Bryar. It’s about their many learnings working for years at Amazon. It documents the process and policies that were put in place to make it the true giant it is today (including the theory behind pizza teams, the bar-raising process, etc.). One of the chapters talks about the concept of Input Metrics. Something that we often forget about in the context of SEO (or driving traffic via other channels).

Input metrics are considered metrics that can help define progress towards an output metric.

  • How many products do you have (today versus a week/month ago)? Metric: # of SKUs.
  • How many pages do have a custom/unique title or META description? Metric: # of optimized pages.

Analysis Decision Tree

They can be influenced by the work that you do on a daily basis. Let’s say that your (output metric) revenue is down. You would likely follow a decision tree like this:

  • Did the number of transactions decrease or did the average order value decrease last week?
  • If transactions decreased, did our conversation rate change?
  • If the conversion rate didn’t change, did sessions change?
  • If sessions were down, what channels caused this? Let’s say for a minute that all channels were flat except for Social Media.
  • If Social Media was down on the metric of sessions. What caused it?

A very simple explanation often follows: We just posted less on social media channels on Tuesday because of X. Aka, you defined the input metric that is in your hands to change: the # number of posts. My take often is that any metric that you report on should be able to either trigger an action, report an outcome, or show industry-level trends/benchmarking. Knowing certain metrics is useless without context via either benchmarking or historical data.

Example: # of Posts > More Social Posts > More Traffic > Larger Community

While I led Marketing at The Next Web, we looked at our variability in traffic to figure out how to grow our audience more sustainably over time (instead of relying on content going viral). Besides the apparent focus on SEO, we realized that the # of published posts was obviously a big input. It was not too surprising in itself, but it was an input metric that could significantly impact sessions. For example, it led to us republishing or reposting old content more on Social Media and having more writer support on weekends to create a steady stream of content. Without knowing what impacts traffic on a channel, you’ll have difficulty figuring out how to change your approach.

Reporting Input Metrics versus Output Metrics

What metrics should you be reporting on to your boss or upper-level? The ones that show the impact, which is most often output metrics in my opinion. You want to show business results as that shows your contribution to the bottom line of the business (at the end of the day we’re all getting paid based on that).

Tom Critchlow has written a bit more about this subject in his blog post: Some Notes on Executive Dashboards

However, it doesn’t mean that you shouldn’t have reporting for input metrics. If you know the % of pages that aren’t indexed, you have an input metric that needs to change to grow the output metric: sessions over time. If you run an e-commerce store, the number of products has an impact on their availability. In RVshare’s case, we can have thousands of RVs, but if they’re not available at the highest peak of the year, it’s still not going to help us (or the owner) grow.

Example Input Metrics for SEO

Input to Output can often be visualized as a funnel, as in SEO there are many steps that eventually lead to an outcome:

Crawl to Indexation: Each combination of steps will provide you with a way to input the next outcome. Increasing the number of pages should increase how many are crawled.

  • Number of pages (input)
  • Number of pages that are crawled
  • Number of pages that are submitted (via XML sitemaps)
  • Number of pages that are indexed
  • Number of pages receiving traffic
  • Number of pages driving revenue (output)

What do I need to get started?

Getting this insight goes back to having access to the right data. Most of what I just talked about is available through free tools: Google Search Console and Google Analytics is your best friend. The next best source is your internal data or CMS, which can provide insights into the quality of content/products/etc.


Assigning Purchases to Other Users in Google Analytics

Sometimes, you want to assign specific user interactions to a different user.

There are many cases where you want to send an event for user X while user Y performs the action. But it’s important that you can save this information to the right user.

In our case, we ran into the use case where an RV owner needs to accept a booking. Only after the approval takes place do we consider this an actual purchase (transaction in GA). According to regular logic, the RV owner would get the purchase attributed to its session in Google Analytics. In the end, that user goes through the steps in the interface and confirms the transaction.

Marketing <> Google Analytics Logic

This doesn’t work for Marketing, though. We did our best to acquire the renter, and they’re the ones purchasing. According to Google Analytics, we’d fire an e-commerce purchase on behalf of the owner. What this messes up is channel attribution & performance measurement of our campaigns. In this case, the owner’s path is likely not touching any paid or true marketing channels but either direct or email.

In summary, the wrong user would get credit for the conversion, which could cause issues with our ROAS measurement of marketing channels.

Switching Client IDs & Leveraging User IDs

When Google Analytics is loaded, it will set the client ID value in the _ga cookies in your browser. This value (the client id) will be used to tie events → pageviews → sessions together and be seen as one user (in combination with the userId value, obviously).

So what we’re doing is pretty simple to change this behavior:

  1. Whenever a user goes through the checkout funnel and creates a user account, we save their Google Analytics Client IDs (for GA4 and UA) to their profile.
  2. When user X confirms the purchase, we’re sending an event to Tag Manager with the purchase context, including the impacted data from user Y.
  3. Instead of directly firing the hit to Google Analytics, we swap out the client ID and userID from user X to user Y so that the actual purchase will get attributed to that hit. You need to mimic a session ID.
  4. Google Analytics will now stitch this purchase to user Y instead of user X. You can choose for yourself what you want to fire for user Y.

Resources

→ Setting the Client ID in Google Analytics 4

→ Setting the Client ID in Google Analytics with gtag.js

User ID

Where possible, make sure that you’re already sending a userID to Google Analytics to ensure that the interactions can be truly tied back to the same user.


Learnings from Managing Marketing Budgets

In my 1st job in Marketing, I ‘managed’ a marketing budget of less than €5.000 a month (mainly paid acquisition spend), eventually growing it to about €20.000 a month. A few jobs later, my budget responsibility has grown significantly, while now responsible for a yearly eight-digit figure ($)💸. This creates a need for more accuracy, accountability, and diligence into what you’re spending resources on and the return on investments.

Operating Expenses versus Compensation versus Other

Often you’ll roll into a role or organization that already has a set strategy and aligned budget plan. The same applied to my role at RVshare, which I joined almost four years ago. It’s rarely the case that you get to build a budget out from $0/scratch. Instead, I joined on a budget plan that already had some essential buckets (that we remain to invest in). This often also predetermines the organization’s path and how a budget will be divided in the short term.

  • Operating Expenses/OpEx: All expenses tied to regular marketing activities, campaigns, and the operations of the business. I can’t describe it better than Investopedia does here.
  • Capital Expenses/CapEx: For most businesses or functions, there is a more strong capital element for investments that they’re making. As we’re a very asset-light business, we don’t own any inventory and don’t have any actual physical marketing assets (at least not to make a dent in a budget plan). This is not a thing for us. And probably with us many other marketing organizations.
  • Compensation: Salary, Bonuses, etc. Basically what you get paid every two weeks.

Direct Return versus Non-Direct versus Supporting

There is no one right way to set up a budget plan, as it can be divided in many ways. Our Marketing one contains three big buckets:

  • Direct Return: Channels that directly can drive a return on investment and can be measured on this basis. Examples of that are Paid Acquisition, SEO, and Marketing Partnerships. It doesn’t have to be a scientific method, but you can predict your spend > revenue pretty well often.
  • Non-Direct Return: Other channels, for example, more focused on driving awareness or reach like PR, Social Media (Organic) and Brand Marketing.
  • Supporting: Marketing Analytics, Technology, Education, Consultants, etc.

Categories like travel, meals/entertainment, and education are not directly coming out of our marketing budget. But it is realistically also a tiny allocation of a total multi-million dollar budget (considering we have a relatively small team, we do invest in them, don’t worry). Which doesn’t mean that it’s low in itself as you always want to make business travel possible, as well as training & education.

Internal versus External

On ‘who’ do you spend your resources, and how is the decision made. In most cases, this is not a finance/budget-driven decision, in my opinion. As it’s mainly tied to what functional/strategic expertise you need to make an impact in a specific area. Before a year’s start (financial years exist for a reason), we plan out our expected spend on internal versus external resources. At that point, we often already have an idea of what we’d like to add headcount for and what might be better to hire external support for.

→ Also read, Deciding between who to hire: an Agency versus a Contractor versus Hiring?

Misconceptions

Unlimited Resources

The narrative around this is often that you always need to scale. It applies to most startups, and overall I’m a fan of it. But it’s not realistic later on. A business with 90% of its marketing expenses is not healthy at scale. Scaling an unlimited budget might work well for you, but at some point, Finance will start knocking on your door as you’re ruining their cash flow positions.

Unplanned Costs

“A good leader can plan for any type of cost well ahead” – Nobody said ever. New ideas and costs always come up; I’m a firm believer in making sure that you keep a flexible mind and can adjust plans. For example, when we hit the pandemic in 2020, all our (preplanned budget) plans were useless as it would have ruined RVshare if we had continued on that spending level. Two months later, it turned out that we needed to spend way more aggressively than we ever expected and we ended up ‘overspending’ according to budget plans by many millions.

Brand versus Performance

“You should always invest in the long-term by building a brand” versus “Performance Marketing is key, Advertising can’t be measured”. If you’ve heard both, welcome to the club! If you lean one way, you might be able to learn a bit more about the other side. It’s a misconception that you can’t measure a brand. The comparison that I keep bringing up to people is the one of Booking.com versus Airbnb, one very well known for its performance marketing and the other one for its amazing brand building. However, that doesn’t mean that Airbnb spends over $400+ million on efforts that I would mainly categorize as performance marketing. Initiatives don’t always belong in a bucket, although I just described as they do. Both sides are just as important and not your budget but your overall marketing strategy should guide you on what to spend money on. Ours has guided us in multiple ways but we’ll remain to invest heavily in both buckets, in the future.

Working with Finance

Understanding Finance & Their Concerns

As part of the misconceptions, I touched on unlimited resources, and money isn’t unlimited for a company. If you’d spent all of your marketing budgets in the first month of the year, you likely would go bankrupt because your Finance team wasn’t aware of you doing this and ran out of cash flow. They also have a lot of their worries around financing and accounting, which makes it so that you need to be aware of what they care about and help.

Actually Understanding Finance & Accounting

Pick up a couple of books that can give you a high level of what finance and accounting terms mean so that you know how to speak the language of your finance team/CFO.

→ The HBR Guide to Finance Basics for Managers is a good start. After that, you’ll easily figure out what other books you might find interesting and how deep you’d like to go into the topic.

Being able to prove ROAS accurately

Invest in (marketing) analytics so that you can more accurately predict the return on your investments. We have spent the past two years getting good at this to ensure that we can ensure that our investments are returning value for the business. This is not just important for budgeting, but also a direct way to give Marketing a seat at the table as we can provide good answers to questions about how much money every additional 1M of (direct) marketing spend could be.


Building a data warehouse for SEO in Google BigQuery

Why do you want/need a data warehouse for SEO?

A data warehouse (& data lake) stores and structures data (through data pipelines) and then makes it possible to visualize it. That means it can also be used to help create and power your SEO reporting infrastructure, especially when you’re dealing with lots of different data sources that you’re looking to combine. Or, if you have just a ton of data, you’re likely looking to implement a (quick) solution like this as it can power way more than your actual laptop can handle.

Some quick arguments for having a warehouse:

  • You need to combine multiple data sources with lots of data.
  • You want to enable the rest of the organization to access the same data.
  • You want to provide a deeper analysis into the inner workings of a vendor or search engine.

It’s not all sunshine and rainbows. Setting up a warehouse can be quick and easy, but maintaining is where the real work comes in. Especially when dealing with data formats and sources that change over time, it can create overhead. Be aware of that.

What can I do with a warehouse?

Imagine you’re a marketing team with a strong performance-marketing setup, you advertise in Google Ads and meanwhile in SEO try to compete for the same keywords to achieve great click share. It would be even more useful if your reporting could show an insight into the total number of clicks in search (not either paid or organic). By joining two datasets at scale you would be able to achieve this and meanwhile visualize progress. Excel/Google Sheets will give you the ability to do this (or repeat the process if you’re a true spreadsheet junkie) but not to have daily dashboards and share it with your colleagues easily. With a warehouse, you’d be able to store data from both ends (Google Ads and Google Search Console), mingle the data, and visualize it later on.

Seer Interactive wrote a good blog post about their decision to move to their own, homegrown, rank tracking solution. It provides an interesting insight into how they’re leveraging their internal warehouse for some of its data as well.

Do I actually need a warehouse?

Are you a small team, solo SEO, or work on a small site? Make this a hobby project in your time off. You likely, at your scale, don’t need this warehouse and can accomplish most things by connecting some of your data sources in a product like Google DataStudio. Smaller teams often have less (enterprise, duh!) SEO tools in their chest, so there is less data overall. A warehouse can easily be avoided at a smaller scale and be replaced by Google Sheets/Excel or a good visualization tool.

Why Google BigQuery?

Google BigQuery is RVshare’s choice for our marketing warehouse. Alternatives to Google BigQuery are Snowflake, Microsoft Azure, and Amazon’s Redshift. As we had a huge need for Google Ads data and it provided a full export into BigQuery for free, it was a no-brainer for us to start there and leverage their platform. If you don’t have that need, you can replicate most of this with the other services out there. For the sake of this article, as I have experience dealing with BQ, we’ll use that.

What are the costs?

It depends, but let me give you an insight into the actual costs of the warehouse for us. Google Search Console and Bing Webmaster Tools are free. Botify, Nozzle (SaaS pricing here), and Similar.ai are paid products, and you’ll require a contract agreement with them.

  • Google Search Console & Bing Webmaster Tools: Free.
  • Nozzle, Similar.ai, Botify: Requires contract agreements, reach out to me for some insight if you’re truly curious and seriously considering purchasing them.
  • StitchData: Starting at $1000/yearly, depending on volume. Although you’re likely fine with the minimum plan for just 1 data source.
  • SuperMetrics: $2280/yearly, this is for their Google BigQuery license that helps export Google Search Console. There are cheaper alternatives, but based on legacy it’s not worth for us to switch providers.
  • Google Cloud Platform – Google BigQuery: Storage in BigQuery is affordable, especially if you’re just importing a handful data sources. It gets expensive with larger data sets. So having the data itself is cheap. If you’re optimizing the way you process and visualize the data afterwards you can also save a lot of costs. Average costs for querying/analysis are $5 per TB to do that, and especially on small date ranges and selecting a few columns it’s hard to reach that quickly.

Loading Vendors into Google BigQuery

A few years ago, you needed to develop your data pipelines to stream data into Google BigQuery (BQ) and maintain the pipeline from the vendor to BQ yourself. This was causing a lot of overhead and required the need for having your own (data) engineers. Those days are clearly over as plenty of SaaS vendors provide the ability to facilitate this process for you for reasonable prices, as we just learned.

Bing Webmaster Tools & Google Search Console

Search Analytics reports from both Google and Bing are extremely useful as they provide an insight into volume, clicks, and CTR %. This helps you directly optimize your site for the right keywords. Both platforms have their own APIs that enable you to pull search analytics data from them. While Google’s is widely used available through most data connectors the Bing Webmaster Tools API is a different story. Find the resource link below to get more context on how to load this data into your warehouse as more steps are involved (and still nobody knows what type of data that API actually returns).

Resources

Saving Bing Search Query Data from the Bing Webmaster Tools API

→ Saving Google Search Console data with StitchData or Supermetrics

→ Alternatively, read about the Google Search Console API here to implement a pipeline yourself

Rank Tracking: Nozzle

Nozzle is our solution at the moment for rank tracking, at a relatively small scale. We chose them a few months ago, after having our data primarily in SEMrush, as they had the ability to make all our SERP data available to us via their BigQuery integration.

Technical SEO: Botify

Both at Postmates and RVshare I brought Botify in as it’s a great (enterprise) platform that combines log files, their crawl data, and visitor data with an insight into your technical performance.

Similar.ai

Lesser known is Similar.ai, which provides keyword data and entity extraction. Useful when you’re dealing with a massive scale of keywords of which you want to understand the different categories. Especially when they’re to create topical clusters it’s coming in very useful. With their Google Cloud Storage > Google BigQuery import we’re able to also show this next to our keyword data (from Google Search Console).

Bonus: Google Ads

If you’re advertising in paid search with Google Ads it can be useful to combine organic keyword data with paid data. It’s the reason why I like quickly setting up the Data Transfer Service with Google Ads so all reports are automatically synced. This is a free service between Google Ads and Google BigQuery. More information can be found here.

How to get started?

  1. Figure out what tools that you currently use provide a way to export their data?
  2. Create a new project in Google Cloud Platform (or use an existing one) and save your project ID.
  3. Create a new account for StitchData and where needed create an account (paid) for Supermetrics.
  4. Connect the right data sources to Google BigQuery or your preferred warehouse solution.

Good luck with the setup and let me know if I can help in any way. I’m curious how you’re getting value from your SEO warehouse or what use cases you’d like to solve with it. Leave a comment or find me on Twitter: @MartijnSch


Learnings from a year with Google Tag Manager Server Side

Learnings from a year with Google Tag Manager Server Side

Over the last few years, I’ve written several times about Google Tag Manager, especially when I was at The Next Web as we had an extensive integration. Something we ran into over time was the lack of being able to send events that needed validation or were triggered from the front-end. At the time, I asked the Product team of GTM what their thoughts were about sending server-side events. It wasn’t a primary use case for a publisher, but for many others, it is. For example, you often want to validate data or check if an interaction has happened/passed a milestone (like verifying payment) before you can accept a purchase. In the web version (front-end) of Google Tag Manager, you either had to fire an event by hitting the purchase button or wait until the thank-you page was loaded. But sometimes, that doesn’t guarantee it, and the final confirmation takes place behind the scenes. This is where Google Tag Manager for the server-side will come in.

An intro to Server Side Google Tag Manager

Google Tag Manager server-side was released originally in beta in 2020, and since the early days, we’ve been in their program with RVshare. Server-Side (SS) leverages Google Cloud Platform (GCP) to host a GTM container for you that you then point via your DNS records to it. By doing this, you’re able to assure that any ad/privacy blockers are not blocking the default Google scripts hosted on google.com. But mainly, it will provide you with the ability to receive back-end requests. In our case, this means that we have more flexibility to validate purchases or user signups before we fire a successful event to SS GTM.

Learnings

New roads can be bumpy as you’re still trying to learn what should work and what new beta features might still not be ready for production. That’s why we gathered some learnings on our integrations that I wanted to share.

Server-Side Validation

Although we’ll likely all have to move to a world in which we run server-side containers there are still plenty of reasons why smaller sites don’t have to migrate yet and can avoid the overhead. But if you’re dealing with use cases that require validation, I would urge you to take a look.

Examples are:

  • You want to verify someone’s identification before registering a signup.
  • You want to verify payment information or inventory status before registering a purchase.

Those examples will require validation but will likely already send the user to a thank-you page without knowing if this is actually all approved or not. As a lot of this depends on business logic you want to ensure that your marketing or product analytics tracking in Google Analytics 4 supports this as in my opinion as you want to align your reporting with the business metrics that the rest of the organization cares about.

How to think about this for your setup? What are some of the examples or business rules that are in place for your organization that your current analytics integration might not line up with perfectly? Chances are that you find an opportunity to improve this by using server-side validation and only sending events once it successfully passes.

Managing Server Capacity

GCP and SS GTM on setup will create 3 (small) servers via Kubernetes by default. When you start receiving many events and send them in parallel to many different clients, you’ll start to notice the load on your servers increase. We didn’t realize that autoscale only goes up to 6? by default. Since then, we improved this, but it caused us to lack partial data for a few days after a significant release. So check how many servers your servers can auto-scale to avoid any issues. Realistically, you can quickly push that number up to 20+ without incurring any additional costs as they’re only used with a high CPU event, and they’re automatically being downsized after traffic slows down.

How to do this yourself?

As the procedure for this has changed since we ran the updates for this about six months ago, I recommend reading the instructions for reconfiguring the servers that App Engine creates as it will help you find the right command-line instructions to execute.

Monitoring Request & Server Errors: Slack & Email Alerts

Look at the trend line in the report. Notice the cliff? You never had to deal with this issue previously, as Google managed GTM web containers. But with SS GTM, things can go wrong, quickly and you’ll need alerting to tell you about it. In this case, an error came up on the side of Google Analytics with ec.js, which wouldn’t load for enhanced commerce, causing 5XX errors on the managed servers. That eventually led to conversions not being recorded, as that’s a mission-critical piece of our infrastructure. You can guess the implications.

How to do this yourself?

  1. Go to your Google Cloud Platform project that is associated with your GTM account, your project ID will look something like: ‘gtm-xxx’.
  2. Use the search bar at the top to find the ‘Logging’ product, you can also go there via App Engine which powers the servers for this.
  3. You can find all the requests that are being sent to this server and debug what could be going on.

Saving request & error logs in Google Cloud Storage

Server-side events can come in all shapes and sizes, POST/GET requests, and can contain many things that you care about. In our case, we sent a server-side event on purchase with a full POST body with data. But… this body isn’t shown directly in the logging feature that we just talked about, that we just discovered as only the request URI is shown: GET /request-path. When you’re sending an event to: /request-path?event_name=foo-bar with a body, this quickly can cause issues as you don’t have a place to debug the body. We opted for sending this full request as a JSON file to Google Cloud Storage so that we can evaluate the full request. As this technology is still in development I can’t share much more about it this time.

Special thanks to the team of Marketlytics and Data to Value (Rick Dronkers/Peter Šutarik), as they’re our marketing analytics partners at RVshare.