The previous two blog posts in this series talked about writing better job descriptions for SEO roles and levels and seniority for SEOs. In this blog post, I want to mostly talk about how to grow as an SEO: the fundamental part of this series, how do you get better, what do you grow in, but mostly what tools and resources do you have available.
“It’s not about resources, it’s about resourcefulness” – I’ll leave it up to you to Google who this quote is from. It’s not that hard, it’s a quote that hit me a few years ago. I felt I was stalling in a role and needed to move forward. There is so much that you can do yourself to advance your career and learn. That’s why I wanted to focus in this blog post on the things that I’ve used, and in most cases are still using to learn more about SEO.
I’ve tried to list as many different learning options as possible, in the end, everybody learns in a different way. It’s one of my favorite questions to ask in interviews what the best way is for people to learn. If they’re aware of what it is, it for sure adds a bonus point to the candidate.
Conferences & Meetups
While I was still living in Amsterdam, at some point I felt like that I didn’t miss any meetup related to online marketing in a while. I went to a ton of them, and they were great. There is so much to learn at an event: soft skills: networking, talking, socializing, small talk. All skills that are just as important (I’d argue even more important) than the on the job skills (crawling, technical, content, etc.).
So let’s give you a selection of the blogs that I’ve been following over the years that helped me build my SEO knowledge. These are some of the ones on the list (read: it’s far from complete, I’ll keep updating the list with specific SEO blogs).
Follow people, follow experts. You can learn so much from the approach from that other people are taking. Just to get a different insight or to learn a new tool. SEO is a rich field where everybody has their own tactics and I feel strongly that every week I pick up on some new tactics in SEO that I’ve never thought of (mostly around research or authority building). We can’t know everything but it’s a good tactic.
Talk to people, they can tell & learn you more. I’ve asked people at companies that I admire for years if I could have coffee with them. If you’re reading this and one of them, I thank you again!
Don’t ask for trade secrets, but if you do your research it will strengthen the conversation. So let’s expand a little bit on that …
After I’ve just mentioned that having coffee is great to learn more, but make sure you come prepared. As we’re talking about SEO, run an audit. Ask them why they’ve done certain things this way. I’ve learned a great deal just analyzing and researching the best sites trying to figure out what their SEO strategies are after which I got that confirmed by their teams while having coffee with them. I’m not telling this to brag about it, but to give you an insight into what you can do to get more out of the meeting too. It will strengthen the conversation. You’re using somebody’s time and she or he will likely appreciate it if you know what they’re talking about in more detail.
Creating Playbooks / Keeping track
Recently I shared for the first time the idea behind building a playbook in a presentation that I gave at a conference. It’s something that I’ve actually been doing for a few years now. For about a five years now I’ve been saving job descriptions, not of jobs that I wanted to be hired for (at least mostly not). But Marketing roles that I thought I was going to hire one day or grow into. They’re a great archive (I have close to 200!) these days for whenever I need to fill a specific role.
But the same methodology applies to most parts of somebodies work, most content teams have a style guide, when you’re working in CRO you have templates to document and hypothesize your experiments. But I felt that most of these ‘standards’ were missing across functions within Marketing and specifically (in-house) SEO teams.
What other tools are out there for others to use as well? What learning options have I missed and should I add to the post? Leave a comment here or on Twitter: @MartijnSch and I’ll make sure to keep this post updated, just like the others.
Growing as an SEO – This series
In this series I’ve also blogged about:
Since I’ve joined RVshare, I needed to think a lot about these questions (again): what people do I need to hire? What experience level do they need to be at? This made me reflect back on hiring for my teams at Postmates and The Next Web and my views on different levels in certain functions. As my background is, mostly, in SEO I started to think about what levels I would form within a big SEO team and what their differences are. This is my first attempt at this framework and part of the series about growing as an SEO, the previous blog post talked about how to write a proper job description for an SEO role.
In this blog post I want to talk more about the different seniority levels, what do they mean? What kind of role are you looking for: specialist or generalist? What level are they at? And what kind of levels do you need for your own team and what might be the different responsibilities for the different roles and how do they change (over time).
Generalist versus Specialist
Are you a smaller or bigger company and how big is your SEO team? What are you really looking for on your team? What is your own background? Do you know enough about SEO yourself to successfully guide & lead an SEO person?
You’ve probably heard of the idea behind a T-shaped role. Do you expect somebody to know a lot in one specific area (specialist) or do you want that person also to know a lot about the other areas that have a relation with SEO. This visual is just the tip of the iceberg o other skills that you can expect from an SEO.
I started myself as a generalist in my career, like most people. Back then, Springest had a need for traffic acquisition and I worked on their SEO, Affiliate Marketing and later on their Paid Acquisition (mainly Adwords). Next, to that, I worked a lot with Google Analytics to learn more about the keywords that were driving performance (this was before ‘not provided’ got introduced).
Mostly in smaller organizations, I see marketing leaders or founders hire for this type of person. In most companies, you early on need somebody to test the waters for all the channels and need to be able to manage more than just one thing. SEO isn’t usually the fastest growing channel for a company as it takes a while. That has a huge impact on why there aren’t a ton of people with a dedicated focus on SEO in most startups.
Later on, when I left Springest and joined The Next Web I was much more of a specialist. I focused solely on SEO, although later on, I added analytics and CRO (all before I lead their Marketing team). This meant that I needed to be proficient in all the areas that were part of SEO: technical SEO, content (we had tons of editors to work with) and figure out how to build out our authority at a big scale. All this type of work was very much only focused on SEO and didn’t have much impact on other channels.
Most SEO roles these days that I see are similar, they’re usually part of a digital marketing team and/or are the only person on the team with a dedicated focus on SEO to help that channel. They have often contact with a product manager. Marketing manager and the needed people focused on content, design, and development. But they’re the ones driving the specific roadmap for SEO.
Individual Contributor (IC) & Management (M) Roles
Not everybody is a generalist or a specialist, neither is everybody a manager or wants to focus on just one discipline. But for most people, it makes sense to belong to a specific ladder.
Individual Contributor Roles in SEO
With most companies, you’ll start at the bottom of the totem pole when you start your career in SEO. Most people will start right around the title of SEO Associate or SEO Specialist at the beginning of their career and work their way up the ladder. After a while, most of them will need to make the decision to either continue to be an Individual Contributor (IC) or move into the role of manager where they start managing (or better: leading) people.
- SEO Internship: We all need to build up experience and what better way to do that than with an internship/apprenticeship. This role will usually get the support of the SEO team while you learn how SEO works. Most people that I’ve seen enter this role have a passion for online marketing and are studying something in a related field (or totally not, sometimes even better). You’re never long in this role (at most 5 months), you either tend to like or not so you can move up on the SEO ladder.
- SEO Associate: In some cases, this role comes in between an internship and having the title SEO Specialist. This usually happens within enterprises where you’re dealing with bigger SEO teams. There is not a ton of difference between the role of an SEO Specialist and the SEO Associate. But usually, SEO specialists tend to have a little bit more of experience (1-2 years as a maximum). They’re starter positions and sometimes the titles are intertwined.
- (Senior) SEO Specialist: For most people, this is where they’ll start, the SEO Specialist. I’ve been and done there myself when I joined TNW this was my job title. I was the only person on the Marketing & Sales department dealing with SEO and was answering to the (at the time) CMO. This meant that I was working on all the aspects of SEO and was working with a development. When you’re getting more experience and depending on the size of the organization and HR structure it could be that you get the title Senior SEO Specialist after a while to claim the more experience that you have.
- (Senior) SEO Manager: You’re growing, you’re basically now sort of managing the SEO process and you’re not answering usually to somebody who’s leading the SEO team anymore. You’re the one in charge of SEO but you’re not leading anybody specifically on the SEO team itself.
- (Senior) Head of SEO: The highest level that I usually see on SEO teams as an Individual Contributor. It makes it that you’re not managing other people but work deeply on SEO and have the fundamental knowledge and resources around you to manage the whole process from start to finish. There isn’t a ton of companies that I know that are able to support this role as in most companies they’ll require you to become a manager.
Management Roles in SEO
Some people chose to go the route of the manager, they want to lead a team and be responsible for multiple people. This is where management & leadership skills are becoming more important as they’re not working 100% of their time hands-on on SEO anymore.
- SEO Team Lead: This role likely makes sense by reading the job title. You’re part of a small SEO team and you’re the lead. I like to apply this seniority level on a team when it’s small and the ‘manager’ isn’t very experienced yet as a leader. It’s usually the case when they have moved over from the level of SEO Specialist and you decide to hire another SEO Specialist. Somebody has to lead the wolf pack and decide on a strategy. If the 1st SEO person has the ambition to step over to a more managerial role over time, this is a good start.
- (Senior) SEO Manager: You’re managing the SEO team and you work with some people outside your own team to get things done. Usually, the case when you’re part of a bigger Growth or Marketing team and you’re the one deciding on what work is important to help the bigger team achieve its goals.
- Director of SEO: You can strategically think about SEO and you’re part of a bigger organization. That’s what my last title was at Postmates. Our overall Growth organization of which we were part of was around 50 people and we had multiple Directors of different functions (Growth Product, Growth Engineering, Paid Acquisition, etc.) report into our VP of Growth. You lead a team that can also work cross-functionally with other teams within and outside the same group.
- VP of SEO: Likely the highest seniority title that I’ve seen in SEO for in-house was that of VP of SEO. There are a few companies, mainly in the United States, that use that title. They’re enterprise companies (in all cases that I’ve seen at least). Where they differ from a Director of SEO role is that they’re focused on the bigger picture. They lead a team that is usually 1.5-2 times as big as the level lower and are responsible for just more than SEO. A position like this is usually also heavily involved in functions like Public Relations, Brand Marketing, and Content Creation depending on where that might live in the rest of the organization.
‘Global … Head of SEO’ – Global companies & reflecting this on titles
Through Twitter (@micahfk) reached out, with a good point about the title: “Global Head of SEO”. I’ve seen this level a few times myself as well and I agree with his point that this title can in most cases have more weight than a title on a Director level. In companies at scale, there will be a global team managing all of the enterprises’ SEO strategy where on the local level (usually countries or regions) teams will work on the local execution (and often strategy). They’ll have similar titles, but usually, the people who will head up a Global team will rank higher on the organizational chart.
This framework is simplified and not perfect. It’s a first shot at assessing what roles there are in an SEO function from an in-house point of view. As I’ve never worked with/for an agency I’m sure their views on this would be different, I won’t blog about that. It’s up for grabs for somebody who has extensively worked on that side of the fence.
Work in Progress: This blog post is a work in progress. I hope to extend it over the upcoming weeks with more information on the responsibilities and areas that the different roles work on.
Growing as an SEO
In this series I’ve also blogged about:
We have all been there, haven’t we? Quotes like: “SH*T, my sitemaps are broken”, “I have no-indexed half my pages” or: “I have been kicked out of a search engine with way too many pages” sound familiar? Honestly, I can’t blame you. It’s getting harder and harder to keep track of all the changes that are being made regarding SEO on your site and you’re likely the only person involved with SEO for your company while also trying to work on driving traffic through other channels. So let me give you a quick insight in what I am usually tracking within bigger companies and where there is an actual team to react on issues that come up.
Continue Reading →
Writing a resume isn’t fun (IMHO) and writing job descriptions is probably even less fun. Over the last years I’ve written many of them, usually following a similar template that would help us define what the role is about. Which isn’t always a good thing, depending on the seniority of the role you want to make sure you use the right approach to hire and make it as personal as possible. Which usually makes for better hiring, most of my best hires came through my network of people I was at least ware of. Over the last months I’ve received many requests if I wanted to take a look at an SEO job description, if I knew people that were looking for a job and wanted to share it with my network, you get it. But what I started noticing is that most SEO job descriptions are incredibly generic and don’t really seem inviting too many people.
“We’re looking for somebody to set up or SEO strategy, we’re looking for somebody to work with our engineering and design team to create content. You’ll pick the right keywords for us to focus on”. Yada yada yada. You’ve seen and heard it all before. Obviously when you’re on a job search in SEO you’ll come across all of these requirements and responsibilities easily. But I think companies need to do better, definitely in an area like Silicon Valley, to hire the right SEO talent or to get them even interested. There isn’t that many of us, but the information you give ‘us’ isn’t always great. That got me thinking on what information should be mentioned in job descriptions for SEOs. But I also wanted to take a look at what job descriptions look like right now:
Saving job descriptions
I must admit, I have a weird obsession, if I see well written (or really poor) job descriptions, for whatever type of role in digital, growth, marketing, you name it, I have a tendency to save them (in Evernote). Over the years that has build up to a nice archive (150+ JDs) that I can use for writing new job descriptions that I’ve used for hiring. The list of 16+ companies that are amongst them: Airbnb, Uber, Groupon, Booking, Zillow, Hulu, Porch, Tesla and the descriptions range from SEO Assistants to more senior positions like Senior Director of SEO. Fill that up with all the job descriptions that you can easily find on most job sites (LinkedIn, Glassdoor) and you can get a good enough understanding of what managers + recruiters are thinking about while sourcing/hiring for SEO roles.
Almost unfortunately, Postmates didn’t have a job description for me. As my previous boss asked me to fill this need within the Growth team, otherwise I would have loved to share that original one.
What companies are looking for?
It doesn’t exist, even when you’re in the right position and you might be able to write your own job description. But most of them have some issues, so I decided to look at all the SEO job descriptions that I could find and see if there are any patterns in what companies are looking for. So let’s look at the two main areas of job descriptions:
Tag clouds are good for something I guess, that’s why I just threw in all the requirements for a dozen job descriptions and these were the main keywords that came up in the tagcloud. Some of the ones that stood out for me:
- Performance: This keyword was interesting to me so I did some digging on the context, I expected it to be a requirement to know about performance marketing. Turns out the overwhelming majority of companies wants better performance reporting around their SEO strategy.
- Content: People in SEO need to have a solid understanding of content, know how to create it and maybe even more important, know how to improve it.
- Technical: Guess what, these days SEOs need to be technical. As most of the job descriptions are from Bay Area companies, that doesn’t surprise me at all considering that the work with product managers (or in some orgs are even PMs) and engineers most of the day. This is also important regarding technical audits that are usually performed inhouse.
- Strategies/Initiatives: SEOs need to be able to make strategic decisions. For most companies they’re one of the people working on usually the biggest traffic channel for the site so they need to be able to think strategic as they can make changes to a platform that have a bigger impact than just SEO.
- Team(s): They either need to be great working in teams (aka teamplayer) and in the more senior positions they need to be great at building up their own teams, or building out.
While analyzing this there were a few things that I was missing that I thought were interesting so at least I wanted to mention them.
- Agencies: A good portion of SEOs that I know work with agencies, but there was barely a mention in job descriptions about working with agencies, finding them, etc.
- ASO: Most companies that I went through had mobile apps, but ASO was never really part of the job description.
Requirements / Qualifications
- Experience in SEO: For starter roles this is usually not a requirement, as they can only have experience with the work that they’ve done on the side and not in an actual job/company.
- Experience in Analysis: Most SEOs needs to be at least familiar on a basic level with a web analytics tool like Google Analytics, Omniture, Adobe Analytics so they can analyze their performance (one of the core responsibilities).
- Tools: Often I see experience with Google Search Console being mentioned, but I’d love to see more companies mention the other tools in their toolset too. In the end you won’t share that much information with your competition by telling them what tools you’re using.
- Delivering results: Although you can’t guarantee that your work will help you need to be able to show the progress that you’ve made on other sites and the work that you’ve done there. If it didn’t result in an uplift, at least you’re able to provide answers on the why not and what your original hypothesis was.
What I feel is missing in the list of requirements & qualifications is a few things, what about the setup that you already have, or are they diving into a new field of opportunity. Are you going to expand your business, are you operating in new niches? For some companies the future manager will already know what projects (s)he wants work to be done for.
- Tools? What is your current toolset, if somebody has exceptional expertise with a certain tool that for sure would help. Anybody can learn more about a tool, but experience is important too.
- How often have they played ‘this’ game before? How many sites have you worked on, what was the scale/business model of the sites? I have way more experience then on average with publishers and marketplace models then probably other SEOs. While somehow I have barely worked for ecommerce sites and SaaS companies thus far. This also gives better insights if they have a certain ‘play book’ on how to approach certain issues.
Writing the Ultimate Job Description
I’m on a journey to change the world. OK slowly. And one by one. But I believe we can do better, making people find the right jobs will make them happier and increase the productivity and output for the company. The first step to get that started would be to improve job descriptions so people have a better idea on what they’re getting into then setting up a very generic one. Not all bullet points will apply to every job description, but you likely get the point:
- Define the SEO strategy: we’re wanting to grow (X metric) with approximately XX% this year. SEO is one of the channels that we depend on, so we’re looking for somebody who could build out the channel after an intensive audit and figure out what opportunities we really have.
- Reporting: be able to use our analytics infrastructure to dive into customer & traffic data to find new insights and opportunities for us to grow SEO as a traffic channel.
- Reporting Up: be able to talk to our stakeholders and peers in the company about the performance and opportunities that you see within SEO. Be able to communicate the results of the work that we‘ve done.
- Analytical: be analytical and data driven, are you able to write SQL and work with large amounts of data? Great! We have some of our analysts ready to work with you in supporting the insights that you need to gather.
- Technical: we have developers ready to work with you, so it would help if you could code and be able to explain in detail what your wishes are for implementations regarding SEO and new features.
- Content: we’ve been wanting to create & produce more and better content. It would be great if you have worked with copywriters and are able to take our blog & content marketing efforts to the next level. We have copywriters that we work with and also our PR specialists.
- Build out the team: be a team leader and builder. Currently the team is 2 people that will be supporting you, but we hope to build out the team with your support. So we’d like to see experience leading people & teams.
- Performance: you need to be able to identify opportunities, build out the resources needed and along the way have a ton of fun while always striving for better results.
Requirements / Qualifications
- You have X years working experience in online/digital marketing and you know what channels are important for our type of business to be successful.
- You have worked on (multiple) big sites regarding SEO before, it is important to us that you can show experience building out a strategy for a bigger site (50.000+ pages).
- You have worked with web analytics tools and understand how you can use these insights to further improve user experience and optimize pages for search engines. Preferred tools would be: Google Analytics, Amplitude Analytics, Adobe Analytics, …, etc..
- Do you have experience writing or have worked with copywriters before, great! This will help push forward our ideas on content marketing.
- You have experience managing different products/projects at the same time, our teams are divided between products/projects and some are cross functional (designers, engineers).
- You have worked before with tools that we already have in our toolset: Google Search Console, Bing Webmaster Tools, Majestic SEO, Screaming Frog, … , etc. but you’re free to look into other SEO tools (up to enterprise budget) and evaluate needs for our organization.
This is not even good enough but hopefully a good start, in the job descriptions that I usually write I also try to give insights into the company, mention what the team looks like and what the perks & benefits are of the role. But most important what type of person we’re looking for and how we think this role will help the bigger team grow & support. In the end it’s a two way stream and we want to make that clear from the start. You need somebody’s skills but you also want them to feel welcome and appreciated!
What do you think is really missing in job descriptions these days that should be reflect. What are you looking for in a next or first SEO role? Let me know, I’d love this post to become the ultimate SEO job description for the rest of the world. Hit me up on @MartijnSch on Twitter for feedback!
Growing as an SEO
In this series I’ve also blogged about:
In my effort to write longer posts on a specific topic I thought it was time to shed some light on something that we’ve been working on during the last months at Postmates and something that I never thought of as a topic that could become interesting: sitemaps. They’re pretty boring in itself, it’s a technology where you give search engines basically all the URLs for a site, that you want them to know about (indexed) and you take it from there. Even more so, as most sites these days run on a CMS like WordPress where tons of plugins can take care of this for you. Don’t get me wrong, do use them if you are on one! But as I work mainly for companies that don’t have a ‘standard’ CMS I worked multiple times on creating sitemaps and having their integrations work flawless. Over time that taught me a ton of things and recently we discovered that certain additional features in the process can help speed up the process. That’s why I think it was time to write a detailed essay on sitemaps ;). (*barf: definitive guide).
TLDR; How can sitemaps help you get better insights, how to set them up?
- Sitemaps will provide you with insights on what pages are submitted and which ones are indexed.
- You create create sitemap files by uploading XML or TXT files with dumps of URLs
- All your different content on pages can be added to sitemaps: images, video, news.
- Different fields for priority, last modified and frequency can give search engines insights in the priority for certain URLs to be crawled.
- Create multiple sitemaps with segments of pages, for example by product category.
- Add your sitemap index file to your robots.txt so it’s easy to find for a search engine.
- Submit your sitemap and ping sitemap files to search engines for quick discovery.
- Make sure all URLs in your sitemaps are working and returning a 200 status code, think twice: do you all want them to be discovered?
- Monitor your data and crawls through log files and Google Search Console.
When you start working on sitemaps there is a few things to keep in mind. The ideas that you have around them and the goal: what problem that you have are they solving? For small sitemaps (100 pages) I’m honestly not sure if I would support sitemaps. There is probably a lot of other projects that would have more impact on SEO/the business.
If you’re thinking about setting up sitemaps there is a few goals that it will help you accomplish:
- Get better insights into what pages are valuable to your site.
- Provide search engines with the URLs that you want them to index, the fastest way to submit pages at scale.
Overall this means that you want to support the best sitemap infrastructure you can as that will help you get the best insights ever, the quickest way to get these insights and most of all get your pages indexed + submitted as fast as possible.
Format? XML/Text? Does the format matter, for most companies probably not as they’re using a plugin to support their sitemaps. If you want to go more advanced and get better insights I would go with the XML format myself. From time to time we’re using text file sitemaps where we just dump all the URLs. They’ll help in getting you a sitemap quick and dirty if you don’t have the time or resources quickly.
Types: There are multiple formats for sitemaps to support different content types.
- Pages: In there you’ll dump all the actual URls that you have on the site and that you want a search engine to know about. You can add images for these specific pages to that Schema as well to ensure that the search engine understands what images are an important part of the page.
- Images: For both image search as making an impact with the pages you can add sitemaps for images.
- Videos: Video sitemaps used to have a bigger impact back in the days as the video listings were a more prominent part of the search results page. These days you mostly want to let search engine know about them as they’re usually part of an individual page.
- News: News is not really its own format as they’re just individual pages. But Google News sitemaps do have their own format. Creating a News Sitemap – Google.
- HREFLang: This is not really a type of content but it’s still important to think about. If your pages have a translated version, you want to make sure they’re being listed as the ‘duplicate’ version of that. Read more information about that here in Google’s support.
- Frequency: Does the page change on a regular basis? Some pages are going to be dynamic and will always change. But for some of them they will change only daily, weekly, monthly. It’s likely worth it to include this as a good signal in combination with the Last Modified field and the header.
- Last Modified: We do want to let a search engine know what kind of pages have been updated/modified and which ones aren’t. That’s why I’d always recommend to organizations that they should include this in their sitemap. In combination with the Last Modified header, we’ll talk about that in the next step it will be a good enough signal to assess if the page has been modified or not.
- Priority: This is a field that I wouldn’t spend too much time thinking about. On multiple occasions, Google has mentioned that they don’t put any value or effort into understand this field. Some plugins use it and it won’t hurt. But for custom setups it’s not something that I would recommend adding.
Has the actual sitemap changed since the last time it’s been generated? Yes or No? In some cases your sitemap won’t change. You didn’t add any new products/articles. Have you ever run this in your terminal:
curl -I https://www.example.com/sitemap/sitemap_index.xml
Look at the headers, if you see a Last Modified header, it will be a signal to see when the page has been last modified. We use it to tell the last time it was updated. We combine this with serving a Last Modified Header at the URLs that are in the sitemaps. Sometimes this won’t always work as pages can change momentarily (based on availability of products for example).
For better insights it’s really useful to segment your sitemaps. The limit per sitemap is in the end 50.000 URLs, but there is basically not a required minimum. The way you’ll see sitemaps being segmented is in multiple ways. Based on these you can get more segmented insights, is 1 category of pages better indexed then another one.
Categories: Most companies that I work with are segmenting there pages by the categories they’ve defined themselves. This could be based on region or for example by product categories for an ecommerce site.
Static Pages: Something that most people with custom build sites don’t realize is that there is usually still a ton of pages that aren’t backed up by a database that you you want insights on too. Think about: contact, homepage, about us, services, etc. List all these pages in a different sitemap (static_sitemap.xml) and include this file in your sitemap index too.
If you have multiple sitemaps (10-25+) you want to look into creating a sitemap index file, with this you can just submit 1 file and with that the search engine will be able to find all the underlying files that are part of the sitemap. This saves you adding multiple sitemap URLs to Google Search Console/Bing Webmaster Tools and will also give you the ability to add only 1 line to your robots.txt file. In the end it’s another sitemap technically which lists all the different URLs of the other sitemaps.
You want to make sure that on first entry a search engine will know about your sitemaps. Usually one of the first files a search engines’ crawler will look at is the robots.txt file as it needs to know what it can/can’t look at on a site. As we just talked about the sitemap index, we’re going to list that one in the robots.txt file for your site which should live on https://www.domain.com/robots.txt. It’s just as simple as adding this one line to it:
Obviously the URL can be different based on where you have hosted your sitemap index file.
If you’re a big site you likely have servers that won’t go down and can take quite a hit but if you have extensive sitemap files they could easily get up to +50MB that is not a file transfer that can be done in a matter of two seconds. Also it can just slow down things on both your end and the end of the search engine. That’s why we’ve started GZipping our sitemap files to make for a faster download and speed up that process, at the same time you make it 1 step more complicated for people to copy paste your data.
PING Search Engines
Guess what, it has an affect. I thought it was crazy too, but we found a tiny bit of proof that actually pinging a search engine will result in something. As you mostly will likely only care about Google and Bing we still have a way of letting them know about a page:
Submit your sitemap
Probably not worth explaining, you need to make sure that you can get insights into your XML sitemaps and the URLs that are listed in there. So make sure to submit your sitemaps to Google Search Console and Bing Webmaster Tools.
One of the projects that is very unknown is the PubSubHubbub project, it will let, mostly publishers, be instantly notified (through a specific push protocol) when new URLs are published in a feed. This protocol works through an ATOM feed (do you still know about that protocol?) that you provide. Once you have registered the feed with the right services you can make it easier for them to be notified of new pages.
XML Sitemaps aren’t easy to read for a regular person. If you’re not familiar with the format of XML it might be uncomfortable. Luckily a while back people invested XSLT. This will let you ‘style’ the output of XML files to something that is more readable. This would make it easier to see certain elements in the sitemaps that you’ve listed. If you want to make them more readable I would advise looking into: https://www.w3schools.com/xml/xsl_intro.asp.
Search engines like sites that are of high quality. The pages are the best, the URLs are always working and your site never goes down. Chances are high that all of this doesn’t always apply to your sitemaps as some pages might not be great. Some things to consider when you’re working on this:
- 301/302/404: Are all URLs in your sitemap responding like they should with a 200 response? In the best case scenario none of your URLs should be responding with another response code then that. In reality most sitemaps always contain some errors.
- NoIndex: Have you included URLs in your sitemap that are actually excluded by a noindex meta tag or header? Make sure that it’s not the case.
- Robots.txt: An even bigger problem, are you telling the search engine about URLs that you actually don’t want them to look at?
- Canonical Pages: Is the actual URL that you’re listing the canonical URL/original URL or are you listing the pages that are still ‘stealing’ the content from another page, like a filter page. Do you really want to list these URLs in your sitemap?
With all of these signals, some might have a big/small impact others won’t matter at all. But at least think about the implications that they might have when you’re building out your sitemaps.
Lately I’ve been working a ton with Apache Airflow, it’s the framework that we use at Postmates, invented by the great folks at Airbnb and mostly use for dealing with data pipelines. You want to do X, if X succes you want it to go on to task Y. We’re using that for the generation of sitemaps, if we can generate all sitemaps we want to have them pinged with the search engines, if that succeeds we want to run some quality scripts, if that is done we want to be notified on both email and Slack to tell us at what time the script succeeded.
For some sitemaps we want it to run everyday, for a specific segment we want to have it run on an hourly basis. The insights from Airflow will give us the details to see if it’s failing or not and will notify us when it succeeds/fails. With this setup, we have constant monitoring in place to ensure that sitemaps are being generated daily/hourly.
Eventually you only want to know if your pages are of good enough quality that they’re being indexed by the search engine. So let’s see how can see this in Google Search Console.
A useful report in Google Search Console is the Index Status report (Google Index > Index Status). It will show for the property that you’ve added how many pages have been indexed and what pages have been crawled. As the main goal for a sitemap is driving up the number of pages being submitted for the Google index the following step is making sure that they’re being indexed. This report will give you that first high level overview.
Sitemap Validation: Errors & Amount of URLs
But what about the specifics of the sitemap, are the URLs being crawled properly and are the URLs being submitted to the index. The sitemap reports give you this level of detail (in this case 98% is indexed, which makes sense, the 2% missing are some products that were test ones that Google seemed to have ignored, luckily!). Remember what we talked about before regarding segmenting your pages? If you would have done that you would have seen in this particular example what percent of pages in that sitemap was submitted / indexed. Very useful if you work on big sites where the internal link structure for example is lacking and you want to push that. These reports can (they not always) give you insights into what the balance could be between them.
- Are the URLs working (200 status code)? An unknown fact, but Google doesn’t like following redirects or finding broken URLs in your sitemaps. Spend some time on making sure that these pages aren’t in there or add the right monitoring to prevent it from happening. Since we’ve starting Gzipping our sitemaps that’s become a tiny bit harder as you first need to unpack them. But for quality testing we still have scripts in place that on demand can run a crawl of the sitemap to see if all URLs in there are valid.
- Page Quality: Honestly, is this page really worth it to be indexed in Google? Some pages are just not of the quality that they should be and so sometimes you should take that into account when building up sitemaps. Did you filter out the right pages?
Metrics & Analysis
So far we’ve talked about the whole setup and how to monitor results. Let’s go a little step further before we close this subject and look at the information in log files. It’s a topic that I became more familiar with and have worked closely with over the last months too:
As log files can be stored on the web server that you’re also using for your regular pages you can get additional insights into how often your sitemaps are being viewed and if there are any issues with sitemaps. As we work on them on a regular basis it could be that they break. That’s why we make sure that for example we monitor the status codes for the URLs so that we can see when a certain sitemap doesn’t hit a successful 200 status code.
Proving that pinging works
A while back we started to ping our sitemaps to Google and Bing, both make it clear (Google) that if you have an existing sitemap and you want to resubmit it this is a good way to do it. This sounds weird, Google got rid of their ‘submit a URL’ feature for the index years ago. So we were skeptic to see if this had any impact. As it was really easy to implement, you just fire a GET request to a Google URL with the sitemap URL in there. What we noticed is that we saw Google almost immediately try to look at these URLs. As we refresh this specific sitemap every hour, we also ping it every hour to Google. Guess what happens, every hour for the last weeks they look at the sitemap by now. Who says you can’t influence crawlers? Result? If you want to ensure that Google is actually looking at a page and actively crawling it, pinging seems to prove that, that is actually happening.
Screenshot of this from a Kibana dashboard where we log server requests
What if you can’t ping? Usually I would only recommend pinging a search engine if your whole sitemap generation process is fully automated, it doesn’t make sense to open your browser or have a tiny script for this. If you still want to basically experience the same, use the Resubmit button in Google Search Console > Sitemaps to achieve the same.
This is not all of it and I’ve gone over some topics briefly, I didn’t want to document everything as there’s already a ton of information from Google and other sites about how you can specifically setup sitemaps. In my case, we’re on a route to figure out how we can make our sitemap setup near perfect, what I’m still wanting to investigate or analyze:
- Adding a Last Modified Header to pages in the sitemap, what is the effect of pinging a sitemap and Google looking at all pages or just the ones that are modified?
- Segmenting them even further, let’s say I only add 100/1000 pages to a sitemap and start creating just more of them, does that influence crawling, do we get better insights?
You want to learn more about sitemaps, look into the following resources to learn more about the concept, the idea behind it and the technical specification:
When I started writing I didn’t plan to have this become everything I know about sitemaps. But what did I miss? What optimizations can we apply to sitemaps in order to get better insights, speed up the crawling of pages. This is just one of the areas of technical SEO but probably an important one if you’re looking for deeper insights into what Google or Bing think about your site. If you have questions or comments, feel free to give a shout on Twitter: @MartijnSch