Retrieving Search Analytics Data from the Google Search Console API for Bulk Keyword Research
Last year I blogged about using 855 properties to retrieve all your Search Analytics data. Just after that Google luckily released that the limits on the API to retrieve only the top 5000 results had been lifted. Since then it’s been possible to potentially pull all your keywords from Google Search Console via their API (hint: you’re not able to get all the data).
Since I’ve started at Postmates now well over two months ago one of the biggest projects that I started with was getting insights into what markets + product categories we’re already performing OK in from an SEO perspective. With over 150.000 unique keywords weekly (and working on increasing that) it is quite hard to easily get a good grasp on what’s working or not as we’re active in 50+ markets that influence the queries that people are searching for (for example, show me all the queries over a longer period of time with only Mexican in the title across all markets, impossible from the interface). That’s why clicking through the Search Analytics feature in Google Search Console was nice for checking specific keywords quickly, but overall it wouldn’t help in getting detailed insights into what’s working and what’s not.
Some of the issues I was hoping to solve with this approach:
- Pull all your data on a daily basis so you can get an accurate picture of the number of clicks and how that changes over time for a query.
- Hopefully get some insights into the actual number of impressions. Google Adwords Keyword Tool data is still vary valuable but as it’s grouped it can be off on occasion. Google Search Console should be able to provide more accurate data on a specific keyword level.
- Use the data as a basis for further keyword research and categorization.
Having used the Google Search Console API a bit before I was curious to see what I could accomplish pulling in the data on a daily basis and making sense of it (and combining it with other data sets, maybe more on that in later blog posts).
- Daily pull in all the keywords, grouped by landing page so you know for sure you get all the different keyword combinations and your data isn’t filtered by the API.
- Save the specific keyword if we haven’t saved it before, so we know if the keyword was a ‘first-hit’ for the first time.
- For every keyword that you return do another call to the API to get the country, landing pages and metrics for that specific query.
In our case we categorize the keywords right after we pull them in to see if it’s matching a certain market or product category. So far this has been really useful for us as it’s providing way better ways for dashboarding.
Some of the things that I ran into while building out this:
What to look out for?
- The API is very much limiting the keywords that you get to see with only impressions. I was able to retrieve some of the data but on a daily basis the statistics for impressions are off with 50% from what I’m seeing in Google Search Console. However clicks seems to only have a small difference, win!
- Apparently they’re hiding some of the keywords as they qualify them as highly personal. So you’ll miss a certain percentage because of that.
- The rate limits of the Google Search Console aren’t very nice, for over 5k keyword it’s taking quite long to pull in all the data as you have to deal with their rate limits.
Most of these items aren’t really being an issue for us, we have better sources for volume data anyway. In the future we’re hoping to gather more data around different sources to extend that. I’m hoping to blog about somewhere in the future.
June 19, 2017 - 3:17 pm
Thanks for this mate, I really need to look at testing this in my new role… ideally pushing it all into Tableau to start to dice it up and overlay with other data points.
Very interesting that impressions are so varied between API and interface but I guess you only really care about keywords that drive clicks as a lot of queries generating impressions might not be relevant.
October 18, 2017 - 6:51 pm
Hey Martijn, cheers for the awesome post – I’ve been trying to replicate this for some of my own clients, and I’ve been checking out your Ruby script on bulk adding subfolders to GSC. However it keeps giving me a 401 “authError”, “Invalid Credentials” response.
What gives, I’ve added in the API Key + Client Secrets/Access Token but still no luck. Can you provide any advice on where I’m going wrong?
Thanks again for the sweet blog posts!