Don’t just think of a robots.txt as a file that you are going to use to block search engines from crawling pages on your site, there are creative uses for this file if you just allow yourself to think outside the box.
One of the very first technical SEO fires I ever had to extinguish was one started by a robots.txt file. The startup where I was working launched a brand-new redesigned website, and we were excited to see the results. However, instead of traffic going up it inexplicably went down. For anyone that has ever had the misfortune of screwing up a robots file (spoiler alert that’s what happened), you will know that a robots noindex directive is not an instant kiss of search death – it’s more like a slow drawn-out slide especially on a large site.
Since I had never seen anything like this before, I was frantically checking everything else that might cause this slow dive. While I turned over every rock to find the issue, I chalked it up to bad engineering, an algo update, or a penalty. Finally, I found a clue. A URL where I thought a ranking had slipped was actually no longer indexed. Then I found another URL like that and another one.
Only then did I check the robots.txt and discovered that it was set to noindex the entire site. When the brand-new site had been pushed live – everything had moved from the staging site to production including the robots file which had been set to noindex that staging site. When Googlebot revisited the homepage next, it fetched the robots.txt and saw the noindex directive. From the log data Googlebot continued to fetch pages, but the rate started declining quickly. (For thoughts around why this is, I will be doing another blogpost on this topic.)
We fixed the robots file as fast as we could and then waited for Google to love us again. If you are ever in this situation, this is your warning that this is not an instant fix. Unfortunately, it took longer to recover our positions than it did to lose them, and I have seen this be the case every time I have worked on an issue like this over the last decade. It took about five days to lose all of our rankings and weeks to recover them.
Ever since I dealt with this issue, I have had a very healthy respect for robots files. I carefully consider alternatives to using them to address issues that can be fixed in other ways, and only add folders to them if I really never want them to be indexed.
Robots.txt can have other uses
This healthy fear aside, I have also found some great uses for robots files that are completely harmless to your search visibility and could even cause some confusion with your competitors. Note that if you do any of these you should add a comment tag in front of any non-code so you don’t inadvertently break your robots file. These are just examples of what you can do with a robots file once you start being creative.
- Recruiting: Use robots files to advertise open SEO roles. The only humans that are ever going to look at a robots file are either search engine engineers or search marketers. This would be a great place to grab their attention and encourage them to apply. For an example of this, check out TripAdvisor’s robots file here.
- Branding: Showcase something fun about your brand like Airbnb does. They write out their brand name in ASCII code which reflects their design sense and make a casual reference to careers at the company.
- Mystery: Anything that is in a robots file as disallowed that appears to be a traditional file (and not a reference to scripts) will inevitably be something that people will want to check out if they find it. If you produce a technical tool, this might be the place where you can offer a discount to only the people that find the link in the robots file with a URL like “secret-discount”. If someone goes to the lengths to explore your robots file, they are most definitely going to navigate to a page that references a secret discount!
- Subterfuge: Any good marketer should always check out their competitors robots.txt files to see if there is anything that the competition is doing that you should keep track of. Assume that your competitors are going to check out your robots file and do the same to you. This where you can be sneaky by dropping links and folders that don’t exist. Would you love to do a rebrand but can’t afford it? Your competitors don’t need to know that. Put a disallow to folder called /rebrand -assets. For added fun you can even put a date on it, so the competition might mark their calendars for an event that never happens. When you start being creative along these lines, the ideas are truly endless. You can make references to events you won’t be having, job descriptions you aren’t releasing or even products with really cool names that will never exist. Just make sure you block anyone from seeing a real page with either password protection or a 404, so this does remain just a reference to an idea. Also, don’t take this too overboard into anything immoral or illegal!
- Taxonomy: A robots file really just exists to disallow folders and files not to allow them; unless your default is disallow (like Facebook’s robots file) and you just want to allow a few pages. An exercise where you sit down with content or engineering to add folders to be allowed might be a good way to find out if there are folders that should not exist. Truthfully, the value in this is just the exercise but you can carry it forward and actually lay out all the allowed folders in the robots file as a way of detailing the taxonomy.
As you can see a robots file is not just a boring requirement – and it is a requirement, every site should have one. There is a lot you can do with it if you think of it as another asset or channel on your website. If you have other ideas on how to make robots files useful, I would love to hear them – please email me!