A common question that comes up with any website that has a global multilingual audience is what the best structure might for international SEO. Personally, I find all aspects of international SEO to be the most fascinating since the cultural nuances involved make it a constant learning exercise. Qualitative aspects aside there are very serious technical decisions to be made and architecture is one of those.
When you have a website targeted at only one market or at least one language, your website architecture and folder decisions are very simple. You will have a bunch of directories that all live under the root domain and ideally link to each other. Whether a user lives in California or New York, or even Sydney or San Francisco they will experience the website the exact same way.
From a Google perspective, provided there is no local filter applied your visibility should be similar in any location. For your query set when you are the most relevant, there will only be one page to serve to each of these locations with no intra-site conflict.
Expanding to new markets
The complexity begins when you add other languages, for example, you now want to expand into Spanish for the US market or French for the Canadian market. You will need to make a determination of what pages will be available for these new audiences and where they will live on your website. If you put them in a language folder, for example /fr/ for French does that mean you need to put English /en/ or do you just have a French folder?
Now to make it even more complex, what if you want to serve one set of French to the Canadian market and another to the French market in France. To those that are unaware, yes there are nuances that can be fairly substantial between the markets. (Think about how different UK English is from the US. Yes, you can understand it, but it might not be preferable to use if you are trying to sell something). Does this mean you now have to create a language folder with a country subfolder: /fr/ca and /fr/fr/ or do you do vice versa and have a country folder with a language subfolder: /ca/fr/ and still /fr/fr/?
This question might seem quite binary with regards to Canada and France, but we just created a bigger problem in the US. Do you now need a /us/en directory because you are now serving a Spanish audience in the US? Or do you have an English directory /en/ where the default content will live, and Spanish will live as just /es/. This decision will again open up a new can of worms since you now need to decide what to do if you want to target Mexico.
If you have a default Spanish folder of /es/, you are likely going to have to split that folder up into countries that speak Spanish too since Spanish will also be spoken differently between Spain and Latin America which is even fairly different between Latin American countries. You will definitely want to have content for Argentina that is not the same as what you have for Mexico.
Future growth planning
These are very important considerations as you start to build out your international strategy, and you want to make decisions that will take into account your future growth plans. It would be quite unfortunate to assume that you will only ever target North America and then decide a couple years later that South America is now a target market. You not only will have to build for the multiple languages and dialects in South America, you will also upset the apple cart on your existing North American infrastructure.
While these are globalization decisions, we haven’t even touched on SEO yet. Google is great at parsing intent and strings in English, but in other languages it leaves a bit to be desired. There are few factual reasons for this reality, but one of the biggest ones at least according to Google’s Off The Record Search Podcast is that there is just not enough content in some languages for their algorithms to properly learn.
Without the advantage of Google having prophetic abilities to know which dialect a user might want, you could leave yourself exposef to users going to the wrong place if you allow Google to index every country and language variation.
To try to mitigate this issue, Google released a way to inform them of language and country targeting called hreflang (which I will go into in a future post), but from my experience it is hardly foolproof. Hreflangs are easy to mess up, and even if you get them precisely right, they are only a suggestion to Google. Google is free to disregard your suggestion and rank your Mexican page in Argentina if they so choose.
This can also happen in English where a UK page will be visible in Canada when the hreflang suggests a Canadian page. I have seen many instances where Hreflang either did not solve an international visibility issue or even made it worse. My preference in general when it comes to things like hreflang and even canonicals is not to rely on them when the consequences of them being ignored are high. I would much prefer to use absolute solutions like blocking pages from being indexed or not having a page ever existing.
So, with this in mind, my recommendation is not to index, all sorts of directories that are full of near duplicate content. (Keep in mind while Canadian French might be different than French French it’s mostly similar which means that all content will be at risk of being marked as duplicate.) If you do need to create these pages, my recommendation would be to noindex them and make them available only to users who navigate to them on the site.
To summarize, I would recommend indexing a hybrid page for all major languages popular in multiple locations (English, Spanish, Arabic, German, Portuguese, and French) and then either redirect users to the correct language page via a geo-detection script IF LOCAL PAGES matter. If local pages don’t really matter and it’s only currency and/or contact info that is different, just these elements should be changed with a geo-detection script.
This advice would only change when there’s a substantial difference in how users in these countries would search the primary keywords. If the primary keywords are the same, and it’s only the sentence phrasing and tail keywords that change, minimize the headache and just have one page per language.
A couple end notes:
I didn’t weigh in on how the directory structures should be set up for the user when SEO is excluded because I think this is going to be an individual decision.
Under any circumstances, do not consider creating new domains for each country.
This post is a brief summary of the decisions you will need to make and please reach out if you have questions!