i18n and pathauto
The internationalization module (commonly know as the i18n module) and the pathauto modules are great modules, but when combined they do create a minor gotcha.
If you're the type to run statistics on your site, or otherwise look into how people are using your site, and have a fairly standard setup of i18n and pathauto, you'll discover that each node gets a path for each language you're using, regardless of what language the node really is.
First off, it's important to realise that while 'da/some-english-title' just seems wrong, it's really as designed. The language code in the paths isn't the language code of the content, but the requested language.
Why do we need the requested language in the path? It becomes quite clear when looking at pages that actually use the requested language, like the default node listing provided by the node module. The node listing takes the requested language into account, and if so configured, only shows nodes in the requested language.
That in itself doesn't prescribe the use of language prefixes, as the browser provides the language preferences for the user, but they do become necessary when we factor search engines into the equation.
If there's no language code in the path, search engines only index the default language, as there's really no way for them to know how many and which languages a given page can be in (save the possibility of the server returning a 300 "Multiple choices" code with a document listing the options, but that also requires that each language version has its own URI, which brings us back at something like language prefixes), so the only way to get a search engine to index multiple language versions of a page, is giving it different URIs for each language.
And what goes for overview pages like 'node' really goes for individual node pages too. While the language of the node is pretty distinct, there's no way for Drupal to know whether that's the case for the rest of the page too. You could have (and probably have, if you're using i18n) blocks that change depending on the language, and if you don't ensure that the user visiting your site gets the same language as the search engine that sent them there, you could have a situation where part of what they searched for isn't on the page anymore.
So the default way of i18n to do things makes a lot of sense.
But keeping that in mind, you (like me) might come to the conclusion that you don't want each node 'duplicated' under each language prefix, but solely exist under the right language, that's quite possible too:
The i18n module is smart enough to ignore aliases that already contain a language prefix, so the solution is to alter your pathauto patterns to include the prefix. Pathauto used to come with support for the language codes, but the code has been removed by the maintainer as he doesn't use the i18n module and can't guarantee that it works. As it really belong in the i18n module anyway, that makes sense, but as the time of writing, it haven't made its way into the lastest stable release of i18n yet.
Someone has made a patch though, and it's easy to transform that into a module. I've done that and attached the file, with one minor change: when no language is set for the node/term, it will not insert a prefix. This allows for language neutral content to still be available under all language prefixes.
| Attachment | Size |
|---|---|
| i18npathauto-5.x-0.1.tar.gz | 601 bytes |