It depends on the server and its settings whether and how an author can make available, via the language negotiation mechanism, versions of pages in different languages. Here we discuss only the methods that might be used in one widely used server software, Apache, and mainly just one of the two alternative methods there. As regards to other servers, see WebServer Directory by WebServer Compare, for links to original documents of different server software.
The Apache documentation contains the section Content Negotiation, which describes two basic methods:
For a description of these methods which is one step more detailed, see the CERN page Language Negotiation.
If Multiviews is enabled on Apache (as it is by default), then you can use language negotiation in the following, though somewhat limited, manner for a directory:
.htaccess
file in that directory:AddLanguage en .en
AddLanguage fi .fi
AddLanguage fr .fr
foo.txt.en
for the English version of
foo.txt
and
foo.txt.fr
for the French version.
(You don't create a file named foo.txt
, but a URL
ending that way will work.)
Note that language negotiation works well for plain text files too;
the negotiation does not depend on the data format of the file.
http://jkorpela.fi/multi/foo.txt
as a generic URL that works via language negotiation.
The specific language versions, like
http://jkorpela.fi/multi/foo.txt.fr
can be used too whenever desired.
A simple example of applying the latter method:
http://jkorpela.fi/rfct.html
and a version of it in English
http://jkorpela.fi/rfcs.html
.htaccess
and containing the lineAddHandler type-map var
.var
in a special way.
(This might be a system-default actually.)
rfc.var
and with the following content:
URI: rfcs.html Content-Type: text/html Content-Language: en URI: rfct.html Content-Type: text/html Content-Language: fiThis causes the URL
http://jkorpela.fi/rfc.var
to become operational, so that the server will respond by sending
a Finnish version or an English version, according to the
language preference settings in the user's browser.
If a browser sends such language preferences that none of the versions is acceptable by them, Apache sends back the HTTP error code 406 Not Acceptable. This itself can be somewhat confusing, but further problems are caused by the associated conciseness: the text that comes along with the error messages contains links to the alternative versions but so that they just tell the relative URL and the language specified using a two-letter language code.
The situation can be improved to some extent by adding, into
the .var
file, after each alternative
(below each Content-Language
line) a line with the keyword
Description:
and a description of the alternative,
e.g. the name of the page in its own language.
For example, for the English version
of the main page of this documentation I have written:
Description: Techniques for multilingual Web sites
By adding such descriptions, you can make the server response look
somehow understandable:
Not Acceptable
An appropriate representation of the requested resource /~jkorpela/multi/index.html could not be found on this server.Available variants:
- index-en.htm Techniques for multilingual Web sites, type text/html, language en
- index-fi.htm Tekniikoita monikielisiä Web-sivustoja varten, type text/html, language fi
- index-de.htm Techniken für mehrsprachige Web-Sites, type text/html, language de
- index-sv.htm Tekniker för mångspråkiga Webbsajter, type text/html, language sv
You might be able to make the situation even better by creating
a specific error page for the error code 406 and by
applying the
ErrorDocument
-directive to make Apache use that customized error page.
The best option is, however, probably to append
a generic alternative to the list: an alternative with no
Content-Language
specified. Such an
alternative will be sent by the server as a response to a request which
cannot be satisfied by any other alternative.
That alternative could be a page that explains the available
alternatives in English, with their names in their own languages.
The page could additionally, for the general benefit of the user,
give the user some advice on setting his browser's language
preferences at least by adding English there.
An example of such an alternative: my "generic" 404 error message page. That example is somewhat special, since there are specific pragmatic requirements on error page contents.
Language negotiation for this documentation, Techniques for multilingual Web sites, has been implemented using the type-map method. (The server that was originally used by the author had been configured not to allow the use of Multiviews.) In detail, the method has been used as follows:
.htaccess
with the contentAddHandler type-map html
.html
the server applies language selection mechanism
.html
and intended to refer to pages in this directory,
e.g.
http://jkorpela.fi/multi/index.html
,
a file with the corresponding name is created, with content
like the following (which makes it a type-map file):
URI: index-en.htm Content-Type: text/html Content-Language: en URI: index-fi.htm Content-Type: text/html Content-Language: fiIn this example, there are two language alternatives, English and Finnish, and the versions have been named as
index-en.htm
and index-fi.htm
.
In this method, the names need not be systematically formed;
you could use for example
index.htm
and
hakemisto.htm
if you like. (The names must not end with
.html
in this case though, since such files
have been declared to be treated as type-map files now!)
It is then caused by settings of the server (which are in this case
the default settings of Apache)
that when a browser sends a request for
http://jkorpela.fi/multi/
(with a trailish slash),
the server first expands it to the URL
http://jkorpela.fi/multi/index.html
and then begins language selection. Note that the
"URL" or "Location" box in the browser displays the original URL,
since the expansion was made by the server without informing the browser;
the browser has just got the content of the document that the server
selected. This does not mean that such
URLs like
http://jkorpela.fi/multi/index-fi.htm
wouldn't work any more; they just refer to the specific alternatives
in a fixed way, bypassing the language selection mechanism.
The Apache documentation uses the expression
".var
file", but this does not mean that the file names
must end with
.var
. The approach described above, using .html
instead of .var
, is a bit esoteric, but handy.
Note that this approach cannot be applied
(well, cannot be conveniently applied),
if the directory already
contains normal HTML documents in files ending with
.html
and their URLs should keep working.
The reason is that the server applies the
type-map method to all
files with names ending with the string specified in the
AddHandler
directive; thus all such files must be
type-map files.
In particular the language-specific file names must not end with
.html
in our example; but
.htm
will do just fine.
Next section: Language selection in browsers.
2003-09-08 Jukka K. Korpela