A Web author may wish to "customize" error messages that are sent to the browser (or other client) when a requested resource is not found. Normally the server in such cases just sends a general HTTP error code 404 (conventionally called and displayed as "Not Found"), and the browser then takes a general action that it applies in all such cases. So there is nothing site-specific. And there might be a reason why an author wishes to make something site-specific to happen.
Perhaps there have been lots of URLs that referred to the site but have now become non-functional due to a site rearrangement. That would be a gross mistake; cool URLs don't change. If you've done such a mistake, you should probably check what other mistakes you've done; see the alertboxes The Top Ten Mistakes of Web Design and The Top Ten New Mistakes of Web Design to avoid the worst problems in future. But let's assume that the mistake has been made, and it's impossible to use redirection or other mechanisms to fix things nicely. Or let's assume that you have some other reason, such as directing people using wrong URLs to a search page of yours so that they might find what they are looking for. Or you might have read Nielsen's alertbox Improving the Dreaded 404 Error Message which presents some good arguments in favor of doing something about that problem of poor default error messages.
There's little point in creating an error page that is less informative than the default error page. In fact, your error page should be better than or at least as good as the default error page, for all users in all situations.
In particular, include an explanation in English, even if you also have an explanation in some other language. You might think that if all your pages are in, say, Estonian, only people who can read Estonian will try to access them. But on the Internet, virtually anything can happen. For example, someone might follow a casual link here or just mistype an address he read in a newpaper, turning the real address into something that refers to your server. If you had no customized error page, the user would probably see a default error page (sent by your server or shown by his browser) in English, or in some language he knows. So if you prevent that, make sure that there's something in English too on your error page.
However, putting long explanations in two or more languages into one page makes it rather big, and potentially confusing. Consider using language negotiation, so that each user gets a monolingual page, in his preferred language, if his browser settings correspond to his preferences. More information: Techniques for multilingual Web sites.
Make it very evident that an error page is an error page, not an odd-looking content page. The very first words should say that explicitly. Note that e.g. blind people will experience the page sequentially, so the sooner they hear that there's an error situation, the better. If you include something funny, put it after the simple explanation. It is not necessary to repeat the usual jargon "404 Not Found", if you can make things clear by other means, but it hardly hurts to use that expression, since so many people are familiar with it.
Nielsen's alertbox about error messages in general, Error Message Guidelines, is very useful when designing error pages, too.
At the general protocol level, the idea behind 404 customization is that when a server sends a 404 Not Found error code, it may, and indeed should, also send a document (normally, an HTML document) which explains the situation, an error document . Browsers and other user agents are not expected to treat that document as corresponding to the URL used but as explaining why there is no document corresponding to it. Typically a server sends by default a very generic error document like the following:
Not Found
The requested URL /foo was not found on this server.
Apache/1.3.9 Server at www.hut.fi Port 80
But server software and its settings may let an author affect what is sent, for example so than an author-specified error document is sent for all URLs that refer to his directory. (In principle, we should refer to URLs with a specific prefix, not directories; but typically servers map the path part of a URL to a path name in a file system.)
To see such a setting in action, try using a URL like
http://jkorpela.fi/asdfg
which does not refer to any resource. Instead, the server sends
an error code and an error document to the browser.
In Microsoft Internet Explorer 5, there is a feature which causes a customized error message to be suppressed in favor of the browser's default message, if the customized error document is "too short". It has been reported that the limit appears to be 512 bytes. Although a value of 1024 bytes has also been mentioned, a test on IE 7 shows that "too short" means "512 bytes or less". The limit might be changeable by the user via registry settings, and the entire feature can be disabled, but it is unrealistic to expect most users to know such things. Thus, it is advisable to make sure that your error document is at least 513 characters long (counted by the number of characters in the HTML source).
Whether and how "customized error messages" are possible depends on the server software and its settings. It's a server issue, and HTML markup is not involved (except that the customized error document is usually an HTML document, of course).
After finding out what software your server runs
(you might use e.g.
Delorie's
HTTP Header Viewer
for the purpose; give it some URL referring to something on
your server),
see the
list of links to documentation of different servers
by
WebServer Compare
to find documentation of server software, and contact your local
server administration
(webmaster@
server)
to ask about applicable settings if needed.
In the
Apache server
software, which is rather widely used (and often imitated by other
software), you can use the
ErrorDocument
directive.
This means the following:
notfound.html
,
into the directory where your Web documents are.
.htaccess
(note the leading period)
and put the following into it:ErrorDocument 404 /~jkorpela/notfound.html
/~jkorpela/notfound.html
by the address of your
own customized error document.
The address consists of whatever follows the server name
(like www.cs.tut.fi
)
in the full URL of
the error document when referred to directly
(in this case, http://jkorpela.fi/notfound.html
).
Note that the address to be used is not
relative to your own directory
(a plain notfound.html
wouldn't do) but to the server root.
Technically you could also use a
full
absolute URL
like (in my case)
http://jkorpela.fi/notfound.html
but don't use absolute URLs here,
since they make Apache send a wrong return code, namely
302 (Found), together with a redirection to the
address specified in the ErrorDocument
directive.
This is all wrong, since it would indicate
that the original URL works and the requested resource
exists but temporarily resides under a different URL!
This would cause quite a lot of confusion among users,
search engines, etc.
(The information related to 404 issues on
MSN TV (previously WebTV) pages
even
recommends such a method; a previous version of WebTV's page about it
called it "gentle deceit", but it is far from gentle!)
For example, suppose you have created a Web page and you wish to use a link checker (such as the W3C Link Checker) to verify that all of your links work in some technical sense at least. If you have actually mistyped a URL in link, the link checker will note a status code of 404 as an error, so that you can fix the problem. But if the status code is 302, there is no error to be reported; the checker could at most issue an informative message about redirection.
If you create your Web documents
e.g. on a PC running Windows and separately upload your
Web documents onto a server, you may find it difficult to create
a file named .htaccess
on Windows. Well, you could
name it, say, access.txt
and rename when uploading; for example,
typical FTP programs let you specify a different name for the destination
(e.g. put access.txt .htaccess
would work
in a simple command-based FTP program like DOS FTP).
The customization applies to subdirectories too.
On the other hand you can create per-subdirectory customization too,
overriding the customization for the parent directory. For example,
I have made such customization for my test directory due to its special
nature, so an incorrect URL like
http://jkorpela.fi/test/asdfg
will cause a different error document to be sent.
Other error conditions can be handled in similar or
analogous ways. As regards to Apache, the
documentation of the
ErrorDocument
directive contains a few examples like
ErrorDocument 401 /subscription_info.html
which might be a good idea for a paid-access directory:
if you don't give a correct username/password combination,
you'll see a customized error message page which gives you
subscription information and hopefully also information that
tells you what you should do if you are a paying customer who
has forgotten the password. The following URL is for a trivial demo:
It depends on the browser whether and how it prompts for a username and password anew automatically when it has got the 401 response from the server, instead of displaying the message, but the user hopefully knows how to abort that process and make the browser show the message.
A few ideas:
Generally, when considering customization of error messages, read carefully the HTTP status code definitions. You might expect that some error occurs in certain conditions but it might arise otherwise too, and a customized message could then be misleading, or plain wrong.