Learning HTML 3.2 by Examples, section 3 General remarks on the syntax of HTML:

Media types

An Internet media type is, generally speaking, a property of a data set, describing both the general type of data (such as "text" or "image" or "application"; the last one refers to program-specific internal data formats) and, as a subtype, a specific format for the data. The concept was originally defined as "MIME content types".

Media types relate to HTML as follows:

The HTML 3.2 Reference Specification refers to RFC 1521 but that specification was superseded by RFC 2046 (in November 1996). The procedure for registering types in given in RFC 2048. The official registry is kept at http://www.iana.org/assignments/media-types/

In addition to standardized media types, there are media types which are in fact supported by popular servers and browsers. Appendix B of Special Edition Using CGI (by QUE) lists many of them. For an online list, see Multipart Internet Mail Extensions (MIME) in The HTML Sourcebook, 3Ed, by Ian S. Graham.

You can check what is the media type information sent by a server by using Delorie's HTTP Header Viewer.

There is an additional complication caused by the fact that Internet Explorer does not work according to the protocols in this area. It often ignores the media type announced in the Content-Type and uses the last few characters of the URL instead to determine the method to be used. (IE may also apply some "heuristics" based on the actual content of the data!) This means that in addition to making sure that the server sends the correct media type information one should try to name the file so that things might work on IE, too. Thus, one should try to stick to commonly used conventional file name suffixes like .DOC for MS Word documents, .XLS for MS Excel documents, .TXT for plain text documents, etc.


Date of last update: 2010-12-16.
This page belongs to the free information site IT and communication, section Web authoring and surfing, by Jukka "Yucca" Korpela.