HTML5 is not SGML-based, and there will be no official DTD for it. SGML can express only a rather limited set of rules for syntax. Yet, a DTD is useful for validation.
SGML validation helps to detect unintentional
low-level mistakes in code, such as a forgotten end tag
(e.g., a missing
</i> may turn large parts of text to italic)
or a mistyped attribute name (e.g.,
SGML validation also helps you the check that you only use the tags and attributes you intended to use and, when dealing with old pages, to detect use of features that you would rather get rid of. This is best achieved when you are in full command of the tags and attributes that are allowed. My DTD generator is a step in that direction.
Note that the HTML5 mode in the W3C Markup Validator as well as in the Validator.nu service apply a fixed set of rules, which generally reflects the current state of HTML5 drafts with some delay. This means, among other things, that it reports as errors the use of many tags and attributes that have been in HTML for a long time and are universally or almost universally supported by browsers. To people who wish to or need to keep using such features extensively, such checkers are of limited usefulness. It is difficult to pick up the real syntax errors from a pile of messages expressing dislike for some markup.
To use my experimental HTML5 DTD, more exactly a DTD for a markup language closely resembling HTML5, take the following steps:
doctypedeclaration, if present):
<!DOCTYPE HTML SYSTEM "absurl/html5.dtd">
http://) of the DTD.
You can alternatively use the
permissive HTML DTD,
which additionally contains features mentioned in HTML5 drafts but
declared obsolete there, such as the
font tag and the
You can also use my DTD generator to select a set of tags as you like.
IDREFin HTML 4.01) is more restrictive than in HTML5. Without this restriction, validators would not e.g. check the uniqueness of
frameattribute is allowed in the
tableelement, despite being declared obsolete in HTML5. This is needed to describe the shorthand attribute
data-attributes are not allowed, as the rule for allowing them cannot be expressed in SGML. (It would be possible to add the feature of allowing a given set of additional attributes.)
aria-attributes are not supported.
dataformatas(obsolete per HTML5) are not supported.
mathelement is defined as having just flow content, and no other MathML markup is recognized (partly because it would complicate things a lot and would require the problematic entities).
rbelement (obsolete per HTML5) is not supported.
hidden=""as opposite to