The pragmatic guide to HTML: Principles
or
The HTML Anarchist’s leaflet

Use tags in a simple manner

<!doctype html>
<html lang=en>
<meta charset=utf-8>
<title>Demo</title>
<link rel=stylesheet href=basic.css>
<h1>This is a simple demo</h1>
<blockquote>
Indented <font class=special>text.</font>
</blockquote>
<i>Jukka K. Korpela</i>

Some traps to avoid

It is easy to exaggerate the pragmatic principle so that the author considers only the visual effects of tags, using whatever makes things look good. It is the total effect that matters. The total effect includes functionality and impression on users.

This may look like fieldset but isn’t.

Authors have used the fieldset element to create a block with a border that has rounded corners. This used to be the only convenient way to produce such rendering; nowadays CSS (border-radius), though still with somewhat limited browser support. The markup does not normally cause changes in functionality, but assistive software may announce the element to the user as a fieldset. This may confuse the user to think that there are some form fields present. So it is better to use fieldset only according to its defined meaning, for grouping form fields. Besides, most browsers nowadays do not even use rounded corners for fieldsets by default!

The textarea element has fairly often been used just to create a scrollable area of text. In the early days of HTML, this was the only way to create such an area. Nowadays, there are better ways in CSS, and even the iframe element could be used. The main reason for avoiding such use of textarea is that it creates a presentation that most often indicates a text input area. Thus, using it for mere presentation of text may confuse users, at least mildly.

The distinction principle

Use different markup for elements that might conceivably be styled differently. For example, foreign-language phrases such as fait accompli might be rendered as normal text, or in italic, or maybe even in some special color. Similarly, there are several conceivable styles for a term when a definition is given: “By tactile we mean…” Contrast this with a scientific species name Homo sapiens or a quantity symbol like designation like m for mass in physics; these expressions are to appear in italic by the conventions of biology and physics.

<i class=foreign>fait accompli</i>
<i class=def>tactile</i>
<i>Homo sapiens</i>
<i>m</i> = 2,4 kg

Thus, if you decide that italic is the best default ren­der­ing for all of these cases, use the i tag by all means, but distinguish the cases by using suitable class names. The cases where italics is the only cor­rect rendering do not need a class attribute, un­less you have some special use for it (in, say, client-side scripting).

This way, you are prepared to requests like “show the defining occurrences in normal text style but with a yellow background“ (like tactile and not tactile) Then you would not need to go through all of your markup to find your i elements and decide which of them are to be styled that way. You would just add one style sheet rule, like
.def {
  font-style: normal;
  background: #ffa;
  color: black;
  padding: 0 0.2em;
}

Why not use dfn for the defining occurrence? It would be “semantically correct” by the specifications, but then the default rendering would be italic on some browsers, normal text in others. (The risk for getting normal text is small, but why not use the safer tag?)

Lists

The three list elements ul, ol, and dl are often convenient tools for setting up bulleted, numbered, and description lists, respectively. Use them when they serve such purposes, but do not think that any list must be written using one of them.

It does not make much sense to use ul and then consider how to get rid of the bullets. Similarly, ol gives you simple browser-generated numbering. If that’s fine, use it; if you need some more complicated numbering, forget ol.

It might be argued that list elements are partly functional. A speech browser may announce a list by saying something like “a list of fifteen items“, and this may help the user. It may distract, too. A speech browser could also have a command for skipping an entire list. There are many things that could be done with markup, and some of it is actually being done in some software, but it would still be pointless to require that every list be marked up using specific HTML elements.

Tables

<table>
  <tr>
      <td id=nav>Navigation
      <td id=content>Content proper
      <td id=footer>Footer
<table>

Use tables for tabular data and, when appropriate, for layout. Tables are the most reliable and robust way of putting elements in columns in the broad sense. Com­pli­cat­ed layout is messy and risky, but the solution is to simplify things, not to complicate them by using to loads of div elements placed with CSS positioning.

The most vulgar forms of the anti-table-movement tend to attack undeniable tabular data too, requiring tag soup approaches (div, div, div, span, span, span, with loads of tricky, kludgy, and unreliable CSS code) to them as well.

In the more educated circles, there are endless debates of what is a “real table” or “tabular data”. What you call a table tells more about you than the data. This makes the vulgar form somewhat reasonable. Why waste time on cutting hairs when you can spend it on coding?

It's usually pointless to debate against the anti-table-movement, in any of its form. Just as it is pointless to debate on religion or to fight against superstitions. People change convictions, but debates don’t tend to prevent rather than cause that.

Many sites that use tables for layout have serious problems, but they are caused by complexity of design, designing for some specific window width (“resolution”), and requirements on pixel-exactness. In most cases, the div + CSS school has just made the situation worse. They keep repeating dogmas they learned from somewhere, without ever stopping to cite any factual evidence. The most honest of their dogmas is “layout tables are outdated”.

For simply placing an element on the left or right of other content, consider using either the align attribute or the float property in CSS. However, that makes the other content “flow” around it, and if you want a different setup, you probably want a table.

If you expect to need to essentially vary or modify the layout without changing the markup, then the div approach is better than tables. IE (at least up to IE 9) does not let you change the basic rendering of table elements.

In order to style tables, use primarily CSS, simply because it is easier and offers more possibilities than presentational HTML attributes. However, for example, for a column of numeric data, consider using the HTML attribute align=right in each cell; this works even when CSS is off.

Custom tags?

              </div>
            </div>
          </div>
        </div>
      </div>
    </div>
  </div>
</div>

People often get tired of writing or even reading markup like
<div class="nav">…</div>
instead of simple
<nav>…</nav>
It’s not just a matter of convenience of coding. If you need deeply nested elements, it gets difficult to keep track of </div> tags. Take a large HTML file and try to find out which major constructs are terminated by each of such tags. Well, HTML5 drafts contain new elements like nav just for such reasons.

However, HTML5 does not let you use the tags you like. It just adds a set of tags, selected largely to reflect class attribute values that authors seem to use frequently.

In practice, you can still use your own tags. Instead of clumsy markup like <span class="person">John</span>, you can use <person>John</person>. You can combine blocks (paragraphs, headings, etc.) into larger units using your own descriptive markup, e.g. <main>…</main> or <fig>…</fig>. This improves the readability of the markup and makes it is easier to associate end tags with start tags.

There are some problems, though:

These problems, except for the last one, relate to new HTML5 tags, too (though the second last is only marginally relevant).

So it’s your call—maybe. Your boss, your client, or your partners may have a word to say on using tags, and in education and training, you will probably be told to use the Right Markup, whatever it might be this week. But if you can choose, the benefits of “custom tags” probably outweigh the risks.

You could also use custom attributes, as in <person sex=male>John</person>, and corresponding at­tri­bute selectors in CSS, e.g. person[sex=male]. While this works on modern browsers, there is not much to be gained in comparison with using classes, and class selectors are better supported in old browsers. Remember that a class attribute value may contain several class names (well supported in browsers), e.g. class="employee male important".

Excuses

I was an advocate of semantic HTML markup for nearly 20 years. After much frustration, I was converted. A decisive event on the road to admitting the defeat of semantic HTML and drawing the conclusions was Ian “Hixie” Hickson’s message dated Sun Feb 12 13:25:52 PST 2012, with the title The blockquote element spec vs common quoting practices in the public WHATWG Working Group mailing list. He wrote, among other things: “The use case for most of the ‘semantic’ markup is [just] easier authoring and maintenance, in particular for se­lec­tors in CSS.”

Perhaps authors will use the new HTML5 tags widely, according to HTML5 rules, making the markup easier to read and modify to coworkers or others who work on the same markup. But I don’t see this as a big issue. And it is not realistic to expect (though admittedly pos­si­ble) that browsers will do anything special with these elements (except render them as blocks and not inline elements) or that search engines will get enthusiastic about them.

I have intentionally written mostly about tags here. I know the difference between tag and element quite well. I even know that confusing them with each other can cause real trouble. But for the purposes of this presentation, tag is suitable. After all, we talk about tag soup, not element soup.