Writing block quotations -
why the HTML blockquote element is insufficient

A block of quoted text should normally be preceded by some introductory text that makes it clear that a quotation follows. It should be followed by an attribution of some kind, preferably with an exact citation of the source preceded by an expression like "Source:". It is usually not adequate to rely on things like the HTML blockquote element, partly because neither visual nor aural browsers make it sufficiently clear that a quotation is given.

Theory: the nice, structural blockquote

In a sense, the blockquote element is one of the few HTML elements with a well-defined meaning. It indicates a block of text quoted from an external source. This is an important structural issue, since the quoted text is not part of the document proper, and it need not reflect the author's views; it might have been included just in order to argue against it.

Since graphic browsers typically display blockquote as indented, it has become a fairly common practice to use it simply to indent text, irrespectively of whether it is quoted or not. This might be tempting, since there is no element for merely indenting text in HTML. On the other hand, HTML shouldn't be supposed to have such an element, since indentation is not structural at all; it might be used to convey some structural information, but it is very ambiguous what that might be.

It's a rather good quick criterion for HTML tutorials to check what they say about blockquote. As I wrote years ago:

The problem is that there are so many HTML tutorials, and most of them are more or less garbage. A rule of thumb: Before starting to read an HTML tutorial, check what it says about the BLOCKQUOTE element; if it describes that the element is for block quotations, and perhaps gives an illustrating example, the tutorial is probably worth a closer look; if on the other hand the tutorial starts explaining that BLOCKQUOTE is for indentation - which is just as absurd as it sounds, and contradicts the HTML specifications - don't waste your time with it. (I take this example because BLOCKQUOTE is so simple an element that even without prior knowledge on HTML you should quickly realize what BLOCKQUOTE is for - and yet so many tutorial writers don't get that.)

Source: a Usenet posting 1998-06-29 in alt.html.

Practice: what does it sound like?

These two elements designate quoted text. BLOCKQUOTE is for long quotations (block-level content) and Q is intended for short quotations (inline content) that don't require paragraph breaks.

Was it absolutely clear that the preceding paragraph was quoted text? What about its source then? I used the correct element, blockquote, and even the cite attribute in it element to indicate the source of the citation, namely the discussion of the blockquote and q elements in the HTML 4 specification. But what graphic browsers typically do is just that they they present the blockquote element with some extra margin on the left and possibly on the right too. Some older browser used italics font as well. If you are using a CSS enabled graphic browser, you will probably see a border around the block as well as a distinctive background for it. But this is not a universal convention for indicating text as quoted; it's a hint, at best. And in auditive presentation, e.g. by a screen reader, such information might get completely lost.

In speech, people often utter the words "quote" and "unquote" when presenting a longish quotation. This might not be stylistically ideal, but it reflects the very real need to indicate the start and end of a quotation exactly.

In visual appearance, any of the ways that browsers commonly use to present blockquote elements may fail to convey the message, or to give a wrong message. In particular, indentation is often used and understood as indicating emphasis or, less often, de-emphasis (excursory remarks). Although indentation of quoted text is probably a good idea, it is not sufficient as the only method of indicating text as quoted; but popular browsers by default use indentation only.

At worst, and quite often, a blockquote sounds or looks like part of the normal flow of text. We should do something about it.

The natural way to indicate the start of a quotation is to use introductory words in preceding text. It is usually suitable to make some general statement that says that a quotation follows, instead of indicating the source accurately. The detailed credits are probably best presented after the quotation. In particular, if the credits contain links to online resources where the quoted text can be found, it is better that the user can make a choice of following the link after he has heard the quotation.

If the quotation is in a language different from that of the enclosing text, I suggest that you mention this explicitly in the introductory words, in addition to using the lang attribute. The user would then be better prepared to the change of language. Besides, there is no guarantee that the user even recognizes the language! (Knowing the language would be useful for getting human or automated help in understanding the text at least roughly.)

How to indicate the source?

Even in principle, blockquote is structurally insufficient, since there is no markup for specifying the source of the quotation in the content. The cite attribute is unsatisfactory, since its value is just an address (URL) and it is not required (or even expected) to be presented by default to the user. (The old HTML 3.0 draft contained the credit element for the purpose, but for some reason the idea was ignored later.)

Yet, indicating the source is very often required by various guidelines and practices, or even the law. In fact, it should be the rule rather than exception to indicate the source of a quotation.

We have no adequate markup, and the designation of the source is a small block of text but not a paragraph, so div element seems to be what we should use. And we might just as well use a class attribute for it, both as a comment-like reminder and as a tool for applying style sheets to such expressions.

But should the source citation be part of the blockquote element, or should it appear after it? There might be some presentational benefits from putting it inside the blockquote, but logically, the citation is not part of the quoted text, and the definition says that blockquote designates quoted text.

What I recommend is the following structure:

<blockquote cite="URL">
The actual quotation.
</blockquote>
<div class="credit" align="right"><small>Source:
<cite><a href="URL">Title</a></cite>.</small></div>

This would make the information about the source appear after the quoted text, in smaller font size and right-aligned, in most visual presentations. In printed publications, it is customary to begin the information about the source with an em dash (—) rather an expression like "Source:", but the latter is clearer and works better in speech presentation. Naturally, the word "Source" should appear in the language of the document. The class name credit is rather arbitrary; source might be better, but in any case, it's normally an internal affair, irrelevant to the user, how you name your classes.

The URLs above need not be exactly the same; the cite attribute could refer to the exact location whereas the visible citation would refer to a larger entity, such as an online version of a book as a whole. If the quoted document is not available online, the cite attribute and the link markup are omitted, of course.

If you wish author in "puristic" HTML, you would omit the align attribute and the small tags, leaving all presentational suggestions to a style sheet. Despite being somewhat of an HTML "purist", I keep using presentational markup in this context, however. It can hardly cause any harm, and it may give useful hints to the reader. But the fine tuning of the presentation should be done in a style sheet, and it cannot be done elsewhere, in fact.

Tuning the visual appearance

Quite often authors use blockquote just to get some indentation, e.g. just to produce left margin which is assumed necessary, or to emphasize something, or to "set apart" an example. This is semantically all wrong if the material is not actual quoted text. But such widespread abuse implies that there is little hope in getting better presentation on browsers. If they started making it more obvious that a quotation is present, a large number of pages would look really odd, especially if they use nested blockquote markup to get more indentation.

Hence, authors may wish to consider using style sheets (CSS) to improve the presentation. There are many ways to try to make quoted text look sufficiently different from normal text, such as font face, line spacing, text and background color, etc. To get started with style sheets and to evaluate the possibilities, please refer to my annotated link list How to write style sheets (CSS).

Here's what I am using for block quotations, in my basic style sheet:

blockquote { border : solid #696 1px; padding : 3px; 
   margin-left: 3em; margin-bottom:0.2em;
   background: #f9f9f9 none; color: #000; }
.credit { text-align : right; page-break-before: avoid;
   font-family:Verdana,Arial,Helvetica,sans-serif; }
.credit small { font-size: 80%; } 
blockquote p { margin: 0; text-indent:1.5em; }
blockquote pre { margin: 0; }

In the context of my style sheet in general, this distinguishes block quotations from other material in several ways: no background image, close to white background color, a light green border around the block quotation; and the credits in a sans-serif font below the quotation proper, in the overall page style. The last two lines are intended to avoid excessive spacing between the quoted text and the border as well as to use "literary paragraph style" (first line indents) within block quotations.

This is just one set of ideas, of course. If, for example, your documents contain a large number of block quotations, you might prefer something quite different.

Should we imitate the style of the quoted document?

When quoting online HTML resources, the question arises whether we should use the presentational features of the quoted document in favor of our own style. For example, if our own document is black on white and the quoted document is white on black, this could get disturbing. For non-HTML and offline resources, the question arises whether we should try imitate the presentation style as far as possible. For example, when quoting a book, should we use CSS (or HTML) to suggest similar font face, size, etc.? (For font size, hardly; for font face, it might be stylish.)

If we decide to imitate the style of the quoted document in our quotation, to some extent at least, then a technical problem arises: there is no simple way to say that a particular style sheet be used for a blockquote element. Even if the quoted document is well-designed, so that all stylistic suggestions are in a linked style sheet, we cannot just refer to that style sheet so that it will be applied to the content of the blockquote.

Instead, we would need to use a class (or id) attribute for the blockquote (say, class="foo") and to create a copy of the quoted document's style sheet, modify it by prefixing selectors by something suitable like blockquote.foo, and add a link element referring to that modified style sheet. This gets rather awkward, and I can't recommend it in the general case.

For quotations of non-HTML material, I would say that a reasonable imitation of the presentation style is nice, if you care to take the trouble. Naturally the imitation should not conflict with your basic ideas of making quotations look like quotations, or with accessibility principles. (For example, a fixed font size should not be imitated.)

Things would be easier if blockquote markup were defined so that it establishes a new "frame of reference", similar to that of iframe. This is not just a presentational issue. For example, most software for processing HTML documents fails to distinguish between heading elements inside blockquote and elsewhere. If you quote a piece of text containing a heading, it should have no relationship with heading markup in your document. Yet, probably e.g. most programs for generating a table of content from headings will get confused.

Quotation marks

There is a very simple idea of indicating something as quoted text: quotation marks. They work well in visual presentation, and speech presentation might also be expected to recognize them and use a special tone of voice or perhaps to spell out the punctuation marks. (This however cannot really be relied on. But it could be useful as an additional indication.)

The naive approach would be something like the following:

blockquote:before { content: "\201C"; }
blockquote:after  { content: "\201D"; }
This works in some cases, as the following demonstrates, presuming that your browser has sufficient CSS support:
Veni, vidi, vici.

Note that e.g. Internet Explorer 6 lacks support to the CSS constructs used. But the quotation marks appear as intended e.g. on Opera 6 and Netscape 7.

However, if the content of blockquote is wrapped inside a block-level container like p or div, as it should by HTML 4 Strict rules, the quotation marks appear on lines of their own! The following example demonstrates this, on supporting browsers. It is the same as the preceding example but with the phrase wrapped inside div:

Veni, vidi, vici.

The reason is that e.g. div markup implies a line break before and after, and as generated content the quotation marks appear before and the content of the blockquote element.

There's no simple solution to this. A clumsy non-CSS solution would be to insert the quotation marks into the content of blockquote. This would be illogical too, since those characters would not be part of the quoted text.

On the other hand, a widely and justly appraised guide on writing style states:

Quotations of an entire line, or more, of verse, are begun on a fresh line and centred, but not enclosed in quotation marks.

Source: William Strunk, Jr.: The Elements of Style, part A Few Matters of Form.

So maybe we shouldn't try to use quotation marks for block quotations.

Conclusion

The blockquote element is mostly an illusion, and for adequate markup, better methods would be needed. In HTML as currently defined, blockquote should be used for bulky quotations but not relied upon. You should include both a "prelude" that verbally expresses that a quotation follows and a "postlude" that specifies the source or otherwise indicates that the quote ends. Moreover, CSS can be used to distinguish quotations visually.

You might ask whether this is too pessimistic: if you cannot rely on well-defined structural markup, what can you rely on in HTML authoring? The answer is that there structural elements that we can reasonably rely on. We can expect browsers to present h1, h2, h3, and h4 in a manner that makes it obvious that they are headings. Similarly, p markup for paragraphs works well, and so do ul and ol for lists, for example.