This document suggests three ways of presenting an image with a caption in HTML. Styling in CSS is also discussed.
Sadly enough, there is no
markup for image captions in HTML,
unless you count the
figcaption
element in HTML5 proposals.
What comes closest to
semantically associating some text content with some image is putting them
into a table so that the image is in one cell and the text is either in
another cell or in a caption
element.
Then there’s the “semantically empty” approach,
which is better than
semantically wrong (such as suggestions to use definition list markup).
There are two basic ways to use a table for an image and its caption, so as a whole, we have three alternative methods:
caption
) element
div
element containing both the image and
an inner div
element, which contains the caption
A Dalmatian |
<table class="image"> <tr><td><img ...></td></tr> <tr><td class="caption">caption text</td></tr> </table>
This approach generates by default (i.e. if you don’t use
style sheets or additional attributes to affect the rendering)
a presentation that is illustrated on the right.
The image and the caption text are in two cells of a one-column table.
The markup above assigns a class, caption
, for the caption
text cell, but it’s there just to make styling easier.
The same applies to class image
assigned to
the table
. There is nothing magic in class names in HTML
and CSS;
they are just names chosen by an author as he finds convenient and
hopefully descriptive to anyone who reads the code.
By default, text in a table cell (td
element)
is left-aligned, but you can change this by
using e.g. align="center"
in the td
tag,
or in CSS (e.g.,
td.caption { text-align:
center
).
A table is normally left-aligned by default
and appears with no other content on either side of it.
You can affect
this using a align
attribute in the table
or,
more flexibly, using CSS. It might be a good idea to set
just some left margin for the table, using e.g. the CSS code
table.image { margin-left:
2em }
.
caption
element<table class="image"> <caption align="bottom">caption text</caption> <tr><td><img ...></td></tr> </table>
This approach is similar to the first one, but instead of putting
the caption text into a cell, you put it inside a caption
element. It is by definition a caption for the entire table, but in this
case, the table has but one cell, containing the image.
By default, the caption would appear above the image, but the
attribute align="bottom"
puts it below the image.
You could do the same in CSS using
table.image caption {
caption-side: bottom; }
,
but this is poorly supported: no support in Internet Explorer.
If you wish to affect the horizontal alignment in a
caption
element, use the text-align
property in CSS. For example,
<caption align="bottom"
style="
.text-align
: left">
div
elements<div class="image"> <img ...> <div>caption text</div> </div>
This is the simplest method, using just div
markup.
The inner div
element is used for two reasons: to make
the caption text appear on a line of its own, and to make it an element,
so that it can be referred to in CSS (using a selector like
div.image div
).
It might be argued that it is even simpler to omit the inner
div
markup and use just <br>
to create
a line break between the image and the caption. Even the outer div
markup could be omitted on similar grounds. However, the markup
presented here is the simplest reasonable alternative.
The use of div
makes it possible to treat the caption text and
the combination of the image and caption as styleable elements.
A div
element
has no top or bottom margin by default. You can change this in CSS. For example,
div.image { margin: 1em 0; }
would set a top
and bottom margin of 1em. On the other hand, the construct is often
preceded by an element that has a bottom margin, or followed by
an element that has a top margin, such as a paragraph or a heading,
so it does not need margins of its own.
The caption text is left-aligned by default. This can be changed
in different ways, but note that if you use align="center"
for the inner div
, the text will be horizontally
centered within the available space, not with respect to the image.
These three approaches give a tolerable rendering in non-CSS situations (showing the caption under the image), and they are each a relatively good starting point for styling. When using a table, you need to consider cell spacing and cell padding, which are by default nonzero. But there wouldn’t be strange browser idiosyncrasies to worry about. The rest really depends on the desired appearance as well as the properties of the image and the text.
A Dalmatian |
Typically we’d probably want to set caption text size to a bit smaller than copy text, and maybe the font face to something different too, and we might wish to center the text (though this may depend on its length). In the first approach you could use the following:
.image .caption { font-size: 80%; font-family: Verdana, Arial, sans-serif; text-align: center; }
In the two other approaches, you would replace
.image .caption
by
.image caption
or
.image div
,
respectively.
For long caption texts, you need to decide whether they should wrap according to the width of the image or be set to some other width. It’s probably best to make the width the same as that of the image or (for narrow images) just a little wider.
By default, browsers handle the second approach (using a table
and a caption
element)
so that the the text is wrapped to the same width as the image.
This is because they determine the width of the table according
to the cell containing the image. If you wish to make sure of this,
you could explicitly set the table width to the same as the image width.
In the first approach, you would need to be explicit about the table
width, either in CSS or in HTML.
In the above example, the caption
element has
grey background to illustrate that it extends a bit to the left and to the right
of the image width. This is usually not serious when the text there is centered.
The phenomenon is caused by default cell padding and cell spacing that browsers
apply when rendering a table.
If it becomes a problem, you can fix it in HTML by setting
cellspacing="0" cellpadding="0"
in the table
element
or in CSS by setting
table { border-collapse: collapse; } td { padding: 0; }
.
In the third approach, the caption text by default uses the
available width. The reason is that the width of a div
element
by default extends across the available width.
You could change the appearance by explicitly setting the width of the
outer div
element, e.g.
<div class="image" style="width:200px">
.
Using a style
attribute is a practical choice here,
since the width needs to depend on the specific image that appears
inside the element.
Of course, in many cases you could meaningfully use explicit
line breaks (with <br>
) markup inside the caption text,
especially if the text has fairly separate parts. For example, you could
write <div>A Dalmatian dog.<br><small>Drawing by Liisa Sarakontu.</small></div>
.
As described above, the caption text can be centered relative to the image
by setting a width the text and using align="center"
(HTML) or
text-align: center
(CSS) for it.
On the other hand, if you wish to center the image and
its caption as a whole horizontally, then you can simply use
align="center"
in the table
tag,
if you are using one of the table approaches. In the
div
approach, you would use CSS. You could use CSS
in the table approach too, of course.
Note that centering tables and other blocks is surprisingly
problematic. Many constructs that might be expected to center a block
will actually center each line instead, depending on browser.
Please refer to the excellent treatises by Nick Theodorakis:
Centering tables
and
Centering blocks with CSS.
The following example shows an image as centered so that the
caption under it is left-aligned to the left edge of the image.
A simple way to achieve this is to use the two-cell table approach, with
align="center"
for the table
element
and with the alignment of cells (td
) defaulted to
align="left"
.
A Dalmatian dog. Drawing by Liisa Sarakontu. |
Using the align
attribute in an img
element,
you can float an image so that appears on the right or on the left of
some text, so that the text flows on the other side of the image.
You can use a more modern approach as well, the float
property in CSS. It’s more logically named as well, since this
is really not about alignment but about floating. Moreover, you should usually
set some left margin for an image floated on the right (and right margin for
an image floated on the left), and CSS is the only way to do this reasonably.
Thus, a simple way to float an image would be to use the attribute
style="float: right; margin-left: 0.5em"
in an img
tag.
A Dalmatian |
It is almost as easy to do the same when the image has a caption.
Actually, such techniques were already used previously on this page,
In the table-based approaches, you can just use align="right"
in the table
, or float: right
for it in CSS.
In the third approach, it is clearly best to use the CSS method, since
there is no direct way to float a div
in HTML. Here, too,
CSS is the way to set a margin so that text does not come too close to the
image.
To end floating, you can either use
<br clear="all">
in HTML or
clear: both
in CSS (for the first element that
should appear with no floating elements on either side).
If you have a set of images and you would like to present them as a collection on one page so that there are several images side by side, there are several approaches.
A common approch is to use a table, with images in one row, captions in another, then more images in a third row, etc. This approach does not linearize well, since when processed rowwise, the connection between images and captions is lost. But more importantly, it requires a fixed layout, with a fixed amount of images in one row. This means that the page requires a minimum width to be viewed without horizontal scrolling, and on the other hand it does not utilize the full available width in a wide window.
The goal here is to make an image gallery adapt to the available width. For simplicity, let’s assume that the images are of equal size.
In the simplest case, you could just write img
elements in
succession. A browser will then present the images so that it puts as
many images side by side as fits to the available width.
In effect, a browser treats img
elements as big letters
and processes a string of images as if it were text consisting of such
letters. The following string of identical images illustrates this.
I use a space between the img
elements
in HTML source. This tends to cause some spacing between the images
on common browsers. Whether this is correct is debatable. In any
case, if you don’t want any spacing, don’t leave those
spaces or line breaks between img
elements. Instead,
you can put line breaks e.g. after the element name img
before the attributes, where they cause no effect. And if exact spacing
is important, do the same and use CSS properties to suggest specific
margin or padding.
If we wish to put captions under the images,
things become more complicated, but not much.
We can float the elements that contain an
image and its caption. We would use the methods discussed above,
except that we float to the left, using
float: left
in CSS or
align="left"
in HTML for a table.
We probably want to have some spacing in the gallery.
A simple way is to put some margin on the right and below each image.
For this, we can wrap the elements inside a div
element with some class, say class="gallery"
, and use
CSS code like the following:
.gallery table { float: left;
margin: 0 5px 20px 0; }
This leaves a 5 pixel space on the right of each image and 20px space below each each image:
Remember to stop floating after the gallery, using the techniques mentioned above.
If the caption texts vary essentially in length,
you need to consider how to
make their boxes equal in size in rendering. This usually requires
you to guess a reasonable height for the boxes.
Moreover, to make the texts vertically aligned to the top
(that is, the bottom of the image), it is simplest to use the
two-cell table approach. In that case, you can simply use
valign="top"
(in HTML) or
vertical-align: top
(in CSS) for the cells.
In the next
example, the height of caption cells has
been set to 4em.
An ornament |
This is a caption that is essentially longer than the other captions. |
An ornament |
An ornament |
An ornament |
A caption should not be confused with an
alt
attribute,
which specifies the textual alternative to be
presented in place of the image, when the image itself is not
presented (e.g., on a text-only browser). Neither of these
should be confused with the title
attribute,
which specifies an “advisory title” for an element,
typically implemented as a tooltip that is displayed when the
pointer is moved over the element.
If an image is purely decorative or just visualizes
something that has been said in the text, it is appropriate
to use an empty alternate text, alt=""
.
In that case, when accessing the page without images,
the page would appear as if the image were not there at all.
This however creates problems if the image has a caption.
The caption text would appear on its own, leaving the user
in confusion: what does this relate to? Thus, in such cases,
it might be suitable to include the caption text into the
image itself, using image processing software.
Normally, on the other hand, if an image has a caption, it is probably a
content image and the caption text just describes what the image is
about, instead of conveying its full message. Then the odds are that
it would be better to have the caption read first, giving those
users who have some way of accessing images (maybe the user is
just surfing with images disabled?) a basis for deciding whether to
try to access this particular image.
The easiest way to achieve this (and still make the caption appear
below the image in visual rendering) is to use the method
of a single-cell table and a caption
element with
align="bottom"
.
Unfortunately, there’s no way to suppress a caption in non-visual
rendering except by making the caption part of the image.
For example, if your page contains
some article that tells about some meeting and is illustrated
by a photo of the meeting, with a caption, then both the photo
and the caption should probably be omitted in non-visual rendering.
In that case it’s probably the least of evils to use a short alt
text like
"(photo of the meeting)"
.
Putting the caption into the image itself might not be practical enough,
and besides, it might be relevant to the user to know that an image
is available even if cannot (for now) see the image.
For some additional notes, see
section
When an image says more than a thousand words
in Guidelines on alt
texts in
img
elements.
dl
markup?For some odd reason, the suggestion to use dl
(Definition List) markup pops up fairly often. Logically, it makes no
sense; such markup
should be reserved for genuine definitions of terms,
as discussed in
Definition: a definition and an analysis.
Presentationally, it creates a rendering that is rather poor,
as shown below. Although it might be possible to tune the rendering
using CSS, this would be more difficult and less reliable than
styling simple div
elements.
The default rendering of
<dl> <dt> <img ...> </dt>
<dd>caption text</dd> </dl>
on your current browser is the following:
The reason why browsers render the construct that way has nothing
to do with images or captions. They render a dl
element
so that the dt
elements are indented somewhat and the
dd
elements are indented even more, and each of those elements
starts on a new line:
So this is why the caption text gets indented relative to the image. Such indentation is generally not suitable, since normally captions should be either left-aligned or centered with respect to the image. But if desired, the indentation can be achieved very simply, and with a controlled amount of indentation, in the approaches described above, e.g. by setting a left margin for the caption.
If a speech-based browser implemented a dl
element according to its defined
semantics (ignoring any examples in the specification that contradict
that), it would be natural to read
<dl><dt>xxx</dt><dd>yyy</dd></dl>
as follows: “Definition list. Term: xxx. Definition data: yyy.
End of definition list.” Current browsers probably don’t do that,
but would you really like to fear that some browsers start
behaving by the specs? (Maybe there is no fear, because the HTML5 drafts
effectively turn dl
to a list of paired items
with no real semantics.)
Using a definition list with a single dt
element and a single dd
element inside would be
semantically odd. A list can have just one element, though
it’s a rather pathetic list and makes sense in special
case only. But this is not the main point. The point is that
neither an image nor its caption
is a term being defined. Well, except in a very special example like the
following, which illustrates the absurdity of using dl
markup for normal combinations of an image and its caption:
<dl><dt><img alt="mass" src="mass.gif"></dt>
<dd>a fundamental property of matter</dd></dl>
mass.gif
would refer to an image that
consists of the word “mass”
in some appearance.The dl
element is in practice just a visual layout trick, and a coarse
and unreliable trick at that. Quite often
the layout would not even be suitable but needs tedious styling.
Besides, the dl
is more difficult to style than most elements, since its default
rendering is complicated and hard to describe, and there are quirks
in CSS implementations that make the styling even harder.
figure
and
figcaption
markupAccording to HTML drafts,
figure
markup can be used as a container for
an illustration (such as one or more images), with
figcaption
element inside it giving a caption
for the image or images.
This means markup like the following:
<figure>
<img src="..." alt="...">
<figcaption>caption text</figcaption>
</figure>
This would solve the problem of semantic association between captions and images, if supported by relevant software. It remains to be seen whether search engines will recognize such associations.
For minimally acceptable rendering, you currently need at least the following (the script is needed for IE up to and including IE 8 to make it recognize the markup at all):
<figure>
<script>
document.createElement("figure");
document.createElement("figcaption");
</script>
<style>
figure, figcaption { display: block; }
</style>
</figure>
You should probably also add some top and bottom margin
for figure
and also some left margin.
HTML5 drafts suggest a left margin of 40px, but this is currently
not what browsers usually do. So you should explicitly specify
the left margin you want.
In order to have the caption rendered e.g. below the image
in a box that is as wide as the image, it is probably best
to use a small script on the page. The script can traverse
the figure
elements on the page and set the width of such
an element equal to the img
element contained in it,
if there is just one img
element there.
Similar techniques can, of course, be also applied when some
other markup is used for image captions.
For a different view on image captions, see
CSS: figures & captions
by Bert Bos. I don’t see any reason to use
paragraph (p
) markup in a simple structure
consisting of an image and its caption. But if you use it,
note that paragraphs typically have default rendering that
involves top and bottom margins, though they might be suppressed
if the paragraph is inside a table cell.
See also Scalable Figures and Captions with CSS and HTML by Robert J. O’Hara. It discusses, among other things, the distinction between a legend (extended prose) and a caption (a descriptive word or phrase only). Both are treated as captions in my document, but it is useful to note that there can be different “captions” that should be styled differently.
Technically, it is possible to include a caption into the image itself using a suitable graphics program. Although that’s a simple approach and although many programs generate such images automatically, it has essential drawbacks. Text that has been “burned” into an image is not directly accessible as text to programs, and its font cannot be changed the same was as normal text font can. If you need to change the text, you need to manipulate the image instead of simple text editing. Thus, if you wish to use the image in documents in different languages, things get awkward. Moreover, e.g. Google image search is based on searching for images using keywords, and Google associates words with images by their appearance near to each other (and in some other ways). A caption text embedded into an image itself is of course not accessible to Google, but if the caption text appears as real text right after the image, Google may find the image when someone searches with words that appear in the caption.
Yet another approach is to wrap an image and its caption in a container and
declare it as an inline block, using
display:
as described in the
CSS 2.1 draft. This approach would have some rather nice features especially
in fluid galleries, but unfortunately browser support is still too small.
In particular, IE has some bugs (e.g., the default width is 100% if the container
is block-level in the HTML sense) and Firefox 2 lacks support. In some years, though,
this might become a feasible alternative.