This document lists the various space characters in Unicode. For a description, consult chapter 6 Writing Systems and Punctuation and block description General Punctuation in the Unicode standard. This document also lists three characters that have no width and can thus be described as no-width spaces.
The third column of the following table shows the appearance of the space character, in the sense that the cell contains the words “foo” and “bar” in bordered boxes separated by that character. It is possible that your browser does not present all the space characters properly. This depends on the font used, on the browser, and on the fonts available in the system.
Code | Name of the character | Sample | Width of the character |
---|---|---|---|
U+0020 | SPACE | foo bar | Depends on font, typically 1/4 em, often adjusted |
U+00A0 | NO-BREAK SPACE | foo bar | As a space, but often not adjusted |
U+1680 | OGHAM SPACE MARK | foo bar | Unspecified; usually not really a space but a dash |
U+180E | MONGOLIAN VOWEL SEPARATOR | foobar | 0 |
U+2000 | EN QUAD | foo bar | 1 en (= 1/2 em) |
U+2001 | EM QUAD | foo bar | 1 em (nominally, the height of the font) |
U+2002 | EN SPACE (nut) | foo bar | 1 en (= 1/2 em) |
U+2003 | EM SPACE (mutton) | foo bar | 1 em |
U+2004 | THREE-PER-EM SPACE (thick space) | foo bar | 1/3 em |
U+2005 | FOUR-PER-EM SPACE (mid space) | foo bar | 1/4 em |
U+2006 | SIX-PER-EM SPACE | foo bar | 1/6 em |
U+2007 | FIGURE SPACE | foo bar | “Tabular width”, the width of digits |
U+2008 | PUNCTUATION SPACE | foo bar | The width of a period “.” |
U+2009 | THIN SPACE | foo bar | 1/5 em (or sometimes 1/6 em) |
U+200A | HAIR SPACE | foo bar | Narrower than THIN SPACE |
U+200B | ZERO WIDTH SPACE | foobar | 0 |
U+202F | NARROW NO-BREAK SPACE | foo bar | Narrower than NO-BREAK SPACE (or SPACE), “typically the width of a thin space or a mid space” |
U+205F | MEDIUM MATHEMATICAL SPACE | foo bar | 4/18 em |
U+3000 | IDEOGRAPHIC SPACE | foo bar | The width of ideographic (CJK) characters. |
U+FEFF | ZERO WIDTH NO-BREAK SPACE | foobar | 0 |
Previously MONGOLIAN VOWEL SEPARATOR (U+180E) was classified as a space character, now as formatting characters (with no width). The characters ZERO WIDTH SPACE (U+200B) and ZERO WIDTH NO-BREAK SPACE (U+FEFF) were never classified as space characters in Unicode, despite their name.
ZERO WIDTH SPACE, when supported, can be used to indicate a line breaking opportunity within a string. Similarly, ZERO WIDTH NO-BREAK SPACE can be used between two characters to “glue” them together, so that they no line breaking appears between them even if normal processing rules would allow that.
The characters U+2000…U+2006, when implemented in a font, usually have the specific width defined for them, though small deviations exist. Their widths are defined in terms of the em unit, i.e. the size of the font.
The characters U+2007…U+200A and U+202F have no exact width assigned to them in the standard, and implementations may deviate considerably even from the suggested widths. Moreover, when concepts with the same names, such as “thin space”, are used in publishing software, the meanings can be rather different. For example, in InDesign, “thin space” is now 1/8 em (i.e. 0.125 em, as opposite to the suggested 0.2 em) and “hair space” only 1/24 em (i.e. about 0.042 em, whereas the width of a THIN SPACE glyph typically varies between 0.1 em and 0.2 em).
Web browsers and other programs may fail to render all space characters according to their definitions or descriptions. Many commonly used fonts lack some of the space characters. The situation has improved over the years, but caution is still needed especially when text data may need to be transferred from one program to another or may be viewed using different fonts.
Modern browsers can usually find a glyph for a character if some of the fonts in the system contain it. This does not always take place, however, See Guide to using special characters in HTML. Moreover, font substitution may cause undesired effects, since the widths of characters vary by font.
The use of various space characters of specific
width, such as THIN SPACE,
is often an unnecessary risk.
Consider using other methods, such as the
features of a text processing program or (on Web pages) CSS properties like
padding
,
margin
,
word-spacing
,
and
letter-spacing
.
In text processing, Web page display, and other contexts, space characters are often “adjustable” in the sense that they are presented in different widths, especially to satisfy justification requirements. You might see this in effect in this paragraph. Justification often just makes spaces wider, though it may shrink them, too, especially in typesetting.
No-break spaces are defined in Unicode as having the same width as spaces. This does not specify what should happen to them in justification. The common practice has been to treat them as having fixed width (in each font), which means that in adjusted text, spaces and no-break spaces have different effects.
On web browsers, no-break spaces tended to be non-adjustable,
but modern browsers generally stretch them on justification.
Within
justified text on web pages,
authors may have used no-break spaces instead of normal spaces
to prevent stretching (e.g., as in 5 m
instead
of 5 m
). Due to changes in browser behavior,
it is better to use fixed-width spaces instead. Among them, the four-per-em
space
(e.g., as in 5 m
)
usually best corresponds to the width of a normal unstretched
space. However, the fixed-width spaces act as normal spaces
in line breaking, so you may wish to use some technique to
prevent undesired line breaks
(e.g.,
as in <nobr>5 m</nobr>
).
Alternatively, consider using
NARROW NO-BREAK SPACE, which is generally treated
as non-stretchable in web browsers.
It might be adequate in contexts where strings belong together so that
they should not be split on two lines and could well be rendered with
decreased spacing between them, e.g. in expressions like
”10 kg” and ”C. S. Lewis”.
The change in the treatment of no-break spaces, though inconvenient, is consistent with changes in CSS specifications. For example, clause 7 Spacing of CSS Text Module Level 3 (Editor’s Draft 24 Jan. 2019) defines the no-break space, but not the fixed-width spaces, as a word-separator character, stretchable on justification.
The Unicode standard describes the adjustment process and the intended role of specific-width space characters as follows:
The fixed-width space characters (U+2000..U+200A) are derived from conventional (hot lead) typography. Algorithmic kerning and justification in computerized typography do not use these characters. However, where they are used (for example, in typesetting mathematical formulae), their width is generally font-specified, and they typically do not expand during justification. The exception is U+2009 THIN SPACE, which sometimes gets adjusted.
The EM QUAD character is canonical equivalent to EM SPACE. The intended difference seems to be in the code chart note for the latter: “may scale by the condensation factor of a font”. There is no such note for EN SPACE to make it any different from EN QUAD. It is not clear what “condensation factor” means here.
The MEDIUM MATHEMATICAL SPACE character was added in Unicode version 4.0.
Regarding the non-breaking property of no-break space and other characters, see Unicode line breaking rules: explanations and criticism.
Microsoft’s page Space Characters Design Standards says: “In digital fonts there are only two kinds of space characters supported by most computers, the space and the no-break space.” This is somewhat misleading, since the support depends on fonts rather than computers, except for no-break space support, which depends on programs.
Alan Wood’s excellent Unicode resources contain a page on the General Punctuation block, with widths of space characters illustrated graphically.
See also: Styling spaces in CSS.
This paragraph is here for demonstration purposes only, and it contains normal SPACE characters between words.
This paragraph is here for demonstration purposes only, and it contains SIX-PER EM SPACE characters instead of normal SPACE characters between words.
There are some graphic characters that can be used a symbols for a space. Though sometimes called visible spaces, they are not spaces at all but visible notations used to indicate the appearance of spaces in instruction manuals and descriptions of texts.
The following table lists some symbols, in decreasing order by practical usefulness. Their shapes vary by font; especially the last one varies a lot.
␣ | U+2423 | OPEN BOX |
␢ | U+2422 | BLANK SYMBOL |
␠ | U+2420 | SYMBOL FOR SPACE |