Using accented letters and other ISO Latin 1 characters in Emacs

Introduction: character set standards

ISO 8859 is an international standard which defines several different eight-bit character sets. The first of them, ISO 8859-1, also called ISO Latin 1, is intended for languages in Western Europe ("Western" is to be understood in a broad sense, mainly as opposite to Slavic languages). It contains, among other things, accented letters like ß and other national characters. Finnish readers may wish to consult my article ISO-Latin-1-merkist÷stń for pragmatic information about the use of ISO Latin 1 characters in various languages and contexts. For information on character sets in general, see my tutorial on character code issues.

I have composed a compact description of ISO Latin 1, containing just the octal codes, the characters themselves, and their "official" English names.

ISO Latin 1 is often described as the standard eight-bit character set. It is rather extensively supported by modern software. It is largely similar to the Windows character set, but ISO Latin 1 specifically defines character codes 200 - 237 (octal) reserved for control characters, whereas in Windows they are used for some printable characters.

A new standard, ISO 10646-1, is basically a 16-bit or 32-bit character set, containing ISO Latin 1 as a subset. The basic problem with it is that it is not yet supported by computer software, and due to its size it is necessary to define reasonable-size subsets of it for practical use. This definition work is currently in progress.

Using ISO Latin 1 under Unix

The use of ISO 8859-1 under Unix may require some operations by the user, as indicated in a previous reply, and of course it depends on user's terminal or other equipment whether the characters are displayed correctly. On the computers of HUT Computing Centre, no special actions are required, unless the user has modified the default environment, in which case he should consult the (old) instructions Ohjeita kahdeksanbittisyyteen siirtymisestń (in Finnish).

Using ISO Latin 1 in Emacs

In Emacs, one can use the ISO Accents mode, entered by the Emacs command

M-x iso-accents-mode

where M-x stands for meta-x, which is usually produced by pressing first the ESC (or Escape) key, then the x key.

The mode belongs to the GNU Emacs distribution - if you haven't got it, ask your local Unix support to install a newer version of Emacs.

In this mode, for example typing the apostrophe character ' immediately followed by the letter a produces accented a (ß). Of course, you must be careful in this mode when you really want to enter eg ' as a character.

More exactly, in the ISO Accents mode the accent character keys `, ', ", ^, / and ~ do not self-insert; instead, they modify the following letter key so that it inserts an accented letter. Examples:

combination result
`o ˛
'o ˇ
"o ÷
^o ˘
/o °
~o §

Special combinations:

combination description of result result
~c a c with cedilla š
~d an Icelandic eth (d with dash) ­
~t an Icelandic thorn
"s German sharp s
/a a with ring ň
/e an a-e ligature Š
~< left guillemet ź
~> right guillemet
~! an inverted exclamation mark í
~? an inverted question mark

Using an upper-case letter in the above-mentioned combinations results in the corresponding upper-case national character. For example, 'A produces ┴.

The command M-x iso-accents-mode works as a toggle: when you are not in the mode, the command enters the mode, and when you are in the mode, the command leaves the mode.

Sometimes ISO Accents mode is inconvenient, especially if you forget to turn it off after entering the characters you needed. There are alternatives to using the ISO Accents mode for entering ISO Latin 1 characters:

Date of last update: 1998-09-20. Last technical modificatio: 2004-09-30.

Jukka Korpela.