Info on ISO 8601, the date and time representation standard

This document gives a short description of ISO 8601, the date and time rep­re­sen­ta­tion standard. It also presents some arguments why it should be applied, especially in Web authoring. Sample codes are given for printing date and time in ISO 8601 format in some programming languages. Links to more detailed technical resources are given. This document recommends the following simple format for dates:
1998-05-12 (year-month-day)
and the following format for combined date and time in international contexts:
1998-05-12T10:20Z
though it may improve readability to replace the letter T by a space.

Contents:

Background: some problems

There are different date and time formats in use in different parts of the world and in different contexts. There are two major practical problems:

  1. A date designation such as 5/6/98 is ambiguous: which of the first two numbers is the month and which is the day? It is interpreted as the 5th of June, 1998, in most countries. However, in the United States it is generally interpreted as the 6th of May, 1998. This has caused a lot of confusion.
  2. Date designations that do not explicitly specify the century can cause serious problems in the 21st century. This "millennium problem" or "year 2000 problem" or "Y2K" is definitely not over yet. On the contrary, now that we are in the 21st century, we face the problem in everyday life.

These problems are combined in notations like 01/02/03. Does it mean 1st of February, 2003, or 2nd of January, 2003, or 2nd of March, 2001, or what? (In some notations, the year precedes the date.) If a product has the text "Use before 03/06/09" without explanation, what do you do?

Practical problems are also caused by the ambiguity of time designations:

  1. Does 6:00 refer to six o'clock in the morning or six o'clock in the evening, i.e. 18:00? Adding "AM" or "PM" may help, but that would introduce a language-dependent feature into a notation which is essentially numeric and therefore language-neutral.
  2. What is the frame of reference for time designations? Especially on the Internet, people often refer to times without realizing that there are different time zones in use.

Although some of these problems could be rather easily solved by special solutions in special cases, it is evident that a uniform and universal date and time representation format is needed. For example, in a monolingual context, the first problem (ambiguity of notations like 5/6/98) could be solved by writing the month as a name, not with digits. But this too would introduce an unnecessary language dependency. For example, on a multilingual Web page, one would certainly like to have the "last updated" date expressed just once, in a language-neutral notation.

As an example, consider the following system message which is bilingual for a good reason (from real life, but abbreviated here, and with typos fixed):

* Internet-yhteydet poikki maanantaina 23.11. klo 17 - 20
* ---
* Internet outside of HUT will be inaccessible starting at 5 o'clock PM
* (1700 hours), on Monday, November 23. Estimated duration is
* till 8 o'clock PM (2000 hours) at most.

So the time is expressed in three ways. Users have been observed to get easily confused in such situations. The more you try to explain things in different ways, the more probable it is that the ways get mixed up. Moreover, it all becomes too long for short announcements, headlines, etc. By switching to simple ISO 8601 notations things could be expressed briefly and uniquely:

Internet-yhteydet poikki, Internet connections off:
1998-11-23T17/20 

During a transitional period we would, of course, need to accompany such information with text (in "finer print" when applicable) which expresses the time period in older, language dependent notations.

Automatic processing of data is easier to program if dates and times are expressed in a uniform, preferably fixed-length notation. The format should allow simple comparison and sorting of dates and times, which means that the notation should be either fully descending (with the most significant part, such as the year, expressed first, the the next significant part, such as the month, etc, up to seconds and parts of a secord) or fully ascending (just the opposite). It should be noted that such uniformity would be most beneficial for small, tool-like programs, typically created by private persons or small companies. In a large project by a large software vendor, the cost of code for handling a wide variety of date and time formats is relatively small (although perhaps absolutely large).

On the Internet, the notation of times and dates has always been problematic. In particular, the format of Internet E-mail messages, as defined in 1982-08-13 (with some later modifications) by RFC 822 remained valid (which is still valid as an Internet standard) for a very long time. specifies a relatively uniform notation for date and time. It allowed some variation, but the most common alternative was something like
Fri, 8 May 1998 15:57:33 +0300 (EET DST)
There was enough variation to make it difficult to write simple programs for processing such data, too little variation to please everyone. In 2001-04, RFC 2822 was published as a successor to RFC 822. It restricted the recommended date and time formats to the format exemplified above. Note that this format is hardly used outside the Internet.

In addition, different programs use date and time formats differing from the one specified in RFC 822 and RFC 2822. To illustrate the diversity, let us take a look at the Proposed Standard RFC 2068; in the discussion of time and date formats, it says:

HTTP applications have historically allowed three different formats for the representation of date/time stamps:
  Sun, 06 Nov 1994 08:49:37 GMT  ; RFC 822, updated by RFC 1123
  Sunday, 06-Nov-94 08:49:37 GMT ; RFC 850, obsoleted by RFC 1036
  Sun Nov  6 08:49:37 1994       ; ANSI C's asctime() format

The first format is preferred as an Internet standard and represents a fixed-length subset of that defined by RFC 1123 (an update to RFC 822). The second format is in common use, but is based on the obsolete RFC 850 date format and lacks a four-digit year.

How ISO 8601 can be used to address the problems

The ISO 8601 standard, or most officially ISO 8601:2004 Data elements and interchange formats -- Information interchange -- Representation of dates and times, approved by ISO in 1988, updated in 2000, again in 2004, defines a large number of alternative representation of dates, times, and time intervals. Thus, rather than the date and time standard, it is just a general framework. To achieve uniformity, we must select one or a few formats from it and apply them consistently.

Luckily, it seems that people who know about ISO 8601 usually stick to the same simple alternatives. The following is an attempt to describe "best current practice" (in the informal sense of this phrase):

Date only format
Use format like 1998-05-12, always expressing the year in full, followed by the month and then the day. Thus, the example means the 12th of May in 1998. Use exactly two digits for the month and exactly two digits for the day, using leading zeros when necessary. Notice that there is no time zone indication, although dates too are time zone dependent in principle; by default, times are relative to some local time zone. If this is of some concern for dates (i.e. you need to be very exact with them), you could express the date in UTC and append a Z to the date designation to indicate this. But in such cases, the combined date and the format is probably preferable (see below).
Time of the day only format, local (national) use
Use format like 14:15 or 14:15:00, always expressing hours and minutes and seconds (if present) each with exactly two digits. Express the time as local time in the time zone implied by the context. But whenever there is any possibility of misunderstanding what the time zone is, use the next option:
Time of the day only format, international use
Use format like 14:15Z or 14:15:00Z, always expressing hours and minutes and seconds (if present) each with exactly two digits. Express the time as Universal Time Coordinated (UTC, formerly called Greenwich Mean Time, GMT); the appended Z letter indicates that the time is represented in UTC. Alternatively, use a local time with explicit zone designation as explained in the next item.
Time of the day only format, explicit zone
Append a zone designation in one of the formats +hh:mm, +hhmm, and +hh to a time denotation to indicate that the used local time zone is hh hours and mm minutes ahead of UTC. Examples: 12:15+02:00, 12:15+0200, 12:15+02. Select one of the formats and stick to it within a document. This format is suitable when the time zone may be relevant. An alphabetic time zone designation might be even in parentheses, e.g. 12:15+02 (EET), but it is not sufficient alone. There is no standard on such designations, and the same string is used for different zones.
Combined date and time format
Use a format where the date designation is followed by the letter T and the time of the day designation, e.g. 1998-05-12T14:15Z. Note that the standard clearly requires the use of T in this context. However, such a notation is often regarded as odd-looking, and people who otherwise use ISO 8601 might deviate from it here by using a space instead.
Period of time format
Use a format where an indication of the start of the period is followed by the slash (solidus) character / and an indication of the end of the period. Of course, one of the formats mentioned above is used for the start and end. However, to allow reasonably short expressions, higher order components of the end designation can be omitted, in which case the corresponding values from the start designation are used. Examples:
1998-05-12T14:15Z/1998-05-13T16:00Z (time interval extending from one day to another)
1998-05-12T14:15Z/16:00Z (time interval within a day)
1998-05-12/15 (time interval from the 12th to 15th of May, 1998).

Note: This is compatible with the format described in the Dates and times subsection of the HTML 4.01 Specification for use with certain HTML constructs. However, the format specified there is stricter in the sense that only the combined date and time format is allowed and it must contain the seconds part, but more permissive in the sense that is allows other time zones than UTC, too.

For periods of time, notations such as 1980-85 have often been used. Even if you use an en dash (–) instead of a hyphen and/or surround that punctuation with spaces, there misunderstandings may arise. According to ISO 8601, a notation like 2000-02 uniquely means the second month of year 2000, so it is risky to use it, or any similar notation, to denote years from 2000 to 2002. Using 2000/2002 would confirm to the ISO 8601 standard, but it could easily be misunderstood as meaning "2000 or 2002". Thus, it is perhaps best to use a horizontal ellipsis (or, as a replacement, three consecutive dots) or the en dash, with the year written in four digits: 2000…2002 or 2000–2002. Writing the year in full would probably remove the possibility of misunderstanding when using the en dash or even when using a hyphen as a replacement for an en dash (2000-2002). But these notations do not conform to ISO 8601. It specifies that the slash (solidus) "/" is used as the separator, with the following somewhat vague note: "In certain application areas a double hyphen is used as a separator instead of a solidus.". (Notations like 2000--2002 were promoted by previous versions of the standard.)

The ISO 8601 standard does not specify whether a date or time (or date and time) designation refers to a singular point in time or a time period. In particular, a designation of a date can be used to refer to a full 24-hour day or a specific moment of time within it, probably by default the start of the day (00:00). Similarly, a time notation like 9:00 could refer to nine o'clock absolutely sharp or the period from 09:00 to 09:01 or anything else. When necessary, a specific agreement or verbal indication of the meaning can be given, or the most explicit notation with ISO 8601 could be used. For example, one could write 09:00:00 or 09:00:00/09:01:00 to distinguish between the two interpretations mentioned above.

Within the European standardization organization CEN, a so-called CEN Workshop Agreement (CWA) on various notations has been prepared, and it specifies:

For the date and time conventions, the following numeric forms are recommended to be used in a language-independent, pan-European document.

Long date: 1996-04-28
Abbreviated date and time: 1996-04-28 17:22:06
Abbreviated long date: 1996-04-28
Numeric date: 1996-04-28
Time: 09:22:06

The 24 hour system is used in Europe. Thus the time of the day is given in the range from 00:00:00 to 23:59:59, and the possible leap second 23:59:60. No abbreviation is used for before or after noon.

NOTE The abbreviated date and time is given as the combination of the date format and the time format of ISO 8601; as opposed to the combined day-and-time format of said standard, which includes a "T" between day and time.

- CWA 14051-1, Information Technology - European generic locales - Part 1: General specifications, section 4.1.5 (page 10).

There was an European standard, EN 28601, with the same content as ISO 8601. It has however been withdrawn. Members of CEN are thus no longer required to have national standards on this issue.

In modern approach to localization, data is internally stored and processed in a neutral format as far as possible. If localization is desired, such as the presentation of data in a particular language or notation, it is performed as close to the user as possible. This makes it possible to apply user-selected presentation principles. The approach is described in quite some detail in the Common Locale Data Repository (CLDR) material. Apparently, ISO 8601 is the suitable neutral format for dates and times.

Notes on the separators

The basic separators used according to ISO 8601 are the hyphen "-" in a date and the colon ":" in a time designation.

The ISO 8601 standard allows these separators to be omitted (e.g., 19980512 for a date), but expressions are much easier to read when separators are used. The separators also make it more obvious that a date or time is given; a string of digits could mean different things.

The separators can be omitted in internal data formats that are never visible to users. Sometimes they need to be omitted due to technical restrictions or special considerations. For example, if you use file names that correspond to dates (e.g. in news archives), a name like 19980512.html is probably more convenient than 1998-05-12.html.

The standard distinguishes the hyphen from the minus sign as well as hyphen-minus, often called ASCII hyphen. (These concepts are explained in the document Dashes and hyphens.) However, it mentions that both hyphen and minus may be mapped to hyphen-minus when the character repertoire is limited, and this is common practice. Moreover, programs that interpret date notations might expect to see hyphen-minus. In principle, however, U+2010 HYPHEN is the most appropriate character for use in ISO 8601 dates, when available (e.g., in text processing when using a font that contains it).

When ISO 8601 date notations are used in text (or in tables), there might a risk of line break after a hyphen. Although that would not be strictly wrong, it cannot be regarded as good presentation. However, technically it would be incorrect (and often ineffective) to use the non-breaking hyphen character. Usually the problem needs to be handled at levels other than character level, e.g. using markup (see notes on preventing line breaks on web pages).

Writing date and time in some programming languages

As an example of writing code which outputs a date in the ISO 8601 notation, here is the C code for getting the current date and printing it:

      time_t now_t;
      struct tm now;
      time(&now_t);
      now = *localtime(&now_t);
      printf("%4d-%02d-%02d",
              now.tm_year+1900, now.tm_mon+1,
              now.tm_mday);

Naturally, only the printf function call is affected by the date format used. Notice the use of zero in the field designator %02d to force the number to be written with exactly two digits, using leading zero if needed.

As another example, here is Perl code for getting the current date and time and writing it in UTC:

($sec,$min,$hour,$mday,$mon,$year,$wday,$yday,$isdst) =
  gmtime(time);
$t = sprintf "%4d-%02d-%02dT%02d:%02dZ\n",
  1900+$year,$mon+1,$mday,$hour,$min;
print $t;

In JavaScript programming, we can expect currenty used browsers to support the toISOString method, which yields an ISO 8601 conformant notation. If you just need the date part, you can pick up a substring consisting of the first ten characters (because in ISO 8601, the date part is of fixed length):

var today = new Date();
var dateString = today.toISOString().substring(0, 10);

The following information about clumsier solutions is preserved here mostly for historical reasons: In JavaScript, there are various advanced date functions, such as getFullYear. Previously they were not supported by all JavaScript implementations, so it was safest to use just the basic date functions and "do it yourself" (performing Y2K corrections too). Although this is probably irrelevant nowadays, here is code that constructs an ISO 8601 conformant date notation into the value of the variable dateString using a just old basic functions:

function getCorrectedYear(year) {
    year = year - 0; /* converting to a number */ 
    if (year < 70) return (2000 + year);
    if (year < 1900) return (1900 + year);
    return year; }

var today = new Date();
var d  = today.getDate();
if(d < 10) d = '0' + d;
var m = today.getMonth() + 1;
if(m < 10) m = '0' + m;
var y = getCorrectedYear(today.getYear());
var dateString = y + '-' + m + '-' + d;

Pete Forman has written more detailed ECMAscript code for determining the date and time.

Concerning Java, see section Dates and Times of the Java FAQ and code written by Simon Brooke.

If you use the strftime function (see Single UNIX® Specification for a description), the following format specification would be suitable: "%Y-%m-%dT%H:%M:%SZ" to get both date and time. This means that if you, as a Web author, use Server Side Includes (SSI), the following should cause the date and time denotation (corresponding to the moment when the server processes and sends the document) to be inserted in ISO 8601 format:

<!--#config timefmt="%Y-%m-%dT%H:%M:%SZ"-->
<!--#echo var="DATE_GMT"-->

But check server-specific documents and test that this works on your server! And if you just want to have the date inserted, you'd use timefmt="%Y-%m-%d".

In PHP, you can write the current date and time on the server as follows:

   $now = substr_replace(strftime("%Y-%m-%dT%H:%M:%S%z"), ":", -2, 0);

   echo "Server datetime is ", $now;

In Fortran 90, the following code could be used to print the current date and time (without seconds) in the local time zone using the standard subroutine date_and_time:

      character*8 date
      character*10 time
      character*5 zone
      integer values(8)
      call date_and_time(date,time,zone,values)
      print 100, values(1), values(2), values(3), values(5), values(6)
 100  format(1X,I4,'-',I2.2,'-',I2.2,'T',I2.2,':',I2.2)

For some other program codes related to formatting dates and times in ISO 8601 notation, see ISO 8601 Date and Time - Converting and implementing by Nikolai Sandved Aasen.

Language dependent notations

It is probably unnecessary to apply these notations in running monolingual text, where language-dependent traditional expressions with the month expressed with a word like "the 4th of July" or "4. heinäkuuta" can be used without problems. But separate date designations, such as date of issue or date of last update, and tabulated dates, are best presented using the variant of ISO 8601 outlined above.

The CLDR (Common Locale Data Repository) activity, coordinated by the Unicode Consortium, has defined a general-purpose formalism (a markup language, LDML) for specifying formats of date and time representation. It has also collected voluminous information about date and time formats in different locales (languages and language variants) represented in that formalism. The general idea is that internally, in data structures and binary files, dates and times should be represented in ISO 8601 format, but externally, when displaying data to users, they should be formatted according to the language of the context and ultimately according to each user's preferences. Of course, the user's preference could be ISO 8601, too.

Links to more information