JavaScript and HTML: possibilities and caveats

In HTML, the form concept is simple and static. This is one reason why people often use client-side scripting, usually in JavaScript, to enhance the functionality of forms. In some cases, this is just fine: it might give extra comfort to some users, without disturbing others. But it is important to know the limitations. Specifically, one should never rely on JavaScript alone in the processing of data entered by user; to make sure things work reliably, such processing needs to be repeated by a server-side script.

Content

Notes on the HTML form concept

Please consult my document How to write HTML forms for general information on forms, such as tutorials, references, and basic notes on processing form data by server-side scripts. This section will just make a few additional notes.

The form concept in HTML is simple, if not simplistic. You might have difficulties in seeing this when you look at the description of forms in HTML specifications, especially in the most extensive one (HTML 4). The complexity of the specifications in this area is however mostly related to details.

A form in HTML is essentially just a data structure describing the fields of possible user input at a rough level. You can specify the generic type of an input field (one line of text, several lines of text, choice between given alternatives, form submission request, request to clear the form).

It is probably useful to state explicitly some examples of what you cannot do with forms in HTML:

Despite the limitations, forms have turned out to be very useful. In a sense, they are largely useful thanks to being simple and limited: almost all browsers have supported forms for a long time, so they are widely accessible in various browsing situations.

The work in progress as regards to enhancing the HTML form concept illustrates the above-mentioned limitations in the current form concept. See XHTML Extended Forms Requirements.

Client-side scripting in general and JavaScript in particular

Client-side scripting in general

Generally speaking, client-side scripting in relation to HTML means that some relatively simple code, which is embedded into an HTML document or referred to in it, is executed by the user's Web client (browser, such as Netscape, Opera, or Internet Explorer). The code might, for example, dynamically modify the HTML document (more exactly, a copy of it as processed and displayed by the browser), or it might open a new window on the user's screen. The code is written in some specific language designed for the purpose.

Section Scripts in the HTML 4.0 specification describes the different interfaces between HTML and scripting languages. In particular, it specifies some "intrinsic events" which are part of the interface.

JavaScript

Currently the most popular client-side scripting language is JavaScript. See JavaScript meta FAQ for sources of information on JavaScript and its possibilities. It especially recommends the extensive JavaScript FAQ (by Martin Webb) which is divided into sections, such as JavaScript Form FAQ. The FAQs are large, but there is a nice search form, replicated here for convenience:

Keyword(s):

Martin Webb's JavaScript Guidelines and Best Practice is a good overview and summary, and it helps in avoiding common errors.

There is a newer FAQ document, comp.lang.javascript FAQ, which is less extensive but more carefully written and probably better maintained than the FAQ resources mentioned above.

For references, see Netscape's JavaScript documentation, especially JavaScript Guide and JavaScript Reference, with the usual caveats: there are differences in JavaScript support between vendors and browsers and browser versions. See also JScript documentation, which describes the Microsoft's equivalent to JavaScript.

The international Usenet newsgroup for JavaScript is comp.lang.javascript. Please check the FAQs mentioned above before posting. You can also search from the group using Deja.com:

Keyword(s):

Simple example: focusing in a form

To take a simple example, you could use the following for focusing on a form field:

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<TITLE>Form test</TITLE>
<META HTTP-EQUIV="Content-Script-Type" CONTENT="text/javascript">
<BODY ONLOAD="document.theform.Comments.focus();">
<FORM ACTION="http://jkorpela.fi/cgi-bin/echo.cgi"
NAME="theform"
METHOD="POST">
<P>
Enter data:
<INPUT TYPE="TEXT" SIZE="40" NAME="Comments">
<BR>
<INPUT TYPE="SUBMIT" VALUE="Send">
</FORM>
</BODY>

You may wish to view a document containing the example on your current browser. If the browser does not support JavaScript or has JavaScript disabled, it may or may not focus on the input field, depending on browser's default behavior; if it does not, the user probably has to click on the input area before starting to type. A browser which executes JavaScript code should automatically focus on the input area when the document is loaded. This is a very simple example of JavaScript code written for extra comfort to some users; the form would still work without it, although perhaps slightly less comfortably. (This also implies that we are not very worried about situations where the focusing does not take place, but see question The onLoad event handler does not always set the focus in a form field, are there any workarounds? in the JavaScript Forms FAQ .)

However, there is the view that automatic focusing is undesirable and that focusing should be left to the user. And especially on large pages it might really happen that the part of a page, including a form, has loaded, the user starts filling out the form, and then the page finishes loading and focus is moved away from the field where the user was!

The META element is used to specify the scripting language used. In principle, the ONLOAD attribute value could be written in any scripting language. However this is mainly theoretical; browsers that support scripting languages imply JavaScript as the default.

Validation problems with the NAME attribute

The HTML 4.0 Specification did not allow a NAME attribute for a FORM or IMG element. This means that such attributes give validation errors when validating against any HTML 4.0 DTD. However, the HTML 4.01 Specification (approved 1999-12-24), which contains several changes (usually small ones) as compared with HTML 4.0, allows the NAME attribute for those FORM and for IMG elements. Thus, you can now use the following document type declaration if you use those attributes:

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
            "http://www.w3.org/TR/html40/loose.dtd">
(or perhaps Transitional removed and loose replaced by strict, if you don't use deprecated features; or Transitional replaced by Frameset and loose replaced by frameset for frameset documents).

Although this was expected to resolve the issue, the XHTML 1.0 specification re-introduced the problem, in the following sense: In XHTML 1.0, the Transitional version allows the name for form and img, the Strict version does not. Apparently, you can switch to the Transitional DTD if this is a problem.

Some historical notes on previous solutions and background:

Browsers which support JavaScript generally let you refer, in your JavaScript code, to a specific form element using uses a NAME attribute in the FORM tag, as in the focusing example above. However, the logical way to uniquely identify an element would be via an ID attribute - which is allowed by HTML 4.0 but unfortunately supported by browsers less widely than the NAME attribute. See the thread NAME is an IMG tag in December 1998 in the comp.infosystems.www.authoring.html newsgroup.

The WDG Validator documentation contains instructions for using a customized DTD. In this case it would suffice to use an HTML 4.0 DTD with the simple extension which allows NAME for FORM; see our example modified to refer to that DTD - it validates. But currently this is not needed any more, since you can just use the HTML 4.01 DTDs.

When referring to images in JavaScript e.g. in order to dynamically replace an image by another one normally uses by using a NAME attribute in an IMG element. The causes validation problems as explained above. One solution used was to refer to images in JavaScript using their numbers (e.g. document.images[0] when referring to the first image on a page). This could be feasible if there are just a few images on the page. But using numbers for such purposes is error-prone (do you remember to change the numbers if you add images? even when in hurry to meet a deadline?), and it is discouraged in Martin Webb's JavaScript Guidelines and Best Practice; and since the approval of HTML 4.01, there's no reason any more to use that method.

A typical example: opening a link in a new window with specific properties

One of the most usual simple uses of JavaScript is to open a linked resource in a new browser window with specific properties like width and height. It is highly questionable whether this is a useful feature (see item 2 in The Top Ten New Mistakes of Web Design) but here we concentrate on the technical question of doing it the best way. This hopefully also illustrates the general idea of coding such things robustly.

So let's assume that you wish to make the word "foo" a link so that when the user clicks on it, the page somepage.html will open in a new browser window which is 150 pixels wide and 150 pixels high. (Just simple properties, for illustration; see JavaScript window FAQ for other possibilities and Window Spawning and Remotes at WebReference for some illustrations.)

Let's start from the wrong way of doing it:

<script><!--
function popup(){
newwin = window.open('somedoc.html','','width=150,height=150');}
//--></script>
...
<a href="javascript:popup()">foo</a>

The problem with this is that when a browser does not support JavaScript, the link does not work at all. The user will see "foo" as a link but clicking on it does nothing. (By the way, search engines are probably "JavaScript challenged", so they won't find somedoc.html through that construct.)

It won't help to do something like the following in place of an <a href="javascript:.. link:

<a href="#" onClick="popup()">foo</a>

This approach is for some reason popular, but of course href="#" makes no sense. Although # is technically a URL reference, it means a reference to the start of the focument. When JavaScript is disabled, the "link" thus degrades to something that causes the document to be positioned at the start, potentially confusing the user thoroughly.

A correct way is the following:
<script type="text/javascript"><!--
function popup(){
newwin = window.open('','somename',
  'width=150,height=150,resizable=1');}
//--></script>
...
<a href="somedoc.html" target="somename"
 onclick="popup();">foo</a>

There is a simple example page using this method. It also contains some further caveats and notes.

When the browser supports JavaScript (and the support is enabled), clicking on the link will cause the value of the onclick attribute to be executed first. This will create a new window with the desired properties but not load anything into it yet. Next (or as the only operation, if JavaScript is not supported) the browser does normal link processing, getting somedoc.html and then

As an alternative method, you could write:

<a href="somedoc.html" target="somename" onclick=
"return !window.open('somedoc.html','somename',
'width=150,height=150,resizable=1')">foo</a>

If the window is successfully opened, the window.open function returns the value true, which is changed to false by the negation operator !. Thus, on successful opening, the entire expression returns false, and thius tells the browser not to perform the normal link processing (since here the JavaScript code is intended to replace rather than precede it).

Historical remark: This feature is not supported by browsers such as IE 3 which support JavaScript 1.0 only. This means that despite the return value false, they would try to follow the link normally after executing the JavaScript code. For our example, this would be just unnecessary extra operation; but with the target attribute omitted, it would result in opening somedoc.html both in a new window and in the original window!

The discussion above is largely based on the ideas in a news article by Nick Kew, applied to a particular purpose of using JavaScript.

Similar considerations apply to directing output from form submission to a new window with some specific properties. You would use an onsubmit attribute in a FORM tag, together with a target attribute in the same tag. Example:

<FORM ACTION="http://foldoc.doc.ic.ac.uk/foldoc/foldoc.cgi"
target="FOLDOC" onsubmit=
"window.open('','FOLDOC','resizable=1,scrollbars=1,width=400,height=300')"
>
  <INPUT NAME="query">
  <INPUT TYPE="submit" NAME="action" VALUE="Search">
</FORM>

Try it (it will search for a word definition in the Free On-Line Dictionary of Computing)

Web authors who wish to open new windows rather often also ask how to create a button for closing a window. The short answer is that there is not need for that. If new windows can be opened in user's systems, then they can be closed by the user. You can create e.g. a "close button" using JavaScript - but you should then generate the markup and code for that using JavaScript, to avoid creating a button that looks like a close button but does not do anything when JavaScript is off. And the user could still, depending on the security settings of his browser, get a message telling that a page tries to close a window and asking for user's confirmation. So you would very easily end up with creating something confusing and more difficult to use than the normal way of closing windows in the user's system. (There are several items in the JavaScript windows FAQ which discuss various issues related to closing windows. It is rather technical and complicated, and the complexity illustrates how problems arise if you try to interfere with browsers' normal user interfaces).

A rather common special case of opening a new window in a specific size is a link to an image. It is natural to try to make the window as small as possible. One problem with this is that browsers may use different amounts of padding around an image, if you only link to an image as such. You can't affect that padding, but you might affect the padding around an image if it is part of an HTML page (or the sole content of an HTML page). Moreover, you might wish to have some explanatory text (image caption) in the window, too. Hence it might be better to make the link point to an HTML document containing the image (and little or nothing else). You might find it comfortable to construct the HTML document "on the fly" using JavaScript. Just remember to have an href attribute pointing to the image, as a fallback for non-JavaScript browsing.

Initializing an input field to contain current date

Suppose that you have an input field where a user should specify a date and that you would like to set the default to current date. The first question is: current as of what? The time of creating the document, or the server sending it, or the user filling out the form? Cf. to How do I display the current date or time in my document? But note that if you'd like to set the default to the date of filling out the form, then using JavaScript is quite suitable, and doesn't cause problems when it doesn't work; after all, it's just a matter of optionally setting a default.

The simplest approach is to write an input field normally, and use separate JavaScript code to assign an initial value to it. That is, you would not use a value attribute in an input element in HTML markup but assign the date (as a string) to the element's value property in JavaScript.

The next question is how to construct the date string. This depends on the desired date format. Beware that manipulation of dates in JavaScript is somewhat awkward due to browser incompatibilities and Y2K problems (they are real in JavaScript, especially after the start of 2000); see JavaScript Date FAQ.

Especially on the WWW, there are good reasons to use ISO 8601 date format (sample: 2000-11-06). The following code initializes a form field to current date in that format, when JavaScript is enabled, and leaves it initially blank otherwise:

<form ... name="testform">
<input name="datefield" size="10">
<script type="text/javascript"><!--
function getCorrectedYear(year) {
    year = year - 0;
    if (year < 70) return (2000 + year);
    if (year < 1900) return (1900 + year);
    return year;}
var today = new Date();
var d  = today.getDate();
if(d < 10) d = '0' + d;
var m = today.getMonth() + 1;
if(m < 10) m = '0' + m;
var y = getCorrectedYear(today.getYear());
var dateString = y + '-' + m + '-' + d;
document.testform.datefield.value = dateString;
//--></script>
...
</form>
This is how it looks like on your current browser:

It is generally best to put JavaScript code into separate files (referred to via script src="..." elements in HTML) or into script elements in the document head, whenever feasible, instead of "inlining" it as above. The code above is just a compact illustration.

Yet another example: dynamically displaying the sum of fields

Assume that your form has some fields for numeric input by the user, such as quantity of items he is about to purchase. You might wish to include, as extra convenience to some users, a client-side script which dynamically calculates and displays a running total.

To illustrate the basic idea, we consider an extremely simplified version where the data to be displayed is just the sum of two fields. Let us assume that user input is via selecting an option from a SELECT menu, which means that he has a predefined set of alternatives.

If data input is via a text input field, there will be more complications. You would probably want to check the data for being numeric, and it would be more difficult to make the sum information change in a manner which cannot mislead the user. Note that the onchange attribute is handled by browsers so that an event is not interpreted to occur just because the user uses the mouse to move the cursor away from the input field, but it occurs when you focus on another field (e.g. by clicking on another field in order to start typing there, or by "tabbing" to the next field).

The code for it could be the following:

<script type="text/javascript"><!--
function updatesum() {
 document.form.sum.value =
(document.form.s1.options[document.form.s1.selectedIndex].text-0) +
(document.form.s2.options[document.form.s2.selectedIndex].text-0); }
//--></script>

<form name="form" action="address of server-side script">
Select a number:
<select name="s1" onChange="updatesum()">
<option selected>0<option>1<option>2<option>3<option>4
<option>5<option>6<option>7<option>8<option>9
</select>
and another number:
<select name="s2" onChange="updatesum()">
<option selected>0<option>1<option>2<option>3<option>4
<option>5<option>6<option>7<option>8<option>9
</select>
Their sum is:</th> <td><input name="sum" readonly style="border:none"
value="(not computed)">
</form>

It looks like the following on your browser:

Select a number: and another number: Their sum is:

Notes on the example:

Remember to pay attention to the problem what happens when JavaScript is off (see below). At least the page should not display a bogus value as the sum!

Limitations imposed by the use of JavaScript

You should be aware of and pay attention to the following limitations:

Perhaps your JavaScript code has no security risks. (Are you sure?) But many people think that JavaScript is insecure in general and therefore keep it disabled. They have other things to do than to install patches to something they don't really need, the JavaScript support. (After all, even if your JavaScript code is useful, there's a lot of JavaScript stuff on people's pages which is a nuisance, like new windows popping up for no good reason.)

This means that you should never rely on JavaScript being executed. You should first write your pages so that they work robustly, then perhaps consider adding some "spices" using JavaScript or other methods.

If you think that your page works better with JavaScript on, you might consider adding a note about this onto your page, so that users can consider switching JavaScript on or perhaps even switching from one browser to another. The note should be something brief and polite, and "Note: This page works better if JavaScript is enabled, since ...", where "..." should contain a real, practical explanation. It would be natural to put such a note into a NOSCRIPT element (but note that there are some problems in its implementation on popular browsers). Nick Kew has presented the following example in a Usenet article of his:

Let's see. I'm on your order page, and want to order something:

  1. There's a plain order form with a submit button that works. Good, you have my order.
  2. There's a javascript order form with a submit button that works. Likewise.
  3. There's an order form that requires javascript and won't work without. I'll use the back button.
  4. There is an instance of (2) done nicely, with something like
    <noscript>
    <p>This Form includes embedded spreadsheet software that
    will update your order value as you select different items
    and quantities.  You are invited to use a Javascript-enabled
    browser to take advantage of this.</p></noscript>
    

    Now that's a Good usage, and might well persuade me to enable it, not just for the page, but for the entire site. Because the author has gained my respect.

Anyone tell me why there are 1000 instances of (3) for each instance of (4)?

The robust approach: processing backed up server side

This section discusses the use of JavaScript in conjunction with forms. See especially Graceful Degradation in Dan's Web Tips for examples of using JavaScript enhancements in other contexts in a manner which does not break the basic functionality when JavaScript is off.

When do you need a robust approach?

A common situation, the use of JavaScript for "navigational dropdown menus", is discussed in a separate document Navigational pulldown menus in HTML. It explains how one can write a form so that it works with JavaScript when enabled, and with a server-side script otherwise - although there are other robust solutions which are simpler. Here I will discuss some other cases.

The preceding example about focusing was a very simple one, and it used JavaScript to do something that would be nice to have done but not necessary. If JavaScript is not enabled, the focusing depends on the browser and the user may need to click on the field before starting to type. There is nothing especial you need to do for such situations and nothing you can do. (If you try to teach people to use their browsers on a page of yours, you will probably just cause confusion, especially among people who use browsers other than you expect.)

For a somewhat different example, see my document Get RFC by number - a demonstration of combined client and server side scripting. There the point is that a small utility form gives the user quick access to some data which is available on the Internet in many ways. Thus, using a server-side backup for the JavaScript solution is not necessary. But it is a relatively simple thing to write, so there's no particular reason why one wouldn't do that.

But what should we do if there is something that must be done, such as checking for some input data (entered using a form) which really needs to be checked?

The answer is very simple:

  1. Perform the checks in the server-side script which processes the submitted data.
  2. You may additionally perform the checks in the client side, using JavaScript (or some other client-side scripting language). The benefit is that some users will have the extra comfort of having their input checked faster, so they can fix their errors more conveniently.

Naturally, this means duplication of code, typically in two different languages. It is up to you to decide whether it is worth doing. It is advisable to write the server-side checks first; you will need them anyway, and the service will be functional without client-side checking; and you might notice that you simply haven't got time or energy to do both, o it is better that you've done the indispensable thing.

Example: checking data for being numeric

Let us illustrate this idea with a very simple example. In my Getting Started with CGI Programming in C I present an example of a simple form and server-side script which processes form submission by interpreting the data as two integers and calculating and sending back their product. Naturally the the server-side script contains checks against invalid data. Specifically, the checks can be coded in the C language as follows:

if(sscanf(data,"m=%ld&n=%ld",&m,&n)!=2)
  printf("<P>Error! Invalid data. Data must be numeric.");

Having taken care of this basic check, you might add client-side checking. This would imply that when such checking takes place, the user gets more immediate feedback, and invalid data will not be sent to the server. For example, we could code the checks in JavaScript essentially as follows (see the full example code for details): instead of just <INPUT TYPE="TEXT" NAME="m"> we write the following:

<INPUT TYPE="TEXT" NAME="m" ONBLUR=
"if(!ValidNumber(m.value)) { document.theform.m.value='';
document.theform.m.focus(); return false;}">
where ValidNumber is a JavaScript function which performs the check as follows:
function ValidNumber(thestring)
{
    for (i = 0; i < thestring.length; i++) {
        ch = thestring.substring(i, i+1);
        if (ch < "0" || ch > "9")
          {
          alert("The numbers may contain digits 0 thru 9 only!");
          return false;
          }
    }
    return true;
}

This is very different from the way the check is made in the server-side script. And in fact it makes a different, stricter check: it accepts unsigned integers only. Essentially, the code verifies the data when the user tries to move away from the input field; if the data is incorrect, the code issues a message, clears the field, and returns focus to it, so that the user can immediately type a valid value. (Clearing the field is somewhat brutal. Perhaps you should leave it as it is; an error processing script might just delete the nonnumeric characters, but then the user might get somewhat confused. Note that you could also use a script like the one described under Restricting a Field to Numbers Only at idocs Guide to HTML. It prevents the user from typing nonnumeric characters. Since the technique uses the onkeypress attribute which is not supported in all JavaScript implementations, you might additionally have the simpler check as above.)

There is a collection of various basic routines which you could use as building blocks when writing JavaScript code for checking form data: Sample code for form validation.

Why checks must be done reliably

It is hopefully evident that if something needs to be checked, it needs to be checked reliably. But let's make some notes to illustrate the point:

Consider the risks involved if you rely on checks in JavaScript. Assume that you have a form for user input which will be used to store data into a database. Assume that some field in the data must be numeric, otherwise the database will be damaged. Assume that you use JavaScript to check for that, perhaps even so that when a user tries to type a non-numeric character into that field, he just can't do it. Now, relying on that alone would be very stupid. Anyone using a browser with JavaScript disabled would be able to enter and submit any data. In fact, invalid data can pass the checks made by our example code above even when JavaScript is enabled! There are differences between browsers as regards to the interpretation of the ONBLUR attribute, so the user might still be able to leave the field with invalid data and to submit the form.

You shouldn't rely even on such restrictions which can be imposed (in a sense) in pure HTML, such as the length of a single-line text input field (using the MAXLENGTH attribute in an INPUT TYPE="TEXT" element; cf. to How to limit the number of characters entered in a textarea). Probably all browsers will enforce the restriction, but consider what happens when someone saves a copy of your document onto his disk (you cannot prevent that) and edits it by changing or removing that attribute.