IT and communication - Web:

Referring to Usenet newsgroups, articles and threads in HTML

The new version of Google Groups, misleadingly called "Google Groups Beta", has changed several essential features. This document has not been updated to reflect them. It is still uncertain to which extent Google Groups might revert back to the old version in functionality.

Content:

Introduction

Sometimes you might wish to refer to Usenet newsgroups on a Web page (HTML document) of yours. Perhaps you'd like to say "these issues are discussed in the Usenet group ..." or "for other opinions on this, see the Usenet thread ..." or "a good example of this was given in the Usenet posting ...". And you would like to make such statements link to the group, thread, or article so that the reader can conveniently access the resource you are referring to. This means that you need a suitable URL.

There is a URL scheme (news:) specifically designed for referring to Usenet groups or articles. However, this method suffers from serious problems. Therefore we shall here consider first a more practical approach, linking via the Google Groups service (previously known as Deja, before that as DejaNews), This means that you would not use a news: URL but an http: URL which effectively makes a query which accesses the Google Groups database.

When you consider referring to a group, you should not expect many people find a link to a newsgroup particularly helpful, especially if it is a very (hyper)active like comp.infosystems.www.authoring.html. Linking to an FAQ (list of Frequently Asked Questions with answers) is more probably useful; you could of course link both to an FAQ and to the corresponding group. You can find a large repertoire of FAQs via the Internet FAQ archive, but there are also FAQs that haven't been included into it; reading a group for a few weeks should make you note whether it has a maintained FAQ. (You really should not refer to a newsgroup if you don't know whether it has an FAQ and where it is!)

If a Usenet article contains information which is so valuable in your opinion that you wish to link to it on a Web page, you could ask the author for permission to store a copy of it (either as plain text or as converted to HTML) into a file which upload onto a Web server. Then you would use a normal http: link to it, of course. Consider yourself lucky if the author responds that actually he will himself make the content of the article available on the Web and tell you the URL. Notice that in general the author of a news posting has copyright to it. Typically authors are willing to give permission to the operation described here, but it is the author's exclusive right to allow or deny the distribution of his work. (See the Web Law FAQ. Someone might argue that Google Groups violates authors' copyright, but that's a different story; it's probably best to see Google Groups as a generally approved "extension to Usenet" so that when you post to Usenet, you implicitly give the permission to store it onto Google Groups unless you explicitly request otherwise.)

Part I: Referring via Google Groups

About Google Groups

The Google Groups system, previously know as Deja News (or Deja), is based on an extensive, automatically maintained archive of Usenet postings. It lets users access the archive in various ways and also post articles. In the following, some methods of utilizing such possibilies in links are discussed.

Google Groups archives extend back to March 1995 and are updated effectively in real time, though it may take time before an article reaches a news server where it is "harvested" by Google Groups.

The system has undergone several changes, which have affected both the user interface and the ways in which groups, articles, and threads can be referred to with URLs. This has been rather frustrating at times; it's not nice to see how URLs stop working. Hopefully the system has stabilized now.

Linking to a group via Google Groups

To link to a group via Google Groups, you can use a URL of the form
http://groups.google.com/groups?group=groupname
The following link uses that method: news.announce.newusers.

This gives you a "threaded" access to the group, with those threads first that have new articles.

You might also consider including a simple search form which performs a "customized" Google Groups search, with searching automatically limited to one group, imitating the seacth facilities on the Advanced Group Search page. Here is a simple example (I have a more complicated example too):

Search for articles in comp.infosystems.www.authoring.html:

What you need is a form with action attribute referring to http://www.google.com/advanced_group_search, a field named group containing the group name, and a field named q containing the search clause, and a submit field.

Linking to an article (posting) via Google Groups

When you have found an article via Google Groups or otherwise, there are different ways to set up a URL that can be used to refer to the article, in different formats.

If you have just a vague idea of what the article might be, you can use Google Groups Advanced Group Search page. You can then, after having found the interesting article, follow the link to it, then the link "Complete Thread", and finally "View this article only". This gives you a page with a URL like http://groups.google.com/groups?hl=en&selm=35ECF6C6.177A%40helsinki.fi but note that the part hl=en& is redundant; it just tells the human language interface to be used. And you had better leave that to your visitors. This means, to generalize it, that you can use URLs of the form
http://groups.google.com/groups?selm=Message-ID
where Message-ID is the Usenet message ID, with no parentheses around it.

Example:

Hubert Partl has presented, in <A TITLE="Colors in HTML" HREF=
"http://groups.google.com/groups?selm=674ufh%24ui0%241%40www.univie.ac.at">a news article</A>,
a nice English summary of his <A HREF=
"http://www.boku.ac.at/htmleinf/hein52.html#color">article
on colors in HTML</A> (in German).

This looks like the following on your browser - try to follow the link to see what happens:

Hubert Partl has presented, in article on colors in HTML (in German).

Linking to a thread via Google Groups

To refer to a thread, you can locate any article in the thread first. Then you can link to any article, typically the first one in the thread, if you wish to refer to the thread as a whole.

It is probably best to link to the initial article of a thread, since its presentation on Google contains a link to the thread, with a clumsy URL.


Part II: news: URLs

Linking to a group with a news: URL

You can link to a Usenet newsgroup in an HTML document e.g. as follows:

You can find more information about this in the
<A TITLE="comp.infosystems.www.authoring.html"
 HREF="news:comp.infosystems.www.authoring.html">ciwah</A>
newsgroup.

Typically, such links are implemented in browsers so that when the link is selected, a special browsing mode is initiated. The user sees first a list of headings of (recent) articles posted to the group. The headings act as links through which individual postings are accessible. Such an interface to Usenet in a browser may allow the user post to the group, too, and effectively use a Web browser as a newsreader.

Is this particularly useful? Well, if your reader knows how to read Usenet groups but prefers using some other software than the one embedded into or coupled with his current Web browser, the link will do no good. It might even confuse, if followed. If the link takes the user into reading a group the way he actually prefers, then you have saved him a few seconds. (If you just gave the name of the group, he might need to use e.g. cut and paste to access the group the way he normally uses.)

Warning: The default configuration of popular Web browsers is often inadequate in this respect. If you use e.g. Internet Explorer or Netscape for posting to Usenet, you should at least find out how to configure them to make sure that they send the articles as plain text only, not as HTML or both as text and as HTML! (This needs to be handled by each user; here it is given as a warning to authors: if you provide links to newsgroups, this may lead unexperienced users into posting to Usenet without understanding the basics of Usenet and the netiquette.)

Linking to an article (posting) with a news: URL

It is theoretically possible to use a news: URL to link to individual articles posted into Usenet. You would put the Message-ID of the article immediately after the scheme part news:. Example:
news:EvvA43.20r@tac.nyc.ny.us

However, not all browsers support URLs of this form. More importantly, since news articles are transient in news servers, the URL will probably work for a short time only, perhaps a week, perhaps a month (and different times for different users). Thus, URLs of this type are of very limited usefulness.

The general syntax of news: and nntp: URLs

This section presents the formal syntax of news: and nntp: URLs, as defined in the specification of URL syntax, RFC 1738, which has been partly superseded by RFC 2396 but is still applicable as regards to individual URL schemes. Some notes on extensions recognized by some browsers are also given.

RFC 1738 describes, in section 3.6 NEWS, that
news:groupname
and
news:message-ID
are valid forms of URL.

The groupname is a hierarchical ("dotted") name such as comp.infosystems.www.authoring.html and it can also be an asterisk * which refers to all groups. Some browsers also support a form which uses * for one part of the groupname, so that e.g. comp.infosystems.www.authoring.* refers to a set of groups.

When referring to an individual message, the message-ID is of the form id@domain where is domain is the full domain name of the server through which the message was originally posted and id is a message identifier which is unique within that server. Depending on you news reader, you may or may not see the message-ID of a news posting when reading it. (See RFC 1036 for details on message-ID, but notice that the enclosing < and > are not used when a message-ID is within a URL.)

RFC 1738 also defines, in section 3.7 NNTP, a URL scheme which, in principle, lets you refer to a particular article by its article number within a server:
nntp://host:port/groupname/article-number
It also warns: "most NNTP servers currently on the Internet today are configured only to allow access from local clients, and thus nntp URLs do not designate globally accessible resources".

Browsers generally don't support the nntp: scheme. So the following example most probably won't work:
nntp://news.cs.hut.fi/alt.html/239157
whereas the article referred to might be reachable via
news:om3tpsgcubhopiobbet0h65ej6gn0bqc4f@4ax.com
(though usually it won't either).

Most browsers, on the other hand, support an extended format of news: URLs which lets the server to be specified. This format corresponds to the generic URL syntax. It is:
news://server/groupname
or news://server/article-number
where the server part is the full domain name of the news server or its IP address.

Example: news://otax.tky.hut.fi/otax.test

Generally, as a Web author you cannot know which news server is accessible or most convenient to each user. So usually the extended format should be avoided. However, in very special situations it might make sense, mainly when you are referring to a group which is local to one server (i.e. not distributed elsewhere). The usefulness of such a link to readers naturally depends on the access restrictions of that server; normally you should not expect your local news server to be accessible outside you local area network or some similar community.

Sometimes a news server prompts for a login name and password, and in rare cases, one might wish to include them into an nntp: URL. However, the syntax of such URLs has no provisions for including a login name and a password. But, analogously with http: URLs, which have no such provisions either by the specs but are actually supported in extended format by quite a few browsers, and analogously with ftp: URLs and the general format of URLs, one could try a format like
nntp://user:password@servername/newsgroup
But when I tried it on IE 5, for a news server which is accessible on the Internet but with user name and password only, I was unable to retrieve an article that exists there (as checked using a newsreader). I just get a message about failed connection. Netscape 4 doesn't even recognize the nntp: URL scheme. On Opera 5 it works. But beware that RFC 2396 warns: "It is clearly unwise to use a URL that contains a password which is intended to be secret. In particular, the use of a password within the userinfo component of a URL is strongly disrecommended except in those rare cases where the password parameter is intended to be public".

Practical problems with news: URLs

When using news: URLs, you should take into account the following: