Path: news.cs.hut.fi!nntp.hut.fi!not-for-mail From: Jukka Korpela Newsgroups: comp.infosystems.www.authoring.html Subject: HTML is not a programming language but a data format Date: Wed, 02 Aug 2000 17:41:14 +0300 Organization: Helsinki University of Technology Lines: 61 Message-ID: <4mtfosg72dpva54bkeiqrcfbpob6me3bec@4ax.com> References: <39818CD1.1ED8665F@nospam.com> <3987CC0A.E54A3A19@idsi.net> NNTP-Posting-Host: cc-dialin4-51.hut.fi Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Trace: nntp.hut.fi 965227358 26043 130.233.221.51 (2 Aug 2000 14:42:38 GMT) X-Complaints-To: usenet@nntp.hut.fi NNTP-Posting-Date: 2 Aug 2000 14:42:38 GMT X-Newsreader: Forte Agent 1.7/32.534 Xref: news.cs.hut.fi comp.infosystems.www.authoring.html:310359 On Wed, 02 Aug 2000 03:21:48 -0400, Mike Lepore wrote under Subject: Re: Hiding text for search engine indexing: >Jukka Korpela wrote: - - >>There is no such thing as html programming; HTML is not a >> programming language at all. > >The content of the memory of a computer is in two categories: data and >programs. Things can be categorized that way, and that's even useful for some purposes. Of course, programs are just a special case of data - they can be processed in various ways, like copied onto diskettes, sent over the Internet, etc., just as other data can. But programs are data that can be executed as machine instructions, or executed in interpretive mode by an interpreter, or compiled into machine instructions by a compiler. And we can use the word "data" in a limited meaning too, as 'any data which is not a program'. Now, in that categorization, HTML documents are data. The good old HTML 2.0 specification, a great improvement in conceptual clarity over its successors, says: The HyperText Markup Language (HTML) is a simple data format used to create hypertext documents that are portable from one platform to another. HTML documents are SGML documents with generic semantics that are appropriate for representing information from a wide range of domains. >Programs are any content of memory that result in the selection >of operations which the central processor is told to perform. And how would that make an HTML document a program? It is true that there are some (currently deprecated) constructs in HTML that can be regarded as "instructions" in a sense. One might say that is an "instruction" to turn font color to red. I wouldn't say so - it is more natural to interpret it as a suggestion, or hint, concerning presentation - but if you do, then you might say that a browser is an interpreter that executes such instructions. But that would be _very_ remotely if at all analogous to, say, a Perl interpreter executing a Perl program (script), which is written in a full-bloodied programming language. And the categorization of a language is to be judged according to its characteristic and typical constructs. As a whole, HTML is clearly a markup language which is declarative ("here's a block quotation... here's a heading, ... "), not procedural/imperative ("indent so-and-so", "increase font size", ...). Even if we regarded and used HTML as a procedural markup language (and for such a purpose, HTML is remarkably limited), this wouldn't make it a programming language or turn HTML documents into programs. An MS Word document contains procedural markup - in a specific binary format - for document appearance. If HTML documents were programs, Word documents would be that much stronger - and I'm not even referring to macros. So would TeX documents, nroff documents, etc. Procedural markup is not programming; so surely structural markup isn't either. -- Yucca, ../ Brevis esse laboro, obscurus fio.