English - the universal language on the Internet?

Abstract

Generally speaking, English is the universal language on the Internet, but it has no official status, and it will never have. The reasons for the position of English are the imperialism and economical and political importance of English-speaking countries. Linguistically, English is extremely unsuitable for international communication, and the actual wide use of English tends to polarize the world into Internet users and Internet illiterates.

The position of English can only be altered by major world-scale political and economical changes, such as increasing importance of the European Union or a coalition between Japan and China. Such powers might wish and be able to promote a language other than English, possibly a constructed language, for international communication.

Alternatively, or in addition to this, the technology of machine translation may allow people to use their own language in international communication.

Table of Contents

Preface

The impulse to writing this article was a discussion in the Usenet newsgroup sci.lang. The original question was "whether or not English should be made the universal language of the internet".

As several people remarked, English essentially is the universal language of the Internet. Nevertheless, the question, appropriately interpreted and elaborated, is worth a more delicate treatment.

The current situation

In general, the universal language on the Internet is English, or more exactly a vague collection of languages called "English" because their common origin is the national language spoken in England by the English. That national language has spread over the world, and several variants such as American (US) English, Australian English, etc exist. A great number of people whose native language is none of the variants know English as a foreign language. They typically use a more or less simplified variant, e.g. excluding most of the idioms of British, American, Australian etc English. Of course, they make mistakes, and sometimes the "English" used by people as a foreign language on the Internet is almost incomprehensible to anyone else. In addition, people who use English as their native language do not know how to spell difficult words, since they basically know English as a spoken language.

Thus, roughly speaking, the universal language of the Internet is clumsy, coarse and misspelled "English".

There are exceptions, such as national discussion forums in such countries where English is not the native language of the majority. Even in such forums, English is often used, for instance when people from other countries wish to participate.

Why is it so?

Generally speaking, when a languages has got the position of a universal language, the position tends to be affirmed and extended by itself. Since "everyone" knows and uses English, people are almost forced to learn English and use it, and learn it better.

Even if you expect the majority of your readers to understand your native language, you may be tempted to use English when writing e.g. about research work. Usually researchers all over the world know English and use it a lot, and often the relevant terminology is more stable and well-known in English than in your own language. Thus, to maximize the number of interested people that can understand your text, you often select English even if the great majority of your readers have the same native language as you. Alternatively, you might write your texts both in your native language and in English, but this doubles the work needed for writing your document and possibly maintaining it. The maintenance problem is especially important for documents on the World Wide Web - the information system where one crucial feature is the ability to keep things really up to date. Consequently, the use of English in essentially national contexts tends to grow.

In Usenet news, the first discussion system on the Internet, the position of English in most international groups has been regarded as so obvious that people who post non-English articles to such groups - by accident or by ignorance - have typically got flamed quickly. This is the sort of control that communities exercise in other matters than language, too. It has often been regarded as an example of the "democratic" nature of the system.

In more modern discussion and communication systems, such as web-based discussion forums, blogs, StackExchange, Facebook, etc., English is the language of international communication. This is regarded as more or less self-evident and seldom mentioned example; anyone who joins such a system sees the situation.

The universal language position, once gained, tends to be strong. But how is such a position gained?

During the history of mankind, there have been several more or less universal languages or lingua francas, such as Latin (and Greek) in the Roman empire, mediaeval Latin in Western Europe, later French and English. Universality is of course relative; it means universality in the "known world" or "civilized world", or just in a large empire. No language has been really universal (global), but the current position of English comes closest. The position of a universal language has always been gained as a by-product of some sort of imperialism: a nation has conquered a large area and more or less assimilated it into its own culture, including language, thus forming an empire. Usually the language of the conquerer has become the language of the state and the upper class first, then possibly spread over the society, sometimes almost wiping out the original languages of the conquered areas. Sometimes - especially in the Middle Ages - the imperialism has had a definite cultural and religious nature which may have been more important than brute military and economic force.

As regards to the English language, it would have remained as a national language of the English, had it not happened so that the English first conquered the rest of the British Isles, then many other parts of the world. Later, some English colonies in a relatively small part of America rebelled, formed the United States of America, and expanded a lot. They formed a federal state where a variant of the English language was one of the few really uniting factors. And that federal state became, as we all know, wealthy and important. It also exercised traditional imperialism, but more importantly it gained a very important role in world economy and politics. Whether you call the US influence imperialism or neo-imperialism is a matter of opinion, but it certainly has similar effects on maintaining and expanding the use of English as classical imperialism.

This probably sounds like political criticism, but it is intended to be descriptive only. Personally, I do not regard imperialism as an incarnation of the Evil; it has had both positive and negative effects, and in many cases imperialism has been a necessary step from chaos to civilization.

Effects of the importance of the Internet and English

The importance of the Internet grows rapidly in all fields of human life, including not only research and education but also marketing and trade as well as entertainment and hobbies. This implies that it becomes more and more important to know how to use Internet services and, as a part of this, to read and write English.

Of course, the majority of mankind cannot use the Internet nowadays or in the near future, since they live in countries which lack the necessary economical and technological infrastructure. But the Internet causes polarization in developed countries, too: people are divided into Internet users and Internet illiterates, and as the use of the Internet grows and often replaces traditional methods of communication, the illiterates may find themselves in an awkward position.

In general, it is easy to learn to use Internet services. The worst problems of Internet illiteracy are, in addition to lack of economical resources of course, wrong attitudes. Older people are usually not accustomed to live in a world of continuous and rapid change, and they may not realize the importance of the Internet or the easiness of learning to use it.

But although Internet services themselves are, generally speaking, easy to learn and use, you will find yourself isolated on the Internet if you are not familiar with English. This means that knowledge or lack of knowledge of English is one of the most severe factors that cause polarization. Learning to use a new Internet service or user interface may take a few hours, a few days, or even weeks, but it takes years to learn a language so that you can use it in a fluent and self-confident manner. Of course, when you know some English, you can learn more just by using it on the Internet, but at least currently the general tendency among Internet users is to discourage people in their problems with the English language. Incorrect English causes a few flames much more probably than encouragement and friendly advice.

In different countries and cultures, English has different positions. There are countries where English is the native language of the majority, there are countries where English is a widely known second language, and there are countries where English has no special position. These differences add to the above-mentioned polarization. Specifically, it is difficult for people in previous colonies of other countries than Great Britain (e.g. France, Spain, the Netherlands) to adapt to the necessity of learning English. Locally, it may be necessary to learn the language of the previous colonial power since it is often an official language and the common language of educated people; globally, English is necessary for living on the Internet. And the more languages you have to learn well, the less time and energy you will have for learning other things.

An official language for the Internet?

There is no conceivable way in which any authority could define an official language for the Internet. The Internet as a whole is not controlled by anyone or anything, and this could only change if, by miracle, all countries made an agreement on it or if the entire world were taken to the control of one government.

Thus, if the question "whether or not English should be made the universal language of the internet" is interpreted as concerning the official status of English, the answer is simply that English, or any other language, cannot be made the official universal language. It is fruitless to ask whether an impossible thing should be made.

But can things change?

Things can change, and they actually do, often with unpredictable speed. The rapid fall of the Soviet empire - including the loss of the role of Russian as a "universal" language within in - is an indication of this.

English can lose its position as a widely used (although not official) universal language in two ways. Either a new empire emerges and its language becomes universal, or a constructed language becomes very popular. I believe most people regard both of these alternatives as extremely improbable, if not impossible. Perhaps they are right, perhaps not.

I can see two possible empires to emerge: the European Union and a yet nonexistent Japanese-Chinese empire.

The European Union (EU) is a political and economical formation which is moving towards federalism. In many respects, the European Union already is a federal state, with less independence and autonomy for its constituents than the states have in the United States. Although people may present the EU as the successor of previous empires such as the Roman empire and the empire of Charlemagne, it is quite possible that the EU never becomes a real empire, since it seems to be inherently bureaucratic. Every empire needs a bureaucracy, of course, to promote the aims of its ruler(s), but the EU lacks true rulers. But if the EU ever becomes a true empire with a prominent role in the world, the language of the empire will hardly be any of the national languages in the EU, except possibly English. It is possible that the builders of the empire will realize the need for a relatively neutral universal language, and adopt Esperanto or some other constructed language for official purposes. In fact, such a choice would be extremely rational at the present stage of the EU, since now a considerable portion of EU expenses are used for translation and interpretation between the official languages of the EU. A single official language of the EU might or might not be adopted by people worldwide as a universal language for everyday communication, including communication on the Internet.

It is, however, more realistic to expect that if the EU will have a single language for its administration and politics, it will be English, possibly a specifically "pan-European English" or "EU English" The EU has extensive style guides, both with language-independent principles and with rules for different languages. So we might say that for internal use within EU organization, a "EU English" already exists.

Japan is probably too small, both as a country and as a nation, to create an empire with its own forces, despite its flourishing technology and economy and efficient social organization. But its potential combined with the vast human and other resources of China would certainly constitute a basis for an empire that succesfully competes with the United States and the European Union, even if latter powers were (economically) strongly allied. Both Japan and China would have a lot to gain from intensive mutual cooperation, or alliance, confederation, or federation.

A Japanese-Chinese empire would have a difficult choice of language. It might decide to accept the role of English as a universal language, both for continuity and for the reason that selecting either Japanese or Chinese (Mandarin) would set the Japanese-Chinese union at stake. Alternatively, it might seriously consider using a constructed language - most probably not Esperanto but a language which is culturally more neutral, i.e. not dominantly Indo-European, for instance something like Loglan or Lojban.

Is English a suitable universal language?

Apart from being widely used and known, English is extremely unsuitable as a universal language. There are several reasons to this.

Any national language, i.e. a language which is or was originally the language of a particular tribe or nation, has obvious defects when used for international communication:

These remarks apply to English, too, and especially to English. One of the worst relics of English is the orthography. English has a very rich repertoire of idioms, and it typically has several words which have the same basic meaning but different connotations and stylistic value. Especially in international contexts you can never know what words mean to people with different backgrounds. Thus, you may occasionally get your basic message understood in some way, but you cannot tell in which way. This is of course an inherent problem in all human communication, but the nature of English makes it a really big problem.

English is an eclectic language which tends to borrow words from other languages instead of constructing words for new concepts from older words with derivation or word composition. People often say that English has a rich vocabulary as if it were something to be proud of. The richness of the vocabulary results basically from word borrowing and implies that words for related concepts are typically not related to each other in any obvious, regular manner. Word borrowing makes a language more international in one sense, but in the essential sense it makes it less suitable for international communication, since learning the vocabulary is more difficult.

A constructed international language?

The discussion above shows that it would be highly desirable to have a constructed language for international communication. It is well known that a large number of attempts to that effect have been made, with little results. Advocates of the basic idea have hardly agreed on anything but the basic idea, and most constructed languages have had no use as a language. People who strongly support the idea have typically designed their own proposal, a perfect language, and they do not want accept anything that is not perfect - "best" is the worst enemy of "good".

The very idea is not inherently unrealistic, but it can only be realized if strong economical and political interests are involved, such as the intended creation of a European or Japanese-Chinese empire. The best that the advocates of a constructed international language can wish is that such empires emerge and that the United States remain as an important power, so that the world will have a few strong empires which cannot beat each other but must live in parallel and in cooperation. In such a situation, it might turn out that it is unrealistic not to agree on a common language which is not any of the national languages.

The role of the Internet in this hypothetical development would be to create the informational infrastructure for the discussion of the construction of the language, the very construction work, spreading out information about the language, the use of the language, and continuous development of the language. Most probably the language would first be used in parallel with English, and the initial use would be for such purposes like international agreements where national languages are clearly insufficient. For instance, if you need to formulate an agreement between two countries, you definitely need a neutral common language instead of having the text in two languages, each text allowing its own interpretations.

An alternative: machine translation

An alternative view of the future is that after a few years or decades, no universal language is needed: machine translation will allow you to use your own language. If the machine translation tools had sufficient quality and speed, you could sit on your terminal writing your news article or an IRC message in, say, Finnish, and another person in New Zealand would read your text in English, due to automatic translation "on the fly".

During the last few decades, quite a lot of predictions and even promises have been presented regarding machine translation, but useful software and systems for it have not been available until recently. This has caused disappointments and pessimism to the extent that many people consider machine translation as definitely unrealistic.

Actually, machine translation is operational for a wide range of texts, although corrective actions by human translators may be necessary. Corrections are needed to resolve ambiguities which exist due to the limitations of the software and to fix errors caused by the fact that translation of human languages requires extralinguistic information.

Assumably fully automatic correct translation will never be possible. However, this does not exclude the possibility of using it extensively. It only means that we must be prepared to accept a risk - decreasing by advances in technology, but never reaching zero - of translation errors. Such risks exist when human translators are used, too, and in many respects automatic translation can be more reliable. Both human beings and computer programs err, in different ways.

In addition to the advancement of translation techniques, there are several ways in which the risk of errors in automatic translation can be decreased:

Currently the operational machine translation software is essentially based on syntactic analysis, so that semantic information is implicit in the dictionaries used by the software. An alternative approach, based on some kind of semantic analysis in addition to syntax, does not appear to be practically applicable yet.

Final remarks

Machine translation and constructed international languages are alternative but not mutually exclusive solutions to the problem of communication between people with different native languages. They can be combined in several ways.

A constructed language might form the basis of a semantics-oriented machine translation system. It could be used as an intermediate language, thus reducing the problem of making m × n translators from m languages to n languages into the problem of making m + n translators.

A constructed language, specifically designed to allow exact and unambiguous expression, might also be more suitable than English to the role of the language of "authorized" translations.

The design of a constructed language which might achieve general use is, of course, a very difficult and controversial issue. A few years ago I wrote an (unpublished) article The Effect of Computer Technology on the Design of Artificial Languages which emphasizes computer tractability as an essential point in constructed language design.