This
information has been compiled mainly for people who make decisions on
localization of software or on translations of texts. They also help people who
implement such decisions. This presentation deals with such features of Finnish
that imposes requirements on software design or translation processes. No
previous knowledge about Finnish is assumed here.
In addition
to the common Latin letters, the letters ä and ö (in uppercase Ä and Ö) are necessary for Finnish texts. There is no accepted
way to replace them. The letters š and ž are
desirable, as they are part of official orthography, but in practice (though
not officially) they are often replaced by sh and zh.
A Finnish
word may be a compound word and it may contain several suffixes. A compound
word often corresponds to two or more words in another language. For example, keskushermosto is “central nervous system”. This means
that an English word, such as “central” or “nervous”, often cannot be
translated into Finnish without knowing at least some of the context.
The
suffixes often correspond to prepositions or other small words in other
languages. For example, taloissammekin consists of
the base word talo and four suffixes and means “in
our houses, too”. Thus, e.g. in translation from English to Finnish, it is
usually necessary to have at least a few consecutive words to work on, and it
is very unrealistic to require “word to word” translations.
As a rule,
a complete clause (with subject, verb, etc.) is usually the smallest feasible
unit of translation. When individual words and phrases, such a menu item texts
or button texts, need to be translated, they should be presented as grouped by
context and with suitable explanations if possible.
Expressions like “five apples” or “5 apples” pose special problems when generated programmatically. For English, you can mostly use simple code that just appends “s” to the noun if the number is not one (1). In Finnish, the noun must be in a special case form, the partitive, e.g. 5 omenaa versus 1 omena or 5 hevosta versus 1 hevonen. This means that you either need to store the partitive forms of all nouns that may appear, in addition to the basic form, or to have a rather complicated algorithm that constructs the partitive forms. If you only store the partitive forms and use them even when the number is one (e.g., 1 omenaa, 1 hevosta), the result is understandable but odd-looking and ungrammatical, comparable to a presentation that uses “1 apples” and “1 horses” in English.
In general,
word order cannot be preserved when translating into (or from) Finnish. The
normal order of parts of a clause is often different from the order in English.
For example, even a simple clause like “A new proposal was made” must be
translated using a different order: Tehtiin uusi esitys, putting the verb (tehtiin, “was made”) at the start. The
reason is that Finnish lacks articles, and the distinction that English makes
by using “a” or “the” must be made using other means, such as word order.
To take another example, a sentence like 舠There is a rat in the house舡 cannot be reasonably translated so that the order of the words for rat (rotta) and house (talo) is preserved. The natural Finnish expression is Talossa on rotta.
Although Finnish is often said to have 舠free word order舡, the order is significant. It just often expresses different things than word order in English. Thus, a requirement that a specific order of words or expressions be preserved in translation is generally unrealistic.
Finnish has
a large number of inflected forms for nouns, adjectives, numerals, pronouns,
and verbs. In general, all the forms cannot be derived
from the basic form alone. Two words may well have the same basic form but
different inflection. Therefore, when storing a word as a vocabulary entry,
inflection information should be stored as part of it.
When
translating a word into Finnish, the sentence context is needed for the
selection of a proper form. For example, it is impossible to
give a single translation for the English word form “hats”, since it should be translated as
hattuja when used in an advertisement text like “new
hats for sale”, as hattua when occurring in “I have
five hats”, as hatut when used as a label in a
product catalog, etc.; and the phrase “in my hats” should be translated as a
whole as hatuissani.
Word
inflection is applied to proper names (including foreign names) and
abbreviations, too. In abbreviations, the colon “:” appears before the suffix,
e.g. EU:ssa “in the EU”.
Word division after the colon (as applied by some software) is not acceptable.
Sometimes
companies impose a requirement that a company name or a product name be used in
one form only. This is impossible in Finnish, as impossible as it would be to
write about a product in English without ever using any preposition before the
product name. When a trade mark symbol is appended to
a name, it is written after the inflected form, e.g. Aspirin® (basic
form), Aspirinin® (genitive). The only way to avoid all inflection of a
name is to use a hyphenated compound word with the name as the first part and a
generic noun as the second part, so that the second part is inflected, e.g. Aspirin-lääke (lääke means
medicin), genitive Aspirin-lääkkeen. Such texts
look clumsy and artificial.
When
patterns such as “from … to …” need to be translated, the process should deal
with each pattern as a whole rather than translate just “from” and “to”. Those
prepositions simply have no translations as such in Finnish; they need to be
translated by attaching a suitable suffix to the next word. The suffix depends
on the context and on the word, and there may be a change in the word stem
involved. For example, “from Helsinki to Vantaa” should be translated as Helsingistä Vantaalle and “from Tampere to London” as Tampereelta Lontooseen.
The
importance of word flexion also means that search routines that simply operate
on words are of very limited usefulness for Finnish. A word may have dozens
(even hundreds) of inflected forms. Search engines like Google can deal with
this in a limited manner. In most situations, it is more or
less sufficient to have the ability to search with wildcards at the end
of a string. For example, “Helsin*”, where “*” is a wildcard, would find Helsinki, Helsinkiin, Helsingissä, and all the other inflected forms.
For similar reasons, automatic checks for consistency of use of terms generally fail if they do not recognize inflected forms. Although a Finnish noun has dozens of forms (when all possible suffixes are counted), typically only a handful of them occurs in normal text when the noun is a term. This means that recognition of inflected forms can even be handled in a simplistic manner by listing the most common forms in the term glossary.
Long words
are common in Finnish due to many suffixes and compound words. Words longer
than 20 characters appear often in business texts. For such reasons,
hyphenation is desirable. Without hyphenation, lines tend to be of different
lengths, causing either very ragged right margin or (in a justified column)
very wide gaps between words.
As a rule, the length of a piece of text should be expected to vary greatly when translated into another language, even doubled or more. For this reason, fixed width settings on texts should be avoided or set rather liberally. For example, in user interfaces, a menu item like “Save As” is usually (and properly) translated into Finnish as Tallenna nimellä.
The basic
hyphenation rules are simple and easy to implement in software. However,
compound words require special attention. Good hyphenation requires either software
that knows how to recognize the components of compound words or manual
checking. Hyphenating Finnish texts with English hyphenation rules produces
unacceptable results.
In the use
of capital letters, Finnish generally follows continental European (e.g.
French) tradition rather than English practice. This means that normally only
the first letter of a sentence (or a sentence-like separate expression) and the
first letter of each proper noun is in upper case. Derivations of proper names,
such as englanti (English language) and englantilainen (English or Englishman or Englishwoman),
are not treated as proper names.
Capitalizing almost every word in a title of a work, which is common in English (e.g., “On the Origin of Species”), is definitely incorrect in Finnish. Capitalizing words for emphasis, as in ”Very Important” (Hyvin Tärkeää) is not normal in Finnish and may make a very childish impression.
If text is
written in all upper case, care should be taken to make sure that ä and ö are capitalized, too.
For
business documents, a requirement on writing some words in all upper case is
often made. Typically, the words are company or product names or terms used in
a contract, such as COMPANY and CUSTOMER. Such style has traditionally not been
used in Finnish, and language authorities recommend against it, but it has
become increasingly common.
The
standard alphabetic order in Finnish is A B C D E F G H I J K L M N O P Q R S
(Š) T U V (W) X Y Z (Ž) Å Ä Ö. Letters in parentheses are treated as equivalent
to the preceding letter. However, it is increasingly common
and now standard
to treat W as a
letter of its own, placed after V.
Sorting
algorithms designed for English do not sort Finnish words correctly, since they
treat Å, Ä, and Ö as variants of A and O, rather than as separate letters at
the end of the alphabet. On the other hand, sorting tailored for Finnish often
treats W as a variant of V, instead of applying the modern approach.
Finnish
uses symmetric quotation marks: ”tekstiä”
and (within a quotation) ’tekstiä’. The opening and closing mark are
identical and correspond to the closing mark as used in English, e.g. “text” or
‘text’.
Finnish has no separate male or female pronoun. The same pronoun hän is used for both sexes. This may cause unintended ambiguity in tranlations. A common technique to avoid that is to use people’s names instead of pronouns when needed.