Background information

The development and use of machine translation systems and
computer-based translation tools

John Hutchins

Part 4

Translation workstations
Localisation of software 
Systems for personal computers

Translation workstations

In the 1990s, the possibilities for large-scale translation broadened with the appearance on the market of translation workstations (or translator workbenches). The original ideas for integrating various computer-based facilities for translators at one place go back to the early 1980s, in particular with the systems from ALPS. Translation workstations combine multilingual word processing, means of receiving and sending electronic documents, OCR facilities, terminology management software, facilities for concordancing, and in particular ‘translation memories’. The latter is a facility that enables translators to store original texts and their translated versions side by side, i.e. so that corresponding sentences of the source and target are aligned. The translator can thus search for a phrase or even full sentence in one language in the translation memory and have displayed corresponding phrases in the other language. These may be either exact matches or approximations ranked according to closeness.

It is often the case in large companies that technical documents, manuals, etc. undergo numerous revisions. Large parts may remain unchanged from one version to the next. With a translation memory, the translator can locate and re-use already translated sections. Even if there is not an exact match, the versions displayed may usable with minor changes. There will also be access to terminology databases, in particular company-specific terminology, for words or phrases not found in the translation memory. In addition, many translator workstations are now offering full automatic translations using MT systems such as Systran, Logos, and Transcend. The translator can choose to use them either for the whole text or for selected sentences, and can accept or reject the results as appropriate (Heyn 1997).

There are now four main vendors of workstations: Trados (probably the most successful), STAR AG in Germany (Transit), IBM (the TranslationManager), and LANT in Belgium (the Eurolang Optimizer, previously sold by SITE in France). The translation workstation has revolutionised the use of computers by translators. They have now a tool where they are in full control. They can use any of the facilities or none of them as they choose. As always, the value of each resource depends on the quality of the data. As in MT systems, the dictionaries and terminology databases demand effort, time and resources. Translation memories rely on the availability of suitable large corpora of authoritative translations – there is no point in using translations which are unacceptable (for whatever reason) by the company or the client.

Although widely used by administrators within the European Commission, the full-scale MT system Systran is relatively little used by the Commission’s professional translators. For them, the translation service is developing its own workstation facility, EURAMIS, i.e. European Advanced Multilingual Information System (Theologitis 1997). This combines access to the Commission’s own very large multilingual database (Eurodicautom), the dictionary resources of Systran, facilities for individual and group terminology database creation and maintenance (using Trados’ MultiTerm software), translation memory (again for individuals and groups), access to CELEX (the full-text database of European Union legislation and directives), software for document comparison (to detect where changes have taken place), and also, of course, access to the Systran MT systems themselves. The latter are now available from English into Dutch, French, German, Greek, Italian, Portuguese, and Spanish; from French into Dutch, English, German, Italian, and Spanish; from Spanish into English and French; and from German into English and French. The whole EURAMIS system is linked to other facilities such as authoring tools (spelling, grammar and style checkers, and multilingual drafting aids), the internal European Commission administrative network, and to outside resources on the Internet.

Localisation of software

One of the fastest growing areas for the use of computers in translation is in the industry of software localisation. Here the demand is for supporting documentation to be available in many languages at the time of the launch of new software. Translation has to be done quickly, but there is much repetition of information from one version to another. MT and, more recently, translation memories in translation workstations are the obvious solution (Schaeler 1996). Among the first in this field was the large software company SAP AG in Germany. They use two MT systems: METAL for German to English translation, and Logos for English to French, and plan to introduce further systems for other language pairs.

Most localisation, however, is based on the translation memory and workstation approach. Typical examples are Corel, Lotus, and Canon. It is interesting to note that much of this localisation activity is based in Ireland – thanks to earlier government and European Union support for the computer industry. However, localisation is a multi-national and global industry, with its own organisation (Localization Industry Standards Association, based in Geneva) holding regular seminars and conferences in all continents (For details see LISA Forum Newsletter)

Localisation companies have been at the forefront of efforts in Europe to define standardised lexical resource and text handling formats, and to develop common network infrastructures. This is the OTELO project coordinated by Lotus in Ireland, with other members such as SAP, Logos, and GMS. The need for a general translation environment for a wide variety of translation memory, machine translation and other productivity tools is seen as fundamental to the future success of companies in the localisation industry.

Systems for personal computers

Software for personal computers began to appear in the early 1980s (with the Weidner MicroCAT system becoming particularly successful). Nearly all the main Japanese computer companies produced systems for translation to and from English, e.g. the PIVOT system from NEC, the ASTRANSAC system from Toshiba, HICATS from Hitachi, PENSEE from Oki and DUET from Sharp.

Outside Japan, systems for personal computers began to appear a little earlier, but from relatively few companies. The first American systems came in the early 1980s from ALPS and from Weidner. The ALPS products were intended primarily as aids for translation, providing tools for accessing and creating terminology resources but they did include an interactive translation module. Although at first sold with some success, the producers concluded by the end of the decade that the market was not yet ready and the products were in effect withdrawn. Instead, ALPS turned itself into a translation service (ALPNET), using its own tools internally. By contrast, Weidner sold a full translation system in a growing number of language pairs (English, French, German, Spanish), and the business flourished. Weidner produced two versions of its systems: MicroCat for small personal computers, and MacroCat for larger minicomputers or workstations. The company was then purchased by a Japanese company Bravis, a Japanese version was marketed, but soon afterwards the owner decided that the MT market for personal computers was still undeveloped and the business was sold. MicroCat disappeared completely, but MacroCat was purchased by Intergraph, who modified and developed it for its range of publishing software and sold it later as Transcend – recently Transcend was acquired by Transparent Language Inc. (For these developments see Hutchins 1993, 1994).

At the end of the 1980s, most of the commercial systems on the market today appeared. First came the PC-Translator systems (from Linguistic Products, based in Texas) for low-end personal computers. Over the years, many language pairs have been produced and marketed, apparently successfully as far as sales are concerned. Next came Globalink with systems for French, German and Spanish to and from English. (There was also a Russian-English system deriving essentially from the original owner’s experience on the 1960s Georgetown project.) Within a few years, Globalink merged with MicroTac, a company which had been very successful in selling its cheap Language Assistant series of PC software – essentially automatic dictionaries, with minimal phrase translation facility. In the early 1990s, Globalink produced its now well-known ‘Power Translator’ series for translation of English to and from French, German and Spanish, and recently Globalink has marketed the more advanced ‘Telegraph’ series of translation software products, and Globalink itself was acquired by Lernout & Hauspie, a leading speech technology company.

Since the beginning of the 1990s, many other systems for personal computers have appeared. For Japanese and English there are now also LogoVista from the Language Engineering Corporation, and Tsunami and Typhoon from Neocor Technologies (also now owned by Lernout & Hauspie). From the former Soviet Union – where particularly in the 1960s and 1970s there was very active research on MT – we have now Stylus (recently renamed ProMT) and PARS, both marketing systems for Russian and English translation; Stylus also for French, and PARS also for Ukrainian. Other PC-based systems from Europe include: Hypertrans for translating between Italian and English; the Winger system for Danish-English, French-English and English-Spanish, now also marketed in North America; and TranSmart, the commercial version of the Kielikone system, for Finnish-English translation.

Vendors of older mainframe systems (Systran, Fujitsu, Metal, Logos) are being obliged to compete by downsizing their systems; many have done so with success, managing to retain most features of their mainframe products in the PC-based versions. Systran Pro and Systran Classic, for example, are Windows-based versions of the successful system developed since the 1960s for clients worldwide in a large range of languages; the large dictionary databases offered by Systran give these systems clear advantages over most other PC products. Both Systran Classic (for home use) and Systran Pro (for use by translators) are now sold for under a five hundred dollars in many language pairs: English-French, English-German, English-Spanish; and for English to Italian and Japanese to English. The publishing company Langenscheidt acquired rights to sell a version of METAL, in collaboration with GMS (Gesellschaft für Multilinguale Systeme, now owned by Lernout & Hauspie) – the system is called ‘Langenscheidt T1’ and offers various versions for German and English translation. Also from Germany is the Personal Translator, a joint product of IBM and von Rheinbaben & Busch, based on the LMT (i.e. Logic-Programming based Machine Translation) transfer-based system under development since 1985. LMT itself is available as a MT component for the IBM TranslationManager. Both Langenscheidt T1 and the Personal Translator are intended primarily for the non-professional translator, competing therefore with Globalink and similar products. (For these developments see proceedings of MT conferences: AMTA, EAMT, MT Summit, and MT News International.)

Sales of commercial PC translation software have shown a dramatic rise. There are now estimated to be some 1000 different MT packages on sale (when each language pair is counted separately.) The products of one vendor (Globalink) are present in at least 6000 stores in North America alone; and in Japan one system (Korya Eiwa from Catena, for English-Japanese translation) is said to have sold over 100,000 copies in its first year on the market. Though it is difficult to establish how much of the software purchased is regularly used (some cynics claim that only a very small proportion is tried out more than once), there is no doubting the growing volume of ‘occasional’ translation, i.e. by people from all backgrounds wanting renderings of foreign texts in their own language, or wanting to communicate in writing with others in other languages, and who are not deterred by poor quality. It is this latent market for low-quality translation, untapped until very recently, which is now being discovered and which is contributing to massive increases in sales of translation software.

