Background information

The development and use of machine translation systems and
computer-based translation tools

John Hutchins

Part 5

MT on the Internet
Future needs and developments

Comparison of human and machine translation

MT on the Internet

At the same time, many MT vendors have been providing network-based translation services for on-demand translation, with human revision as optional extras. In some cases these are client-server arrangements for regular users; in other cases, the service is provided on a trial basis, enabling companies to discover whether MT is worthwhile for their particular circumstances and in what form. Such services are provided, for example, by Systran, Logos, Globalink, Fujitsu, JICST and NEC.

Some companies have now been set up primarily for this purpose: LANT in Belgium is a major example, based on its rights to develop the METAL system and on the Eurolang Optimizer, which it also markets (Caeyers 1997). Its speciality is the customisation of controlled languages for use with its MT and translation memory systems. In late 1997 it launched its multilingual service for the translation of electronic mail, Web pages and attached files. And in Singapore, there is MTSU (Machine Translation Service Unit of the Institute of Systems Science, National University of Singapore), using its own locally-developed systems for translation from English into Chinese, Malay, Japanese and Korean (with Chinese its main strength) and with editing by professional translators. The service is providing large scale translation over the Internet for many customers world wide (mainly multinational organisations), and including much of the localisation needs for software companies in the Chinese-language markets (LISA Forum Newsletter 4(3), August 1995, p.12.)

A further sign of the influence of Internet is the growing number of MT software products for translating Web pages. Japanese companies have led the way: nearly all the companies mentioned above have a product on this lucrative market; they have been followed quickly elsewhere (e.g. by Systran, Globalink, Transparent Language, LogoVista). As well as PC software for translating Web pages, we are now seeing Internet services adding translation facilities: the most recent example is the availability on AltaVista of versions of Systran for translating French, German and Spanish into and from English – with what success or user satisfaction it is too early to say (Yang and Lange 1998).

Equally significant has been the use of MT for electronic mail and for ‘chat rooms’. Two years ago CompuServe introduced a trial service based on the Transcend system for users of the MacCIM Support Forum. Six months later, the World Community Forum began to use MT for translating conversational e-mail. Usage has rocketed (Flanagan 1996). Most recently, CompuServe introduced its own translation service for longer documents either as unedited ‘raw’ MT or with optional human editing. Soon CompuServe will offer MT as a standard for all its e-mail. As for Internet chat, Globalink has joined with Uni-Verse to provide a multilingual service.

The use is not simple curiosity, although that is how it often begins. CompuServe records a high percentage of repeat large-volume users for its service, about 85% for unedited MT – a much higher percentage than might have been expected. It seems that most is used for assimilation of information, where poorer quality is acceptable. The crucial point is that customers are prepared to pay for the product – and CompuServe is inundated with complaints if the MT service goes down!

It is clear that the potential for MT on, via and for the Internet is now being fully appreciated – no company can afford to be left behind, and all the major players have ambitious plans, e.g. Lernout & Hauspie (McLaughlin and Schwall 1998), which has now acquired MT systems from Globalink, Neocor and AppTek as well as the old METAL system (from GMS).

Future needs and developments

Despite the recent growth of systems for personal computers and of Internet services, it is still true to say that there is nothing yet really suitable for the independent professional translator, i.e. for those not working for large companies or in translation organizations. It is known that some translators have tried to apply commercial PC-based software to their needs, but the amount of adaptation required and the generally poor output has made them unsatisfactory and uneconomic. More suitable for the independent translator would be a cost-effective translation workstation. However, current workstations on the market are still too expensive for the individual translator. Although there is promise of low-cost computer tools for this potentially large market – e.g. terminology and concordancing software, and perhaps alignment software – there is no doubt that this segment is not being covered as well as many other areas.

Another area at present poorly served is the need for reliable but low-cost translation of documents into unknown foreign languages where users do not want to engage expert bilingual translators. There is no problem with translation into recipients’ own languages – PC systems can give adequate ‘rough’ versions for users to get some idea of the basic message – but for translation into an unknown language there are still no solutions. There have been recently some cheap Japanese products which serve this specific ‘foreign language authoring’ demand in the case of writing business letters (based on standard phrases and document templates), but for other areas and for longer documents, where there is less ‘stereotyping’, there is nothing as yet. For translation into another language unknown (or poorly known) by the sender, what is really required is software which can be relied upon to provide good quality output (and most PC products are not good enough). A number of research groups are investigating interactive systems, where the sender composes an MT-friendly version of a letter or document in collaboration with the computer. With a sufficiently ‘normalised’ input text, the MT system can guarantee grammatically and stylistically correct output. As yet, however, this work (e.g. at GETA in France) is still at the laboratory stage (Boitet and Blanchon 1995).

The same is true for software combining MT with information access, information extraction, and summarisation software. There are no commercial systems yet on the market; developments are still at the research stages. The potential and the demand has been recognised: for example, in recent years, most research funds of the European Union have been focused not on MT or ‘pure’ natural language processing (as it was during the 1980s), but on projects for multilingual tools with direct applications in mind; many involve translation of some kind, usually within a restricted subject field and often in controlled conditions (Hutchins 1996; Schütz 1996). As just one example, the AVENTINUS project is developing a system for police forces in the area of drug control and law enforcement: information about drugs, criminals and suspects will be available on databases accessible in any of the European Union languages.

There is growing interest in such multilingual applications worldwide. The application that has received most attention has been ‘cross-language information retrieval’, i.e. software enabling users to search foreign language databases in their own languages. As yet most work has focussed on the construction and operation of appropriate translation dictionaries, for the matching of query words against words or phrases in document databases (Bian and Chen 1998, Oard 1998) – although the provision of software for fast translation of original texts into the enquirer’s own language is naturally also envisaged (McCarley and Roukos 1998). Clearly it will not be long before commercial software is available for this application.

The future application that is probably most desired by the general public is the translation of spoken language. But, from a commercial (and even research) perspective, the prospects for automatic speech translation are still distant (Krauwer et al. 1997). It was only in the 1980s that developments in speech recognition and synthesis made spoken language translation a feasible objective. In Japan a joint government and industry company ATR was established in 1986 near Osaka, and it is now one of the main centres for automatic speech translation. The aim is to develop a speaker-independent real-time telephone translation system for Japanese to English and vice versa, initially for hotel reservation and conference registration transactions. Other speech translation projects have been set up subsequently. The JANUS system is a research project at Carnegie-Mellon University and at Karlsruhe in Germany. The researchers are collaborating with ATR in a consortium (C-STAR), each developing speech recognition and synthesis modules for their own languages (English, German, Japanese). (One by-product of this research was mentioned earlier: the rapid-deployment project for custom-built systems in less-common languages.) The fourth major effort in speech translation is the long-term VERBMOBIL project funded by the German Ministry for Research and Technology which began in May 1993. The aim is a portable aid for business negotiations as a supplement to users’ own knowledge of the languages (German, Japanese, English). Numerous German university groups are involved in fundamental research on dialogue linguistics, speech recognition and MT design; a prototype is nearing completion, and a demonstration product is targeted for early in the next century.

Speech translation is probably at present the most innovative area of computer-based translation research, and it is attracting most funding and the most publicity. However, few experienced observers expect dramatic developments in this area in the near future – the development of MT for written language has taken many years to reach the present stage of widespread practical use in multinational companies, a wide range of PC based products of variable quality and application, growing use on networks and for electronic mail. Despite today’s high profile for written-language MT, researchers know that there is still much to be done to improve quality. Spoken-language MT has not yet reached even the stage of real-time testing in non-laboratory settings.

Comparison of human and machine translation

From this survey it should be apparent that the application of computers to the task of translating natural languages has not been and is unlikely to be a threat to the livelihood of professional translators. Those skills which the human translator can contribute will continue always to be in demand. There is no prospect, for example, that machine translation could ever attempt the translation of literary or legal texts. By contrast, for the rough translation of electronic texts on the Internet there is no rivalry for machine translation – human translators cannot compete in terms of speed, even if they were prepared to undertake poor quality translation of ephemeral material.

We may compare the relative merits of human and machine translation according to the categories of need and use outlined at the beginning of this paper. As far as the dissemination function (production of publishable translations) is concerned, human translation is more satisfactory and less costly overall whenever it is a question of translating one particular text in a unique subject domain (whether scientific, technical, medical, legal or literary). Machine translation demands the costly investment of dictionary maintenance and updating and the costly involvement of post-editing. This can be justifiable (i.e. cost-effective) only when large volumes of documentation within a particular domain are being translated. It is even more justifiable if translation is into more than one target language (when pre-editing and/or vocabulary and grammar control of original texts is possible), and when there is considerable repetition. For such tasks, the human translator would be overwhelmed by the scale of the task, by the boring repetitiveness and by the need to maintain terminological consistency. By contrast, the computer can handle large volumes and can automatically maintain consistency. In brief, machine translation is ideal for large scale and/or rapid translation of (boring) technical documentation, (highly repetitive) software localisation manuals, and real-time translation of weather reports. The human translator is (and will remain) unrivalled for non-repetitive linguistically sophisticated texts (e.g. in literature and law).

For the translation of texts for assimilation, where the quality of output can be poorer than that for texts to be published, it is clear that machine translation is an ideal solution. Human translators are not prepared (and resent being asked) to produce ‘rough’ translations of scientific and technical documents that may be read by only one person who wants to merely find out the general content and information and is unconcerned whether everything is intelligible or not, and who is certainly not deterred by stylistic awkwardness or grammatical errors. Of course, they might prefer to have output better than that presently provided by most MT systems, but if the only alternative option is no translation at all then machine translation is fully acceptable.

For the interchange of information, there may still in the future continue to be a role for the human translator in the translation of business correspondence (particularly if the content is sensitive or legally binding). But for the translation of personal letters, MT systems are likely to be increasingly used; and, for electronic mail and for the extraction of information from Web pages and computer-based information services, MT is the only feasible solution.

For spoken translation, by contrast, there will be a continuing market for the human translator. There is surely no prospect of automatic translation replacing the interpreter of diplomatic and business exchanges. Although there has been research on the computer translation of telephone enquiries within highly constrained domains, and future implementation can be envisaged in this area, for the bulk of telephone communication there is unlikely to ever be any substitute for human translation.

Finally, MT systems are opening up new areas where human translation has never featured: the production of draft versions for authors writing in a foreign language, who need assistance in producing an original text; the on-line translation of television subtitles, the translation of information from databases; and no doubt, more such new applications will appear in the future. In these areas, as in others mentioned, there is no threat to the human translator because they were never included in the sphere of professional translation. There is no doubt that MT and human translation can and will co-exist in harmony and without conflict.

Previous   Next...