PAGE – 1 ============
When not to use MT and other translation toolsTerence LewisAbstractThe letter “L” often crops up in such d ecision-making. Love, li terature, lawthese are some of the areas wh ere translation tools, partic ularly MT programs might struggle to convey meaning in a useful way. Another “L” word is length; when your system is lexically challenged, the length of the document to be translated will be a key factor in de termining whether the time sp ent entering new terms will be repaid. Creativity is another consideration. Some times the brief will be to “live and breathe” the environment of the target audience, to create a “new original”, to “surpass the original”. Even a good MT program will tie the translator’s hands then; revisers tinker, cut an d paste, and scr ub out complete se ntences only when the program has made a complete mess. Translation memory tools regurgitate old translations; not much help if freshness and spontaneity are what is wanted. This intervention will take a – hopefully – entertaining l ook at what we may or may not expect our translation tools to deliver. Hook and Hatton Ltd.Hook and Hatton Ltd is a family-owne d company providing specialist language services. Over the past four years it has concentrated on supplying MT services via e-mail. It delivers Dutch-E nglish translations with its own MT program and also provides German-English and French-English MT output using third-party programs. Hook & Hatton Ltd34 Central AvenueWhitehills Northampton NN2 8DZ Telephone: + 44 1604 847278Fax:+ 44 1604 821486 34EAMT Workshop, Copenhagen, May 1997

PAGE – 2 ============
When not to use MT and other translation toolsDiscussion paper presented by Terence Lewis Hook & Hatton Ltd The question we are asking is when not to us e translation tools. The first answer might be : when they provide unusa ble output. Usability, of co urse, is relative – it varies according to the purpose of the document, bot h for the author a nd the reader. Thesepurposes may differ: a document that expre sses a burning issue for the author may be considered irrelevant by the r eader. Or a discarded draft da shed off by a specialist may be an essential piece in a pa rticular reader’s in tellectual jigsaw pu zzle. There are many cheap-and-cheerful translation packages on the market. Th e following translation wasproduced by one of them: German: In jedem Fall ist eine Abstraktion und Reduzierung der Objektinformation vorzunehmen. Eine de rartige Objektbeschreibung wird als Modellierung bezeichnet. Möglichke iten zur Datenakquisition bieten Meßmaschinen, Methoden der Photogrammetrie und geodätische Verfahren.English: An abstraction and re duction of object information is to be carried out always. A such object description is designated as a mo deling. Possibilitiesfor the data acquisition offer measuring machines, means of photogrammetry and geodesic procedures. Figure 1This is an example of very poor MT output that is still of some informational value to a reader with no knowledge of the source langua ge. If that reader finds it gives him the knowledge he is seeking, he should use the MT software in question; it does not, he should leave it on the shelf.The above piece of information may fill a gap in somebody’s knowledge. We should probably decide against using translation tools if they leave large gaps in our knowledge. A text of foreign specialist te rms punctuated with translated determiners, verbs and adjectives is hardly going to enlighten and inform . This degree of inadequacy is less likely in a controlled or restricted domain. Here, it is nearly always possible to generate useful machine translation output. By definition there is near-total lexical, syntactic and semantic coverage. Weat her forecasts, avalanche bulletins, job descriptions; motorway traffic flow: in al l these areas, few terms are involved and theoutput is likely to be reliable and valuable. Can we get usef ul output if the input file is outside the domain? Can we cr eate customised dictionari es? Does the grammar provide coverage of all possible syntax? That depe nds on how the system has been designed. EAMT Workshop, Copenhagen, May 199735

PAGE – 3 ============
There would be little point in submitting an aircraft maintenan ce manual to the TAUM METEO system.In practice, use of any MT program does not make much sense if the lexical deficit is disproportionately large in rela tion to the length of the text. If we have a text of, say, less than 5 pages with more than 30 unknow n words it is usually more cost-effective to have the text translated by a human translator than to go through the business of inputting all those new entries. But that’s a judgement call. Sometimes, if the subject is likely to crop up again the dictionary update time can be treated as a sound investment. Most purchasers of MT programs are lik ely to use their soft ware in a general translation environment. Experience has sh own us that there are number of disciplines where we are better off using other approach es to translation than MT. Not a few of these seem to begin with the letter “L”. If we take a basic PC translation package straight out of the box it will give a varyingly useful tran slation of a technical text. Legal texts, however, seem to provide problems for all syst ems that are not specifically designed to deal with them. Figure 2 exemplifies a number of these problems. English original: Agreements providing for both joint research and development and joint exploitations of the results may fall within Article 85 because the parties jointly determine how the products developed are manufactured or the processes developed are applied or how related intellectual property rights or know-how are exploited.German output: (Übereinstimmungen, die für das beiden Gelenk sorgen, forschen, und die Entwicklung und exploitat ions von den Ergebnissen kö nnen innerhalb Artikel 85 fallen, weil gemeinsam di e Parteien entscheiden, wie die Erzeugnisse, die entwickelt werden, hergestellt oder di e Prozesse, die entwickelt werden, angewandt sind oder wie verwandte geistige Eigentumsrechte oder verwandtes geistiges Know-how ausgebeutet sind. French output:Les accords fournissant les deux développement et recherche conjointe et exploitations conjointes des résultats peuvent tomber dans L’article 85 parce que les fêtes déterminent conjointeme nt comment les produits développées sont fabriquées ou les procédés dévelo ppées sont appliquées ou comment liées des droits intellectuels de propriété ou connues – comment sont exploitées. Figure 2Sentences including phrases such as “how the products develope d are manufactured orthe processes developed are applied” would confound most off-the-shelf German-English packages, although the Fren ch-English package used here gets that right after completely misreading “parties “. In fact, if we look at the above example we see that different sentences cause different problems for the two languages. Of course, the ridiculous German translation of “joint research and development” could easily be avoided by coding the four words as a single entity. Generally speaking legal texts36EAMT Workshop, Copenhagen, May 1997

PAGE – 4 ============
contain a multiplicity of clause s, often without any commas. This causes problems when translating into or from languages in which th e verb is thrown to the end of the clause after a relative pronoun or a conjunction. These problems are highlighted by Figure 3. English original: The vendor who may not be an alien sha ll not deposit the proceeds of the sale in an account held outside the national territory save in such circumstances as may be determined in sub-section 3 to which reference is also made in chapter 4 under the heading “Recogni zed accounts” and only subject to satisfactory verification of full compliance wi th the statutory currency transferrequirements by all bodies duly appointe d for this purpose by the regulatory authority or any other agen cy having the powers to appoint bodies to verify compliance with the statut ory currency transfer requirements or any other pertinent requirements.German MT output Der Verkäufer der darf sein Ausländer ni cht sitze die Erlöse des Verkaufs auf einem Konto gehalten au ßerhalb des nationalen Territoriums sparen in solchen Umständen genau so kann entscheide in abteilung 3 dazu welcher Hinweis auch gemacht in Kapitel 4 unter der Übers chrift erkannte Konten und einziges Thema zur zufr iedenstellenden Belegung von vo llem Einhalten den gesetzlichen Währungs Übertragungs Anforderungen von allen Körpern ordnungsmäßig ernannt hierzu Zweck durch die regulative Autorität oder jegliche andere Vermittlung hat die Mächte ernennen Körper Einhalten die gesetzlichen Währungs Anfo rderungen oder AnforderungenFigure 3The lack of commas also ma kes semantic disambiguation mo re difficult in the above text.Of course, the combination of MT and TM approaches would certainly reduce likelihood of errors. Strings such as “shall not deposit the proceed s of the sale” and “save in such circumstances” could be entered into th e translation archive and would then be delivered up as fixed entities ; infelicities such as “sparen in solchen Umständen” would be systematically avoided. In our experience, this is certainly the most cost-effective way to “machine-translate” large volumes of legal text. With legal translation it is often a question of buildi ng a bridge between two quite different legal systems. False friends can very easily send the reader off down the wrong path. The distinction between the Spanis h Escritura de compraventa and the contrato privado de compravent a is quite alien to English law. The idea that you can legally purchase a property under a private agreement wh ile the title deed remains in someone else’s name would strike most Engl ish lawyers as a pre posterous proposition. Yet it is common practice in rural Spain. So even if an MT system got the translation of EAMT Workshop, Copenhagen, May 199737

PAGE – 5 ============
a term right we have to ask ourselves what use the translat ion would be to a reader who did not understand these differences between English and Spanish legal systems. Another aspect of legal tr anslation is that sometimes much hangs upon the precise interpretation of a single worl d. In view of this, one would have to question the usefulness of a computer-translated legal text unless it were thoroughly revised by somebody with a knowledge of the two legal systems involved.Anyway, law is boring and it’s the middle of th e afternoon. So let’s ge t onto love or lust – both “L” words. Reading recen tly about the Systran/Seiko strategic alliance aimed at developing a new generation of handheld translators – I just love the name handheld translators – I had the imag e of this hapless young man wandering around the night clubs of Europe clutching his brand-new Systran/Seiko handheld. A conversation something like this might take place:Englishman abroad: Hi baby, you look fantas tic. Let’s find somewhere quiet where we can get to know each other better. I lo ve your eyes. You’ve got really great hair too. I think you’re really sexy. Well-known English-French MT program: Hi bébé, vous regardez fantastiques. Le permettre trouver apaiser quelque part où nous pouvons obtenir connaî tre chacun autre meilleur. J’aime vos yeux. Vous avez obtenu réellement gran de chevelure trop. Je vous pense êtes réellement sexys. Well-known English-German MT program: Hallo baby, Sie phantastisc h aussehen. Let’s finden irgendwo ruhig, wo wir können bekommen, einander besser zu wissen. Ich lieb e Ihre Augen. Sie haben wirklich großes Haar auch bekommen. Ic h glaube, daß Sie wirklich sexy sind. Figure 4Perhaps the result would be so amusing that it might actually serve as a conversationstarter!(Author’s note: This slide was intended as a humorous interlude in my ta lk; I have left itin the write-up as it serves to make a serious point about the unsuitability of manymodes of linguistic discourse for computer processing). Talking of bars or night clubs makes me think of cross-dr essing and that brings up the problem of cross-disciplinary documents. Such would be a legal document concerning highly specialised photog rammetric equipment to be used in geological surveys of rock faces. If the MT system does not allow the user to select at least three customisable subject dictionaries, the computer version of such a document woul d be of questionable value and the translation would be more usefully entrusted completely to a human translator. We have talked a little about domains and disciplines. Anot her factor that plays a part in the decision on whether to us e MT or any other translati on tool is the purpose of the document and the purpose of the translat ion. A translation may face numerous 38EAMT Workshop, Copenhagen, May 1997

PAGE – 6 ============
challenges and constraints:Ł The translation may be needed to assess the creativity of the original.Ł A creative translation may be required. Ł The translation should surpass the original.Ł The translation will become the new original. Ł Cultural bridging is required.Let us look at each of these points. If we need to assess th e creativity of the original – say a sales brochure, it’s unlik ely that the computer transla tion is going to capture this. Only a human translator would be able to recognise creativity a nd interpret it for thetarget language. Then, the tr anslation itself may need to be creative. It may need to relate to the target audience, live and br eathe the culture of th e target audience. No matter how good or bad a comput er translation is, it will shape the final draft of the document. In fact, the better it is, the less likely it is that the reviser will make any drastic changes. In our experience, the goo d machine translation seduces the reviser into complacency. If the tr anslation is grammatically a nd semantically correct, the MT reviser is loath to scrub it out and recast it sentence by sentence.Another thing the MT program cannot do is to assess how much of what is assumed in the source text needs to be brought out or clarified for the target audience.Figure 5 shows a translation generated by a computer program followed by a translationproduced the corporate translation departme nt of the company concerned. The first difference we notice is in the t itle: The “Call on the Vecht” is a literal translation of the Dutch. The human translator probably felt that the target audi ence outside Holland would not necessarily know that the Vecht is a rive r so he wrote “Call from the banks of the Vecht”.There is nothing wrong with “Nearly a centur y later”, but “Now, nearly a century later” is far more engaging and somehow establis hes a rapport with th e reader. I happen to know that the human translation was produced by a Dutchman. As an Englishman, if I were starting the translation from scratch I would have departed far more radically from the source text while still retaining the overall meaning. If, on the other hand, I were given the MT output, I would be inclined to do little more than tinker with it. The result would be a grammatically correct, wood en piece of prose whic h would fail to fulfil the purposes invested in the Dutch original by that company’s PR department. Dutch original:Lokroep aan de Vecht DSM Andeno Maarssen: traditie en ambitieDe historie van de lokati e in Maarssen (voorheen ACF Chemie) gaat terug tot EAMT Workshop, Copenhagen, May 199739

PAGE – 8 ============
contains a large number of terms which would have to be explained in detail to the target audience, it is probably more us eful to have it translated by a human translator. On the other hand, if we were designing a system to be used exclusively for legal translations such explanations and di stinctions would idea lly be included in our sub-language dictionary module. Situational decision-making is as critical when assigning tasks to a set of tools as it is when giving out jobs in the translation tool.Basically, however, it all come s down to money. If Deep Bl ue can search through the ramifications of hundreds of thousands of ch ess moves in seconds, it is possible to search corpora containing all possible combinations of a limited number of words and select the statistically most appropriate solutions to a particular lexical or semantic problems. Why hasn’t anyone built a commercial program to do this? There seem to be faster ways to make a buck. (Author: This was another as ide in my talk, which I have included here although it does not contribute a nything to my general argument). In work environments, translation managers have to make the most cost-effective use of the translator’s time. If a translator can produce 1000 words of fully checked text an hour using MT or TM to generate a draft document a nd is costing me , his employer, around £20 an hour, and I can sell that text at the rate of £150 per thousand words, that translation tool is giving me a gross margin of £130 per 1000 words. When considering whether to use or not to use translation tools in a professional translation environment we have to decide whether: – a translator with a workbench tool – a translator revi sing MT output – or a translator who knows the subject and us es dictation software is going to produce the greatest quantity of acceptable translation at the lowest cost.We come back to the questi on of what kind of translat ion is needed. Even poor MT output can be rendered usable with enough pos t-editing. If the customer will pay £50 per thousand words for lightly revised computer output and the translator can post-edit at a rate of 2000 words an hour, I am grossi ng less than when I can sell “ready-to-use translation” at £150 per t housand words. But if the cu stomer will pay £15/1000 for pretty raw computer output an d the system can produce 20 ,000 words an hour, that is far more profitable. At the end of the day, the question of when and when not to use MT comes down to what the customer wants and what he’s willing to pay for.EAMT Workshop, Copenhagen, May 199741

