
Machine Translation: Is good enough, good enough for you?
Perhaps surprisingly, Machine Translation (MT) is not a recent development, but it is actually over 60 years old. In 1949, Warren Weaver, a U.S. mathematician and information technologist, wrote a memorandum entitled Translation, that initiated research interests and government funding in MT. Today, MT has become commercially viable under certain circumstances, yet it is also still a research topic among many academics. It is offered with varying degrees of sophistication by a number of commercial MT developers and language service providers (LSPs) who utilize them. MT can provide value in the localization process, but expectations must be realistic.
What is Machine Translation?
Machine Translation, also known as “Automated Translation,” is a computer application of translating one human language (the source language) into another (the target language). Early systems translated on a word-by-word basis and when the word orders of the source and target languages differed, the translations were practically useless. To improve translations, MT developers included grammatical analyses of source and target languages, resulting in what is known as Rule-based MT. To eliminate the labor-intensive construction of grammars, researchers more recently developed a new breed of MT. This new breed of MT, referred to as Statistical MT, uses statistics to determine the best translations among many options.
Rule-based MT vs. Statistical-based MT
The two MT architectures that dominate the market are Rule-based MT (RbMT) and Statistical MT (SMT). RbMT systems determine the linguistic structure of the sentences in the source language such as their subjects, verbs, and objects, the order of these constituents, as well as their semantics. These source language constituents are then mapped to their corresponding words and phrases. Moreover, an RbMT system will reorganize the order of the source language phrases to meet the grammatical order of the target language. For instance, an RbMT system will translate a subject-verb-object sentence in English to a subject-object-verb sentence in German. RbMT systems are heavily based on grammatical processing.
In contrast to RbMT, SMT does not rely on the grammatical structures of source and target languages. Instead, SMT takes large bilingual text databases, such as, translation memories (TMs), “trains” on them and produces a database of source words/phrases, their translations, and the likelihood (the statistics) of the correctness of these translations. In the translation process, SMT consults its databases and finds the best translation. No grammar development is required.
The Potential of MT
MT offers a reduction in localization time and entry into new markets.
- Decreased localization time - Humans translate about 2,500 words per DAY. In contrast, depending on the MT system and hardware used, MT systems can conservatively translate over 5,000 words per MINUTE. This output can be significantly higher with the right system configuration. Faster localization times mean faster to market with products and services.
- Entry into new markets/locales – MT systems can offer entry into new markets and locales where translations have not existed previously by offering quick, rough translations (“gists”). The options are full-scale human translations or none at all.
- New products and services – Through streamlining the localization process with MT, companies can introduce new localized products and services. Options again are full-scale human translations or none at all.
Caveats of MT
Buyers of MT must recognize that MT is not synonymous to an “easy button” for free and perfect translations. Certain requirements must be met and realistic expectations need to be set.
Before engaging in MT, consider these essential requirements:
- Deploying MT is not a stand-alone effort by your LSP first and foremost it requires executive buy-in from your company. That means serious strategic and financial commitments that go beyond an immediate ROI. (Before investing in MT, contact Medialocate to determine whether your company is a good candidate for this approach)
- Initial costs for setup – Implementation costs for MT can be high. There are costs for evaluation licenses, terminology development for customization, evaluation and deployment, including the full vendor licensing fees, and maintenance costs.
- Extensive corporate- or domain-specific terminology – While systems come with general dictionaries, both RbMT and SMT need large, reviewed terminological databases to improve MT accuracy. These databases must already exist or be developed.
- Large sets of training texts – For SMT, large sets of source documents and their approved translations are needed for training. The larger the set, the better – 2 million words are a good start.
- No one-stop shopping – MT developers usually provide different sets of language pairs – some more than others. However, do not expect that one MT engine will accommodate all the needed language pairs for your organization.
MT success is not automatic and there are a number of significant challenges:
- Little or no buy-in from your own managers – Localization managers may have had a bad experience with free, on-line translation sites such as Google or Babelfish. These concerns could be reduced with a small pilot project showing the benefits of MT.
- Resistance from translators – Human translators may be resistant to the introduction of MT. However, an important localization model is “MT + human post editing” where MT does the mundane work and human translators do the creative work of post editing and polishing the translations.
- Realistic expectations – Users of MT should not expect 100% perfect translations. Depending on the application and the suitability of your source, go for the gisting approach to MT or for the MT-post editing model.
- Right applications – One aspect for MT to be successful is the right application. Systematically organized technical documentation and other well-written documents may be good sources, whereas marketing material and legal documents are not the right choice.
- ROI – Determine your ROI with MT compared to your other localization costs. Does MT with post-editing in fact save money and time when compared to human localization? The answer is not an automatic “yes”.
Summary
While MT may not be effective for all, it is worth exploring. Set your expectations, survey the various MT engines, determine which match your needs, do an evaluation on your data determining accuracy and ROI, and go from there.
While that may sound easy, it is not. If you need assistance on your MT journey, Medialocate is uniquely positioned to help you along the way. For more information on the subject, or to schedule an “MT Feasibility Assessment” specific to your company, please contact us at info@medialocate.com.

