The fundamental purpose of e-Discovery is to coordinate, organize and structure vast amounts of documents, emails, call records, transcripts, contracts and other legal materials into collections for easier reference and analysis. Much of this documentation may be in different languages, and characters that are used in one language may be totally foreign to others. Having an e-Discovery system that can handle multilingual text and find any information needed quickly and correctly ensures that your information request will comprehensively deliver all of the documents needed.

Linguistic platforms for the processing of multilingual documents for legal translation need scalable, specialized e-Discovery solutions that can handle several tasks:

Data Collection
The first step in accurate e-Discovery is to have a program capable of processing multiple languages and able to quickly pinpoint character sets of different languages, so the extracted text can be correctly converted from a particular character set into Unicode and thus, generate code pages for you when characters cannot be identified.

Data Filtering and Processing
Then, linguistics analysis capability is triggered in multilingual data, and the parts of speech are analyzed so they can be broken down properly for searching and filtering. This produces a more accurate and comprehensive analysis.

When Unicode is used, all characters in a language are assigned numbers. This means that having Unicode compliance in whatever e-Discovery multilingual methods you choose will automatically deliver higher accuracy. Languages do have non-Unicode characters, and that’s why code pages display substitute characters (like a question mark) for further research. When you choose methods that process data with Unicode compliance, as well as code pages, you are ensuring that you are using a comprehensive and powerful solution.

In terms of processing data, tokenization (the breaking of text into searchable keywords) is needed for some Asian languages that don’t use spaces to distinguish words and sentences. Tokenization involves identifying special characters such as blank spaces, commas, or periods and using them as separators between words. As an example, if a document contains “cats or dogs”, the tokenization looks for spaces and creates three keywords: “cats”, “or” and “dogs”. Make sure the e-Discovery solution you’ve selected is able to parse your language data into words and sentences to filter and search by using tokenization.

Reviewing Documents
Once the documents are filtered and processed, they remain in their native language, and it’s up to you to gather a team of native-speaking attorneys to review those documents and have them translated into English for review by English-speaking attorneys (or to use a combination of these methods).

When there are tight deadlines to be met, human legal translation may be too costly and not very feasible. Having machine legal translation software in e-Discovery solutions will enable you to translate documents quickly. In such cases, however, post-editing by a human linguist will be required in order to produce the clearest and most useable documentation.

Document Production
The crux of successful document production is to come to an agreement with your opposition regarding multilingual production matters. For example, all parties must agree to use multilingual documents in their original language, in English, or in both. In addition, they must agree on production formatting, ordering and sequencing.

Having an agreement in place beforehand that covers production language, formatting, ordering, and sequencing is critically important to producing your discovery plan and avoiding disputes and repetition of tasks further down the road. Incorporating flexible production methods in your e-Discovery solutions enables you to deliver in the agreed-upon production framework, thus avoiding costly and time-consuming changes in the future.

Your legal team can easily harness the powerful advantages of multilingual e-Discovery by familiarizing itself with the legal and technical aspects of the process as it has been outlined for you here. Consider partnering with an experienced e-Discovery service provider who possesses the technological tools and expertise needed to wade through the processes of multilingual e-Discovery when you are involved in intricate litigation, investigations, or compliance matters.


Related Posts:

7 Reasons Why Your Translation Review is Skidding off the Rails

To MT, or Not to MT? That is the Machine Translation Question

Self-Source, Crowd-Source, or Outsource?



Leave a Reply