Fuzzy Matching Makes Translation Cents

Fuzzy matching is a great way to save both time and money on your translation and localization projects. Fuzzy matching is most useful when you update your translated or localized content on a regular basis.

Most of the content you develop is not static, but will undergo regular updates, additions and changes. Updates may be complete revisions, but more often than not, they are modifications of the existing content. No matter how big or small the update, when dealing with localization, all other language versions of the content must be updated as well. Over time, this can get expensive, especially for your more dynamic content. Fuzzy matching is one way to save you precious resources on your multilingual projects.

Applying fuzzy matching to an existing Translation Memory can result in significant reductions in time and cost to localize each target language version.

What Is Fuzzy Matching?

Fuzzy matching is part of an analysis your language service provider performs on project quotes to more precisely determine the cost and time required to complete the project. For projects that are regularly updated, this is especially helpful to reduce both cost and turnaround time.

Fuzzy matching goes along with using Computer-Assisted Translation (CAT) tools to enhance your localization project. Fuzzy matching is another benefit that comes from using a Translation Memory. The Translation Memory (also called TM or translation database) stores language pair phrase segments from previous translations. As a feature of TM, fuzzy matching allows you to analyze prior translations or repeated content within a document to achieve cost and time savings.

When a localization project is submitted, the Translation Memory is searched for segment matches. Segments are phrases, lines, sections of text or sometimes even complete sentences. Each segment is evaluated and compared with the new content. A report is then returned detailing the text additions, deletions and modifications of the existing text. When fuzzy matching is applied to the project, the TM returns with information on new text where no match was found and fuzzy matches for edited or modified text. It also helps to remove deleted text.

Here is a quick overview of the various results that come from analyzing content against the Translation Memory.

No Match Or New Content

If no match is found for any segments within the Translation Memory, then the content is considered new text. When new content is discovered, it must be translated for all language versions. The language service provider is given notification of new content that requires full translation.

Deleted Content

When your updated source content is compared against the Translation Memory and portions of the previous version are missing, then it is considered deleted text. When deleted content is discovered, the TM helps with removing the deleted text.

Repeated Content

When the content is analyzed and there are exact repetitions of segments or phrases then they are used to reduce the overall word count, which in turn saves time and money. Repeated content is just as valuable as fuzzy matches because once the translation is initially performed, it will be considered an exact match each time the content repeats itself.

100% Perfect Match

When your project is compared with the existing Translation Memory, all content that is repeated word-for-word from existing translations will be given the 100% or perfect match designation. This content that has already been translated will not need to be translated again. If content within a large project is repeated word-for-word as an exact match, then it does not need to be retranslated. Finding 100% perfect matches reduces the overall word count of content that the linguists must translate, saving valuable time and reducing translation costs.

Fuzzy Match

Sometimes content analyzed against the Translation Memory is not a perfect match, but it is close. You may find content that varies from previous segments in your TM by only one or two words. In this case, it may be returned as a fuzzy match.

Fuzzy matches are typically returned on a scale from a 50% match all the way up to 99% matches. Below 50% is not considered reliable enough to return to the translator and is ignored. Typically you and your language service provider will agree ahead of time on what percent of fuzzy matches will be presented to your linguists when they are translating and at what percentage fuzzy matches will be rejected. For example, a fuzzy match of 50% will be much less helpful for your translator than a 90% match would be.

Just like new content, all fuzzy matches must be translated to incorporate the updates. By considering the percentage of the match of each segment, you can estimate the amount of time/resources needed to fully translate the text.

When a fuzzy match is found for a particular segment that the translator is working with, the TM suggests a translation that the translator can accept, reject or modify to meet the needs of the target language. This can help lower the per-word translation costs and speed up the turnaround times.

So as you can see, if you are looking to save time and resources during your next translation or localization project then fuzzy matching makes cents—or we should say saves cents off your total per word translation costs—and that just makes sense.

Have you used fuzzy matching for your localization or translation projects? Please tell us the risks and benefits in the comments below. Also, please like and share this blog socially, using the links above.

Language Scientific

Life Sciences Translation Services