Monday, August 15, 2016

Microsoft's free terminology website and unused data companies often don't know about (but pay for)

A few days ago, the blogster wrote about a simple viewer for translation memory exchange (.tmx) files.

The big picture was not explained: companies and institutions can spend many thousands or millions of dollars on translation services without ever making full use of the product.

How come?

Sure, customers like you or me will get a native language user interface, a translated online help, a neat manual for the greatest ever vacuum cleaner, an auto manual. Or sometimes a small slip of paper that makes the rounds on the web because its translations are hilarious or outright dangerous.

That is not what the blogster means by "product". A better term might be "intermediate product", the translation memory files (.tmx) or terminology exchange files (.tbx) that are created by the people and machines who perform the translation work.

The overwhelming majority of consumers of translation (i.e. companies/agencies paying for them) either do not know or do not care about the value of these "intermediate" resources.
These consumers won't make tmx or tbx data available for free within their own institution and even less so give them to the public without any strings attached.

Who cares whether the secretary who needs to write up a memo in a foreign language has access to a tmx that contains all the translations of the contracts and shipping documents the company has ever needed for that language?

Who gives a rat's something about the call center person who works in three languages, because it is cheaper, and struggles every day?

Wouldn't it be memorable if the engineers in the basement could communicate smoothly with their counterparts in a Japanese basement?

Oh, you send them to Google Translate or to Bing Translator because you didn't even realize that your company has subject specific, domain specific data that was produced by actual humans? Instead of a machine that confuses dork and dark, or does not quite understand that female and male parts in electrical engineering/electronics are not what WebMD serves up?

Or you use an awfully odd free smartphone travel app to figure out what that Chinese tourist you stuck into a German refugees shelter might have to say? This despite the fact that your bureaucracy is guaranteed to have a Chinese translation of "asylum" and one of "wallet theft" sitting around somewhere?

Microsoft has been one of the great exceptions for decades.

In the early days of the internet, you could download zipped "glossary" files from MS if you were patient and knew where to look.

Nowadays, Microsoft has a dedicated Language Portal for your terminology pleasure. Granted, the portal is not one of fastest sites of the Redmond giant, but choice makes up for that.

You can download bi-lingual terminology data (English plus the respective foreign language) for nearly 100 languages. Need a teaser before you accept anything out of Redmond for free?

How about Arabic, or Inuktitut, or Yoruba?

Note: Of course, you can argue that machines will take over soon, why bother? That's for another post.

[Update 8/16/2016] In an effort to be nice to the EU, the blogster decided to add this link to a page of the Terminology Coordination Unit of the European Parliament, which guides readers through the process of downloading their very own pieces of the world's largest language project ever.

No comments:

Post a Comment