source code

Fanyi is a translation script that employs the free Microsoft Translator API in order to translate a newline-separated text file, line by line, into almost any language. I wrote it to facilitate an internationalization proof-of-concept. Of course it isn’t meant for production use since the quality of the translations will be poor.

The script allows for a custom list of excluded words or phrases, so you can exempt from translation your brand names, HTML entities, and other untranslatable strings. Since each line is treated as a single entity, excluded strings are skipped, then merged back in.

Example

Translate a three-line file into Chinese, excluding HTML entities and two brand names.

Input:

Acme company product home page
Click <b>here</b> to login
Read about Acme's new product, the Cornballer!

Output:

Acme 公司产品主页
点击<b>这里</b>登录
阅读Acme的新产品, Cornballer!

Usage Synopsis

Single file conversion:

  perl fanyi.pl input_filename.txt output_filename.txt

Multiple file conversion recursively through a directory tree:

  find /path/to/target_directory -type f -exec sh -c '
  perl /path/to/fanyi.pl $0 $0.new &&
  mv $0.new $0
  ' {} \;