pinyin-dec: Pinyin Formatting

pinyin-dec (“Pinyin Decorate”) is a small command-line utility program for converting numbered Pinyin text to its proper form with diacritics. You can use it as a program (its main purpose), but you can also import its program module and use its internals as a programming library.

This software was written because it is still very common for people to resort to using Pinyin with tone numbers (i.e. ASCII text), simply because typing the diacritics is inconvenient. With this software, the solution is reduced to a simple command-line program that can do it for you automatically.

pinyin-dec is open-source software written in Python 3, and available under the MIT License. The basic form is the following:

Usage: pinyin-dec [options] [string ...]

Decorate Pinyin text with its proper diacritics.

Options:
  -h, --help       print this help message and exit
  -v, --verbose    include information useful for debugging

Examples

A simple way to use pinyin-dec is like this:

$ pinyin-dec han4 yu3 pin1 yin1
hàn yǔ pīn yīn

Another example:

$ pinyin-dec chai2 mi3 you2 yan2 jiang4 cu4 cha2
chái mǐ yóu yán jiàng cù chá

It can also handle capitalized and conjoined Pinyin automatically:

$ pinyin-dec Han4yu3 Pin1yin1
Hànyǔ Pīnyīn

It knows about the v and u: conventions for ü, and converts them accordingly:

$ pinyin-dec NV3 nv3 NU:3 nu:3 LV4 lv4 LU:4 lu:4
NǙ nǚ NǙ nǚ LǛ lǜ LǛ lǜ

It ignores English words and other things that don't look like Pinyin:

$ pinyin-dec "She is a nv3han4zi."
She is a nǚhànzi.

It leaves punctuation alone too, as you can see:

$ pinyin-dec 'Confucius is "Kong3fu1zi3." Mencius is "Meng4zi3."'
Confucius is "Kǒngfūzǐ." Mencius is "Mèngzǐ."

If no arguments are given, it just reads text from the standard input:

$ echo 'Sha1 ji1 xia4 hou2.' | pinyin-dec
Shā jī xià hóu.

It also allows you to enter text and get the results one line at a time:

$ pinyin-dec
chi1 bu4 dao4 pu2 tao5 shuo1 pu2 tao5 suan1
chī bù dào pú tao shuō pú tao suān

If you are a programmer, you can also use the program internals from any Python program by importing the program module and calling its functions:

>>> import pinyin_dec
>>> pinyin_dec.fix_pinyin('No zuo4 no die.')
'No zuò no die.'

Installation

Installation on a Unix type platform is advised, like Unix, Linux, BSD, or OS X. If you are using Windows, then installing in a Cygwin environment is recommended. First you must have Python 3 installed. Then to install pinyin-dec, you can just call pip3 to download and install the software:

# pip3 install pinyin-dec

If you do not have pip3 installed on your system, you can download the software manually and then install it using the old method:

# python3 setup.py install

Documentation

pinyin-dec includes a Unix manual page (“manpage”), which is installed with the software. You can type “man pinyin-dec” to review the function provided by the software, how to use it, and other program information.

Web tool

For those who just want to try the functionality in pinyin-dec, you can use the pinyin-dec Web tool. This form allows you to input whatever text you like, and get back formatted Pinyin as the result. Using this method is not as fast or as flexible as having the program installed on your own system, but some people may not need all the functionality in pinyin-dec, and only need occasional Pinyin formatting.

Resources

^ top