Buddhist texts and resources for the cultivation path


三藏 Sanzang: CJK Machine Translation

[Dharma Wheel]

Sanzang is a compact and simple cross-platform machine translation system. It is especially useful for translating from the CJK languages (Chinese, Japanese, and Korean), and even from ancient and otherwise difficult texts. Unlike most other machine translation systems, Sanzang is small and approachable. Any user can develop his or her own translation rules, and rules are just stored in a text file and applied at runtime. To learn more, check out our Introduction to Sanzang.

Demo: Sanzang on the Web

This is a simplified and limited Web interface for Sanzang, that is useful for demonstration, or for short amounts of text (one or two fascicles of classical Buddhist texts). Using this, you can try out some basic Sanzang functionality without needing to download or install any software. For anything beyond this very basic usage, the Sanzang Utils package below is more suitable.

sanzang-utils ZIP | sanzang-lib ZIP | sanzang-tables ZIP

Sanzang Utils is the set of tools implementing the Sanzang translation system. This package also contains related tools for editing translation tables and formatting CJK text. This software is written in Python 3, and is released as free and open-source software under the MIT License. Full documentation is included. You can also see the Sanzang Utils Tutorial, which teaches you how to install and use Sanzang Utils.

Sanzang Lib is a Python module, or library, that contains core functions of the Sanzang translation system. If you are a programmer, you can use this software to integrate Sanzang functions into your own programs. This software is written in Python 3, and it is free and open-source software released under the MIT License. Note: you only need this if you are a programmer and you want to write special programs.

Sanzang Tables is a package containing our current set of translation rules for translating classical texts from the Taishō Tripiṭaka into English. These translation rules can be used by the Sanzang Utils programs, which can apply the rules to generate translations. The development of these translation rules is a long-term project. Our aim is to have a fairly reliable translation table in the future which will ease reading and translation of the Chinese Buddhist canon.

The old program (you probably do not want this)

This is the legacy Sanzang program, which has been superceded by Sanzang Utils (above). The old Sanzang program is distributed in RubyGem format, and it is hosted on RubyGems.org. Ruby 1.9 or later is required, and it is licensed under the GNU GPL. Full documentation is included. This remains here for users who wish to continue using the old Ruby implementation of Sanzang.

Project Status

The Sanzang translation engine is ready for use. Running on a mid-range PC with a translation table of over 6000 rules, Sanzang Utils can generate translation listing files for the entire CBETA standard corpus (Taishō volumes 1-55, and 85) in less than 10 minutes. The next major phase is the development of a larger and more reliable translation table.

^ top