
Open Chinese Convert (OpenCC, 開放中文轉換) is an open source project for conversions between Traditional Chinese, Simplified Chinese and Japanese Kanji (Shinjitai). It supports character-level and phrase-level conversion, character variant handling, and regional vocabulary variants across Mainland China, Taiwan and Hong Kong. This is not a translation tool between Mandarin and Cantonese, etc.
中文簡繁轉換開源項目,支持詞彙級別的轉換、異體字轉換和地區習慣用詞轉換(中國大陸、台灣、香港)及日本新字體轉換。不提供普通話與粵語之間的轉換。
Discussion (Telegram): https://t.me/open_chinese_convert
詳情參閱OpenCC 設計思想。
winget install opencc 命令可直接安裝 opencc.exe 應用程式,含 Jieba 分詞插件npm install -g opencc 命令可安裝 OpenCC Node.js CLInpm install -g opencc opencc-jieba 命令可同時安裝 OpenCC Node.js CLI 及 Jieba 分詞插件https://opencc.js.org/converter?config=s2t
npm install opencc
The npm package supports Node.js >=20.17. It uses bundled Node-API
prebuilds when available and falls back to a local node-gyp build when the
current platform does not have a matching prebuild.
To install the npm CLI:
npm install -g opencc
opencc -c s2t.json -i input.txt -o output.txt
The npm CLI supports basic text conversion. Plugins, --inspect, and
--segmentation require the native OpenCC CLI.
import { OpenCC } from 'opencc';
async function main() {
const converter: OpenCC = new OpenCC('s2t.json');
const result: string = await converter.convertPromise('汉字');
console.log(result); // 漢字
}
See demo.js and ts-demo.ts.
pip install opencc (Windows, Linux, macOS)
import opencc
converter = opencc.OpenCC('s2t.json')
converter.convert('汉字') # 漢字
#include "opencc.h"
int main() {
const opencc::SimpleConverter converter("s2t.json");
converter.Convert("汉字"); // 漢字
return 0;
}
#include "opencc.h"
int main() {
opencc_t opencc = opencc_open("s2t.json");
const char* input = "汉字";
char* converted = opencc_convert_utf8(opencc, input, strlen(input)); // 漢字
opencc_convert_utf8_free(converted);
opencc_close(opencc);
return 0;
}
opencc --helpopencc_dict --helpOpenCC CLI supports two diagnostic modes that output JSON instead of converted text:
--segmentation — Output segmentation result only (no conversion):
echo "他只看了几行日志,就一叶知秋,猜到整个系统是数据库连接池出了问题" | opencc -c s2twp.json --segmentation
# {"input":"他只看了几行日志,就一叶知秋,猜到整个系统是数据库连接池出了问题","segments":["他","只看","了几行","日志",",就","一叶知秋",",猜到","整个","系统","是","数据库","连接池","出了","问题"]}
--inspect — Output full inspection result (segmentation + per-stage conversion + final output):
echo "他只看了几行日志,就一叶知秋,猜到整个系统是数据库连接池出了问题" | opencc -c s2twp.json --inspect
# {"input":"他只看了几行日志,就一叶知秋,猜到整个系统是数据库连接池出了问题","segments":["他","只看","了几行","日志",",就","一叶知秋",",猜到","整个","系统","是","数据库","连接池","出了","问题"],"stages":[{"index":1,"segments":["他","只看","了幾行","日誌",",就","一葉知秋",",猜到","整個","系統","是","數據庫","連接池","出了","問題"]},{"index":2,"segments":["他","只看","了幾行","日誌",",就","一葉知秋",",猜到","整個","系統","是","資料庫","連線池","出了","問題"]},{"index":3,"segments":["他","只看","了幾行","日誌",",就","一葉知秋",",猜到","整個","系統","是","資料庫","連線池","出了","問題"]}],"output":"他只看了幾行日誌,就一葉知秋,猜到整個系統是資料庫連線池出了問題"}
# Pretty-print with jq:
echo "他只看了几行日志,就一叶知秋,猜到整个系统是数据库连接池出了问题" | opencc -c s2twp.json --inspect | jq .
These modes are useful for diagnosing conversion issues:
--segmentation to verify that the input is segmented as expected.--inspect to see which conversion stage produces an unexpected result.Rules:
--segmentation and --inspect are mutually exclusive.s2t.json Simplified Chinese to Traditional Chinese (OpenCC Standard) / 簡體 到 OpenCC 標準繁體t2s.json Traditional Chinese (OpenCC Standard) to Simplified Chinese / OpenCC 標準繁體 到 簡體s2tw.json Simplified Chinese to Traditional Chinese (Taiwan Standard) / 簡體 到 台灣正體tw2s.json Traditional Chinese (Taiwan Standard) to Simplified Chinese / 台灣正體 到 簡體s2hk.json Simplified Chinese to Traditional Chinese (Hong Kong variant) / 簡體 到 香港繁體hk2s.json Traditional Chinese (Hong Kong variant) to Simplified Chinese / 香港繁體 到 簡體s2twp.json Simplified Chinese to Traditional Chinese (Taiwan Standard, with Taiwan Phrases) / 簡體 到 台灣正體(含台灣常用詞彙)tw2sp.json Traditional Chinese (Taiwan Standard) to Simplified Chinese (Mainland China Phrases) / 台灣正體 到 簡體(含中國大陸常用詞彙)t2tw.json Traditional Chinese (OpenCC Standard) to Traditional Chinese (Taiwan Standard) / OpenCC 標準繁體 到 台灣正體tw2t.json Traditional Chinese (Taiwan Standard) to Traditional Chinese (OpenCC Standard) / 台灣正體 到 OpenCC 標準繁體t2hk.json Traditional Chinese (OpenCC Standard) to Traditional Chinese (Hong Kong variant) / OpenCC 標準繁體 到 香港繁體hk2t.json Traditional Chinese (Hong Kong variant) to Traditional Chinese (OpenCC Standard) / 香港繁體 到 OpenCC 標準繁體t2jp.json Traditional Chinese Characters (Kyūjitai) to New Japanese Kanji (Shinjitai) / OpenCC 標準繁體(日文舊字體) 到 日文新字體jp2t.json New Japanese Kanji (Shinjitai) to Traditional Chinese Characters (Kyūjitai) / 日文新字體 到 OpenCC 標準繁體(日文舊字體)通过环境变量OPENCC_DATA_DIR加载指定路径下的配置文件
OPENCC_DATA_DIR=/path/to/your/config/dir opencc --help
OpenCC 現已支援外部 C++ 分詞插件。當前第一個插件為 opencc-jieba,
可通過 s2t_jieba.json、s2tw_jieba.json、s2hk_jieba.json、
s2twp_jieba.json、tw2sp_jieba.json 等插件配置啓用。
OpenCC now supports external C++ segmentation plugins. The first plugin is
opencc-jieba, which can be enabled through plugin-backed configs such as
s2t_jieba.json, s2tw_jieba.json, s2hk_jieba.json,
s2twp_jieba.json, and tw2sp_jieba.json.
注意:
jieba 插件是可選組件,預設 OpenCC 構建、Python 套件和 Node.js 套件都不要求它。opencc-jieba 額外依賴 cppjieba 及其配套詞典資源,這些依賴僅在構建或分發該插件時需要。Notes:
jieba plugin is optional and is not required for the default OpenCC
build, Python package, or Node.js package.opencc-jieba additionally depends on cppjieba and its dictionary
resources. These dependencies are only needed when building or distributing
the plugin itself.g++ 4.6+ or clang 3.2+ is required.
make
build.cmd
bazel build //:opencc
make test
test.cmd
bazel test --test_output=all //src/... //data/... //python/... //test/...
make benchmark
詳情見 doc/benchmark.md 檔案。
Please update if your project is using OpenCC.
Apache License 2.0
opencc-jieba plugin.opencc-jieba 插件使用的可選依賴。opencc, opencc-js 与 opencc-wasm 三个 NPM packages 區別的說明
https://github.com/nk2028/opencc-js/blob/HEAD/README-zh-TW.md#%E8%88%87-opencc-npm-package-%E7%9A%84%E5%8D%80%E5%88%A5Please feel free to update this list if you have contributed OpenCC.
1.2.0 +1.8y2026-05-18 | |
1.1.82024-07-29 |