英粵對照表 English index

簡介:
This data set exposes the index used to search English words in the dictionary. The index is built by mapping the words seen in each entry's English explanation to its Cantonese word.

This may be useful for English->Cantonese translation purposes

The "English" terms are normalized (to US spelling variant using https://github.com/en-wl/wordlist/blob/master/varcon/README ). If they are prefixed with "!" it means that they are stemmed with PorterStemmer (see http://www.tartarus.org/~martin/PorterStemmer for implementations).

The score number in each entry is the estimate of how important the English term is for the definition of the Cantonese word using some form of tf–idf. The formula is #magic and probably will change anyways. The range of score values is 0-100, but we have limited the dataset to >40 to reduce noise.

Data License: public domain. Credits to words.hk appreciated.
原始數據 JSON 格式 | CSV 格式 -- 最後更新:2022年5月14日10:59:37
SHOW DATA IN BROWSER