(beta 公測版本)

粵典數據


簡介:

counts the number of characters in the corpus and return a JSON dictionary
including non-CJK characters

用嚟統計corpus裡面嘅字(唔係詞)嘅使用頻率。

Data License: public domain. Credits to words.hk appreciated.
授權:公有領域。

數據更新日期:2021年3月21日13:50:33

簡介:

Query database for all word representations, and count the number of occurrences, without trying to do segmentation on the article content

用嚟統計資料庫現有嘅詞嘅使用頻率。
請留意,呢個清單只包括粵文庫入面見過嘅詞,唔包括《粵典》有收錄但粵文庫冇出現過嘅詞。

Data License: public domain. Credits to words.hk appreciated.
授權:公有領域。

數據更新日期:2021年3月28日3:22:09

簡介:

Query database for all word jyutpings, and just dump it out by character if it seems valid.

可以用作{字=>粵拼}嘅數據

The list does not contain character variants that the system recognizes
呢個表冇收錄系統認得嘅異體字

Data License: public domain. Credits to words.hk appreciated.
授權:公有領域。

數據更新日期:2021年3月28日3:29:46

簡介:

Contains the wordlist and pronunciations of all entries recorded in the dictionary. Result will be a dictionary (pun not intended) of written characters as keys, and their respective list of possible pronunciations values. Words that contain variants (異體字) will have a "*"

粵典所收錄嘅詞表同佢哋嘅拼音。異體字會打粒「*」

Data License: public domain. Credits to words.hk appreciated.
授權:公有領域。

數據更新日期:2021年3月28日3:53:58