粵典數據
counts the number of characters in the corpus and return a JSON dictionary
including non-CJK characters
用嚟統計corpus裡面嘅字(唔係詞)嘅使用頻率。
Data License: public domain. Credits to words.hk appreciated.
授權:公有領域。
Query database for all word representations, and count the number of occurrences, without trying to do segmentation on the article content
用嚟統計資料庫現有嘅詞嘅使用頻率。
請留意,呢個清單只包括粵文庫入面見過嘅詞,唔包括《粵典》有收錄但粵文庫冇出現過嘅詞。
Data License: public domain. Credits to words.hk appreciated.
授權:公有領域。
Query database for all word jyutpings, and just dump it out by character if it seems valid.
可以用作{字=>粵拼}嘅數據
The list does not contain character variants that the system recognizes
呢個表冇收錄系統認得嘅異體字
Data License: public domain. Credits to words.hk appreciated.
授權:公有領域。
Contains the wordlist and pronunciations of all entries recorded in the dictionary. Result will be a dictionary (pun not intended) of written characters as keys, and their respective list of possible pronunciations values. Words that contain variants (異體字) will have a "*"
粵典所收錄嘅詞表同佢哋嘅拼音。異體字會打粒「*」
Data License: public domain. Credits to words.hk appreciated.
授權:公有領域。