tok

hanlp.pretrained.tok.COARSE_ELECTRA_SMALL_ZH = 'https://file.hankcs.com/hanlp/tok/coarse_electra_small_zh_20210603_112321.zip'

Electra (Clark et al. 2020) small model trained on coarse-grained CWS corpora. Its performance is P=97.08% R=96.94% F1=97.01% which is much higher than that of MTL model

hanlp.pretrained.tok.CTB6_CONVSEG = 'https://file.hankcs.com/hanlp/tok/ctb6_convseg_nowe_nocrf_20200110_004046.zip'

Conv model (Wang & Xu 2017) trained on CTB6 dataset.

hanlp.pretrained.tok.LARGE_ALBERT_BASE = 'https://file.hankcs.com/hanlp/tok/large_cws_albert_base_20200828_011451.zip'

ALBERT model (Lan et al. 2020) trained on the largest CWS dataset in the world.

hanlp.pretrained.tok.PKU_NAME_MERGED_SIX_MONTHS_CONVSEG = 'https://file.hankcs.com/hanlp/tok/pku98_6m_conv_ngram_20200110_134736.zip'

Conv model (Wang & Xu 2017) trained on pku98 six months dataset with name merged into one unit.

hanlp.pretrained.tok.SIGHAN2005_MSR_CONVSEG = 'https://file.hankcs.com/hanlp/tok/convseg-msr-nocrf-noembed_20200110_153524.zip'

Conv model (Wang & Xu 2017) trained on sighan2005 msr dataset.

hanlp.pretrained.tok.SIGHAN2005_PKU_BERT_BASE_ZH = 'https://file.hankcs.com/hanlp/tok/sighan2005_pku_bert_base_zh_20201231_141130.zip'

BERT model (Devlin et al. 2019) trained on sighan2005 pku dataset.

hanlp.pretrained.tok.SIGHAN2005_PKU_CONVSEG = 'https://file.hankcs.com/hanlp/tok/sighan2005-pku-convseg_20200110_153722.zip'

Conv model (Wang & Xu 2017) trained on sighan2005 pku dataset.