resources

Chinese Treebank

CTB8

hanlp.datasets.parsing.ctb8.CTB8_BRACKET_LINE_NOEC_TRAIN = 'https://wakespace.lib.wfu.edu/bitstream/handle/10339/39379/LDC2013T21.tgz#data/tasks/par/train.noempty.txt'

Training set for ctb8 constituency parsing without empty categories.

hanlp.datasets.parsing.ctb8.CTB8_BRACKET_LINE_NOEC_DEV = 'https://wakespace.lib.wfu.edu/bitstream/handle/10339/39379/LDC2013T21.tgz#data/tasks/par/dev.noempty.txt'

Dev set for ctb8 constituency parsing without empty categories.

hanlp.datasets.parsing.ctb8.CTB8_BRACKET_LINE_NOEC_TEST = 'https://wakespace.lib.wfu.edu/bitstream/handle/10339/39379/LDC2013T21.tgz#data/tasks/par/test.noempty.txt'

Test set for ctb8 constituency parsing without empty categories.

CTB9

hanlp.datasets.parsing.ctb9.CTB9_BRACKET_LINE_NOEC_TRAIN = 'https://catalog.ldc.upenn.edu/LDC2016T13/ctb9.0_LDC2016T13.tgz#data/tasks/par/train.noempty.txt'

Training set for ctb9 constituency parsing without empty categories.

hanlp.datasets.parsing.ctb9.CTB9_BRACKET_LINE_NOEC_DEV = 'https://catalog.ldc.upenn.edu/LDC2016T13/ctb9.0_LDC2016T13.tgz#data/tasks/par/dev.noempty.txt'

Dev set for ctb9 constituency parsing without empty categories.

hanlp.datasets.parsing.ctb9.CTB9_BRACKET_LINE_NOEC_TEST = 'https://catalog.ldc.upenn.edu/LDC2016T13/ctb9.0_LDC2016T13.tgz#data/tasks/par/test.noempty.txt'

Test set for ctb9 constituency parsing without empty categories.

English Treebank

PTB

hanlp.datasets.parsing.ptb.PTB_TRAIN = 'https://github.com/KhalilMrini/LAL-Parser/archive/master.zip#data/02-21.10way.clean'

Training set of PTB without empty categories. PoS tags are automatically predicted using 10-fold jackknifing (Collins & Koo 2005).

hanlp.datasets.parsing.ptb.PTB_DEV = 'https://github.com/KhalilMrini/LAL-Parser/archive/master.zip#data/22.auto.clean'

Dev set of PTB without empty categories. PoS tags are automatically predicted using 10-fold jackknifing (Collins & Koo 2005).

hanlp.datasets.parsing.ptb.PTB_TEST = 'https://github.com/KhalilMrini/LAL-Parser/archive/master.zip#data/23.auto.clean'

Test set of PTB without empty categories. PoS tags are automatically predicted using 10-fold jackknifing (Collins & Koo 2005).