resources¶
Chinese Treebank¶
CTB8¶
-
hanlp.datasets.parsing.ctb8.
CTB8_BRACKET_LINE_NOEC_TRAIN
= 'https://wakespace.lib.wfu.edu/bitstream/handle/10339/39379/LDC2013T21.tgz#data/tasks/par/train.noempty.txt'¶ Training set for ctb8 constituency parsing without empty categories.
-
hanlp.datasets.parsing.ctb8.
CTB8_BRACKET_LINE_NOEC_DEV
= 'https://wakespace.lib.wfu.edu/bitstream/handle/10339/39379/LDC2013T21.tgz#data/tasks/par/dev.noempty.txt'¶ Dev set for ctb8 constituency parsing without empty categories.
-
hanlp.datasets.parsing.ctb8.
CTB8_BRACKET_LINE_NOEC_TEST
= 'https://wakespace.lib.wfu.edu/bitstream/handle/10339/39379/LDC2013T21.tgz#data/tasks/par/test.noempty.txt'¶ Test set for ctb8 constituency parsing without empty categories.
CTB9¶
-
hanlp.datasets.parsing.ctb9.
CTB9_BRACKET_LINE_NOEC_TRAIN
= 'https://catalog.ldc.upenn.edu/LDC2016T13/ctb9.0_LDC2016T13.tgz#data/tasks/par/train.noempty.txt'¶ Training set for ctb9 constituency parsing without empty categories.
-
hanlp.datasets.parsing.ctb9.
CTB9_BRACKET_LINE_NOEC_DEV
= 'https://catalog.ldc.upenn.edu/LDC2016T13/ctb9.0_LDC2016T13.tgz#data/tasks/par/dev.noempty.txt'¶ Dev set for ctb9 constituency parsing without empty categories.
-
hanlp.datasets.parsing.ctb9.
CTB9_BRACKET_LINE_NOEC_TEST
= 'https://catalog.ldc.upenn.edu/LDC2016T13/ctb9.0_LDC2016T13.tgz#data/tasks/par/test.noempty.txt'¶ Test set for ctb9 constituency parsing without empty categories.
English Treebank¶
PTB¶
-
hanlp.datasets.parsing.ptb.
PTB_TRAIN
= 'https://github.com/KhalilMrini/LAL-Parser/archive/master.zip#data/02-21.10way.clean'¶ Training set of PTB without empty categories. PoS tags are automatically predicted using 10-fold jackknifing (Collins & Koo 2005).
-
hanlp.datasets.parsing.ptb.
PTB_DEV
= 'https://github.com/KhalilMrini/LAL-Parser/archive/master.zip#data/22.auto.clean'¶ Dev set of PTB without empty categories. PoS tags are automatically predicted using 10-fold jackknifing (Collins & Koo 2005).
-
hanlp.datasets.parsing.ptb.
PTB_TEST
= 'https://github.com/KhalilMrini/LAL-Parser/archive/master.zip#data/23.auto.clean'¶ Test set of PTB without empty categories. PoS tags are automatically predicted using 10-fold jackknifing (Collins & Koo 2005).