rank_srl
rank_srl¶
Span Ranking Semantic Role Labeling.
- class hanlp.components.mtl.tasks.srl.rank_srl.SpanRankingSemanticRoleLabeling(trn: Optional[str] = None, dev: Optional[str] = None, tst: Optional[str] = None, sampler_builder: Optional[hanlp.common.dataset.SamplerBuilder] = None, dependencies: Optional[str] = None, scalar_mix: Optional[hanlp.layers.scalar_mix.ScalarMixWithDropoutBuilder] = None, use_raw_hidden_states=False, lr=0.001, separate_optimizer=False, lexical_dropout=0.5, dropout=0.2, span_width_feature_size=20, ffnn_size=150, ffnn_depth=2, argument_ratio=0.8, predicate_ratio=0.4, max_arg_width=30, mlp_label_size=100, enforce_srl_constraint=False, use_gold_predicates=False, doc_level_offset=True, use_biaffine=False, loss_reduction='mean', with_argument=' ', **kwargs)[source]¶
An implementation of “Jointly Predicting Predicates and Arguments in Neural Semantic Role Labeling” (He et al. 2018b). It generates candidates triples of (predicate, arg_start, arg_end) and rank them.
- Parameters
trn – Path to training set.
dev – Path to dev set.
tst – Path to test set.
sampler_builder – A builder which builds a sampler.
dependencies – Its dependencies on other tasks.
scalar_mix – A builder which builds a ScalarMixWithDropout object.
use_raw_hidden_states – Whether to use raw hidden states from transformer without any pooling.
lr – Learning rate for this task.
separate_optimizer – Use customized separate optimizer for this task.
lexical_dropout – Dropout applied to hidden states of encoder.
dropout – Dropout used for other layers except the encoder.
span_width_feature_size – Span width feature size.
ffnn_size – Feedforward size.
ffnn_depth – Number of layers of feedforward MLPs.
argument_ratio – Ratio of candidate arguments over number of tokens.
predicate_ratio – Ratio of candidate predicates over number of tokens.
max_arg_width – Maximum argument width.
mlp_label_size – Feature size for label representation.
enforce_srl_constraint – Enforce SRL constraints (number of core ARGs etc.).
use_gold_predicates – Use gold predicates instead of predicting them.
doc_level_offset –
True
to indicate the offsets injsonlines
are of document level.use_biaffine –
True
to use biaffine (Dozat & Manning 2017) instead of lineary layer for label prediction.loss_reduction – The loss reduction used in aggregating losses.
with_argument – The delimiter between tokens in arguments to be used for joining tokens for outputs.
**kwargs – Not used.
- build_criterion(**kwargs)[source]¶
Implement this method to build criterion (loss function).
- Parameters
**kwargs – The subclass decides the method signature.
- build_dataloader(data, transform: Optional[Callable] = None, training=False, device=None, logger: Optional[logging.Logger] = None, gradient_accumulation=1, **kwargs) torch.utils.data.dataloader.DataLoader [source]¶
Build a dataloader for training or evaluation.
- Parameters
data – Either a path or a list of samples.
transform – The transform from MTL, which is usually [TransformerSequenceTokenizer, FieldLength(‘token’)]
training – Whether this method is called on training set.
device – The device dataloader is intended to work with.
logger – Logger for printing message indicating progress.
cache – Whether the dataloader should be cached.
gradient_accumulation – Gradient accumulation to be passed to sampler builder.
**kwargs – Additional experimental arguments.
- build_metric(**kwargs)[source]¶
Implement this to build metric(s).
- Parameters
**kwargs – The subclass decides the method signature.