biaffine_ner¶
Biaffine Named Entity Recognition.
-
class
hanlp.components.ner.biaffine_ner.biaffine_ner.
BiaffineNamedEntityRecognizer
(**kwargs)[source]¶ An implementation of Named Entity Recognition as Dependency Parsing (Yu et al. 2020). It treats every possible span as a candidate of entity and predicts its entity label. Non-entity spans are assigned NULL label to be excluded. The label prediction is done with a biaffine layer (Dozat & Manning 2017). As it makes no assumption about the spans, it naturally supports flat NER and nested NER.
- Parameters
**kwargs – Predefined config.
-
build_criterion
(**kwargs)[source]¶ Implement this method to build criterion (loss function).
- Parameters
**kwargs – The subclass decides the method signature.
-
build_dataloader
(data, batch_size, shuffle, device, logger: logging.Logger = None, vocabs=None, sampler_builder=None, gradient_accumulation=1, **kwargs) → torch.utils.data.dataloader.DataLoader[source]¶ Build dataloader for training, dev and test sets. It’s suggested to build vocabs in this method if they are not built yet.
- Parameters
data – Data representing samples, which can be a path or a list of samples.
batch_size – Number of samples per batch.
shuffle – Whether to shuffle this dataloader.
device – Device tensors should be loaded onto.
logger – Logger for reporting some message if dataloader takes a long time or if vocabs has to be built.
**kwargs – Arguments from
**self.config
.
-
build_metric
(**kwargs) → hanlp.metrics.f1.F1[source]¶ Implement this to build metric(s).
- Parameters
**kwargs – The subclass decides the method signature.
-
build_model
(training=True, **kwargs) → torch.nn.modules.module.Module[source]¶ Build model.
- Parameters
training –
True
if called during training.**kwargs –
**self.config
.
-
build_optimizer
(trn, epochs, lr, adam_epsilon, weight_decay, warmup_steps, transformer_lr, **kwargs)[source]¶ Implement this method to build an optimizer.
- Parameters
**kwargs – The subclass decides the method signature.
-
build_vocabs
(dataset, logger, vocabs, lock=True, label_vocab_name='label', **kwargs)[source]¶ Override this method to build vocabs.
- Parameters
trn – Training set.
logger – Logger for reporting progress.
-
evaluate_dataloader
(data: torch.utils.data.dataloader.DataLoader, criterion: Callable, metric, logger, ratio_width=None, output=False, **kwargs)[source]¶ Evaluate on a dataloader.
- Parameters
data – Dataloader which can build from any data source.
criterion – Loss function.
metric – Metric(s).
output – Whether to save outputs into some file.
**kwargs – Not used.
-
execute_training_loop
(trn: torch.utils.data.dataloader.DataLoader, dev: torch.utils.data.dataloader.DataLoader, epochs, criterion, optimizer, metric, save_dir, logger: logging.Logger, devices, gradient_accumulation=1, **kwargs)[source]¶ Implement this to run training loop.
- Parameters
trn – Training set.
dev – Development set.
epochs – Number of epochs.
criterion – Loss function.
optimizer – Optimizer(s).
metric – Metric(s)
save_dir – The directory to save this component.
logger – Logger for reporting progress.
devices – Devices this component and dataloader will live on.
ratio_width – The width of dataset size measured in number of characters. Used for logger to align messages.
**kwargs – Other hyper-parameters passed from sub-class.
-
fit
(trn_data, dev_data, save_dir, embed: hanlp.layers.embeddings.embedding.Embedding, context_layer, sampler='sorting', n_buckets=32, batch_size=50, lexical_dropout=0.5, ffnn_size=150, is_flat_ner=True, doc_level_offset=True, lr=0.001, transformer_lr=1e-05, adam_epsilon=1e-06, weight_decay=0.01, warmup_steps=0.1, grad_norm=5.0, epochs=50, loss_reduction='sum', gradient_accumulation=1, ret_tokens=True, tagset=None, sampler_builder=None, devices=None, logger=None, seed=None, **kwargs)[source]¶ - Parameters
trn_data – Path to training set.
dev_data – Path to dev set.
save_dir – The directory to save trained component.
embed – Embeddings to use.
context_layer – A contextualization layer (transformer or RNN).
sampler – Sampler to use.
n_buckets – Number of buckets to use in KMeans sampler.
batch_size – The number of samples in a batch.
lexical_dropout – Dropout applied to hidden states of context layer.
ffnn_size – Feedforward size for MLPs extracting the head/tail representations.
is_flat_ner –
True
for flat NER, otherwise nested NER.doc_level_offset –
True
to indicate the offsets injsonlines
are of document level.lr – Learning rate for decoder.
transformer_lr – Learning rate for encoder.
adam_epsilon – The epsilon to use in Adam.
weight_decay – The weight decay to use.
warmup_steps – The number of warmup steps.
grad_norm – Gradient norm for clipping.
epochs – The number of epochs to train.
loss_reduction – The loss reduction used in aggregating losses.
gradient_accumulation – Number of mini-batches per update step.
ret_tokens – A delimiter between tokens in entities so that the surface form of an entity can be rebuilt.
tagset – Optional tagset to prune entities outside of this tagset from datasets.
sampler_builder – The builder to build sampler, which will override batch_size.
devices – Devices this component will live on.
logger – Any
logging.Logger
instance.seed – Random seed to reproduce this training.
**kwargs – Not used.
- Returns
The best metrics on training set.
-
fit_dataloader
(trn: torch.utils.data.dataloader.DataLoader, criterion, optimizer, metric, logger: logging.Logger, linear_scheduler=None, history: hanlp.common.structure.History = None, gradient_accumulation=1, **kwargs)[source]¶ Fit onto a dataloader.
- Parameters
trn – Training set.
criterion – Loss function.
optimizer – Optimizer.
metric – Metric(s).
logger – Logger for reporting progress.
**kwargs – Other hyper-parameters passed from sub-class.
-
predict
(data: Union[List[str], List[List[str]]], batch_size: int = None, ret_tokens=True, **kwargs)[source]¶ Predict on data fed by user. Users shall avoid directly call this method since it is not guarded with
torch.no_grad
and will introduces unnecessary gradient computation. Use__call__
instead.- Parameters
data – Sentences or tokens.
batch_size – Decoding batch size.
**kwargs – Used in sub-classes.