constituency_dataset

constituency_dataset

class hanlp.datasets.parsing.loaders.constituency_dataset.ConstituencyDataset(data: Union[str, List], transform: Optional[Union[Callable, List]] = None, cache=None, generate_idx=None)[source]

A Dataset which can be applied with a list of transform functions.

Parameters
  • data – The local or remote path to a dataset, or a list of samples where each sample is a dict.

  • transform – Predefined transform(s).

  • cacheTrue to enable caching, so that transforms won’t be called twice.

  • generate_idx – Create a IDX field for each sample to store its order in dataset. Useful for prediction when samples are re-ordered by a sampler.

load_file(filepath: str)[source]

The actual file loading logic.

Parameters

filepath – The path to a dataset.