Tokenization
Split text into tokens
Part-of-Speech Tagging
Tag a word in a text with its part of speech
Named Entity Recognition
Detect span and type of entities in text
Dependency Parsing
Parse grammatical structure that defines the relationship between words into a tree
Constituency Parsing
Extract a constituency-based parse tree from a sentence
Semantic Dependency Parsing
Represent semantic relations between words as a graph
Semantic Role Labeling
Model predicate-argument structures of a sentence
Abstract Meaning Representation
Represent semantic relations between concepts as a graph
from hanlp_restful import HanLPClient HanLP = HanLPClient('https://www.hanlp.com/api', auth=None, language='mul') # Support en: English, zh: Chinese, ja: Japanese, mul: Multilingual HanLP('''In 2021, HanLPv2.1 delivers state-of-the-art multilingual NLP techniques to production environments. 2021年、HanLPv2.1は次世代の最先端多言語NLP技術を本番環境に導入します。 2021年 HanLPv2.1为生产环境带来次世代最先进的多语种NLP技术。''').pretty_print()
HanLP also provides native APIs that you can run on your local machine, preferably with a GPU.
import hanlp HanLP = hanlp.load(hanlp.pretrained.mtl.UD_ONTONOTES_TOK_POS_LEM_FEA_NER_SRL_DEP_SDP_CON_MMINILMV2L6) HanLP(['In 2021, HanLPv2.1 delivers state-of-the-art multilingual NLP techniques to production environments.', '2021年、HanLPv2.1は次世代の最先端多言語NLP技術を本番環境に導入します。', '2021年 HanLPv2.1为生产环境带来次世代最先进的多语种NLP技术。']).pretty_print()