io_util
io_util¶
- hanlp.utils.io_util.check_outdated(package='hanlp', version='2.1.1', repository_url='https://pypi.python.org/pypi/%s/json')[source]¶
Given the name of a package on PyPI and a version (both strings), checks if the given version is the latest version of the package available. Returns a 2-tuple (installed_version, latest_version) repository_url is a % style format string to use a different repository PyPI repository URL, e.g. test.pypi.org or a private repository. The string is formatted with the package name. Adopted from https://github.com/alexmojaki/outdated/blob/master/outdated/__init__.py
- Parameters
package – Package name.
version – Installed version string.
repository_url – URL on pypi.
- Returns
Parsed installed version and latest version.
- hanlp.utils.io_util.get_exitcode_stdout_stderr(cmd)[source]¶
Execute the external command and get its exitcode, stdout and stderr. See https://stackoverflow.com/a/21000308/3730690
- Parameters
cmd – Command.
- Returns
Exit code, stdout, stderr.
- hanlp.utils.io_util.get_resource(path: str, save_dir='/Users/hankcs/.hanlp', extract=True, prefix='https://file.hankcs.com/hanlp/', append_location=True, verbose=False)[source]¶
Fetch real (local) path for a resource (model, corpus, whatever) to
save_dir
.- Parameters
path – A local path (which will returned as is) or a remote URL (which will be downloaded, decompressed then returned).
save_dir – Where to store the resource (Default value =
hanlp.utils.io_util.hanlp_home()
)extract – Whether to unzip it if it’s a zip file (Default value = True)
prefix – A prefix when matched with an URL (path), then that URL is considered to be official. For official resources, they will not go to a folder called
thirdparty
underHANLP_HOME
.append_location – Whether to put unofficial files in a
thirdparty
folder.verbose – Whether to print log messages.
- Returns
The real path to the resource.
- hanlp.utils.io_util.hanlp_home()[source]¶
Home directory for HanLP resources.
- Returns
Data directory in the filesystem for storage, for example when downloading models.
This home directory can be customized with the following shell command or equivalent environment variable on Windows systems.
$ export HANLP_HOME=/data/hanlp
- hanlp.utils.io_util.hanlp_home_default()[source]¶
Default data directory depending on the platform and environment variables
- hanlp.utils.io_util.path_from_url(url, save_dir='/Users/hankcs/.hanlp', prefix='https://file.hankcs.com/hanlp/', append_location=True)[source]¶
Map a URL to a local path.
- Parameters
url – Remote URL.
save_dir – The root folder to save this file.
prefix – The prefix of official website. Any URLs starting with this prefix will be considered official.
append_location – Whether to put unofficial files in a
thirdparty
folder.
- Returns
The real path that this URL is mapped to.
- hanlp.utils.io_util.replace_ext(filepath, ext) str [source]¶
Replace the extension of filepath to ext.
- Parameters
filepath – Filepath to be replaced.
ext – Extension to replace.
- Returns
A new path.
- hanlp.utils.io_util.stdout_redirected(to='/dev/null', stdout=None)[source]¶
Redirect stdout to else where. Copied from https://stackoverflow.com/questions/4675728/redirect-stdout-to-a-file-in-python/22434262#22434262
- Parameters
to – Target device.
stdout – Source device.
- hanlp.utils.io_util.uncompress(path, dest=None, remove=True, verbose=False)[source]¶
Uncompress a file and clean up uncompressed files once an error is triggered.
- Parameters
path – The path to a compressed file
dest – The dest folder.
remove – Remove archive file after decompression.
verbose –
True
to print log message.
- Returns
Destination path.