Install DagsHub:
pip install dagshub
To stream this data directly on DagsHub
from dagshub.streaming import DagsHubFilesystem
fs = DagsHubFilesystem(".", repo_url="https://dagshub.com/DagsHub-Datasets/cotonoha-dic-dataset")
fs.listdir("s3://cotonoha-dic")
Description
Japanese Tokenizer Dictionaries for use with MeCab.
Additional information
Documentation
This dataset includes dictionaries for tokenization and morphological
analysis of Japanese for use with MeCab. This includes NINJAL’s UniDic, a
modified smaller version of UniDic for situations that require it, and the
legacy IPADic dictionary.
Update frequency
Infrequently (typically less than once a year)
Managed by
Cotonoha
License
Versions of Unidic offered here are available under the GPL/LGPL/BSD license.
IPADic is offered under a unique BSD-like license. See below.
https://github.com/polm/ipadic-py/blob/master/ipadic/dicdir/COPYING