WikiSum: Coherent Summarization Dataset for Efficient Human-Evaluation Dataset for Machine Learning
Install DagsHub:
pip install dagshub
To stream this data directly on DagsHub
from dagshub.streaming import DagsHubFilesystem
fs = DagsHubFilesystem(".", repo_url="https://dagshub.com/DagsHub-Datasets/wikisum-dataset")
fs.listdir("s3://wikisum")
Description
This dataset provides how-to articles from wikihow.com and their summaries, written as a coherent paragraph. The dataset itself is available at wikisum.zip, and contains the article, the summary, the wikihow url, and an official fold (train, val, or test). In addition, human evaluation results are available at wikisum-human-eval.zip. It consists of human evaluation of the summary of the Pegasus system, annotators response regarding the difficulty of the task, and words they marked as unknown.
Additional information
Documentation
Update frequency
Not currently being updated
Managed by
License
Dataset is published under CC-NC-SA-3.0.
Human evaluation is published under CC-SA-4.0.