T5 Summarisation Using Pytorch Lightning, DVC, DagsHub, and HuggingFace Spaces

62828bf0d0
Pipeline updates
2 months ago
53bf6a4308
Update sync_to_hub.yml
1 month ago
a5c504db2f
Update blog.md
2 weeks ago
7f8088c04d
Update t5s_new.ipynb
1 month ago
0842de02bb
Initial commit
2 months ago
d5f3e56cbd
New models
1 month ago
src
10b6f65b6d
Update hf_upload.py
1 month ago
t5s
5d7d48b619
Update cli.py
1 month ago
9f217b5cd6
Updates
2 months ago
0842de02bb
Initial commit
2 months ago
09be2fbea9
PR conflict resolution plus making HF upload more generic
1 month ago
0842de02bb
Initial commit
2 months ago
17 hours ago
b87a017a22
Update README.md
1 month ago
2c04c2ff00
Update app.py
1 month ago
4d55b3256d
Update data_params.yml
1 month ago
d5f3e56cbd
New models
1 month ago
ec2a2c2917
fix visualization stage.
1 month ago
ec3086fb0b
Update model_params.yml
1 month ago
98c628acef
Update requirements.txt
1 month ago
1a60ef6c9a
Update setup.py
1 month ago
d1aa7b9d88
added params
1 month ago
Data Pipeline
Legend
DVC Managed File
Git Managed File
Metric
Stage File
External File

README.md


title: t5s emoji: 💯 colorFrom: yellow colorTo: red sdk: streamlit app_file: app.py

pinned: false

t5s

pypi Version Downloads Streamlit App Open In Colab DAGSHub

T5 Summarisation Using Pytorch Lightning, DVC, DagsHub and HuggingFace Spaces

Here you will find the code for the project, but also the data, models, pipelines and experiments. This means that the project is easily reproducible on any machine, but also that you can contribute data, models, and code to it.

Have a great idea for how to improve the model? Want to add data and metrics to make it more explainable/fair? We'd love to get your help.

Installation

To use and run the DVC pipeline install the t5s package

pip install t5s

Usage

carbon (7)

Firstly we need to clone the repo containing the code so we can do that using:

t5s clone 

We would then have to create the required directories to run the pipeline

t5s dirs

Now to define the parameters for the run we have to run:

t5s start [-h] [-d DATASET] [-s SPLIT] [-n NAME] [-mt MODEL_TYPE]
                 [-m MODEL_NAME] [-e EPOCHS] [-lr LEARNING_RATE]
                 [-b BATCH_SIZE]

Then we need to pull the models from DVC

t5s pull

Now to run the training pipeline we can run:

t5s run

Before pushing make sure that the DVC remote is setup correctly:


dvc remote modify origin url https://dagshub.com/{user_name}/summarization.dvc
dvc remote modify origin --local auth basic
dvc remote modify origin --local user {user_name}
dvc remote modify origin --local password {your_token}

Finally to push the model to DVC

t5s push

To push this model to HuggingFace Hub for inference you can run:

t5s upload

Next if we would like to test the model and visualise the results we can run:

t5s visualize

And this would create a streamlit app for testing