Fastai community entry to 2020 Reproducibility Challenge

Arto 6ed7cfb3fc removed leaking links 1 month ago
.dvc e27ee527d8 Add model versioning (#122) 2 months ago
.github 7693fbea27 Initial commit 4 months ago
data
docs 6ed7cfb3fc removed leaking links 1 month ago
experiment_results
nbs 6ed7cfb3fc removed leaking links 1 month ago
reformer_fastai 38587071cf tidies up docs 2 months ago
.devcontainer.json 7693fbea27 Initial commit 4 months ago
.dvcignore e27ee527d8 Add model versioning (#122) 2 months ago
.gitignore e27ee527d8 Add model versioning (#122) 2 months ago
ADDING_EXPERIMENT_RESULTS.md e27ee527d8 Add model versioning (#122) 2 months ago
CONTRIBUTING.md c439688d02 added nbdev workflow instructions to CONTRIBUTING.md 4 months ago
IMPLEMENTATION_NOTES.md c0d368ef42 readme upd 4 months ago
LICENSE 9ad1e3ae6d Update LICENSE 4 months ago
MANIFEST.in 7693fbea27 Initial commit 4 months ago
Makefile c0d368ef42 readme upd 4 months ago
README.md 6ed7cfb3fc removed leaking links 1 month ago
data.dvc e27ee527d8 Add model versioning (#122) 2 months ago
dev-requirements.txt e27ee527d8 Add model versioning (#122) 2 months ago
distrib.py 50aa5ac4b8 fix distrib2 3 months ago
distrib2.py 50aa5ac4b8 fix distrib2 3 months ago
docker-compose.yml 7693fbea27 Initial commit 4 months ago
experiment_results.dvc e27ee527d8 Add model versioning (#122) 2 months ago
expscript.py c22a3852ec expscript w direct import 2 months ago
settings.ini 78b3b8c468 Adds Experiment Commands docs, re-arranged side bar, added einops>=0.3 requirement, small docs additions 2 months ago
setup.py b116f87a52 fix mit license classifier for pipy 2 months ago

Data Pipeline

Legend
DVC Managed File
Git Managed File
Metric
Stage File
External File

README.md

Reformer Reproducibility Experiments

Entry to 2020 Papers With Code Reproducibility Challenge

Our Reproducibility Challenge Submission

  • Our OpenReview paper submission to the challenge can be found here

Installation

Setup

If you don't already, its a good idea to install the package into a virtual environment

python3 -m venv my_env
source ./my_env/bin/activate

Install

Then you can install the package via pip:

pip install reformer-fastai

Contributing

This project used nbdev for all development, see their docs here to install nbdev and get started. Once you have nbdev installed we suggest you follow the suggested contributor workflow

Running Experiments

A pip installed version of this library is needed to run experiments. All experiments are run using the run_exp command, followed by the particular task name and then the parameters related to that task. run_exp --help will display a list of all parameters as well as a brief description. For brevity, an example of how to run a Reformer Language Model experiment is show below, a list of all experiment commands can be found here

Example: Reversible Language Model

Below is an example of the code used that generated the results in Section 4.4 "Effect of reversible layers" of our submission paper.

run_exp "lm_rev" \
        --n_epochs=10 \
        --bs=2 \
        --max_seq_len=4096 \
        --grad_accum=8 \
        --save_model=True  \
        --clip=0.5 \
        --seed=444 \
        --precision=2 \
        --do_wandb_logging=False \

Hyperparameters Used

The main hyperparameters used are documented in the Experiment Commands page and the Experiment Configs page.

Results

All full description of our results, including charts and tables can be found in our paper here on OpenReview. Our results are summarised as follows:

Claims around speed on longer sequences and reduced memory footprint were validated; as sequence length increased, Locality Sensitive Hashing ("LSH") Attention became faster and increasing the number of hashes improved performance. We could not achieve the performance of a traditional Transformer with Reformer. Some experiments were not run for as long as in the paper due to a lack of computational resources. Potentially the under-performance of our Reformer may be due to under-training, implementation differences or nuances in JAX vs Pytorch. Also, exploding gradients were encountered with mixed precision training and several model settings were found to be unstable depending on the random seed or learning rate.

Resources

Author's Code and Resources

More Code

Data

Tokenizers used with these datasets can be found here

enwik8

WMT14

Explainers

Related