ML Reproducibility Challenge - Fall 2022

Megan Sternecker
3 min read
3 years ago

Table of Contents

Share This Article

The Papers with Code Reproducibility Challenge is back for its SIXTH edition and so is DagsHub’s Sponsorship!

DagsHub is excited to sponsor the fall edition of the ML Reproducibility Challenge (MLRC) 2022 by compensating participants $500 per paper reproduced! (see specific details below)

DagsHub Reproducibility – Supporting the challenge

DagsHub was sparked by the idea of machine learning reproducibility. When machine learning research and projects are reproducible and accessible - they help advances the entire ML field. This is why we care so much about supporting and promoting the Papers with Code Reproducibility Challenge.

We are excited to announce, that for the third time(!), we will support participants taking on the reproducibility challenge and hope this encourages more teams to join and have a positive lasting impact on the field.

The challenge has been opened to support papers presented at the top ML conferences: NeurIPS 2022 , ICML 2022 , ICLR 2022 , ACL 2022 , EMNLP 2022 , CVPR 2022 , ECCV 2022 , AAAI 2022 , IJCAI-ECAI 2022 , ACM FAccT 2022 , SIGIR 2022 . Plus, this year’s challenge supports papers published in top ML journals in 2022, including JMR , TACL and TMLR .

The goal here is that these papers are reproducible, in a verifiable and reliable way, by having the code, data, models, and experiments tracked in DagsHub.

Supporting you:

We've created a channel on our Discord Community , just for this challenge. To incentivize community members to spend their time on this challenge, and since many papers require expensive computing resources to reproduce, we are offering participants $500 per paper reproduced as long as it meets the guidelines .

What you get

Internal satisfaction and experience of having reproduced a scientific paper
Your paper featured within the DagsHub community and homepage
A great project to showcase on your resumé
$500 per paper (reproduced according to the below instructions)

How do I actually participate?

See Reproducibility Challenge for more details on the criteria, and how to prepare and submit your report.

Select a paper presented at one of the conferences listed above and aim to replicate the main claim described in the paper
Join our Discord community's #ml-reproducibility channel.
Use existing code released by the authors, or create a minimal implementation, to verify the main claims and results made by the paper.
Use DagsHub to track your code, data, models, and experiments.
- You should use DagsHub storage to host your data, models, and artifacts to make sure everything is open-source and reproducible. If you require large amounts of storage (over 10GB), reach out to us first to make sure we accommodate you.
- Use MLflow to log experiments and artifacts to the DagsHub remote MLflow server. Register models based on your preference but make sure the best version of your model is registered under Final.
Fill in a Reproducibility Report and commit it as part of your code to DagsHub. If you're not sure how to fill it in, here is an article that will help.
Write a summary of your findings and ablation studies on the DagsHub Report page. Please address the following topics:
- Was the paper reproducible?
- Did you uncover any key insights?
- (If applicable) Describe the data set - source, data type, distribution, etc.
- (If applicable) Describe the data processing method.
- Describe the model - architecture, performances, etc.
Ensure you meet the ML Reproducibility criteria .
Submit your findings to the following places:
- To the reproducibility challenge (make a submission)
- Create an issue here with a link to your DagsHub project.
We will contact you to review your DagsHub submission.
- Papers accepted by the challenge review qualify automatically
- Papers not accepted might qualify at DagsHub's discretion
Selected authors will have their work featured on the DagsHub community and homepage.
Authors will also be invited to share their work at a dedicated event.

Timeline for the challenge

See Reproducibility Challenge for more details.

Challenge Starts: August 18th, 2022
Final submission deadline: February 3rd, 2023 (11:49PM AOE)
- You can submit your report at any time before the deadline.
- Since writing the report takes time, and you might want to refine your results, final submissions to DagsHub will be accepted up to two weeks after the challenge deadline (February 20th, 2023)
Authors notified for compensation: May 29th, 2023

Receiving the award

We will award every participant who meets the above-listed guidelines $500 towards computing costs. Exact details regarding the form and method of compensation will be communicated with eligible participants.

If you are new to DagsHub

DagsHub is like GitHub for machine learning and data projects. A place where data teams can work and host their project components – code, data, models, experiments, and pipelines under one roof. Utilizing DagsHub's capabilities enables us to reproduce results easily. Here are some of the wonders you can work with DagsHub:

Host and version data, models, and artifacts with Git & DVC
Track experiments with either Git or MLflow
Automate training and deployment with Jenkins
Direct Data Access - interact with your DVC-versioned data via an API to stream and upload it.

Get Started