Wrapping up the Papers with Code ML Reproducibility Challenge - Spring 2021

Nir Barazida
5 min read
4 years ago

MLOps Team Lead @ DagsHub

Table of Contents

Share This Article

The Spring 2021 edition of the ML Reproducibility Challenge is officially over, and we have some inspiring projects contributed by the DagsHub community to share!

Data Science Reproducibility is one of the core reasons DagsHub was established, and we're constantly developing new tools and integration to support a fully reproducible workflow. This is why we're so excited about the Papers with Code - Reproducibility Challenge and decided to support its participants for the second time (Spoiler Alert: we're also supporting the Fall 2021 Challenge!).

"Independent verification of data is a fundamental principle of scientific research across the disciplines. The self-correcting mechanisms of the scientific method depend on the ability of researchers to reproduce the findings of published studies in order to strengthen evidence and build upon existing work." Nature

In the Spring 2021 edition, we supported 3 teams that submitted fully open source and reproducible ML papers that you can now easily use! Before we dive into the projects, we want to give our kudos to Papers with Code for organizing the event and our community members who invested time and effort in reproducing papers and making them accessible.

So without further a due, I'd like to welcome the Reproduced Papers of Spring 2021 edition!

Contextual Decomposition Explanation Penalization

The paper
The repository

Contributors:

“For an explanation of a deep learning model to be effective, it must provide both insights into a model and suggest a corresponding action in order to achieve some objective. Too often, the litany of proposed explainable deep learning methods stop at the first step, providing practitioners with insight into a model, but no way to act on it.“ Authors of the paper

The paper proposes contextual decomposition explanation penalization, CDEP, which allows the usage of explanations to penalize the model during training, so that spurious correlations are not learned. CDEP decomposes the feature space into relevant and irrelevant features with the help of explanations and penalizes the model to look at the relevant features for classification. For example, In the ISIC Skin cancer classification task, the dataset contains bias images where the positive patients wear band-aids. By using CDEP, the model can be trained to ignore the band-aids feature, which is bias, and learn the correct features. In the repository, CDEP has been applied to different architectures across multiple modalities.

From the official paper: figure S4. Heatmaps for benign samples from ISIC

In this project, Shailesh, Azhar, and Midhush re-implemented the original PyTorch project in Tensorflow. It required them to write PyTorch’s `unpool` function from scratch in order to use it with Tensorflow. This later became a patch pushed to the Tensorflow Addons repository making them contribute to two open-source projects at the price of one!

Self-supervision for Few-shot Learning

The paper
The repository

Contributors:

In this paper, the researchers investigate the role of self-supervised learning (SSL) in the context of few-shot learning (FSL). Although recent research has shown the benefits of SSL on large unlabeled datasets, its utility on small datasets is relatively unexplored. They found that SSL reduces the relative error rate of few-shot meta-learners by 4%-27%, even when the datasets are small and only utilize images within the datasets.

From the official paper: Combining supervised and self-supervised losses for few-shot learning.

“We choose this paper because few-shot learning is an emerging and increasingly popular paradigm of machine learning, and self-supervised learning seems to be an easy way to get better performance in FSL with no bells and whistles.” Arjun and Haswanth

Arjun and Haswanth reproduced and verified the paper’s main results on five benchmark datasets, built on top of the author’s codebase. In addition, they implemented the domain selection algorithm from scratch and verified its benefits.

The paper used an image size of 224x224, which intrigued Arjun and Haswanth so they decided to research how it affects the model performance. They modified the image size to be 84x84, a well-used setting in FSL papers, along with a reduced architecture. They found that this setting was a failure which deteriorated the model’s performance.

Interpretable GAN Controls

The paper
The repository

Contributors

This paper describes a simple technique to analyze GANs models and create interpretable controls for image synthesis, such as change of viewpoint, aging, lighting, and time of day.

From the official paper: Figure 1: Sequences of image edits performed using control discovered with our method, applied to three different GANs.

Vishnu and Midhush used StyleGAN and StyleGAN2 models to reproduce the paper’s results. Both models work by computing the PCA of the outputs of the mapping network for a few sampled latent vectors. This gives the basis of the mapping network space, from which we can edit a new vector by varying the PCA coordinates. The augmented vector is then fed to the synthesis network to obtain an image with the modified attributes.

Vishnu and Midhush converted the original PyTorch implementation to Tensorflow and verified the claims presented in the paper. They trained the model with benchmark data sets used in the paper, such as FFHQ, LSUN Car, and CelebAHQ. To verify their results, they tested the model’s performances with datasets not used in the original paper, such as Beetles and Anime Portraits.

“Initially, we tried to recreate images with identical RGB values using the original PyTorch code and our modified code in Tensorflow. However, due to differences in the random number generators in PyTorch and Tensorflow, the random values were not the same even with the same seed. This resulted in minute differences in the background artifacts of some of the generated images. Once we identified this as the cause for the minor differences, we were able to plug in PyTorch’s random number generator in our Tensorflow implementation and successfully reproduce those images as well. Ultimately, we were able to verify all the claims pertaining to the StyleGAN and StyleGAN2 models.” Vishnu and Midhush

Summary

We want to thank all the amazing Data Scientists who took part in this challenge. The DagsHub team enjoyed working with each and every one of you and learned a lot in the process. You made a great impact on the community and moved us one step closer to Open Source Data Science. As mentioned, we are supporting the Fall 2021 edition of Papers with Code Reproducibility Challenge. If you want to take part and move the field of machine learning forward, go to the new guidelines page and join our Discord community to get started! The team is here to help!