RPPP – Reddit Post Popularity Predictor
A project with two goals:
1. Given a Reddit post, predict how popular it's going to be (what it's score will be)
2. Showcasing a remote working file system with DVC

Puneetha Pai cb58a280f3 Change: commit back only when dvc.lock file changes 1 month ago
.dvc 3caabdf8f5 Fix: remove unused dvc remote google cache 2 months ago
data 7a9631b1dd Refactor: reorder files 2 months ago
models c1ea478718 Clean: param and metrics logging 2 months ago
src 58bf4bbc0c Fix Build: lint error and Jenkins file syntax change 2 months ago
test 20786d2e70 Fix: unused pytest import 2 months ago
.dockerignore e26c2233ad Add: Jenkins pipeline definition 2 months ago
.dvcignore e26c2233ad Add: Jenkins pipeline definition 2 months ago
.gitattributes 3ad4fd1944 Update: Pipeline once 2 months ago
.gitignore e48b1b5101 Remove: unused commands 1 month ago
CONTRIBUTING.md d9735a5dd8 Add contributing guide 10 months ago
Dockerfile e48b1b5101 Remove: unused commands 1 month ago
Jenkinsfile cb58a280f3 Change: commit back only when dvc.lock file changes 1 month ago
README.md 0aa625a470 Update 'README.md' 10 months ago
blog.md 2fc1e9512a Add: final PR reveiw usecase 1 month ago
dvc.lock 07408cd10a Fix Build: Add actual dvc.lock file 2 months ago
dvc.yaml c1ea478718 Clean: param and metrics logging 2 months ago
params.yaml e48b1b5101 Remove: unused commands 1 month ago
remote-wfs-setup.md cac18be035 Add 'remote-wfs-setup.md' 10 months ago
requirements.txt fd3dbf59f7 Fix: add dvc to requirements 2 months ago

Data Pipeline

Legend
DVC Managed File
Git Managed File
Metric
Stage File
External File

README.md

RPPP - Reddit Post Popularity Predictor

This Project attempts to predict whether a reddit submission will be popular or not according to it's features.

We currently provide models for r/MachineLearning only, base on submission title and body.

DVC Remote Working File System

This project is also an exploration of DVC remote WFS workflow. To setup your remote WFS – read here: Remote WFS Setup

Contributing

Contributions Are Very Welcome!

Read the Contribution Guide for more information.

Ideas to work on:

  • Combine textual and numerical classifier into one model!
  • Add UI to test if your post is going to be successful!
  • Add MOAR data! (other subreddits, more from r/ML)
  • Improve model performance (there is a lotttt to improve)!
  • Add memes: Add MOAR MEMES