1 Branches

README.md

62f72f0171

Update README.md

7 months ago

DagsHub Storage

You have to be logged in to leave a comment.

Open Source Data Pipeline 🐶

Welcome to DagsHub’s Data Pipeline contribution project for Hacktoberfest 2023!

In this exciting Hacktoberfest challenge, DagsHub invites you to build data pipelines using DVC for automation and versioning of Open Source Machine Learning projects.

What is DagsHub?

DagsHub is a centralized platform to host and manage machine learning projects including code, data, models, experiments, annotations, model registry, and more! DagsHub does the MLOps heavy lifting for its users. Every repository comes with configured S3 storage, an experiment tracking server, and an annotation workspace - all using popular open-source tools like MLflow, DVC, Git, and Label Studio.

What's the Challenge?

DagsHub is excited to introduce the DVC Data Pipeline Contribution Challenge. In this challenge, we invite you to contribute DVC (Data Version Control) data pipelines to open-source projects on DagsHub. DVC pipelines are essential for efficiently managing, versioning, and sharing data workflows in machine learning and data science projects.

How Can You Participate?

Here's a step-by-step guide to get involved in this challenge:

Choose a Project: Explore open-source projects on DagsHub and select one that interests you. It can be any project that utilizes data pipelines or would benefit from one.
Create the DVC Pipeline: Fork the project under your name and using DVC, design and execute a data pipeline that suits the project's needs. Ensure it follows best practices for data versioning, reproducibility, and scalability.
Document Your Pipeline: As you build the pipeline, maintain clear and concise documentation describing its purpose, data sources, processing steps, and any dependencies. This documentation is crucial for future users and contributors and should be added to the project’s README file.
Tag your project: Add relevant tags to the repository and files including dvc,data-pipeline, hacktoberfest, and hacktoberfest-2023 labels to the DagsHub repository.
Submit Your Contribution: Open a Pull Request to the project on DagsHub.
Proof of Contribution: Open a Pull Request here with the README.md, dvc.yaml and dvc.lock files and a link to the DagsHub repo.

Why Join the Challenge?

Participating in the DagsHub DVC Data Pipeline Contribution Challenge offers numerous benefits:

Skill Enhancement: Sharpen your DVC skills and gain hands-on experience in creating robust data pipelines.
Collaborative Learning: Collaborate with open-source project maintainers and fellow contributors, expanding your network and knowledge.
Contribution to Open Source: Contribute to the open-source community by enhancing the data workflows of valuable projects.
Visibility: Showcase your expertise to a wider audience within the data science and machine learning community.

Tip!

Press p or to see the previous file or, n or to see the next file

README.md

Open Source Data Pipeline 🐶

What is DagsHub?

What's the Challenge?

How Can You Participate?

Why Join the Challenge?

Comments

Use Google Cloud Storage!

Specify your Google Storage bucket

Service Account Key

Congratulations!

Use AWS S3 as storage!

Specify your S3 bucket

Access key (If needed)

Congratulations!

Use any S3 compatible storage!

Specify your S3 bucket

Access key (If needed)

Congratulations!

Use Azure Cloud Storage!

Specify your Azure Storage bucket

Access key (If needed)

Congratulations!

DagsHub / open-source-data-pipeline connected to https://github.com/DagsHub/open-source-data-pipeline.git

README.md

Open Source Data Pipeline 🐶

What is DagsHub?

What's the Challenge?

How Can You Participate?

Why Join the Challenge?

Comments

Use Google Cloud Storage!

Specify your Google Storage bucket

Service Account Key

Congratulations!

Use AWS S3 as storage!

Specify your S3 bucket

Access key (If needed)

Congratulations!

Use any S3 compatible storage!

Specify your S3 bucket

Access key (If needed)

Congratulations!

Use Azure Cloud Storage!

Specify your Azure Storage bucket

Access key (If needed)

Congratulations!

DagsHub
/
open-source-data-pipeline
connected to https://github.com/DagsHub/open-source-data-pipeline.git