When people work smarter, and work together, they can build awesome things.
Data scientists are struggling with this.
The basis for frictionless collaboration is the ability to understand what others have done, and the ability to pick up where they left off.
DAGsHub is a web platform based on open source tools, optimized for data science and oriented towards the open source community.
It is a central location where projects can be hosted, discovered, and collaborated on.
Leverage the same best practices used in software engineering to get more done. DAGsHub helps
automate your workflow, so you can focus on work instead of coordination.
Share intermediate results from your pipeline with any collaborator, instead of re-running all the pre-processing code, and re-run only parts of the pipeline that you changed.
Ease of use also means it takes less time to get new team members and collaborators caught up and contributing.
Reproduce any experiment from previous versions of your data pipeline. It’s as simple as switching a
branch and running a single command.
Experiment with different hyper parameters in parallel. Each run will automagically save its results in an organized and easy to access location.
Maintaining giant spreadsheets of hyper parameters and inconsistent naming conventions are a
thing of the past.
DVC tracks data & code versions. Every time you run a stage in your pipeline, your data is saved automatically, so you don’t need to worry about accidentally ruining your data.
You can visualize your pipeline (as DAGs) and see how it changes over time, and get any version of your model with a click.
DAGsHub is built on DVC,
an open source version control system for machine learning projects, which works seamlessly with
We are completely agnostic to your ecosystem, programming language and libraries. Fully modular and awesome!
Also, we are completely free for open source projects!