DAGsHub-Official:main
from
DAGsHub-Official:sign-up-aware-menu
A: DAGsHub is a web platform for data version control and collaboration for data scientists and machine learning engineers.
A: It’s like GitHub for data science and machine learning.
####Q: Why can’t I just use Git? A: Basically, regular Git is not so good at versioning large files, which is important for many data science and machine learning projects.
git-lfs is an extension to git that can be used to version large files, but that's only half of the problem.
Git and git-lfs don't version the data pipeline. This means that when something in your pipeline is modified, you won't know that the end of the pipeline (e.g. the trained model) should be reproduced.
A: The short answer, YES.
The longer answer is that DAGsHub is built on git and DVC, which is an open source command-line tool built for data and pipeline versioning. You use git for the exact same things you would in a regular code project, and you use DVC on top for the DS/ML versioning stuff. DAGsHub adds visualizations and automation features on top of that.
A: The great thing about DVC is that it doesn’t affect code versioning. You still use plain old git for that.
DVC adds commands for DS and ML on top of that, but the syntax is similar to git, so it’s not entirely unfamiliar. Most git commands have a direct equivalent in DVC.
A: In a nutshell: DAGsHub is for DVC what GitHub is for git.
DVC is great, and so is git. But they are both command line tools, and as such have some issues which DAGsHub solves.
First of all, there is no convenient interface to visualize your pipeline and overview your project metrics. DAGsHub shows your pipeline as a, wait for it, DAG (!!!), where every node is a file, with important details and a direct link to the file itself. This is especially important for team projects, where you want everyone on the same page and seeing the same high level picture.
You can send someone a link to your DAGsHub repo, and give them a way to explore your project, including downloading your data and models from any past version, experiment, or branch, without forcing them to clone or run any code.
Building on the powerful foundations of git and DVC, we have many more features in the works, which should make life easier for everyone.
A: NOTHING! This is why we love DVC so much. Just like git, it is non-intrusive and not bloated. You just install the program and it works.
A: Nope. Completely, 100% language and library agnostic. DVC, and DAGsHub, don’t care if you’re using Python or R, Keras or Pytorch.
A: Actually, we can. You can make a DAGsHub account and mirror a repo if you don’t want to migrate. That way you can manage your code on GitHub and get most of the awesome features DAGsHub has to offer here.
After logging in, go to: https://dagshub.com/repo/migrate to mirror your project from GitHub or any other existing repo.
A: Starting at a whopping 0$, DAGsHub is completely free for open source projects. Private repos are currently free with up to 2 additional collaborators. If you need more collaborators, early access to new features or other special requests, you can contact us through our plans page for more details.
A: You can start with the tutorial.
Press p or to see the previous file or, n or to see the next file