DAGsHub Storage¶
What is it?¶
DAGsHub's Onboard Storage is an HTTP remote cache for DVC. Every repository has it, and everyone can use it without having a degree in DevOps and a billing account in a cloud provider. This means you can easily store and version your data and models alongside your code.
How does it work?¶
The same way you get a git remote URL for your git repository. You create a repository, and it automatically provides you with a DVC remote URL. When pushing or pulling data from this URL, you use your existing DAGsHub credentials (via HTTPS basic authentication).
This means you automatically get the same access control as the normal code git repository - public repo data is publicly readable, but only maintainers of the project can push data or read data from a private repo. Just setup your DAGsHub DVC remote, and start working!
Setting up DAGsHub as remote¶
- Go to your repository homepage.
-
Copy your DVC remote URL:
-
Enter a terminal in your project.
-
Add a dvc remote
dvc remote add origin --local <--dvc-remote-url-->
That's it! You're all set to pull the repository data!
Pushing files or using a private repo¶
-
Set the DVC remote to use basic auth
dvc remote modify origin --local auth basic
Why --local?
Everything you configure without
--local
will end up in the.dvc/config
file, which is tracked by git, and appear in you repository. Personal info like authentication details should always be kept local. -
Set your credentials
dvc remote modify origin --local user <--user--> dvc remote modify origin --local ask_password true
Use access tokens instead of filling in your password
Important Note: Using this method for authentication without following instructions closely might result in pushing your password or access token to a public repository. Please use it with caution.
If you prefer not to enter your password every time you push to your DVC remote, or you are using a service machine which is not interactive, you can use this alternative setup.
-
Create an access token in the tokens settings menu. Immediately after creating it, you will be shown an access token. Copy it.
-
Set your credentials:
dvc remote modify origin --local user <--user--> dvc remote modify origin --local password <--access token-->
Note: if you already typed
dvc remote modify origin --local ask_password true
you will need to unset this by typingdvc remote modify origin --local --unset ask_password
-
That's it! You can now pull data from your remote cache
Pull data¶
dvc pull -r origin
Push data¶
-
First, make sure you are using DVC version 1.10 or greater
-
Then you can run:
dvc push -r origin