Skip to content
Reader Mode

Found a problem?
Let us know (or fix it):

Edit this Page

Have a question?
Join our community now:

Discord Chat

Ready to build your own project? It's free

Sign Up

Create a Project on DagsHub

This part of the Get Started section focuses on the configuration process when creating a project on DagsHub. We will cover how to create a DagsHub repository, connect it to your local computer, configure DVC, and set DagsHub storage as remote storage.
There is no need to configure anything to start the project from this point.

Sign Up

  • To create a new user on DagsHub, we will use the DagsHub Signup page.
  • We recommend signing up with your GitHub account, but you can also use your good old email. If you sign up with GitHub, you will be redirected to set your DagsHub password.

Important

You will use your DagsHub password frequently, please choose one that you will remember.

Create a DagsHub Repository

  • Now, we would like to create a new repository on DagsHub. Click on the 'Create' button and choose the 'New Repository' option.
    create-repository
  • You'll be redirected to the repository settings dialog.
  • Fill in the name of the repository as 'hello-world' and add Python to the .gitignore file selector. Then click the 'Create Repository' button at the bottom.

new-repository-settings

  • Congratulations - you created your first DagsHub repository!

Clone the Repository

Now, we'll clone the Git remote, which is stored on DagsHub, to our local computer.

  • Go to the repository page, click on the remote button and copy the Git remote link.
    create-repository
  • From your CLI, change the directory to where you wish to clone the repository and git-clone it using the copied link.

    cd path/to/folder
    git clone https://dagshub.com/<DagsHub-user-name>/hello-world.git .
    
Recommended: Create and Activate a Virtual Environment
python3 -m venv <virtual-environment-name>
echo <virtual-environment-name> >> .gitignore
source <virtual-environment-name>/bin/activate
py -m venv <virtual-environment-name>
echo venv >> .gitignore
<virtual-environment-name>/Scripts/activate.bat
  • Note: To verify that you activated the virtual environment, its name should appear in the parenthesis on the left.

Setup DVC

To use DVC, we will have to initialize and configure it in our local repository. DagsHub makes this process easy by only running the following six commands.

  • We will start by installing DVC on the virtual environment and initialize it.

    pip install dvc
    dvc init
    

Configure DagsHub as DVC Remote Storage

In order to host the data & models alongside our code, we need to create a DVC storage remote. What this usually means is signing up for a cloud account, creating a storage bucket, configuring permissions, etc. This process can be a hassle, even if you are familiar with it. To save you the trouble, we created a free, zero-configuration DVC remote called DagsHub Storage!

When you create a DagsHub project, it is automatically configured with its own DagsHub Storage remote. To configure it locally, all you need to do is copy and paste four commands from your DagsHub repository to your CLI.

  • Copy the commands form the DagsHub repository to your CLI

    dvc remote add origin https://dagshub.com/<DagsHub-user-name>/hello-world.dvc
    dvc remote modify origin --local auth basic
    dvc remote modify origin --local user <DagsHub-user-name>
    dvc remote modify origin --local password <Token>
    

    dvc-commands

  • For more information about DagsHub storage, visit the reference page.

  • If you still want to set up your own cloud remote storage, please refer to our setup external remote storage page.
Checkpoint

Check that the current DVC configuration matches the following:

cat .dvc/config.local
    ['remote "origin"']
        url = https://dagshub.com/<DagsHub-user-name>/hello-world.dvc
        auth = basic
        user = <DagsHub-user-name>>
        ask_password = true
type .dvc/config.local
    ['remote "origin"']
        url = https://dagshub.com/<DagsHub-user-name>/hello-world.dvc
        auth = basic
        user = <DagsHub-user-name>>
        ask_password = true

Version and push DVC Configurations

We've initialized and configured DVC in our local directory. These actions created and updated the .dvc directory and the .dvcignore file. These are configuration files for our project and should be tracked with Git.
Rule of thumb: Git will track every file that ends with '.dvc'.

  • Check the local repository status

    git status -s 
      A  .dvc/.gitignore
      A  .dvc/config
      A  .dvc/plots/confusion.json
      A  .dvc/plots/confusion_normalized.json
      A  .dvc/plots/default.json
      A  .dvc/plots/linear.json
      A  .dvc/plots/scatter.json
      A  .dvc/plots/smooth.json
      A  .dvcignore
      M .gitignore
    
  • Add and push the untracked and modified files using Git tracking

    git add .dvc .dvcignore .gitignore
    git commit -m "Initialize DVC"
    git push
    
  • Check the new status of the DagsHub repository

So far, we've created our very first DagsHub project, clone it to our local computer, and configured our Git and DVC remotes. In the next parts, we will learn how to: