Register
Login
Resources
Docs Blog Datasets Glossary Case Studies Tutorials & Webinars
Product
Data Engine LLMs Platform Enterprise
Pricing Explore
Connect to our Discord channel

create_a_dagshub_project.md 7.4 KB

You have to be logged in to leave a comment. Sign In

Create a Project on DagsHub

This part of the Get Started section focuses on the configuration process when creating a project on DagsHub. We will cover how to create a DagsHub repository, connect it to your local computer, configure DVC, and set DagsHub storage as remote storage.
There is no need to configure anything to start the project from this point.

!!! illustration "Video for this tutorial" Prefer to follow along with a video instead of reading? Check out the video for this section below:

<center>
<iframe width="400" height="225" src="https://www.youtube.com/embed/ECbVxGqS0f0" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe>
</center>

Sign Up

  • To create a new user on DagsHub, we will use the DagsHub Signup page{target=_blank}.
  • We recommend signing up with your GitHub account, but you can also use your good old email. If you sign up with GitHub, you will be redirected to set your DagsHub password.

!!! Important You will use your DagsHub password frequently, please choose one that you will remember.

Create a DagsHub Repository

  • Now, we would like to create a new repository on DagsHub. Click on the 'Create' button and choose the 'New Repository' option.
![create-repository](assets/0-create-repository.png) - You'll be redirected to the repository settings dialog. - Fill in the name of the repository as 'hello-world' and add Python to the .gitignore file selector. Then click the 'Create Repository' button at the bottom. [![new-repository-settings](assets/1-new-repo-settings.png){: style="height:60%;width:60%"}](assets/1-new-repo-settings.png){target=_blank}
  • Congratulations - you created your first DagsHub repository!

Clone the Repository

Now, we'll clone the Git remote, which is stored on DagsHub, to our local computer.

  • Go to the repository page, click on the remote button and copy the Git remote link.
![create-repository](assets/2-git-remote.png) - From your CLI, change the directory to where you wish to clone the repository and git-clone it using the copied link.
=== "Mac, Linux, Windows"
    ```bash
    cd path/to/folder
    git clone https://dagshub.com/<DagsHub-user-name>/hello-world.git .
    ```

We recommend you create and activate a virtual environment before moving forward. ??? info "Recommended: Create and Activate a Virtual Environment" - Make sure you're in the project directory when following this. - If you're using Python 2, replace venv with virtualenv in the below commands. - The name of the virtual environment is for you to choose. The convention is 'env' or 'venv'. - We will add the virtual environment name to the .gitignore file, so Git will not track it. === "Mac, Linux" bash python3 -m venv <virtual-environment-name> echo <virtual-environment-name> >> .gitignore source <virtual-environment-name>/bin/activate === "Windows" shell py -m venv <virtual-environment-name> echo venv >> .gitignore <virtual-environment-name>/Scripts/activate.bat

- **<u>Note</u>**: *To verify that you activated the virtual environment, its name should appear in the parentheses on the left.*

Setup DVC

To use DVC, we will have to initialize and configure it in our local repository. DagsHub makes this process easy by only running the following six commands.

  • We will start by installing DVC on the virtual environment and initialize it.

    === "Mac, Linux, Windows" pip install dvc dvc init

Configure DagsHub as DVC Remote Storage

In order to host the data & models alongside our code, we need to create a DVC storage remote. What this usually means is signing up for a cloud account, creating a storage bucket, configuring permissions, etc. This process can be a hassle, even if you are familiar with it. To save you the trouble, we created a free, zero-configuration DVC remote called DagsHub Storage!

When you create a DagsHub project, it is automatically configured with its own DagsHub Storage remote. To configure it locally, all you need to do is copy and paste four commands from your DagsHub repository to your CLI.

  • Copy the commands from the DagsHub repository to your CLI

    === "Mac, Linux, Windows" bash dvc remote add origin https://dagshub.com/<DagsHub-user-name>/hello-world.dvc dvc remote modify origin --local auth basic dvc remote modify origin --local user <DagsHub-user-name> dvc remote modify origin --local password <Token>

    ![dvc-commands](assets/3-copy-dvc-commands.png)
  • For more information about DagsHub storage, visit the reference page.

  • If you still want to set up your own cloud remote storage, please refer to our setup external remote storage page.

??? checkpoint "Checkpoint"

Check that the current DVC configuration matches the following:

=== "Mac, Linux"
    ```bash
    cat .dvc/config.local
        ['remote "origin"']
            url = https://dagshub.com/<DagsHub-user-name>/hello-world.dvc
            auth = basic
            user = <DagsHub-user-name>>
            ask_password = true
    ```
=== "Windows"
    ```bash
    type .dvc/config.local
        ['remote "origin"']
            url = https://dagshub.com/<DagsHub-user-name>/hello-world.dvc
            auth = basic
            user = <DagsHub-user-name>>
            ask_password = true
    ```

Version and push DVC Configurations

We've initialized and configured DVC in our local directory. These actions created and updated the .dvc directory and the .dvcignore file. These are configuration files for our project and should be tracked with Git.Rule of thumb: Git will track every file that ends with '.dvc'.

  • Check the local repository status

    === "Mac, Linux, Windows" bash git status -s A .dvc/.gitignore A .dvc/config A .dvc/plots/confusion.json A .dvc/plots/confusion_normalized.json A .dvc/plots/default.json A .dvc/plots/linear.json A .dvc/plots/scatter.json A .dvc/plots/smooth.json A .dvcignore M .gitignore

  • Add and push the untracked and modified files using Git tracking

    === "Mac, Linux, Windows" bash git add .dvc .dvcignore .gitignore git commit -m "Initialize DVC" git push

  • Check the new status of the DagsHub repository

![](assets/4-repo-stat-after-push.png)

So far, we've created our very first DagsHub project, cloned it to our local computer, and configured our Git and DVC remotes. In the next parts, we will learn how to:

Tip!

Press p or to see the previous file or, n or to see the next file

Comments

Loading...