Repository API

class dagshub.common.api.repo.RepoAPI(repo: str, host: str | None = None, auth: Any | None = None)
__init__(repo: str, host: str | None = None, auth: Any | None = None)

Class for interacting with the API of a repository

Parameters:
  • repo – repo in the format of user/repo

  • host (optional) – url of the DagsHub instance the repo is on

  • auth (optional) – authentication to use to connect

get_repo_info() RepoAPIResponse

Get information about the repository

get_branch_info(branch: str) BranchAPIResponse

Get information about specified branch

Parameters:

branch – Name of the branch to get the info

get_commit_info(sha: str) CommitAPIResponse

Get information about a specific commit

Parameters:

sha – SHA of the commit.

get_connected_storages() List[StorageAPIEntry]

Get storages that are connected to the repository

list_path(path: str, revision: str | None = None, include_size: bool = False) List[ContentAPIEntry]

List contents of a repository directory

Parameters:
  • path – Path of the directory

  • revision – Branch or commit SHA. Default is tip of the default branch.

  • include_size – Whether to include sizes for files. Calculating sizes might take more time.

list_storage_path(path: str, include_size: bool = False) List[ContentAPIEntry]

List contents of a folder in a connected storage bucket

Parameters:

path – Path of the storage directory in the format of <scheme>/<bucket-name>/<path>

Path example: s3/my-bucket/prefix/path/to/file

get_file(path: str, revision: str | None = None) bytes

Download file from repo.

Parameters:
  • path – Path of the file in the repo.

  • revision – Git branch or revision from which to download the file.

Returns:

The content of the file.

Return type:

bytes

get_storage_file(path: str) bytes

Download file from a connected storage bucket.

Parameters:

path – Path in the bucket in the format of <scheme>/<bucket-name>/<path>.

Path example: s3/my-bucket/prefix/path/to/file

Returns:

The content of the file.

Return type:

bytes

download(remote_path: str | PathLike, local_path: str | PathLike = '.', revision: str | None = None, recursive=True, keep_source_prefix=False, redownload=False, download_storages=False)

Downloads the contents of the repository at “remote_path” to the “local_path”

Parameters:
  • remote_path – Path in the repository of the folder or file to download.

  • local_path – Where to download the files. Defaults to current working directory.

  • revision – Repo revision or branch, if not specified - uses default repo branch. Ignored for downloading from buckets.

  • recursive – Whether to download files recursively.

  • keep_source_prefix

    Whether to keep the path of the folder in the download path or not.
    Example: Given remote_path src/data and file test/file.txt
    if True: will download to <local_path>/src/data/test/file.txt
    if False: will download to <local_path>/test/file.txt

  • redownload – Whether to redownload files that already exist on the local filesystem. The downloader doesn’t do any hash comparisons and only checks if a file already exists in the local filesystem or not.

  • download_storages – If downloading the whole repo, by default we’re not downloading the integrated storages Toggle this to True to change this behavior

property default_branch: str

Name of the repository’s default branch

property full_name: str

Full name of the repo in <owner>/<reponame> format

last_commit(branch: str | None = None) CommitAPIResponse

Returns info about the last commit of a branch.

Parameters:

branch – Branch to get the last commit of. Defaults to default_branch().

last_commit_sha(branch: str | None = None) str

Returns the SHA hash of the last commit of a branch.

Parameters:

branch – Branch to get the last commit of. Defaults to default_branch().

property repo_url: str

URL of the repo on DagsHub

Format: https://dagshub.com/<user>/<repo>

Response Structures

These are structures that are returned as responses from some RepoAPI functions. Most of the examples of these structures you can find by searching for relevant functions in the API Docs.

class dagshub.common.api.responses.RepoAPIResponse(id: int, owner: 'UserAPIResponse', name: str, full_name: str, description: str, private: bool, fork: bool, parent: Optional[ForwardRef('RepoAPIResponse')], empty: bool, mirror: bool, size: int, html_url: str, ssh_url: str | None, clone_url: str, website: str | None, stars_count: int, forks_count: int, watchers_count: int, open_issues_count: int, default_branch: str, created_at: str, updated_at: str, permissions: Dict[str, bool] | None)
class dagshub.common.api.responses.UserAPIResponse(id: int, login: str, full_name: str, avatar_url: str | None, public_email: str | None, website: str | None, company: str | None, description: str | None, username: str)
class dagshub.common.api.responses.BranchAPIResponse(name: str, commit: 'CommitAPIResponse')
class dagshub.common.api.responses.CommitAPIResponse(id: str, message: str, url: str, author: Optional[ForwardRef('GitUser')], committer: Optional[ForwardRef('GitUser')], added: List[str] | None, removed: List[str] | None, modified: List[str] | None, timestamp: str)
class dagshub.common.api.responses.GitUser(name: str, email: str, username: str)
class dagshub.common.api.responses.StorageAPIEntry(name: str, protocol: str, list_path: str)
class dagshub.common.api.responses.ContentAPIEntry(path: str, type: str, size: int, hash: str, versioning: str, download_url: str, content_url: str | None)
class dagshub.common.api.responses.StorageContentAPIResult(entries: List[dagshub.common.api.responses.ContentAPIEntry], next_token: str | None)