File Uploading¶
- dagshub.upload_files(repo, local_path, commit_message='Upload files using DagsHub client', remote_path=None, bucket=False, **kwargs)¶
Upload file(s) into a repository.
- Parameters:
repo (
str
) – Repo name in the form of<username>/<reponame>
.local_path (
Union
[str
,PathLike
]) – File or directory to be uploaded.commit_message (optional) – Specify a commit message.
remote_path (
Optional
[str
]) – Specify the path to upload the file to. Defaults to the relative component oflocal_path
to CWD.bucket (
bool
) – Upload the file(s) to the DagsHub Storage bucket
For kwarg docs look at
Repo.upload()
.
- dagshub.upload.create_repo(repo_name, org_name='', description='', private=False, auto_init=False, gitignores='Python', license='', readme='', template='custom', host='')¶
Creates a repository on DagsHub for the current user or an organization passed as an argument
- Parameters:
repo_name (
str
) – Name of the repository to be created.org_name (optional) – Organization that will own the repo. Alternative to creating a repository owned by you.
description (
str
) – Repository description.private (
bool
) – Set toTrue
to make repository private.auto_init (
bool
) – Set to True to create an initial commit with README, .gitignore and LICENSE.gitignores (
str
) – Which gitignore template(s) to use in a comma separated string.license (
str
) – Which license file to use.readme (
str
) – Readme template to initialize with.template (
str
) –Which project template to use, options are:
"none"
- creates an empty repo"custom"
- creates a repo with your specifiedgitignores
,license
andreadme
"notebook-template"
"cookiecutter-mlops"
"cookiecutter-dagshub-dvc"
By default, creates an empty repo if none of
gitignores
,license
orreadme
were provided. Otherwise, the template is"custom"
.host (
str
) – URL of the DagsHub instance to host the repo on.
Note
To learn more about the templates, visit https://dagshub.com/docs/feature_guide/project_templates/
- Returns:
Repo object of the repository created.
- Return type:
- dagshub.upload.create_dataset(repo_name, local_path, glob_exclude='', org_name='', private=False)¶
Create a new repository on DagsHub and upload an entire folder dataset to it
- Parameters:
repo_name (
str
) – Name of the repository to be created.local_path (
str
) – local path where the dataset to upload is located.glob_exclude (
str
) – glob pattern to exclude certain files from being uploaded.org_name (optional) – Organization that will own the repo. Alternative to creating a repository owned by you.
private – Set to
True
to make the repository private.
- Returns:
Repo object of the repository created.
- Return type:
- class dagshub.upload.Repo(owner, name, username=None, password=None, token=None, branch=None)¶
- __init__(owner, name, username=None, password=None, token=None, branch=None)¶
Class that can be used to upload files into a repository on DagsHub
Warning
This class is not thread safe. Uploading files in parallel can lead to unexpected outcomes
- Parameters:
owner (
str
) – user or org that owns the repository.name (
str
) – name of the repository.token (optional) – Token to use for authentication. If unset, uses the cached token or goes through OAuth.
username (
Optional
[str
]) – Username to log in with (alternative to token).password (
Optional
[str
]) – Password to log in with (alternative to token).branch (
Optional
[str
]) – Branch to upload files to.
- upload(local_path, commit_message='Upload files using DagsHub client', remote_path=None, bucket=False, **kwargs)¶
Upload a file or a directory to the repo.
- Parameters:
local_path (
Union
[str
,PathLike
]) – Path to file or directory to be uploadedcommit_message – Specify a commit message
remote_path (
Optional
[str
]) – Specify the path to upload the file/dir to. If unspecified, sets the value to the relative component oflocal_path
to CWD. Iflocal_path
is not relative to CWD,remote_path
is the last component of thelocal_path
bucket (
bool
) – Upload to the DagsHub Storage bucket (s3-compatible) without versioning, if this is set to true,ignored. (commit_message will be)
The kwargs are the parameters of
upload_files()
- upload_files(files, directory_path='', commit_message='Upload files using DagsHub client', versioning='auto', new_branch=None, last_commit=None, force=False, quiet=False)¶
Upload a list of binary files to the specified directory. This function is lower level than
upload()
, but useful when you don’t have the files stored on the filesystem.- Parameters:
files (
List
[Tuple
[str
,BinaryIO
]]) – List of Tuples of (path in repo, binaryIO) of files to uploaddirectory_path (
str
) – Directory in repo relative to which to upload filescommit_message (
Optional
[str
]) – Commit messageversioning (
str
) – Which versioning system to use to upload a file. Possible options:"git"
,"dvc"
,"auto"
(default, best effort guess)new_branch (
Optional
[str
]) – Create a new branch with this namelast_commit (
Optional
[str
]) – Consistency argument - last revision of the files you want to commit on top of. Exists to prevent accidental overwrites of data.force (bool) – Force the upload of a file even if it is already present on the server. Sets last_commit to be the tip of the branch
quiet (bool) – Don’t show messages about starting/successfully completing an upload. Set to True when uploading a directory
- directory(path) DataSet ¶
Create a
DataSet
object that allows you to stage multiple files before pushing them all to DagsHub in a single commit withcommit()
.- Parameters:
path (
str
) – The path of the directory in the repository relative to which the files will be uploaded.- Return type:
- upload_files_to_bucket(local_path, remote_path, max_workers=8, **kwargs)¶
Upload a file or directory to an S3 bucket, preserving the directory structure.
- Parameters:
local_path (
Path
) – Path to the local directory or file to uploadremote_path (
str
) – The directory path within the S3 bucketmax_workers (
int
) – The maximum number of threads to use
- class dagshub.upload.wrapper.DataSet(repo, directory)¶
Not to be confused with DataEngine’s datasets. This class represents a folder with files that are going to be uploaded to a repo.
- add(file, path=None)¶
Add a file to upload. The file will not be uploaded unless you call
commit()
- Parameters:
file (
Union
[str
,BinaryIO
]) – Path to the file on the filesystem OR the contents of the file.path (
Union
[str
,Path
,None
]) – Where to store the file in the repo.
- add_dir(local_path, glob_exclude='', commit_message=None, **upload_kwargs)¶
Add and upload an entire directory to the DagsHub repository.
By default, this uploads a dvc folder.
- Parameters:
local_path (
str
) – Local path of the directory to upload.glob_exclude – Glob pattern to exclude some files from being uploaded.
commit_message – Message of the commit with the upload.
The keyword arguments are passed to
Repo.upload_files()
.
- commit(commit_message='Upload files using DagsHub client', *args, **kwargs)¶
Commit files added with
add()
to the repo- Parameters:
commit_message – Message of the commit with the upload.
Other positional and keyword arguments are passed to
Repo.upload_files()