Datapoint¶
- class dagshub.data_engine.model.datapoint.Datapoint(datapoint_id: int, path: str, metadata: Dict[str, Any], datasource: 'Datasource')¶
- datapoint_id: int¶
ID of the datapoint in the database
- path: str¶
Path of the datapoint, relative to the root of the datasource
- metadata: Dict[str, Any]¶
Dictionary with the metadata
- datasource: Datasource¶
Datasource this datapoint is from
- delete_metadata(*fields: str)¶
Delete metadata from this datapoint.
The deleted values can be accessed using versioned query with time set before the deletion.
- Parameters:
fields – fields to delete
- delete(force: bool = False)¶
Delete this datapoint.
This datapoint will no longer show up in queries.
Does not delete the datapoint’s file, only removing the data from the datasource.
You can still query this datapoint and associated metadata with versioned queries whose time is before deletion time.
You can re-add this datapoint to the datasource by uploading new metadata to it with, for example,
Datasource.metadata_context
. This will create a new datapoint with new id and new metadata records.Datasource scanning will not add this datapoint back.
- Parameters:
force – Skip the confirmation prompt
- save()¶
Commit changes to metadata done with one or more dictionary assignment syntax usages. Learn more here.
Example:
specific_data_point['metadata_field_name'] = 42 specific_data_point.save()
- property download_url¶
URL that can be used to download the datapoint’s file from DagsHub
- Type:
str
- property path_in_repo¶
Path of the datapoint in repo
- Return type:
- get_blob(column: str, cache_on_disk=True, store_value=False) bytes ¶
Returns the blob stored in a binary column
- Parameters:
column – where to get the blob from
cache_on_disk – whether to store the downloaded blob on disk. If you store the blob on disk, then it won’t need to be re-downloaded in the future. The contents of datapoint[column] will change to be the path of the blob on the disk.
store_value – whether to store the blob in memory on the field attached to this datapoint, which will make its value accessible later using datapoint[column]
- download_file(target: PathLike | str | None = None, keep_source_prefix=True, redownload=False) PathLike ¶
Downloads the datapoint to the target_dir directory
- Parameters:
target – Where to download the file (either a directory, or the full path). If not specified, then downloads to
datasource's default location
.keep_source_prefix – If True, includes the prefix of the datasource in the download path.
redownload – Whether to redownload a file if it exists on the filesystem already.
Note
We don’t do any hashsum checks, so if it’s possible that the file has been updated, set
redownload
to True- Returns:
Path to the downloaded file