Data Types

class dagshub.data_engine.dtypes.MetadataFieldType(value, names=<not given>, *values, module=None, qualname=None, type=None, start=1, boundary=None)

Backing types in the Data Engine’s database

BOOLEAN = 'BOOLEAN'

Python’s bool

INTEGER = 'INTEGER'

Python’s int

DATETIME = 'DATETIME'

Python’s datetime.datetime

FLOAT = 'FLOAT'

Python’s float

STRING = 'STRING'

Python’s str

BLOB = 'BLOB'

Python’s bytes

class dagshub.data_engine.dtypes.DagshubDataType

Inheritors of this ABC define custom types

They are backed by a primitive type, but they also may have additional tags, which we use to enhance the experience.

backing_field_type

primitive type in the data engine database

Type:

dagshub.data_engine.dtypes.MetadataFieldType | None

custom_tags

additional tags applied to this type

Type:

Set[str] | None

class dagshub.data_engine.dtypes.Int

Basic python int

class dagshub.data_engine.dtypes.DateTime

Basic python datetime.datetime

Note

Dagshub backend receives an integer millisecond timestamp (utc timestamp), and optionally a timezone.

A metadata of type datetime is always stored in DB as a UTC time, when a query is done on this field there are 3 options:

  • Metadata was saved with a timezone, in which case it will be used.

  • Metadata was saved without a timezone, in which case UTC will be used.

  • with_time_zone specified a time zone and it will override whatever is in the database.

Example:

# dagshub client sends int(t.timestamp() * 1000) to the backend
# and the +05:30 offset
datapoints = datasource.all()
t = dateutil.parser.parse("2022-04-05T15:30:00.99999+05:30")
datapoints[path][name] = t
datapoints[path].save()

# or:
# send only a millisecond timestamp, without timezone (will be saved as utc)
datapoints[path][name] = int(dateutil.parser.parse("2022-04-05T15:30:00.99999+05:30").timestamp() * 1000)
class dagshub.data_engine.dtypes.String

Basic python str

class dagshub.data_engine.dtypes.Blob

Basic python bytes

Note

DagsHub doesn’t return the blob fields by default, instead returning their hashes. Check out Datapoint.get_blob() to learn how to download the blob value.

class dagshub.data_engine.dtypes.Float

Basic python float

class dagshub.data_engine.dtypes.Bool

Basic python bool

class dagshub.data_engine.dtypes.LabelStudioAnnotation

LabelStudio annotation. Backing type is blob. Has the annotation tag set.

Annotations of this type get automatically converted in the metadata into MetadataAnnotations objects that simplify adding and saving annotations.

class dagshub.data_engine.dtypes.Voxel51Annotation

Voxel51 annotation. Backing type is blob. Has the annotation tag set.

class dagshub.data_engine.dtypes.Document

Field with large text values that is stored as a blob. Document fields can’t be filtered on, but allow you to store arbitrarily large text longer than allowed 512 characters