Are you sure you want to delete this access key?
title | description |
---|---|
DagsHub Data Engine - Visualizing Data | Documentation on using Data Engine to visualize data |
Data Engine provides an easy way to visualize data points and their enrichments. By displaying data with enrichments such as annotations, predictions, and metadata - you can quickly:
This way you can make sense of the datasets you’re using to train and improve your model.
DagsHub's Dataset Viewer allows you to visualize datasources, datasets, as well as individual queries. You can select the metadata you'd like to view. It can also overlay annotations for columns in the Label Studio format!
To visualize a query while working with the python client, you can run the following command:
# Query datasource
query = ds["annotation"].is_null() # query of your choice
# Visualize the quried datasource
query.visualize(visualizer="dagshub")
link, which you can follow to explore the query directly within the DagsHub UI.
To visualize queries usi This should return a link, which you can follow to explore the query directly within the DagsHub UI.
To visualize queries using just the DagsHub UI, follow the documentation under query and create subsets to utilize the web query builder.
Once you explore a dataset, you may want to filter out subsets to work with further (e.g. retraining with upweighted loss). Doing so with the DagsHub Dataset Viewer is easy!
For example: in the sawit dataset project, some data is annotated using a vision model. This is generally less reliable than human annotators. I can filter them out by checking if the annotator field is equal to 'human':
Next, I can save this query as a dataset, giving it an appropriate name:
Once you do the same for your project, use the get_datasets()
command or navigate to your repository and check the datasets tab to see your new dataset.
Metadata can be edited both through the UI, or the python client. To start select the datapoint you'd like to update:
Similarly, you can use the UI to edit and add individual fields.
To batch update metadata for a large set of datapoints, it is recommended to use the client API.
You can also use DagsHub to annotate your datasets or to convert it to a dataloader for training or evaluation.
Press p or to see the previous file or, n or to see the next file
Are you sure you want to delete this access key?
Are you sure you want to delete this access key?
Are you sure you want to delete this access key?
Are you sure you want to delete this access key?