Skip to content

Visualizing your data

Data Engine provides an easy way to visualize data points and their enrichments. By displaying data with enrichments such as annotations, predictions, and metadata - you can quickly:

  • See your data points & better understand your datasets
  • Discover use cases, in which your model is underperforming
  • Create a new dataset out of it using visual filters
  • Send your data for annotation or re-annotation

This way you can make sense of the datasets you’re using to train and improve your model.

Visualize data locally

Data Engine’s local visualization (currently available) is implemented as an integration with open source Voxel51 tool. To visualize your datasets, simply run the following:

  1. First, install the Voxel51 package:

    pip install fiftyone
    

  2. To visualize data source, dataset, or query results, use the .visualize() function:

    # Query datasource
    query = ds["annotation"].is_null()
    
    # Visualize the quried datasource
    query.visualize()
    

    This function will open a visualization instance on your local machine.

    Visualization Local Instance

    Behind the scenes, Data Engine checks which files need to be available for visualization, automatically create a Voxel51 compatible dataset, and creates a new Voxel51 instance locally with the built in DagsHub integration.

All Voxel51 capabilities such as filtering, sorting and tagging are provided out-of-the-box with the Data Engine visualize command. But we didn’t stop there - Data Engine visualizations come with a few capabilities that don’t exist in a regular voxel instance…

Save query results as a dataset

After filtering your dataset through the Voxel51 UI, you can save your filtered results as a new Data Engine dataset. To do that, simply navigate to the DagsHub tab by clicking on the DagsHub icon:

Dagshub Tab Navigation

Click on the ‘Save dataset’ button:

Save query as dataset

Provide the name for your dataset and check the checkbox to keep the filters used within the visualization instance (the ones on the left):

Name dataset

Use the get_datasets() command or navigate to your repository and check the datasets tab to see your new dataset.

Edit metadata

You can update data points’ metadata directly from the local visualization instance. To do this, navigate to the Dagshub tab and click on the ‘Update metadata for selected’ button.

Update Metadata Butoon

Choose the field you would like to change, insert new value and click ‘save’. The metadata will be updated immediately. Changes will affect all selected data points.

Update Metadata Butoon

Currently supported with primitive type columns only. Adding new enrichment fields from the visualization instance is still not supported.

Voxel51 capabilities

Go to Voxel51 documentation for other capabilities.

Annotate your data or create a dataloader for training

You can also use DagsHub to annotate your datasets or to convert it to a dataloader for training or evaluation.