The Data Engine will allow you to share your queries and results with your teammates so they can continue where you left off.
			Create production grade training-ready datasets for machine learning
		
				
			
			We provide an out-of-the-box solution with a clear display of your datasets, querying abilities, annotations, lineage and eventually a faster way to experiment and improve models.
		
		 
							We’re covering all steps to create training ready datasets
			Features
		
				
		
			Seamless connection to your existing storage
		
				
		Simple interface to connect your external storage, no DevOps needed.
We currently support S3, Google Cloud, and S3 compatible, with more to be added in the near future.
			Datasets versioning and lineage
		
				
		Clear & organized display of your datasets,
including visual lineage that connects datasets, models, experiments, labels and predictions
			Data querying
		
				
		Pick and choose the most relevant data points to improve a model where performance is low. This can be achieved by filtering, sorting, and searching for similar examples to create and save a new training-ready version of the dataset
			Annotations
		
				
		Annotate relevant data points in one click with zero setup. Use existing models to automatically label your data, and fine tune manually.
			Experiments and retraining
		
				
		Use subsets of your data to experiment and retrain your model by streaming it directly to your pipeline and track your experiments within DagsHub.
 
							