Open Source Data Science Datasets

Path: .

No description

dataset nlp dvc git github

0 0

Path: .

Argument Quality Dataset for fine-tuning

dataset nlp git github

0 0

Path: .

In this project, I want to train the Name Entity Recognition to Identify the Columns of any csv files.

dataset nlp git github

0 0

Path: .

Showcasing DagsHub Annotations, Label Studio integration, Discussions, and other related features

dataset nlp audio computer vision tabular label studio

0 0 0

Path: .

Transactions messages NLP

dataset model nlp dvc label studio git

0 0 0

Path: .

This repository contains the code to import and integrate the book and rating data that we work with. It imports and integrates data from several sources in a homogenous tabular outputs; import scripts are primarily Rust, with Python implement analyses.

dataset nlp dvc git github

3 0

Path: .

Codes and Data pipeline for Omdena UAE Chapter Challenge "Abu Dhabi Open Data Intelligence: Empowering Analytics with Falcon LLM Voice Bot" Dates: 04-Sep to 12Nov 2023

dataset nlp

0 0 0

Path: .

Testing dagshub orgs

dataset nlp image classification label studio git mlflow github

0 0

Path: .

"Hyderabad, India Chapter" - Chatbot for Interview Preparation using NLP

dataset nlp

0 0 0

Path: .

classification mail text on scan pdf images

dataset nlp classification object detection image classification dvc git

0 0 0

Path: datasets

A DagsHub implementation of BioBERT: a pre-trained biomedical language representation model for biomedical text mining

dataset model nlp named entity recognition dvc git

2 0 0

Path: data tests

DPT is a QA-bot designed to help answer questions about DagsHub. It is a fork of the brilliant buster project. Using DagsHub's documentation as reference and sentence-transformers/all-MiniLM-L6-v2 for sentence similarity, we identify documents that contain relevant information to a given query. This is then passed to OpenAI's GPT-3.5 Turbo, that uses the information and the query given a prompt to return an answer to the user query, that's hopefully helpful.

dataset nlp question answering chatbot dvc git

0 0 0

morrisalp / unikud

Updated 6 months ago

Path: . data

UNIKUD is an open-source tool for adding vowel signs (nikud) to Hebrew text with deep learning, using absolutely no rule-based logic.

dataset model nlp dvc git mlflow github

0 1

Path: .

A subset of the LAION Aesthetics V2 dataset that contains only images with an aesthetics score of 6.5 or larger.

dataset nlp computer vision text-to-image generation dvc git

4 0 0

Path: .

databricks-dolly-15k is an open source dataset of instruction-following records generated by thousands of Databricks employees in several of the behavioral categories outlined in the InstructGPT paper, including brainstorming, classification, closed QA, generation, information extraction, open QA, and summarization.

dataset nlp dvc git

0 0 0

Path: .

Code for the TriviaQA reading comprehension dataset

dataset nlp dvc git github

0 0

Path: data

Fastai community entry to 2020 Reproducibility Challenge

dataset nlp dvc git github

1 0

Path: .

No description

dataset nlp dvc git

0 0 0

Path: .

The WikiText language modeling dataset is a collection of over 100 million tokens extracted from the set of verified Good and Featured articles on Wikipedia. The dataset is available under the Creative Commons Attribution-ShareAlike License.

dataset nlp language modelling dvc git

0 0 0

Path: .

The purpose of the project is to make available a standard training and test setup for language modeling experiments.

dataset nlp language modelling dvc git

0 0 0

Previous 1 2 Next

General

Task

Data Domain

Framework

Integration

Open Source Data Science Datasets