Register
Login
Resources
Docs Blog Datasets Glossary Case Studies Tutorials & Webinars
Product
Data Engine LLMs Platform Enterprise
Pricing Explore
Connect to our Discord channel
Type:  dataset model Task:  transfer learning Data Domain:  audio Framework:  pytorch
2 years ago
a61ebcbc62
eval and markdown
2 years ago
73d6051127
first test
2 years ago
73d6051127
first test
2 years ago
73d6051127
first test
2 years ago
b9df3647ca
Initial commit
2 years ago
73d6051127
first test
2 years ago
a61ebcbc62
eval and markdown
2 years ago
a61ebcbc62
eval and markdown
2 years ago
de1a0f3d8c
Sota Results
2 years ago
de1a0f3d8c
Sota Results
2 years ago
de1a0f3d8c
Sota Results
2 years ago
e2f5996ee7
changed requirment
2 years ago
28739914e3
eval edit
2 years ago
Storage Buckets
Data Pipeline
Legend
DVC Managed File
Git Managed File
Metric
Stage File
External File

README.md

You have to be logged in to leave a comment. Sign In

Urdu-ASR-SOTA

Automatic Speech Recognition using Facebook wav2vec2-xls-r-300m model and mozilla-foundation common_voice_8_0 Urdu Dataset.

wav2vec2-large-xls-r-300m-Urdu

This model is a fine-tuned version of facebook/wav2vec2-xls-r-300m on the common_voice dataset.

It achieves the following results on the evaluation set:

  • Loss: 0.9889
  • Wer: 0.5607
  • Cer: 0.2370

Evaluation Commands

To evaluate on mozilla-foundation/common_voice_8_0 with split test

python3 ./eval.py --model_id ./Model --dataset ./Data --config ur --split test --chunk_length_s 5.0 --stride_length_s 1.0 --log_outputs
import torch
from datasets import load_dataset, Audio
from transformers import pipeline
import torchaudio.functional as F
model = "Model"
data = load_dataset("Data", "ur", split="test", delimiter="\t")
def path_adjust(batch):
    batch["path"] = "Data/ur/clips/" + str(batch["path"])
    return batch
data = data.map(path_adjust)
sample_iter = iter(data.cast_column("path", Audio(sampling_rate=16_000)))
sample = next(sample_iter)

asr = pipeline("automatic-speech-recognition", model=model)
prediction = asr(
            sample["path"]["array"], chunk_length_s=5, stride_length_s=1)
prediction
# => {'text': 'اب یہ ونگین لمحاتانکھار دلمیں میںفوث کریلیا اجائ'}

Eval results on Common Voice 8 "test" (WER):

Without LM With LM (run ./eval.py)
56.21 46.37
Tip!

Press p or to see the previous file or, n or to see the next file

About

Automatic Speech Recognition using Facebook wav2vec2-xls-r-300m model and mozilla-foundation common_voice_8_0 Urdu Dataset

https://huggingface.co/kingabzpro/wav2vec2-large-xls-r-300m-Urdu
Collaborators 1

Comments

Abid Ali Awan

commented in commite2f5996ee7on branch master

2 years ago

I am open to contributions and suggestion. So, Keep them comming.

Omar Farooq FastNU

commented in commite2f5996ee7

1 year ago

I got 17% WER and I am still far from SOTA.

Loading...