Are you sure you want to delete this access key?
Legend |
---|
DVC Managed File |
Git Managed File |
Metric |
Stage File |
External File |
Legend |
---|
DVC Managed File |
Git Managed File |
Metric |
Stage File |
External File |
title | emoji | colorFrom | colorTo | sdk | sdk_version | app_file | pinned | license |
---|---|---|---|---|---|---|---|---|
Urdu ASR SOTA | 👨🎤 | pink | blue | gradio | 2.8.11 | Gradio/app.py | false | apache-2.0 |
Automatic Speech Recognition using Facebook's wav2vec2-xls-r-300m model and mozilla-foundation common_voice_8_0 Urdu Dataset.
This model is a fine-tuned version of facebook/wav2vec2-xls-r-300m on the common_voice dataset.
It achieves the following results on the evaluation set:
Install all dependecies using requirment.txt
file and then run bellow command to predict the text:
import torch
from datasets import load_dataset, Audio
from transformers import pipeline
model = "Model"
data = load_dataset("Data", "ur", split="test", delimiter="\t")
def path_adjust(batch):
batch["path"] = "Data/ur/clips/" + str(batch["path"])
return batch
data = data.map(path_adjust)
sample_iter = iter(data.cast_column("path", Audio(sampling_rate=16_000)))
sample = next(sample_iter)
asr = pipeline("automatic-speech-recognition", model=model)
prediction = asr(
sample["path"]["array"], chunk_length_s=5, stride_length_s=1)
prediction
# => {'text': 'اب یہ ونگین لمحاتانکھار دلمیں میںفوث کریلیا اجائ'}
To evaluate on mozilla-foundation/common_voice_8_0
with split test
, you can copy and past the command to the terminal.
python3 eval.py --model_id Model --dataset Data --config ur --split test --chunk_length_s 5.0 --stride_length_s 1.0 --log_outputs
OR Run the simple shell script
bash run_eval.sh
Boosting Wav2Vec2 with n-grams in 🤗 Transformers
Install kenlm and pyctcdecode before running the notebook.
pip install https://github.com/kpu/kenlm/archive/master.zip pyctcdecode
Without LM | With LM |
---|---|
56.21 | 46.37 |
<root directory>
|
.- README.md
|
.- Data/
|
.- Model/
|
.- Images/
|
.- Sample/
|
.- Gradio/
|
.- Eval Results/
|
.- With LM/
|
.- Without LM/
| ...
.- notebook.ipynb
|
.- run_eval.sh
|
.- eval.py
This project was the results of HuggingFace Robust Speech Recognition Challenge. I was one of the winner with four state of the art ASR model. Check out my SOTA checkpoints.
Press p or to see the previous file or, n or to see the next file
Automatic Speech Recognition using Facebook wav2vec2-xls-r-300m model and mozilla-foundation common_voice_8_0 Urdu Dataset
https://huggingface.co/kingabzpro/wav2vec2-large-xls-r-300m-UrduAre you sure you want to delete this access key?
Are you sure you want to delete this access key?
Are you sure you want to delete this access key?
Are you sure you want to delete this access key?
commented in commite2f5996ee7on branch master
2 years ago OutdatedI am open to contributions and suggestion. So, Keep them comming.
commented in commite2f5996ee7
1 year ago OutdatedI got 17% WER and I am still far from SOTA.