You have to be logged in to leave a comment.

title	description
Log and track Hugging Face Transformer experiments with DagsHub	Log and track Hugging Face Transformers experiments with DagsHub with minimal code changes for collaboration, reproducibility, and data-driven decisions.

Hugging Face Transformers

The Hugging Face Transformers library is an open-source machine-learning library. It is built on top of PyTorch and TensorFlow and provides a set of pre-trained models for natural language processing tasks. With Hugging Face Transformers, developers and researchers can easily fine-tune the pre-trained models on their own datasets, or even train their own models from scratch with ease.

With DagsHub, you can easily log the experiments you run with Hugging Face Transformers to a remote server with minimal changes to your code.

This includes versioning raw and processed data with DVC, as well as logging experiment metrics, parameters, and trained models with MLflow. This integration enables you to continue using the familiar MLflow interface, while also facilitating collaboration with others, comparing results from different runs, and making data-driven decisions with ease.

How do Hugging Face Transformers work with DagsHub?

DagsHub leverages the hooks developed by Hugging Face’s Transformers library to inject code at specific points during the training run. These code snippets log information regarding the training run, like metrics and artifacts, to the DagsHub remote using information provided using environment variables set before the trainer is run.

How to log experiments with Transformers and DagsHub?

Log your transformer experiments in 3 simple steps:

Install DagsHub

=== "Mac-os, Linux, Windows"bash pip install dagshub

Configure DagsHub

import dagshub 
import os

dagshub.init(repo_name='Repository-Name', repo_owner='Username')
os.environ["HF_DAGSHUB_LOG_ARTIFACTS"]= "True" # optional; if disabled, only logs metrics!

dagshub.init configures your DagsHub account and repository, including the remote Mlflow tracking server and DagsHub Storage, with your local machine. If the repository you provide as input doesn’t exist, it will automatically create it for you.
Running this command requires authenticating your DagsHub user. If you want to automate this process, you need to set your DagsHub Token under DAGSHUB_USER_TOKEN environment variable.

!!! Important "You need to set the environment variable before you initialize the Trainer"

??? Note "Optional Environment Variables" The following are optional environmental variables that can be configured. python os.environ["HF_DAGSHUB_MODEL_NAME"] = "model name" # defaults to 'main' os.environ["BRANCH"] = "branch" # defaults to 'main'

Configure Hugging Face Transformers

=== "Mac-os, Linux, Windows"```python from transformers import TrainingArguments, Trainer

training_args = TrainingArguments(output_dir="experiment-name")
trainer = Trainer(... , args=training_args)
```

Great job! The integration has been successfully finished. Transformers will automatically recognize the activation of DagsHub integration and include our hook in your pipeline. Consequently, every run will be logged to your DagsHub repository.

Additional Resources

DagsHub x Hugging Face - learn more about DagsHub x Hugging Face Transformers integration.
Example notebook - create your own transformer model and track your experiments.

Known Issues, Limitations & Restrictions

The artifacts created during training fail to get overridden if the same experiment is run multiple times. However, the experiments are still logged and can be tracked.

Tip!

Press p or to see the previous file or, n or to see the next file

hugging_face.md 4.3 KB

Permalink History Raw

Hugging Face Transformers

How do Hugging Face Transformers work with DagsHub?

How to log experiments with Transformers and DagsHub?

Install DagsHub

Configure DagsHub

Configure Hugging Face Transformers

Additional Resources

Known Issues, Limitations & Restrictions

Comments

Use Google Cloud Storage!

Specify your Google Storage bucket

Service Account Key

Congratulations!

Use AWS S3 as storage!

Specify your S3 bucket

Access key (If needed)

Congratulations!

Use any S3 compatible storage!

Specify your S3 bucket

Access key (If needed)

Congratulations!

Use Azure Cloud Storage!

Specify your Azure Storage bucket

Access key (If needed)

Congratulations!

DAGsHub-Official / dagshub-docs

hugging_face.md 4.3 KB Permalink History Raw

Hugging Face Transformers

How do Hugging Face Transformers work with DagsHub?

How to log experiments with Transformers and DagsHub?

Install DagsHub

Configure DagsHub

Configure Hugging Face Transformers

Additional Resources

Known Issues, Limitations & Restrictions

Comments

Use Google Cloud Storage!

Specify your Google Storage bucket

Service Account Key

Congratulations!

Use AWS S3 as storage!

Specify your S3 bucket

Access key (If needed)

Congratulations!

Use any S3 compatible storage!

Specify your S3 bucket

Access key (If needed)

Congratulations!

Use Azure Cloud Storage!

Specify your Azure Storage bucket

Access key (If needed)

Congratulations!

DAGsHub-Official
/
dagshub-docs

hugging_face.md 4.3 KB

Permalink History Raw