Register
Login
Resources
Docs Blog Datasets Glossary Case Studies Tutorials & Webinars
Product
Data Engine LLMs Platform Enterprise
Pricing Explore
Connect to our Discord channel
General:  llm openai Data Domain:  nlp Framework:  langchain Integration:  dvc git mlflow
7377e0277b
Initial commit
6 months ago
0aa191fe28
Add notebook
6 months ago
82037f7358
Add 'README.md'
6 months ago
Storage Buckets
Data Pipeline
Legend
DVC Managed File
Git Managed File
Metric
Stage File
External File

README.md

You have to be logged in to leave a comment. Sign In

LLM Tracing

Overview

This project provides an approach to tracing interactions with Large Language Models (LLMs) using MLflow, LangChain, and DagsHub. It enables tracking and evaluating LLM responses to improve performance and debugging.

Features

  • Integration with DagsHub for MLflow experiment tracking
  • LangChain support for structured LLM interactions
  • MLflow autologging for automated trace logging
  • Evaluation framework for assessing model responses

Installation

To set up the project, install the required dependencies:

pip install dagshub mlflow langchain langchain_openai

Usage

1. Initialize the Repository

import dagshub

dagshub.init(repo_owner='Dean', repo_name='LLM_Tracing', mlflow=True)

2. Set Up OpenAI API Key

from getpass import getpass
OPENAI_API_KEY = getpass("OPENAI Key:")

3. Define and Run LLM Prompt

import mlflow
from langchain.prompts import PromptTemplate
from langchain_core.output_parsers import StrOutputParser
from langchain_openai import ChatOpenAI

mlflow.set_experiment("LangChain Tracing")
mlflow.langchain.autolog()

llm = ChatOpenAI(model="gpt-4o-mini", temperature=0.7, max_tokens=1000, api_key=OPENAI_API_KEY)

prompt_template = PromptTemplate.from_template(
    "Answer the question as if you are {person}, fully embodying their style. "
    "The question is: {question}"
)

chain = prompt_template | llm | StrOutputParser()

response = chain.invoke({
    "person": "Linus Torvalds",
    "question": "Can I just set everyone’s access to sudo to make things easier?"
})

4. Evaluate Model Responses

import pandas as pd

eval_data = pd.DataFrame({
    "inputs": ["What is MLflow?", "What is Spark?"],
    "ground_truth": [
        "MLflow is an open-source platform for managing the ML lifecycle.",
        "Apache Spark is an open-source, distributed computing system."
    ]
})

with mlflow.start_run():
    system_prompt = "Answer the following question in two sentences"
    logged_model_info = mlflow.openai.log_model(
        model="gpt-4",
        task=openai.chat.completions,
        artifact_path="model",
        messages=[
            {"role": "system", "content": system_prompt},
            {"role": "user", "content": "{question}"},
        ],
    )
    results = mlflow.evaluate(logged_model_info.model_uri, eval_data, targets="ground_truth", model_type="question-answering")
    print(results.metrics)

5. Save Notebook to DagsHub

from dagshub.notebook import save_notebook
save_notebook(repo="Dean/LLM_tracing", path="LLM_Tracing_Tutorial.ipynb", commit_message="Add notebook", versioning="git")

Resources

Tip!

Press p or to see the previous file or, n or to see the next file

About

A demonstration of LLM prompt engineering and tracing with DagsHub and MLflow

Collaborators 1

Comments

Loading...