Register
Login
Resources
Docs Blog Datasets Glossary Case Studies Tutorials & Webinars
Product
Data Engine LLMs Platform Enterprise
Pricing Explore
Connect to our Discord channel
Integration:  git github
Michael 1c9afc4336
chore: sanitize sensitive credentials in HTTP provider debug logs (#5387)
7 hours ago
ae1b1dd5a9
feat(cli): add mcp server (#4595)
1 month ago
f984945770
fix(devcontainer): simplify and standardize development environment (#1547)
1 year ago
99290bd1ff
chore(CI): enable and refactor Docker build for caching (#5374)
1 day ago
370fc43957
chore: Add email to remote inferencere requests (#2647)
7 months ago
193a8d3eab
chore: Sort imports and turn on rule against unused imports (#5010)
1 month ago
7609932114
feat: Persist model audit scans (#5308)
1 week ago
b2f9debe6e
fix: mcp eval example (#5390)
1 day ago
aa94c4ddcf
fix(Dockerfile): Create .promptfoo directory in Dockerfile and remove initContainer (#3435)
5 months ago
1d03b364b8
chore: improve generated constants handling to prevent accidental commits (#5148)
2 weeks ago
79928928a0
fix: resolve TypeScript errors in test files
1 week ago
32386962c1
docs: model graded metrics updates (#5285)
1 day ago
src
1c9afc4336
chore: sanitize sensitive credentials in HTTP provider debug logs (#5387)
7 hours ago
1c9afc4336
chore: sanitize sensitive credentials in HTTP provider debug logs (#5387)
7 hours ago
5cfe9d6ea3
chore: migrate from ESLint + Prettier to Biome (#4903)
1 month ago
78a93bd2e1
chore: update coderabbit config to be less aggressive (#4586)
2 months ago
ee622119a1
feat: Migrate NextUI to a React App (#1637)
11 months ago
1d03b364b8
chore: improve generated constants handling to prevent accidental commits (#5148)
2 weeks ago
e1aa6ab106
docs: Merge docs into main repo (#317)
1 year ago
7c335ff340
chore: upgrade development versions of Node.js to v22 and Python to 3.13 (#2340)
7 months ago
5cfe9d6ea3
chore: migrate from ESLint + Prettier to Biome (#4903)
1 month ago
5cfe9d6ea3
chore: migrate from ESLint + Prettier to Biome (#4903)
1 month ago
d6c3ba5596
chore: bump version 0.117.11 (#5397)
1 day ago
287f95f4b9
docs: add critical git workflow guidelines to CLAUDE.md (#5362)
5 days ago
242b7cbbaa
docs: add contributing guide (#1150)
1 year ago
99290bd1ff
chore(CI): enable and refactor Docker build for caching (#5374)
1 day ago
1b28ccc8c2
chore: update year
7 months ago
3ea34cfcaa
docs: clean up readme
2 days ago
5be7ca2dcf
docs(security): add security policy (#3470)
5 months ago
60389928d9
chore(webui): add intelligent scroll-timeline polyfill loading (#5130)
4 weeks ago
dcddee95ee
chore: migrate drizzle (#1922)
10 months ago
abc2b5a17a
ci: add depcheck (#5310)
1 week ago
8dc68f1a96
chore: update Jest to version 30 (#4939)
1 month ago
5cfe9d6ea3
chore: migrate from ESLint + Prettier to Biome (#4903)
1 month ago
58fd112507
chore: integrate knip for unused code detection and clean up codebase (#4464)
1 month ago
6b4746ed6c
fix: nodemon
6 months ago
d6c3ba5596
chore: bump version 0.117.11 (#5397)
1 day ago
d6c3ba5596
chore: bump version 0.117.11 (#5397)
1 day ago
8938dd3236
chore(build): add pnpm support (#3307)
5 months ago
b5f0766391
chore(deps): update minor and patch dependencies (#4686)
1 month ago
Storage Buckets

README.md

You have to be logged in to leave a comment. Sign In

Promptfoo: LLM evals & red teaming

npm npm GitHub Workflow Status MIT license Discord

promptfoo is a developer-friendly local tool for testing LLM applications. Stop the trial-and-error approach - start shipping secure, reliable AI apps.

Website ยท Getting Started ยท Red Teaming ยท Documentation ยท Discord

Quick Start

# Install and initialize project
npx promptfoo@latest init

# Run your first evaluation
npx promptfoo eval

See Getting Started (evals) or Red Teaming (vulnerability scanning) for more.

What can you do with Promptfoo?

  • Test your prompts and models with automated evaluations
  • Secure your LLM apps with red teaming and vulnerability scanning
  • Compare models side-by-side (OpenAI, Anthropic, Azure, Bedrock, Ollama, and more)
  • Automate checks in CI/CD
  • Share results with your team

Here's what it looks like in action:

prompt evaluation matrix - web viewer

It works on the command line too:

prompt evaluation matrix - command line

It also can generate security vulnerability reports:

gen ai red team

Why Promptfoo?

  • ๐Ÿš€ Developer-first: Fast, with features like live reload and caching
  • ๐Ÿ”’ Private: Runs 100% locally - your prompts never leave your machine
  • ๐Ÿ”ง Flexible: Works with any LLM API or programming language
  • ๐Ÿ’ช Battle-tested: Powers LLM apps serving 10M+ users in production
  • ๐Ÿ“Š Data-driven: Make decisions based on metrics, not gut feel
  • ๐Ÿค Open source: MIT licensed, with an active community

Learn More

Contributing

We welcome contributions! Check out our contributing guide to get started.

Join our Discord community for help and discussion.

Tip!

Press p or to see the previous file or, n or to see the next file

About

Test your prompts. Evaluate and compare LLM outputs, catch regressions, and improve prompt quality.

Collaborators 1

Comments

Loading...