podbench / agent environment fleet

run an agent

live, against the real model and verifier · needs OPENROUTER_API_KEY or ANTHROPIC_API_KEY

quick samples:

loading…

your runs

—

this browser only

pass rate

—

programmatic verifier

spend

—

real API cost

cache hit rate

—

input from cache

when	environment	model	result	reward	trust	steps	cost	cache
Loading…