run an agent
live, against the real model and verifier · needs ANTHROPIC_API_KEYquick samples:
your session
loading…your runs
—
this browser only
pass rate
—
programmatic verifier
spend
—
real API cost
cache hit rate
—
input from cache
your recent runs
| when | environment | model | result | reward | steps | cost | cache |
|---|---|---|---|---|---|---|---|
| Loading… | |||||||