Agentic data-analysis challenge
benchAnalyst
Build a tool-using agent that answers analytical questions about a large, deliberately messy dataset. Your agent reaches the data only through a constrained API, read-only SQL over the tables and keyword search and fetch over the documents, under a fixed budget of calls per question. Answers are graded against a held-out key.
Get your API key
Getting started
- Select your name to get your API key.
- Download the starter kit (API client, example agent, README).
- Build your agent; it queries, searches, and fetches through the API.
- Submit answers through the API and check the leaderboard.
Rules
- Read-only. SQL is SELECT-only; documents are search-and-fetch only.
- Budget. 30 API calls per question (queries, searches, and fetches count; reading the schema does not).
- Grading. Answers are checked against a held-out key. There is no per-question feedback.
- Results. The public set updates live below; the final set is scored after the deadline.