Agent Track

Agent Track#

Agent Track evaluates coding agents that iteratively generate, verify, and optimize kernels.

What It Tests#

Autonomous debugging and optimization capability with execution feedback.

When to Use#

  • Testing agent frameworks (Claude Code, OpenCode)

  • Evaluating kernel-specialized agents (AutoKernel, AKO4ALL)

  • Production-ready kernel generation

Quick Start#

cd agent_bench
bash test_ops.sh add --device-count 1