Setup

Setup#

Configure your environment for the Agent Track evaluation.

Install Claude Code CLI into the same environment with torch/vllm:

# In your KernelGenBench environment
npm install -g @anthropic-ai/claude-code
cp agent_bench/config.example.yaml agent_bench/config.yaml

Edit config.yaml:

paths:
  python: /path/to/your/python  # Python with torch + vllm + kernelgenbench

If Claude Code is installed in a different environment:

cp agent_bench/config.example.yaml agent_bench/config.yaml

Edit config.yaml:

paths:
  python: /path/to/envs/kernelgenbench/bin/python

When running, export PATH:

export PATH="/path/to/claude_tool/bin:$PATH"
cd agent_bench && bash test_ops.sh add --device-count 1

Field	Description
`paths.python`	Python interpreter with torch + vllm + kernelgenbench
`agent.bin`	Path to agent CLI executable (default: `claude`)

Ensure your API keys are set:

# Anthropic Claude
export ANTHROPIC_API_KEY=your_key

# OpenAI / OpenAI-compatible
export OPENAI_API_KEY=your_key

cd agent_bench

# Quick test with single operator
bash test_ops.sh add --device-count 1