Setup#
Configure your environment for the Agent Track evaluation.
Option A: Single Environment (Recommended)#
Install Claude Code CLI into the same environment with torch/vllm:
# In your KernelGenBench environment
npm install -g @anthropic-ai/claude-code
cp agent_bench/config.example.yaml agent_bench/config.yaml
Edit config.yaml:
paths:
python: /path/to/your/python # Python with torch + vllm + kernelgenbench
Option B: Separate Environments#
If Claude Code is installed in a different environment:
cp agent_bench/config.example.yaml agent_bench/config.yaml
Edit config.yaml:
paths:
python: /path/to/envs/kernelgenbench/bin/python
When running, export PATH:
export PATH="/path/to/claude_tool/bin:$PATH"
cd agent_bench && bash test_ops.sh add --device-count 1
Configuration Fields#
Field |
Description |
|---|---|
|
Python interpreter with torch + vllm + kernelgenbench |
|
Path to agent CLI executable (default: |
API Credentials#
Ensure your API keys are set:
# Anthropic Claude
export ANTHROPIC_API_KEY=your_key
# OpenAI / OpenAI-compatible
export OPENAI_API_KEY=your_key
Verify Setup#
cd agent_bench
# Quick test with single operator
bash test_ops.sh add --device-count 1