LLM Track#
LLM Track evaluates LLMs on direct kernel generation with Pass@K metric.
What It Tests#
Base model capability to generate GPU kernels without execution feedback.
When to Use#
Evaluating base model code generation
Comparing different LLM providers
Quick benchmark with lower cost
Quick Start#
python scripts/generate_kernel_and_verify.py \
--op-name aten::add \
--single-test \
--server-type openai