Hints#
This section introduces Hints and how Hints are handled in the compilation process.
Hints introduction#
Hints provides a non-invasive performance hints injection mechanism that enables hardware-aware optimizations while maintaining full compatibility with native Triton code. The mechanism is simple: programmers add inline comments (#@hint: <hint_name>) to the corresponding Triton operations (for example, tl.load) to provide hardware-aware optimization hints. These hints are encoded as MLIR (Multi-Level Intermediate Representation) attributes during compilation, enabling the mid-end and backend to apply hardware-aware optimizations and multi-platform dynamic adaptation based on an elastic verification strategy.
This mechanism provides the following characteristics:
Native compatibility: Hints are optional—kernels remain valid Triton and run correctly with the original Triton compiler.
Low learning overhead: Hints are added via lightweight comments (
flagtree_hints) without changing core Triton syntax.Enhanced compiler extensibility: New optimizations can be introduced by evolving hint schemas and MLIR attributes, avoiding language-level operation/syntax extensions.
Enhanced performance capability: Hardware-aware hints unlock additional compiler optimizations to better utilize hardware features.
For how to use Hints, see Use Hints.
Hints in the compilation process#
Hints extends TTIR operations with attributes to enable hardware-aware optimizations. The implementation involves AST processing, TTIR attribute encoding, and backend pass distribution.
AST Processing: Hints are processed in two stages:
Parsing(
python/triton/runtime/jit.py): Theparse()method uses Python’stokenizemodule to scan source code for#@hint:comments, extracts hint names, and maps them to line numbers. These hints are stored in aline_flagtree_hintsdictionary and attached to the AST function definition node.Create Op (
python/triton/compiler/code_generator.py,python/triton/language/core.py, andpython/triton/language/semantic.py):During code generation, when encounteringtl.loadcalls, the code generator retrieves hints from the line number mapping and passes them as theflagtree_hintsparameter toload(). The semantic layer then forwards this parameter to the builder’screate_load()method, which encodes hints as TTIR operation attributes.
TTIR Attribute Extension: Hints are encoded as attributes on TTIR operations (for example,
tt.loadoperations carry hint attributes), enabling mid-end and backend passes to access and process them.Backend Pass Distribution: Hints processing passes are dispatched in backend compilers (for example,
third_party/[backend_name]/backend/compiler.py). Each backend registers appropriate passes based on the hints it supports (for example,add_process_shared_memory_hint()for NVIDIA backend).Pass Implementation Locations: Hints processing passes are implemented in the following directories:
Backend-specific directories: Each backend may implement hint-specific passes in its own directory (for example,
third_party/nvidia/)Linalg/FLIR directories: Common Linalg passes that process hints during structured-to-memref conversions.
TLE directories: TLE-related passes that may interact with hints during transformations.