Add new operators and vendor backends#
Add new operators#
When adding a new operator, modify these files:
backends/flaggems/impl/*.py- Add FlagGems implementationbackends/flaggems/flaggems.py- Add method to backend classbackends/flaggems/register_ops.py- Register OpImplbackends/reference/impl/*.py- Add PyTorch implementation (if applicable)backends/reference/reference.py- Add method to backend classbackends/reference/register_ops.py- Register OpImplbackends/vendor/<vendor>/impl/*.py- Add vendor-specific implementation (optional)backends/vendor/<vendor>/<vendor>.py- Add method to vendor backend classbackends/vendor/<vendor>/register_ops.py- Register vendor OpImplops.py- Add abstract method declaration
Note: Not all operators require a reference implementation. For example, attention_backend only has FlagGems and vendor implementations since it returns a backend class path rather than executing a computation.
Add vendor backends#
The dispatch system supports three ways to integrate vendor backends:
Built-in vendor backends - Located in
backends/vendor/(recommended for core vendors)External plugin packages - Distributed as separate Python packages
Environment-based plugins - Loaded via
VLLM_FL_PLUGIN_MODULES
Option 1: Built-in vendor backend#
Directory structure:
backends/vendor/<vendor_name>/
├── __init__.py
├── <vendor_name>.py # Backend class
├── register_ops.py # Registration function
└── impl/ # Operator implementations
├── __init__.py
├── activation.py
├── normalization.py
├── rotary.py
└── attention.py # (optional) Vendor-specific attention backend
Step 1: Create backend class (<vendor_name>.py):
from ...base import Backend
class <VendorName>Backend(Backend):
_available = None
@property
def name(self) -> str:
return "<vendor_name>"
@property
def vendor(self) -> str:
return "<vendor_name>" # Required for vendor backends
def is_available(self) -> bool:
if <VendorName>Backend._available is None:
try:
import <vendor_library>
<VendorName>Backend._available = True
except ImportError:
<VendorName>Backend._available = False
return <VendorName>Backend._available
def silu_and_mul(self, x):
from .impl.activation import silu_and_mul_<vendor>
return silu_and_mul_<vendor>(x)
Step 2: Create registration module (register_ops.py):
from ....types import OpImpl, BackendImplKind, BackendPriority
def register_builtins(registry):
from .<vendor_name> import <VendorName>Backend
backend = <VendorName>Backend()
impls = [
OpImpl(
op_name="silu_and_mul",
impl_id="vendor.<vendor_name>",
kind=BackendImplKind.VENDOR,
fn=backend.silu_and_mul,
vendor="<vendor_name>",
priority=BackendPriority.VENDOR, # 100
),
]
registry.register_many(impls)
Step 3: Register in builtin_ops.py:
try:
from .backends.vendor.<vendor_name>.register_ops import register_builtins as register_<vendor>
register_<vendor>(registry)
except Exception as e:
logger.debug(f"<Vendor> operators not available: {e}")
Option 2: External plugin package#
Create a separate package with entry points:
# setup.py
setup(
name="vllm-plugin-<vendor>",
entry_points={
"vllm_fl.plugin": [
"<vendor> = vllm_fl_<vendor>.register_ops:register_builtins",
],
},
)
Install and use:
pip install vllm-plugin-<vendor>
# Plugin auto-discovered via entry points
Option 3: Environment-based plugin#
export VLLM_FL_PLUGIN_MODULES=my_custom_backend.register_ops
The module should provide a register_builtins(registry) function.
Priority levels#
Use constants from types.py:
BackendPriority.DEFAULT(150) - FlagGemsBackendPriority.VENDOR(100) - Vendor backendsBackendPriority.REFERENCE(50) - PyTorch
Test your backend#
from vllm_fl.dispatch import get_default_manager
manager = get_default_manager()
manager.ensure_initialized()
# Check registration
snap = manager.registry.snapshot()
for op_name, impls in snap.impls_by_op.items():
for impl in impls:
if impl.vendor == "<vendor_name>":
print(f"{op_name}: {impl.impl_id}, available={impl.is_available()}")
Enable debug output:
export VLLM_FL_LOG_LEVEL=DEBUG
Vendor backend checklist#
Backend class inherits from
Backendvendorproperty returns vendor name (not None)is_available()checks hardware/library availabilityregister_ops.pyusesBackendImplKind.VENDORimpl_idfollows format:vendor.<vendor_name>Priority set to
BackendPriority.VENDOR(100)Error handling for missing dependencies
(Optional)
attention_backend()returns vendor-specific attention backend class path
Current vendor backends#
Vendor |
Device |
Library |
Attention Backend |
|---|---|---|---|
|
NVIDIA GPU |
|
- (uses vLLM native) |
|
Huawei NPU |
|
|
See backends/vendor/template/ for a template to create new vendor backends.
Multi-process safety#
OpManager supports multi-process environments:
Uses
os.register_at_fork()to automatically reset state after forkPID detection ensures independent initialization per process
Thread-safe registry and cache operations