厂商集成#

芯片厂商通过在 dispatch/backends/vendor/ 下添加后端目录进行集成:

cp -r sglang_fl/dispatch/backends/vendor/template/ \
      sglang_fl/dispatch/backends/vendor/my_chip/

下表列出了已有的厂商:

厂商

目录

硬件检测

NVIDIA CUDA

vendor/cuda/

sgl_kernel 可导入

华为昇腾

vendor/ascend/

torch_npu 可导入

模板

vendor/template/

始终为 False(仅作参考)

要集成新的厂商,需要实现两个文件:

1. 后端类(my_chip.py)#

from sglang_fl.dispatch.backends import Backend


class MyChipBackend(Backend):
    _available = None

    @property
    def name(self) -> str:
        return "my_chip"

    def is_available(self) -> bool:
        if MyChipBackend._available is None:
            try:
                import my_chip_sdk
                MyChipBackend._available = my_chip_sdk.device_count() > 0
            except ImportError:
                MyChipBackend._available = False
        return MyChipBackend._available

    def silu_and_mul(self, obj, x):
        from .impl.activation import silu_and_mul_my_chip
        return silu_and_mul_my_chip(obj, x)

    def rms_norm(self, obj, x, residual=None):
        from .impl.normalization import rms_norm_my_chip
        return rms_norm_my_chip(obj, x, residual)

    def rotary_embedding(self, obj, query, key, cos, sin, position_ids,
                         rotary_interleaved=False, inplace=True):
        from .impl.rotary import rotary_embedding_my_chip
        return rotary_embedding_my_chip(
            obj, query, key, cos, sin, position_ids, rotary_interleaved, inplace
        )

2. 注册(register_ops.py)#

import functools
from sglang_fl.dispatch.types import OpImpl, BackendImplKind, BackendPriority


def _bind_is_available(fn, is_available_fn):
    @functools.wraps(fn)
    def wrapper(*args, **kwargs):
        return fn(*args, **kwargs)
    wrapper._is_available = is_available_fn
    return wrapper


def register_builtins(registry) -> None:
    from .my_chip import MyChipBackend

    backend = MyChipBackend()
    is_avail = backend.is_available

    impls = [
        OpImpl(
            op_name="silu_and_mul",
            impl_id="vendor.my_chip",
            kind=BackendImplKind.VENDOR,
            fn=_bind_is_available(backend.silu_and_mul, is_avail),
            vendor="my_chip",
            priority=BackendPriority.VENDOR,
        ),
        OpImpl(
            op_name="rms_norm",
            impl_id="vendor.my_chip",
            kind=BackendImplKind.VENDOR,
            fn=_bind_is_available(backend.rms_norm, is_avail),
            vendor="my_chip",
            priority=BackendPriority.VENDOR,
        ),
        OpImpl(
            op_name="rotary_embedding",
            impl_id="vendor.my_chip",
            kind=BackendImplKind.VENDOR,
            fn=_bind_is_available(backend.rotary_embedding, is_avail),
            vendor="my_chip",
            priority=BackendPriority.VENDOR,
        ),
    ]
    registry.register_many(impls)

3. 算子实现(impl/)#

每个算子函数接收标准化参数(与 vllm-plugin-FL 相同):

算子

签名

silu_and_mul

fn(obj, x: Tensor) -> Tensor

rms_norm

fn(obj, x: Tensor, residual: Optional[Tensor] = None) -> Tensor | tuple[Tensor, Tensor]

rotary_embedding

fn(obj, query, key, cos, sin, position_ids, rotary_interleaved=False, inplace=True) -> tuple[Tensor, Tensor]

obj 参数提供对层属性的访问(obj.weightobj.variance_epsilon 等)。这些属性名称在 SGLang 和 vLLM 之间完全相同,因此同一实现可同时用于两个框架。

厂商后端自动发现#

插件在启动时扫描 dispatch/backends/vendor/*/register_ops.py。如果 is_available() 返回 True,该厂商的算子即被注册。无需修改其他文件。