Release Notes

Contents

Release Notes#

v0.1.0#

Note

This is a preview release. The version number shown is a pre-release identifier and may change upon final release. Content in this preview is for reference only and does not constitute a commitment or warranty for the final product.

Initial release of sglang-plugin-FL.

  • Added features

    • Three-layer operator replacement architecture for SGLang:

      • Layer 1: ATen operator replacement via FlagGems Triton kernels

      • Layer 2: SGLang fused kernel dispatch (SiluAndMul, RMSNorm, RotaryEmbedding)

      • Layer 3: Distributed communication via CommunicatorFL (FlagCX / torch.distributed)

    • Non-intrusive plugin architecture using SGLang entry_points

    • Per-operator backend selection with automatic fallback

    • YAML configuration and environment variable control

    • Bridge layer decoupling framework-specific parameters from standardized op signatures

    • Vendor auto-discovery mechanism — same backends work for both sglang-plugin-FL and vllm-plugin-FL

    • Support for NVIDIA CUDA, Huawei Ascend, and extensible to other hardware

    • Verified models: Qwen3.6-27B, Qwen3.6-35B-A3B, Qwen2.5-14B-Instruct

    • Dispatch logging and ATen replacement logging for debugging

    • Precision bisection workflow for numerical debugging