Skip to main content
Ctrl+K
← Back to FlagOS Documentation
← Back to FlagOS Documentation
VLLM-Plugin-FL Documentation - Home VLLM-Plugin-FL Documentation - Home
  • vllm-plugin-FL Documentation

📑 Release Notes

  • Release Notes

📚 Guides

  • vllm-plugin-FL Overview
    • Features
    • Operator dispatch mechanism
  • Getting Started
    • Requirements
    • Install software for running an inference task
    • Run an offline batched inference
  • Operator dispatch user guide
    • Quick start
    • Configure backend selection
    • Policy context management
    • Add new operators and vendor backends

📖 References

  • Dispatch API Reference
  • Repository
  • Suggest edit

Getting Started

Getting Started#

This section covers the requirements for installing vllm-plugin-FL and guides you through installing vllm, vllm-plugin-FL on different hardware platforms, and running a inference task.

  • Requirements
    • Software Requirements
    • Supported hardware platforms
    • Model Compatibility
  • Install software for running an inference task
    • Additional setup steps for running an inference task on Huawei Ascend
    • Additional setup steps for running an inference task with CUDA
    • Dispatch operators
  • Run an offline batched inference

previous

Operator dispatch mechanism

next

Requirements

By FlagOS Community

© Copyright 2025-2026, FlagOS Community.

Last updated on Jun 09, 2026.