Requirements#
Software Requirements#
Requirement |
Version |
Notes |
|---|---|---|
Python |
3.10 - 3.13 |
Required |
PyTorch |
>= 2.7.1 |
Required |
vLLM |
0.13.0 |
Required, from official release or fork |
FlagGems |
5.0.0 |
Required for operator dispatch |
FlagCX |
0.9.0 |
Optional, for multi-chip communication |
FlagTree |
0.4.0 |
Ascend NPU only |
Supported hardware platforms#
The following table summarizes supported hardware and their verification status:
Chip Vendor |
Status |
Notes |
|---|---|---|
NVIDIA |
Supported |
|
Ascend |
Supported |
Requires FlagTree and eager execution |
Pingtouge-Zhenwu |
Supported |
|
Iluvatar |
Supported |
Requires FlagTree and eager execution |
MetaX |
Supported |
|
Moore Threads |
Supported |
Model Compatibility#
In theory, vllm-plugin-FL can support all models available in vLLM if no unsupported operators are involved. The following models have been end-to-end verified:
Model |
Status |
Example |
|---|---|---|
Qwen3.5-397B-A17B |
Supported |
|
Qwen3-Next-80B-A3B |
Supported |
|
Qwen3-4B |
Supported |
|
MiniCPM-o 4.5 |
Supported |
|
GLM-5 |
Supported |
|
Qwen3.5-35B-A3B |
Supported |
|
BAAI/bge-m3 |
Supported |