Low
osv
·
GHSA-hpv8-x276-m59f
vLLM Vulnerable to Remote DoS via Special-Token Placeholders
Published May 5, 2026
CVSS 3.1
Summary
This report explains a Token Injection vulnerability in vLLM’s multimodal processing. Unauthenticated, text-only prompts that spell special tokens are interpreted as control. Image and video placeholder sequences supplied without matching data cause vLLM to index into empty grids during input-position computation, raising an unhandled IndexError and terminating the worker or degrading availability. Multimodal paths that rely on image_grid_thw/video_grid_thw are affected. Severity: High (remote DoS). Reproduced on vLLM 0.10.0 with Qwen2.5-VL.
Details
- Affected component: multimodal input position computation.
- File/functions (paths are indicative):
- vllm/model_executor/layers/rotary_embedding.py
- get_input_positions_tensor(...)
- _vl_get_input_positions_tensor(...)
- vllm/model_executor/layers/rotary_embedding.py
- Failure mechanism:
- The code counts detected vision tokens and then indexes video_grid_thw/image_grid_thw accordingly.
- When user input carries placeholder tokens but no actual multimodal payload, these grids are empty. The code does not bounds-check before indexing.
Representative snippet (context):
# vllm/model_executor/layers/rotary_embedding.py
@classmethod
def _vl_get_input_positions_tensor(
cls,
input_tokens,
hf_config,
image_grid_thw,
video_grid_thw,
...,
):
# detect video tokens
video_nums = (vision_tokens == video_token_id).sum()
# later in processing
t, h, w = (
video_grid_thw[video_index][0], # IndexError if no video data
video_grid_thw[video_index][1],
video_grid_thw[video_index][2],
)
Abbreviated call path:
OpenAI API request
→ vllm.v1.engine.core: step/execute_model
→ vllm.v1.worker.gpu_model_runner: _update_states/execute_model
→ vllm.model_executor.layers.rotary_embedding: get_input_positions_tensor
→ _vl_get_input_positions_tensor
→ IndexError: list index out of range
PoC
Environment
- vLLM: 0.10.0
- Model: Qwen/Qwen2.5-VL-3B-Instruct
- Launch server:
python -m vllm.entrypoints.openai.api_server \
--model Qwen/Qwen2.5-VL-3B-Instruct \
--port 8000
Request (text-only, no image/video data)
cat > request.json
Affected AI Products
vllm