VulnWatch VulnWatch
← Back to dashboard
Medium github · GHSA-fj2m-qvh9-jq4q

local-deep-research is Vulnerable to HTML Injection via Unescaped User Input in PDF Export (`pdf_service.py:_markdown_to_html`)

Published May 11, 2026 CVSS 5.0

Summary

PDFService._markdown_to_html() constructs an HTML document by interpolating user-controlled values — specifically title (sourced from research.title or research.query) and metadata key-value pairs — directly into an f-string without any HTML escaping. An authenticated attacker can craft a research query containing HTML special characters to inject arbitrary HTML tags into the document processed by WeasyPrint during PDF export. This injection can be chained to trigger a Server-Side Request Forgery (SSRF), bypassing the application's existing SSRF defenses in ssrf_validator.py.


Details

Vulnerable code: src/local_deep_research/web/services/pdf_service.py, lines 171–176

# pdf_service.py:171-176
if title:
    html_parts.append(f"{title}")   # ← title is not escaped

if metadata:
    for key, value in metadata.items():
        html_parts.append(f'')  # ← key/value are not escaped

Data flow trace:

User input: research.query
        │
        ▼
research_routes.py:1321
  pdf_title = research.title or research.query
        │
        ▼
research_routes.py:1325-1326
  export_report_to_memory(report_content, format, title=pdf_title)
        │
        ▼
pdf_service.py:107
  PDFService.markdown_to_pdf(markdown_content, title=pdf_title)
        │
        ▼
pdf_service.py:137
  _markdown_to_html(markdown_content, title, metadata)
        │
        ▼
pdf_service.py:172
  f"{title}"   ← injection point, no escaping
        │
        ▼
pdf_service.py:112
  HTML(string=html_content)   ← WeasyPrint renders the injected HTML

research.query is a string submitted by the user via POST /api/start_research, stored as-is in the database, and retrieved without any sanitization. When the user triggers POST /api/v1/research//export/pdf, this value is embedded unescaped into the HTML document processed by WeasyPrint.

Injection point 1: `` tag breakout

Input:    
Rendered: 

When WeasyPrint encounters the injected `` tag, it issues an HTTP GET request to the value of src by default.

Injection point 2: `` attribute breakout

Input:    " />

Affected AI Products

ollama llama
Get the weekly digest. Every Monday: top AI security stories of the week. Free.