PraisonAI knowledge-store backends interpolate unvalidated collection names into SQL and CQL queries
Summary
PraisonAI exposes optional SQL/CQL-backed knowledge-store implementations that build table and index identifiers from unvalidated name and collection arguments. Applications that pass untrusted collection names into these backends can trigger SQL or CQL injection.
Details
This issue affects the public persistence layer exported by persistence/init.py, which exposes KnowledgeStore and create_knowledge_store(). The factory wires the affected backends as supported knowledge-store providers in [persistence/factory.py](/Users/shmulc/Stuff/tmp/first-cve/scans/variant-hunt/PraisonAI/src/praisonai/praisonai/persistence/factory.py:112):
pgvectorat [persistence/factory.py](/Users/shmulc/Stuff/tmp/first-cve/scans/variant-hunt/PraisonAI/src/praisonai/praisonai/persistence/factory.py:162)cassandraat persistence/factory.pysinglestore_vectorat persistence/factory.py
The common root cause is that the KnowledgeStore interface accepts free-form collection names in create_collection(), delete_collection(), insert(), upsert(), search(), get(), delete(), and count() at [persistence/knowledge/base.py](/Users/shmulc/Stuff/tmp/first-cve/scans/variant-hunt/PraisonAI/src/praisonai/praisonai/persistence/knowledge/base.py:44), but the affected backends interpolate those values directly into query text instead of validating or quoting them.
Representative sinks:
SingleStoreVectorKnowledgeStorebuildstable_name = f"{self.table_prefix}{name}"and executes raw DDL in [persistence/knowledge/singlestore_vector.py](/Users/shmulc/Stuff/tmp/first-cve/scans/variant-hunt/PraisonAI/src/praisonai/praisonai/persistence/knowledge/singlestore_vector.py:92). The same pattern is reused fordelete_collection,insert,upsert,search,get,delete, andcount.PGVectorKnowledgeStorebuildspublic.praison_vec_{collection}andidx_{name}_embeddingdirectly into SQL in [persistence/knowledge/pgvector.py](/Users/shmulc/Stuff/tmp/first-cve/scans/variant-hunt/PraisonAI/src/praisonai/praisonai/persistence/knowledge/pgvector.py:82).CassandraKnowledgeStoreinterpolatesnameandcollectiondirectly intoCREATE TABLE,DROP TABLE,INSERT,SELECT,DELETE, andCOUNTstatements in [persistence/knowledge/cassandra.py](/Users/shmulc/Stuff/tmp/first-cve/scans/variant-hunt/PraisonAI/src/praisonai/praisonai/persistence/knowledge/cassandra.py:73).
There is already an internal identifier validator in the conversation persistence layer:
validate_identifier()only allows alphanumeric characters and underscores in [persistence/conversation/base.py](/Users/shmulc/Stuff/tmp/first-cve/scans/variant-hunt/PraisonAI/src/praisonai/praisonai/persistence/conversation/base.py:18)
That validator is used for SQL identifiers such as table_prefix and schema in the conversation stores, but no equivalent validation is applied in the affected knowledge-store backends.
Version scope:
pgvector.pyandcassandra.pywere already present byv2.4.1singlestore_vector.pywas present byv2.4.3- the current PyPI release on May 1, 2026 is
4.6.33, and the same interpolation patterns are still present
Scope note for maintainers: I did not identify a built-in PraisonAI HTTP endpoint that forwards external request data into these specific persistence methods. The issue is in the package's public persistence APIs and affects applications that pass untrusted collection names to the affected backends.
PoC
The following local reproductions show that attacker-controlled collection names become part of the executed SQL text.
- Reproduce the
SingleStoreVectorKnowledgeStore.delete_collection()query construction:
python3 -