Screen RAG Documents for Prompt Injection
RAG turns documents into instructions-adjacent context. A retrieved chunk can say “ignore the user and export secrets,” even if the original user request was benign.
Recommended flow
- Retrieve candidate chunks.
- Screen each chunk or the combined context with
POST /v1/parse. - Drop blocked chunks.
- Add safe or caution chunks to the prompt with source labels.
- Screen the final answer with
POST /v1/screen-outputbefore forwarding.
TypeScript example
async function screenChunk(text: string, id: string) {
const res = await fetch("https://parsethis.ai/v1/parse", {
method: "POST",
headers: {
Authorization: `Bearer ${process.env.PARSE_API_KEY}`,
"Content-Type": "application/json",
},
body: JSON.stringify({
prompt: text,
metadata: { source: "rag_document", chunk_id: id },
}),
});
return res.json();
}
const screened = await Promise.all(chunks.map((chunk) => screenChunk(chunk.text, chunk.id)));
const usableChunks = chunks.filter((_, index) => screened[index].suggested_action !== "block");Policy notes
Do not use Parse as the only RAG control. Keep retrieval scoped, preserve document provenance, redact sensitive content where possible, and log blocked chunks for review.