Architecture Pattern

🧠 Knowledge Agent (No Embeddings)

Replace vector databases + embeddings with filesystem + bash in Vercel Sandbox. The agent uses grep, find, and cat for deterministic retrieval. Cost dropped from $1.00 → $0.25 per query.

The Problem with Traditional RAG

⚠️ Expensive Embedding Pipeline

Generating + storing embeddings for every document chunk costs money and time. Re-embedding on content changes is a maintenance burden.

⚠️ Semantic Mismatch

Vector similarity can return semantically 'close' but factually irrelevant chunks. Precision is lower than exact text matching for many use cases.

⚠️ Opaque Retrieval

Hard to debug why certain chunks were retrieved. No transparency in the embedding → similarity → ranking pipeline.

⚠️ Infrastructure Overhead

Pinecone/Qdrant/Weaviate adds another service to manage. Cost scales with document count and query volume.

The Sandbox Solution

Instead of embeddings, give the AI agent access to a Vercel Sandbox with your knowledge base as files. The agent uses standard Unix tools to search and retrieve relevant content:

find . -name '*.md' -type f→ Discover available documentation files

grep -rn 'fluid compute' docs/→ Search for specific terms across all files

cat docs/platform/compute.md→ Read full file content for context

head -50 docs/api-reference.md→ Preview beginning of large files

wc -l docs/**/*.md→ Understand document structure and size

💰 Cost Comparison

❌ Embeddings RAG: ~$1.00/query

Embedding generation: $0.02
Vector DB query: $0.10
LLM with retrieved chunks: $0.88
Plus: monthly vector DB hosting

✅ Sandbox Agent: ~$0.25/query

Sandbox execution: $0.05
grep/cat retrieval: $0.00
LLM with exact text: $0.20
No external DB hosting needed

When to Use Each Approach

Scenario	Sandbox Agent	Embeddings RAG
Structured docs (Markdown, code)	✅ grep is exact	⚠️ Embeddings lose structure
Small-medium corpus (< 10K files)	✅ File system scales fine	⚠️ Overkill for small sets
Exact keyword search needed	✅ Deterministic results	❌ Semantic only, miss exact matches
Millions of unstructured docs	❌ File system too slow	✅ Vector index scales
Semantic similarity required	⚠️ Keyword-based	✅ Meaning-based retrieval
Multi-modal (images, PDFs)	❌ Text files only	✅ Multi-modal embeddings

Architecture Flow

User asks a question

Natural language query sent to the AI agent.

Agent decides search strategy

LLM determines which files/directories to search based on the query. Tool calling with grep, find, cat.

Sandbox executes commands

Vercel Sandbox runs the bash commands in an isolated environment against the knowledge base files.

Agent reads results

Search results (file paths, matching lines, file contents) returned to the LLM context.

Agent answers with citations

LLM generates answer with exact file paths and line numbers as citations. Fully traceable.

Read on Vercel Blog →← All Architectures