🧠 Knowledge Agent (No Embeddings)
Replace vector databases + embeddings with filesystem + bash in Vercel Sandbox. The agent uses grep, find, and cat for deterministic retrieval. Cost dropped from $1.00 → $0.25 per query.
The Problem with Traditional RAG
⚠️ Expensive Embedding Pipeline
Generating + storing embeddings for every document chunk costs money and time. Re-embedding on content changes is a maintenance burden.
⚠️ Semantic Mismatch
Vector similarity can return semantically 'close' but factually irrelevant chunks. Precision is lower than exact text matching for many use cases.
⚠️ Opaque Retrieval
Hard to debug why certain chunks were retrieved. No transparency in the embedding → similarity → ranking pipeline.
⚠️ Infrastructure Overhead
Pinecone/Qdrant/Weaviate adds another service to manage. Cost scales with document count and query volume.
The Sandbox Solution
Instead of embeddings, give the AI agent access to a Vercel Sandbox with your knowledge base as files. The agent uses standard Unix tools to search and retrieve relevant content:
find . -name '*.md' -type f→ Discover available documentation filesgrep -rn 'fluid compute' docs/→ Search for specific terms across all filescat docs/platform/compute.md→ Read full file content for contexthead -50 docs/api-reference.md→ Preview beginning of large fileswc -l docs/**/*.md→ Understand document structure and size💰 Cost Comparison
❌ Embeddings RAG: ~$1.00/query
- Embedding generation: $0.02
- Vector DB query: $0.10
- LLM with retrieved chunks: $0.88
- Plus: monthly vector DB hosting
✅ Sandbox Agent: ~$0.25/query
- Sandbox execution: $0.05
- grep/cat retrieval: $0.00
- LLM with exact text: $0.20
- No external DB hosting needed
When to Use Each Approach
| Scenario | Sandbox Agent | Embeddings RAG |
|---|---|---|
| Structured docs (Markdown, code) | ✅ grep is exact | ⚠️ Embeddings lose structure |
| Small-medium corpus (< 10K files) | ✅ File system scales fine | ⚠️ Overkill for small sets |
| Exact keyword search needed | ✅ Deterministic results | ❌ Semantic only, miss exact matches |
| Millions of unstructured docs | ❌ File system too slow | ✅ Vector index scales |
| Semantic similarity required | ⚠️ Keyword-based | ✅ Meaning-based retrieval |
| Multi-modal (images, PDFs) | ❌ Text files only | ✅ Multi-modal embeddings |
Architecture Flow
User asks a question
Natural language query sent to the AI agent.
Agent decides search strategy
LLM determines which files/directories to search based on the query. Tool calling with grep, find, cat.
Sandbox executes commands
Vercel Sandbox runs the bash commands in an isolated environment against the knowledge base files.
Agent reads results
Search results (file paths, matching lines, file contents) returned to the LLM context.
Agent answers with citations
LLM generates answer with exact file paths and line numbers as citations. Fully traceable.