How It Works
The optimization pipeline that powers forgetless.
Pipeline
1
Content Ingestion
Your content (text, files, conversations) is collected and prepared. Files are read lazily - only when needed.
2
Smart Chunking
Content is split into semantic chunks that respect natural boundaries - paragraphs, functions, headers, or messages.
3
Local Embedding
Each chunk is embedded using all-MiniLM-L6-v2 model locally. No external API calls required.
4
Hybrid Scoring
Chunks are scored using priority, recency, semantic similarity, and position signals.
5
Budget Selection
Top-scoring chunks are selected to fit within your token budget. Critical content is always preserved.
6
Context Assembly
Selected chunks are assembled into coherent, optimized context ready for your LLM.
Scoring Signals
rust
Priority // User-defined importance levelsRecency // Recent content scored higherSemantic // Query similarity via embeddingsPosition // Context flow preservation
Result
rust
Input: 1,847,291 tokensOutput: 127,843 tokensCompression: 14.5x