How It Works

The optimization pipeline that powers forgetless.

Pipeline

1

Content Ingestion

Your content (text, files, conversations) is collected and prepared. Files are read lazily - only when needed.

2

Smart Chunking

Content is split into semantic chunks that respect natural boundaries - paragraphs, functions, headers, or messages.

3

Local Embedding

Each chunk is embedded using all-MiniLM-L6-v2 model locally. No external API calls required.

4

Hybrid Scoring

Chunks are scored using priority, recency, semantic similarity, and position signals.

5

Budget Selection

Top-scoring chunks are selected to fit within your token budget. Critical content is always preserved.

6

Context Assembly

Selected chunks are assembled into coherent, optimized context ready for your LLM.

Scoring Signals

rust
Priority // User-defined importance levels
Recency // Recent content scored higher
Semantic // Query similarity via embeddings
Position // Context flow preservation

Result

rust
Input: 1,847,291 tokens
Output: 127,843 tokens
Compression: 14.5x