Chunking
Chunking is content-aware. `forgetless` picks a strategy from `ContentType`, then applies token targets, overlap, and deduplication before scoring begins.
Chunk presets
| Preset | Target | Max | Good for |
|---|---|---|---|
| ChunkConfig::default() | 512 | 1024 | General text and mixed project context. |
| ChunkConfig::for_code() | 256 | 512 | Codebases where smaller functions need independent ranking. |
| ChunkConfig::for_conversation() | 200 | 400 | Chat transcripts and message histories. |
| ChunkConfig::for_speed() | 1000 | 2000 | Fast coarse compression with larger chunks. |
| ChunkConfig::for_quality() | 256 | 512 | More selective ranking with finer chunk boundaries. |
Content types
| Type | Detected from | Notes |
|---|---|---|
| Text | Fallback default | Used for plain text and unknown extensions. |
| Code | `.rs`, `.py`, `.ts`, `.tsx`, `.go`, `.java`, and similar | Optimized for source-oriented chunking. |
| Markdown | `.md`, `.markdown` | Keeps markdown documents out of the plain-text path. |
| Conversation | Explicit config path | Best for message-oriented histories. |
| Structured | `.json`, `.yaml`, `.toml`, `.xml` | Useful for config and data files. |
Customization
1use forgetless::{ChunkConfig, Config, ForgetlessConfig, ScoringConfig};23let advanced = ForgetlessConfig::new(4 Config::default().context_limit(64_000),5)6.with_chunk(7 ChunkConfig::for_quality()8 .with_target_tokens(256)9 .with_max_tokens(512)10 .with_min_tokens(10)11 .with_deduplication(true),12)13.with_scoring(ScoringConfig {14 semantic_weight: 0.6,15 keyword_weight: 0.25,16 priority_weight: 0.15,17});