Getting Started

Install the crate, choose a token budget, attach content and files, then call `run()` to produce an optimized context block.

Install the crate

1cargo add forgetless

The current crate line in the repository is 0.1.1.

1[dependencies]
2forgetless = "0.1.1"

Optional features

1# HTTP server
2cargo add forgetless --features server
3
4# Apple Silicon GPU inference
5cargo add forgetless --features metal
6
7# NVIDIA GPU inference
8cargo add forgetless --features cuda
9
10# CPU acceleration
11cargo add forgetless --features accelerate
12cargo add forgetless --features mkl

Build the first pipeline

1use forgetless::{Config, Forgetless, WithPriority};
2
3let result = Forgetless::new()
4 .config(Config::default().context_limit(48_000))
5 .add(WithPriority::critical(system_prompt))
6 .add(conversation_history)
7 .add_file("spec.pdf")
8 .add_file("screenshot.png")
9 .query("What changed in the API contract?")
10 .run()
11 .await?;
12
13println!("{}", result.content);
14println!("compression: {:.1}x", result.compression_ratio());

Attach content and files with priorities

1use forgetless::{FileWithPriority, Forgetless, WithPriority};
2
3let result = Forgetless::new()
4 .add_pinned("You are preparing a release summary.")
5 .add(WithPriority::high(recent_messages))
6 .add(WithPriority::low(archived_notes))
7 .add_file(FileWithPriority::critical("system.md"))
8 .add_file(FileWithPriority::high("design-review.pdf"))
9 .add_files(["src/lib.rs", "src/builder.rs"])
10 .run()
11 .await?;

Add in-memory bytes

1use forgetless::{Forgetless, Priority};
2
3let png_bytes = std::fs::read("diagram.png")?;
4
5let result = Forgetless::new()
6 .add_bytes(b"temporary notes", "text/plain")
7 .add_bytes_p(&png_bytes, "image/png", Priority::High)
8 .run()
9 .await?;

In the current builder implementation, non-text byte payloads are turned into placeholder content records rather than being parsed as rich binary inputs.

Runtime configuration

OptionDefaultWhat it changes
context_limit128,000Maximum output token budget.
vision_llmfalseLoads the local SmolVLM vision model for image descriptions.
context_llmfalseLoads the local SmolLM2-135M model for post-selection polishing.
chunk_size512Target chunk size passed into the chunker.
paralleltrueStored on config; current file reads still use parallel processing.
cachetrueStored on config; the current global embedding cache is not toggled by this flag.
1use forgetless::{Config, Forgetless};
2
3let result = Forgetless::new()
4 .config(
5 Config::default()
6 .context_limit(96_000)
7 .chunk_size(256)
8 .vision_llm(true)
9 .context_llm(true)
10 .parallel(true)
11 .cache(true),
12 )
13 .add_file("diagram.png")
14 .run()
15 .await?;

Read the result

1result.content
2result.total_tokens
3result.chunks
4
5result.stats.input_tokens
6result.stats.output_tokens
7result.stats.chunks_processed
8result.stats.chunks_selected
9result.stats.processing_time_ms
10
11result.compression_ratio()