As invested in context as you are.
If you're building a production LLM workflow, you need a context layer just as deliberate as the rest of your system.
Forgetless turns oversized prompts, files, screenshots, PDFs, and raw bytes into a budget-aware payload that still preserves the signal.
That means fewer manual cutdowns, fewer broken prompts, and more reliable inputs before every model call.
Questions, answered.
Forgetless is a Rust context optimizer for LLM workflows. It previews, ranks, chunks, and compresses oversized prompts, files, screenshots, PDFs, and raw byte inputs into a budget-aware context block so production systems can stay within strict token budgets while keeping the important signal.
Forgetless is published as a Rust crate. You add it with cargo add forgetless, build a pipeline through the Forgetless builder by chaining add, add_file, and query, then call run to produce an optimized context block. An optional HTTP server, enabled with the server feature, exposes the same optimization over a multipart API for use from other languages.
You can add text content, files on disk, and in-memory byte payloads. Inputs can be tagged with priorities such as critical, high, and low using WithPriority and FileWithPriority so the optimizer knows which content to preserve first when the budget is tight.
You set a token budget through the context_limit configuration option, which defaults to 128,000. Forgetless counts tokens, scores and selects chunks to fit within that budget, and returns statistics including input and output tokens and the compression ratio for the run.
Yes. Forgetless includes optional local AI helpers in Rust, including embeddings with cosine similarity, an optional local LLM for post-selection polishing, and vision helpers for describing images. These are gated behind opt-in configuration and Cargo features rather than enabled by default.
Yes. Forgetless is open source and the code is available on GitHub at github.com/pzzaworks/forgetless.