Skip to content

Compaction

Long conversations and agent runs accumulate tokens until they hit the model’s context window limit. Research shows that even full, uncompacted context degrades model performance as length grows — compaction is a tradeoff, not pure loss.

GHOST uses a two-phase compaction system that balances token recovery with information preservation.

The boundary between “old” and “current” is the current turn — everything from the last user text message onward is preserved in full.

When estimated token usage exceeds the configured threshold (default: 85% of the context window), GHOST masks tool results and tool call inputs for all messages before the current turn. Masked entries become compact placeholders like [tool_result: web_search — first 100 chars... (truncated)].

This is free (no LLM call), introduces zero hallucination risk, and recovers thousands of tokens from large tool outputs. Anthropic recommends this as the first step before any summarization.

If masking alone isn’t sufficient, GHOST summarizes the masked pre-turn content into a structured summary block via a single LLM call. The summarization input is capped at 50,000 characters. The summary uses mandatory sections (Task, Decisions, State, Files, Context) that act as checklists preventing silent information drops.

Phase 2 errors are logged and gracefully degraded — the masked history is used as a fallback. Context overflow errors trigger automatic compaction followed by a retry.

GHOST uses a section-based compaction prompt inspired by Factory.ai’s benchmark of 36K production messages, where structured summaries scored 3.70 vs 3.44 for free-form and 3.35 for opaque approaches. The mandatory sections are:

  • Task — what the OPERATOR asked for
  • Decisions — key choices with reasoning
  • State — what was done, what remains
  • Files — paths read, created, or modified
  • Context — names, preferences, domain details

The explicit “Files” section addresses a known weakness: all compression methods score poorly on artifact tracking (2.19–2.45/5.0) without explicit file path sections.

~/.config/ghost/config.toml
[compaction]
threshold = 0.85 # Trigger at 85% context usage
mask_preview_chars = 100 # Characters to preview in masked results
instructions = "Always preserve code snippets in full." # Optional
KeyDefaultDescription
threshold0.85Context usage ratio that triggers compaction
mask_preview_chars100Characters shown in masked tool result previews
instructionsExtra text appended to the compaction prompt

Agents use the same two-phase compaction during tool loops. Agents can override compaction parameters in their Lua config:

agents/my-agent/agent.lua
return {
name = "my-agent",
-- ...
compaction = {
instructions = "Preserve all URLs and the current TODO list. "
.. "Drop verbose page content.",
},
}

Available overrides: threshold, mask_preview_chars, instructions. Any field not specified falls back to the default.