codex-agent-mem

v1.0.x verification results

These are reproducible, sanitized results generated from synthetic fixtures.

Execution context:

Snapshot

Scenario Source tokens Pack tokens Saved not_modified Tools Lazy init Read-only
Small project continuity 1,841 253 86.26% true 4 false->true true
Medium agent workflow 4,855 270 94.44% true 4 false->true true
Large repeated audit 9,731 269 97.24% true 4 false->true true
Sub-agent handoff example 6,523 276 95.77% true 4 false->true true

Token savings by scenario

Small project continuity

A short project where the agent should remember objective, constraints and open work without repeating the whole discussion.

source [############################] 100.00% pack [####……………………] 13.74% saved [########################….] 86.26%

Medium agent workflow

A realistic multi-step implementation with repeated decisions, pending work and DoD requirements.

source [############################] 100.00% pack [##……………………..] 5.56% saved [##########################..] 94.44%

Large repeated audit

A long audit where the same constraints and decisions would normally be re-sent many times.

source [############################] 100.00% pack [#………………………] 2.76% saved [###########################.] 97.24%

Sub-agent handoff example

A project where an explorer sub-agent audits context and a worker sub-agent implements a bounded change.

source [############################] 100.00% pack [#………………………] 4.23% saved [###########################.] 95.77%

Sub-agent example:

Repeated context avoided

known_pack_hash lets the agent ask whether a pack changed before re-sending it.

Scenario Result
Small project continuity not_modified=true
Medium agent workflow not_modified=true
Large repeated audit not_modified=true
Sub-agent handoff example not_modified=true

Runtime safety

Metric Result
Minimal profile tools mem_open_work, mem_completion_check, mem_context_pack, mem_health_runtime
Tool count in minimal profile 4
Lazy initialization before DB-backed tool false
Lazy initialization after context pack true
Mutating tool tested in read-only mode mem_snapshot_create
Mutating tool blocked true

This is the v1.0.0 fixture baseline. In v1.0.1, the minimal profile also includes mem_session_list, mem_scope_resolve, and mem_bootstrap_context so broad workspaces can resolve scope before loading active context.

Response diet

Text shown to the model can be kept compact while the structured payload remains available to MCP clients.

Scenario Compact text chars Balanced text chars Verbose text chars
Small project continuity 160 209 20,715
Medium agent workflow 160 209 42,943
Large repeated audit 160 209 76,114
Sub-agent handoff example 160 209 54,625

Telemetry smoke

Interpretation

These numbers are not a universal guarantee. They show reproducible behavior on public synthetic fixtures. The expected value is highest when an agent would otherwise resend repeated project context across sessions.