AI & Development · · 7 min read

Images That Won’t Die and Directories That Shouldn’t Be There

Opus 4.7 update (April 16, 2026): This post is more urgent than when we wrote it. Opus 4.7 introduces high-resolution image support (2576px, up from 1568px) and a new tokenizer that inflates token counts by up to 35%. A 500KB screenshot that cost ~62,500 tokens per turn on 4.6 can cost ~85,000+ tokens on 4.7. Combined with 4.7’s 2.4x Q5h burn rate from adaptive thinking overhead, image accumulation in conversation history is now one of the fastest ways to drain your quota. Enable CACHE_FIX_IMAGE_KEEP_LAST=3 in the interceptor immediately if you’re on 4.7.

The cache bugs in Parts 2-4 were about the mechanism of caching — prefix matching, TTL tiers, block ordering. But there’s another class of cost problem in Claude Code: things that end up in your conversation history and stay there far longer than they should.

Two discoveries in particular stood out: images that persist across sessions as raw base64, and working directories from unrelated projects that inject themselves into your system prompt.


The Image Persistence Problem

@Renvect reported on GitHub issue #40524 that a single pasted image was being “carried 4x in the conversation” and persisting across 6 session resumes, with approximately 122,000 tokens auto-injected on resume from accumulated image data alone.

We traced the mechanism through the source code:

Step 1: Encoding. When the Read tool processes an image file (PNG, JPG, etc.), it encodes it as a base64 image content block in the tool result (FileReadTool.ts, lines 784-799). This is the correct behavior — the model needs to see the image.

Step 2: Persistence. That tool result — with the full base64 payload — stays in the messages array. On every subsequent API call, the entire conversation history is sent, including every previous tool result with every image still encoded in full (claude.ts, lines 1266-1315).

Step 3: No automatic cleanup. Images are only removed from conversation history by three mechanisms:

  • The /compact command (replaces images with [image] placeholder)
  • Hitting the 100-media-item limit
  • API-side context window management

There is no automatic summarization, no progressive degradation, and no TTL-based expiry for images in conversation history. A 500KB screenshot you asked Claude to read on turn 3 is still being sent as 62,500 tokens of base64 on turn 50.

The cost math

The token cost scales linearly with image size. At Opus input rates ($5/MTok):

Image Size Base64 Tokens Cost Per Turn
200KB (small screenshot) ~25,000 $0.125
500KB (typical screenshot) ~62,500 $0.31
1MB (high-res capture) ~125,000 $0.63
5MB (API maximum) ~625,000 $3.13

These compound with accumulation. Three typical screenshots (500KB each) add ~187,500 tokens to every API call. Over 10 turns, that’s 1.9 million tokens of image data re-sent — $9.38 on Opus just for carrying images the model already saw.

And on 5-minute TTL (see Part 4), those image tokens trigger cache_creation rebuilds every time the cache expires. You’re not just carrying them — you’re paying the write premium to re-cache them every 5 minutes.

The interceptor fix

We added image stripping to our fetch interceptor. The CACHE_FIX_IMAGE_KEEP_LAST=N environment variable controls it:

// Only strip images inside tool_result blocks (Read tool output).
// User-pasted images are preserved.
if (block.type === "tool_result" && Array.isArray(block.content)) {
  const newToolContent = block.content.map((item) => {
    if (item.type === "image") {
      strippedCount++;
      strippedBytes += item.source?.data?.length || 0;
      return {
        type: "text",
        text: "[image stripped from history — file still on disk]",
      };
    }
    return item;
  });
}

Set to CACHE_FIX_IMAGE_KEEP_LAST=3, this keeps images from the last 3 user turns and replaces older ones with a text placeholder. The original files remain on disk — if Claude needs to re-examine an image, it can read it again with the Read tool. The cost of one re-read is far less than carrying the image on every subsequent turn.

We deliberately preserve user-pasted images (images directly in user message content, not inside tool_result blocks). Those are intentional context that the user chose to include.


Cross-Project Directory Contamination

@Renvect discovered something else in their system prompt: 8 additional working directories from unrelated projects were being injected into a single Claude Code session. Directories like .swarm/contracts, .agents, .swarm/mcp-server — none of which had anything to do with the active project.

How it happens

Claude Code supports additional working directories through two mechanisms:

1. The --add-dir CLI flag. When you use --add-dir /path/to/other/project, that directory is added to the session’s scope. Reasonable enough. The problem: the directory is persisted permanently to settings.json and survives across all future sessions. There’s no automatic cleanup, no expiry, and no prompt telling you it’s still active.

2. Symlink detection. If your working directory involves symlinks (process.env.PWD differs from the resolved real path), Claude Code auto-discovers the real path and may add it as an additional directory (permissionSetup.ts, lines 918-928). This is invisible to the user.

What each directory injects

Each additional directory can silently contribute:

  • Plugins — Claude Code reads .claude/settings.json from every additional directory and merges plugin configurations into the active session
  • MCP servers.mcp.json files are discovered by walking parent directories upward, so an additional directory can introduce MCP servers the user never configured for this project
  • CLAUDE.md files — if the CLAUDE_CODE_ADDITIONAL_DIRECTORIES_CLAUDE_MD environment variable is set, instructions from other projects’ CLAUDE.md files are loaded into context

There is no limit on the number of additional directories. No cap, no warnings, no visibility into what they’re contributing.

In @Renvect’s case, 8 directories from unrelated projects — contract systems, agent frameworks, MCP server experiments — were all injecting context into a single session. Each one contributing tools, plugins, and configuration the user hadn’t thought about in weeks.

The emergent cost chain

Here’s where it connects to everything else:

  1. Additional directories inject plugins
  2. Plugins load MCP servers
  3. MCP servers register tools
  4. More tools means more tool schema tokens at the top of the cache hierarchy
  5. Tool schema changes between sessions bust the cache (Part 2, Bug 3)
  6. Tool results from MCP tools may contain images
  7. Those images persist in conversation history (previous section)
  8. Image tokens compound on every turn
  9. More tokens push you closer to the quota boundary
  10. Crossing the boundary triggers TTL downgrade (Part 4)

A directory you added three weeks ago for an unrelated project can silently increase your cache miss rate, inflate your per-turn token count, and contribute to a TTL downgrade — with no indication that it’s happening.


The Common Thread

Both of these issues share a pattern: things that accumulate without visibility or cleanup.

Images accumulate in conversation history because there’s no automatic management. Directories accumulate in settings because additions are persisted but never reviewed. In both cases, the cost impact grows silently over time and compounds with the cache bugs described in earlier posts.

The fix in both cases is the same principle: make the accumulation visible and give users (or their tools) control over it. Our interceptor handles images automatically. For directories, the defense is simpler — periodically review your settings.json and remove --add-dir entries you no longer need.

In Part 6, we’ll step back from the technical details and look at what this investigation says about the state of AI tooling more broadly.


This is Part 5 of a six-part series on Claude Code’s cache management. Previous: Part 4 — The TTL Discovery. Next: Part 6 — What This Says About AI Tooling.


Veritas Supera IT Solutions (VSITS LLC) builds AI-augmented systems for technical teams. If your organization is working with AI tooling and running into problems like these, let’s talk.