第二节：上下文压缩机制
压缩算法与内存管理

作者：小学子 📚 | 日期：2026年4月2日 | 第四阶段 · 模块四

# 第四阶段 · 模块四 · 第二节：上下文压缩机制

核心问题

上下文压缩是如何工作的？autoCompact 的触发条件是什么？压缩后的消息格式是什么样的？

◇ 本节位置


        Claude Code 全局架构
        
        ┌─────────────────────────────────────────────────────────────────────┐
        │  查询引擎层                                                          │
        │                                                                      │
        │  query()                                                            │
        │  └── autoCompact() ──> 上下文压缩 ← 本节                           │
        └─────────────────────────────────────────────────────────────────────┘

一、compact 概述

1.1 什么是 compact


        问题：上下文长度有限制
        
        限制：
        - Claude Opus: 200K tokens
        - Claude Sonnet: 200K tokens
        - Claude Haiku: 200K tokens
        
        长时间对话会耗尽上下文。
        
        解决方案：compact
        - 将历史消息压缩为摘要
        - 保留关键信息（决策、结论）
        - 释放上下文空间

1.2 压缩流程


        ┌─────────────────────────────────────────────────────────────────────┐
        │  原始消息历史                                                        │
        │                                                                      │
        │  [用户] Hello                                                       │
        │  [助手] Hi! How can I help?                                        │
        │  [助手] Let me read the file...                                     │
        │  [工具] File content...                                             │
        │  [助手] The file contains...                                        │
        │  ... (100+ messages)                                               │
        └─────────────────────────────────────────────────────────────────────┘
                                       │
                                       ▼
        ┌─────────────────────────────────────────────────────────────────────┐
        │  压缩后的消息                                                        │
        │                                                                      │
        │  [系统] Conversation summary: User asked about the codebase...       │
        │  [助手] The file contains...                                        │
        │  ... (最新消息)                                                     │
        └─────────────────────────────────────────────────────────────────────┘

二、autoCompact 触发条件

2.1 触发阈值

源码位置：`src/services/compact/compact.ts`


        // 触发阈值
        const AUTO_COMPACT_THRESHOLD = 0.8;  // 80%
        
        // 检查函数
        function shouldAutoCompact(contextUsage: number): boolean {
          return contextUsage > AUTO_COMPACT_THRESHOLD;
        }

2.2 五问分析

问 1：为什么不等到 100% 再压缩？


        // 问题：API 需要预留空间给：
        // 1. 助手的新回复
        // 2. 工具调用的结果
        // 3. 系统提示词
        
        // 如果等到 100%：
        // - 可能无法生成完整回复
        // - 工具结果可能放不下
        // - 需要紧急压缩，用户体验差
        
        // 80% 阈值留出缓冲空间

问 2：如何检测上下文使用量？


        // 计算当前上下文使用的 token 数
        async function getContextUsage(): Promise<number> {
          const messages = mutableMessages;
          const tokenCount = await countTokens(messages);
          const modelLimit = getModelLimit();
          return tokenCount / modelLimit;
        }

问 3：用户可以禁用 autoCompact 吗？


        // 环境变量禁用
        process.env.DISABLE_AUTO_COMPACT = '1';
        
        // 命令行标志
        /claude --no-auto-compact

问 4：autoCompact 和 /compact 的区别？

方面	autoCompact	/compact
触发	自动	手动
时机	上下文达到 80%	用户决定
提示	无	有

问 5：压缩会丢失信息吗？


        // 压缩策略：保留关键信息
        interface CompressionStrategy {
          preserve: [
            'decisions',    // 关键决策
            'conclusions',  // 结论
            'preferences',  // 用户偏好
            'current_work', // 当前工作
          ];
          summarize: [
            'discussion',   // 讨论过程
            'exploration',  // 探索过程
            'refinement',   // 迭代过程
          ];
        }

三、压缩算法

3.1 消息重要性评估


        function evaluateMessageImportance(message: Message): number {
          // 高重要性：包含决策、结论
          if (containsDecision(message)) return 1.0;
          if (containsConclusion(message)) return 1.0;
        
          // 中重要性：用户指令、工具结果
          if (message.type === 'user') return 0.8;
          if (message.type === 'tool_result') return 0.6;
        
          // 低重要性：讨论过程
          if (isDiscussion(message)) return 0.3;
        
          return 0.5;  // 默认
        }

3.2 摘要生成


        async function generateSummary(messages: Message[]): Promise<string> {
          // 调用模型生成摘要
          const summary = await model.generate({
            system: 'Summarize the conversation concisely. Include: decisions made, current state, and key conclusions.',
            messages: messages,
          });
        
          return summary;
        }

3.3 五问分析

问 1：压缩后的消息格式？


        // 创建压缩边界消息
        interface SystemCompactBoundaryMessage {
          type: 'system';
          subtype: 'compact_boundary';
          content: string;  // 摘要内容
          timestamp: number;
        }
        
        // 添加到消息历史
        mutableMessages = [
          {
            type: 'system',
            subtype: 'compact_boundary',
            content: 'Summary: User asked about... Assistant analyzed...',
            timestamp: Date.now(),
          },
          // 最新消息
        ];

问 2：工具调用的结果如何保留？


        // 工具结果通常很重要，优先保留
        function shouldPreserveToolResult(message: Message): boolean {
          // 读取文件的结果保留
          if (message.tool_name === 'Read') return true;
        
          // 写入/修改文件的结果可能关键
          if (message.tool_name === 'Write') return true;
        
          // 搜索结果可以压缩
          if (message.tool_name === 'Grep') return false;
        
          return true;
        }

问 3：压缩会改变消息顺序吗？


        // 不改变消息顺序
        // 只在开头添加摘要消息
        
        [messages before compact]
          ↓
        [{summary}, ...messages_after_boundary]
          ↓
        [messages after compact - only new messages]

四、压缩的影响

4.1 对话连续性


        // 压缩前
        User: "Read the file"
        Assistant: "The file contains..."
        User: "Now explain it"
        Assistant: "The file describes..."
        
        // 压缩后
        [Summary] User asked about the file. Assistant read and analyzed it.
        User: "Now explain it"
        Assistant: "The file describes..."

4.2 五问分析

问 1：模型能区分压缩前后的消息吗？


        // 通过摘要告诉模型上下文
        {
          type: 'system',
          subtype: 'compact_boundary',
          content: 'Earlier in the conversation, you read a file and analyzed it...'
        }

问 2：压缩后能回退吗？


        // 不支持回退
        // /clear 不可恢复
        // /compact 不可恢复
        
        // 建议用户：如果要保留历史，使用 /resume

问 3：压缩的成本？


        // 压缩本身需要一次 API 调用
        // 成本：约 100-500 tokens（摘要生成）
        // 收益：释放 50-80% 的上下文空间
        
        // 总体来看是值得的

五、思考题

思考题 1：压缩失败怎么办？

答案：


        try {
          await compact();
        } catch (error) {
          // 如果压缩失败，尝试部分压缩
          await partialCompact();
        
          // 或者提示用户
          displayWarning('Context is full. Please use /clear to continue.');
        }

思考题 2：如何优化压缩？

答案：

1. 增量压缩：只压缩超过阈值的内容

2. 智能摘要：保留关键信息

3. 分块压缩：分段压缩，避免单次大量丢失

思考题 3：压缩对工具有影响吗？

答案：


        // 工具执行结果在压缩前被评估
        // 重要的结果被保留在摘要中
        
        // 示例：
        // Read('a.txt') → 保留在摘要
        // Grep('pattern') → 可能被压缩

六、延伸阅读

文件	核心内容
`src/services/compact/compact.ts`	压缩实现
`src/commands/compact/`	/compact 命令

七、下节预告

下一节我们将深入 上下文与工具调用：

- 工具结果如何影响上下文

- 上下文中的工具历史

- 工具调用与状态管理

*- 第一轮：□ 事实准确性*

*- 第二轮：□ 深度与洞见*

*- 第三轮：□ 可读性与价值*

第二节：上下文压缩机制压缩算法与内存管理