Skip to content

实战项目三:安全代码执行平台

本文将构建一个安全的代码执行平台,支持多语言代码运行、执行结果可视化和危险操作审批。这个项目综合运用了沙箱系统、人机协作和流式输出等核心能力。

项目概述

安全代码执行平台具备以下核心功能:

  • 多语言代码执行:支持 Python、JavaScript、Go 等多种语言
  • 沙箱隔离:所有代码在隔离环境中执行,保护宿主系统
  • 执行结果可视化:实时显示执行输出和资源使用
  • 危险操作审批:敏感操作需要人工确认
  • 执行历史记录:记录所有执行历史,支持回溯

安全代码执行平台五大核心功能

技术架构

安全代码执行平台
├── 主代理 (ExecutionManager)
│   ├── 请求解析
│   ├── 安全检查
│   └── 结果格式化
├── 沙箱后端
│   ├── Modal 沙箱
│   ├── Daytona 沙箱
│   └── 本地 Docker 沙箱
├── 人机协作
│   ├── 代码审查
│   └── 执行审批
└── 监控系统
    ├── 资源监控
    └── 执行日志

系统四层分层架构

项目初始化

目录结构

code-executor/
├── src/
│   ├── agents/
│   │   └── executor-agent.ts
│   ├── backends/
│   │   └── sandbox-backend.ts
│   ├── tools/
│   │   └── execution-tools.ts
│   ├── security/
│   │   └── code-analyzer.ts
│   └── index.ts
├── package.json
└── tsconfig.json

安装依赖

bash
npm init -y
npm install @langchain/langgraph-agents @langchain/anthropic zod
npm install -D typescript @types/node

沙箱后端配置

typescript
// src/backends/sandbox-backend.ts
import { ModalSandbox } from "@langchain/langgraph-agents/sandbox";
import { CompositeBackend, StateBackend } from "@langchain/langgraph-agents/backends";

export function createModalSandboxBackend() {
  return (rt: any) => {
    const sandbox = new ModalSandbox({
      image: "python:3.11",
      timeout: 300,
      memory: 512,
      cpu: 1,
    });

    return new CompositeBackend(
      sandbox,
      {
        "/logs/": new StateBackend(rt),
        "/results/": new StateBackend(rt),
      }
    );
  };
}

Daytona 沙箱配置

typescript
import { DaytonaSandbox } from "@langchain/langgraph-agents/sandbox";

export function createDaytonaSandboxBackend() {
  return (rt: any) => {
    const sandbox = new DaytonaSandbox({
      workspace: "code-executor",
      persistWorkspace: false,
    });

    return new CompositeBackend(
      sandbox,
      {
        "/logs/": new StateBackend(rt),
        "/results/": new StateBackend(rt),
      }
    );
  };
}

多沙箱选择

typescript
type SandboxType = "modal" | "daytona" | "local";

export function createSandboxBackend(type: SandboxType) {
  switch (type) {
    case "modal":
      return createModalSandboxBackend();
    case "daytona":
      return createDaytonaSandboxBackend();
    case "local":
      return createLocalDockerBackend();
    default:
      throw new Error(`未知的沙箱类型: ${type}`);
  }
}

三种沙箱后端对比选型

代码安全分析

typescript
// src/security/code-analyzer.ts
interface SecurityAnalysis {
  safe: boolean;
  riskLevel: "low" | "medium" | "high" | "critical";
  warnings: string[];
  blockedPatterns: string[];
}

const DANGEROUS_PATTERNS = [
  { pattern: /rm\s+-rf/i, risk: "critical", desc: "危险的删除命令" },
  { pattern: /eval\s*\(/i, risk: "high", desc: "动态代码执行" },
  { pattern: /exec\s*\(/i, risk: "high", desc: "命令执行" },
  { pattern: /subprocess/i, risk: "medium", desc: "子进程调用" },
  { pattern: /os\.system/i, risk: "high", desc: "系统命令执行" },
  { pattern: /import\s+socket/i, risk: "medium", desc: "网络访问" },
  { pattern: /open\s*\([^)]*['"]\/etc/i, risk: "high", desc: "系统文件访问" },
  { pattern: /curl|wget/i, risk: "medium", desc: "网络下载" },
];

export function analyzeCodeSecurity(code: string): SecurityAnalysis {
  const warnings: string[] = [];
  const blockedPatterns: string[] = [];
  let maxRisk: SecurityAnalysis["riskLevel"] = "low";

  for (const { pattern, risk, desc } of DANGEROUS_PATTERNS) {
    if (pattern.test(code)) {
      warnings.push(`${desc} (风险: ${risk})`);

      if (risk === "critical") {
        blockedPatterns.push(desc);
        maxRisk = "critical";
      } else if (risk === "high" && maxRisk !== "critical") {
        maxRisk = "high";
      } else if (risk === "medium" && maxRisk === "low") {
        maxRisk = "medium";
      }
    }
  }

  return {
    safe: blockedPatterns.length === 0,
    riskLevel: maxRisk,
    warnings,
    blockedPatterns,
  };
}

代码安全分析四级风险模型

执行工具

typescript
// src/tools/execution-tools.ts
import { tool } from "@langchain/core/tools";
import { z } from "zod";
import { analyzeCodeSecurity } from "../security/code-analyzer";

export const analyzeCodeTool = tool(
  async ({ code, language }) => {
    const analysis = analyzeCodeSecurity(code);

    return JSON.stringify({
      language,
      codeLength: code.length,
      lineCount: code.split("\n").length,
      ...analysis,
    }, null, 2);
  },
  {
    name: "analyze_code",
    description: "分析代码的安全性",
    schema: z.object({
      code: z.string().describe("要分析的代码"),
      language: z.string().describe("编程语言"),
    }),
  }
);

export const formatResultTool = tool(
  async ({ stdout, stderr, exitCode, executionTime }) => {
    const status = exitCode === 0 ? "✅ 成功" : "❌ 失败";

    let output = `## 执行结果 ${status}\n\n`;
    output += `**退出码**: ${exitCode}\n`;
    output += `**执行时间**: ${executionTime}ms\n\n`;

    if (stdout) {
      output += `### 标准输出\n\`\`\`\n${stdout}\n\`\`\`\n\n`;
    }

    if (stderr) {
      output += `### 错误输出\n\`\`\`\n${stderr}\n\`\`\`\n\n`;
    }

    return output;
  },
  {
    name: "format_result",
    description: "格式化执行结果",
    schema: z.object({
      stdout: z.string().describe("标准输出"),
      stderr: z.string().describe("错误输出"),
      exitCode: z.number().describe("退出码"),
      executionTime: z.number().describe("执行时间(毫秒)"),
    }),
  }
);

export const saveExecutionLogTool = tool(
  async ({ code, language, result, timestamp }) => {
    const filename = `/logs/${timestamp}-${language}.json`;

    const log = {
      timestamp,
      language,
      code,
      result,
      savedAt: new Date().toISOString(),
    };

    return { filename, content: JSON.stringify(log, null, 2) };
  },
  {
    name: "save_execution_log",
    description: "保存执行日志",
    schema: z.object({
      code: z.string(),
      language: z.string(),
      result: z.any(),
      timestamp: z.string(),
    }),
  }
);

执行工具链数据流转

主代理配置

typescript
// src/agents/executor-agent.ts
import { createDeepAgent } from "@langchain/langgraph-agents";
import { createSandboxBackend } from "../backends/sandbox-backend";
import { analyzeCodeTool, formatResultTool, saveExecutionLogTool } from "../tools/execution-tools";

export function createCodeExecutor(sandboxType: "modal" | "daytona" | "local" = "modal") {
  return createDeepAgent({
    model: "claude-sonnet-4-20250514",
    name: "code-executor",
    systemPrompt: `你是一个安全的代码执行平台。你的职责是:

1. 接收用户的代码执行请求
2. 分析代码的安全性
3. 在沙箱环境中执行代码
4. 返回格式化的执行结果

安全策略:
- 所有代码都在隔离的沙箱中执行
- 高风险操作需要人工审批
- 禁止执行危险的系统命令
- 限制网络访问和资源使用

支持的语言:
- Python (python3)
- JavaScript (node)
- Go (go run)
- Bash (bash)

执行流程:
1. 使用 analyze_code 检查代码安全性
2. 如果安全,使用 execute 执行代码
3. 使用 format_result 格式化输出
4. 使用 save_execution_log 记录日志`,
    tools: [analyzeCodeTool, formatResultTool, saveExecutionLogTool],
    backend: createSandboxBackend(sandboxType),
    interruptOn: {
      execute: {
        condition: (args: any) => {
          const analysis = analyzeCodeSecurity(args.command || "");
          return analysis.riskLevel === "high" || analysis.riskLevel === "critical";
        },
      },
    },
  });
}

代理执行流程与interruptOn中断机制

人机协作:执行审批

typescript
// src/approval/execution-approval.ts
import { HumanMessage, AIMessage } from "@langchain/core/messages";

interface ApprovalRequest {
  code: string;
  language: string;
  riskLevel: string;
  warnings: string[];
}

interface ApprovalResponse {
  approved: boolean;
  modifiedCode?: string;
  reason?: string;
}

export async function requestExecutionApproval(
  agent: any,
  request: ApprovalRequest
): Promise<ApprovalResponse> {
  console.log("\n⚠️  需要人工审批");
  console.log("================");
  console.log(`语言: ${request.language}`);
  console.log(`风险等级: ${request.riskLevel}`);
  console.log(`警告:\n${request.warnings.map((w) => `  - ${w}`).join("\n")}`);
  console.log("\n代码:");
  console.log("```");
  console.log(request.code);
  console.log("```\n");

  const readline = await import("readline");
  const rl = readline.createInterface({
    input: process.stdin,
    output: process.stdout,
  });

  return new Promise((resolve) => {
    rl.question("批准执行? (y/n/e[edit]): ", async (answer) => {
      rl.close();

      switch (answer.toLowerCase()) {
        case "y":
        case "yes":
          resolve({ approved: true });
          break;
        case "e":
        case "edit":
          const modifiedCode = await editCode(request.code);
          resolve({ approved: true, modifiedCode });
          break;
        default:
          resolve({ approved: false, reason: "用户拒绝执行" });
      }
    });
  });
}

async function editCode(originalCode: string): Promise<string> {
  console.log("\n请输入修改后的代码(输入 END 结束):");

  const readline = await import("readline");
  const rl = readline.createInterface({
    input: process.stdin,
    output: process.stdout,
  });

  const lines: string[] = [];

  return new Promise((resolve) => {
    rl.on("line", (line) => {
      if (line === "END") {
        rl.close();
        resolve(lines.join("\n"));
      } else {
        lines.push(line);
      }
    });
  });
}

人机协作审批三种操作路径

使用示例

基础代码执行

typescript
// src/index.ts
import { createCodeExecutor } from "./agents/executor-agent";

async function executeCode() {
  const agent = createCodeExecutor("modal");

  const result = await agent.invoke({
    messages: [
      {
        role: "human",
        content: `
请执行以下 Python 代码:

\`\`\`python
import math

def calculate_primes(n):
    primes = []
    for num in range(2, n + 1):
        is_prime = True
        for i in range(2, int(math.sqrt(num)) + 1):
            if num % i == 0:
                is_prime = False
                break
        if is_prime:
            primes.append(num)
    return primes

result = calculate_primes(100)
print(f"前100个自然数中的质数: {result}")
print(f"共 {len(result)} 个质数")
\`\`\`
`,
      },
    ],
  });

  console.log(result.messages.at(-1)?.content);
}

executeCode();

多语言执行

typescript
async function multiLanguageExecution() {
  const agent = createCodeExecutor("daytona");

  const codes = [
    {
      language: "python",
      code: `print("Hello from Python!")`,
    },
    {
      language: "javascript",
      code: `console.log("Hello from JavaScript!");`,
    },
    {
      language: "go",
      code: `
package main
import "fmt"
func main() {
    fmt.Println("Hello from Go!")
}`,
    },
  ];

  for (const { language, code } of codes) {
    const result = await agent.invoke({
      messages: [
        {
          role: "human",
          content: `执行以下 ${language} 代码:\n\`\`\`${language}\n${code}\n\`\`\``,
        },
      ],
    });

    console.log(`\n=== ${language.toUpperCase()} ===`);
    console.log(result.messages.at(-1)?.content);
  }
}

危险代码拦截

typescript
async function dangerousCodeDemo() {
  const agent = createCodeExecutor("modal");

  const result = await agent.invoke({
    messages: [
      {
        role: "human",
        content: `
执行以下代码:

\`\`\`python
import os
os.system("rm -rf /tmp/*")  # 这应该被拦截
\`\`\`
`,
      },
    ],
  });

  console.log(result.messages.at(-1)?.content);
}

流式输出:实时执行进度

typescript
async function streamingExecution() {
  const agent = createCodeExecutor("modal");

  const messages = [
    {
      role: "human",
      content: `
执行一个耗时的 Python 脚本:

\`\`\`python
import time

for i in range(5):
    print(f"Step {i + 1}/5")
    time.sleep(1)

print("完成!")
\`\`\`
`,
    },
  ];

  console.log("🚀 开始执行...\n");

  for await (const [namespace, chunk] of await agent.stream(
    { messages },
    { streamMode: ["updates", "messages", "custom"], subgraphs: true }
  )) {
    if (chunk.type === "AIMessageChunk" && chunk.content) {
      process.stdout.write(chunk.content);
    }

    if (chunk.type === "tool_result") {
      console.log(`\n📋 工具结果: ${chunk.toolName}`);
    }

    if (chunk.type === "execution_output") {
      console.log(`\n📤 输出: ${chunk.output}`);
    }
  }

  console.log("\n\n✅ 执行完成");
}

前端集成

tsx
import { useStream } from "@langchain/langgraph-sdk/react";
import { useState } from "react";
import Editor from "@monaco-editor/react";

function CodeExecutor() {
  const [code, setCode] = useState("");
  const [language, setLanguage] = useState("python");

  const stream = useStream({
    assistantId: "code-executor",
    apiUrl: "http://localhost:2024",
  });

  const executeCode = async () => {
    await stream.submit({
      messages: [
        {
          role: "human",
          content: `执行以下 ${language} 代码:\n\`\`\`${language}\n${code}\n\`\`\``,
        },
      ],
    });
  };

  const handleApproval = async (approved: boolean) => {
    if (stream.isInterrupted) {
      await stream.resume({ action: approved ? "approve" : "reject" });
    }
  };

  return (
    <div className="code-executor">
      <div className="editor-section">
        <div className="toolbar">
          <select value={language} onChange={(e) => setLanguage(e.target.value)}>
            <option value="python">Python</option>
            <option value="javascript">JavaScript</option>
            <option value="go">Go</option>
            <option value="bash">Bash</option>
          </select>
          <button onClick={executeCode} disabled={stream.isLoading}>
            ▶ 运行
          </button>
        </div>
        <Editor
          height="300px"
          language={language}
          value={code}
          onChange={(value) => setCode(value || "")}
          theme="vs-dark"
        />
      </div>

      {stream.isInterrupted && (
        <div className="approval-dialog">
          <h3>⚠️ 需要审批</h3>
          <p>检测到潜在危险操作,是否继续执行?</p>
          <div className="actions">
            <button onClick={() => handleApproval(true)}>✅ 批准</button>
            <button onClick={() => handleApproval(false)}>❌ 拒绝</button>
          </div>
        </div>
      )}

      <div className="output-section">
        <h3>执行结果</h3>
        <pre className="output">
          {stream.messages.at(-1)?.content || "等待执行..."}
        </pre>
      </div>
    </div>
  );
}

安全配置

资源限制

typescript
const RESOURCE_LIMITS = {
  modal: {
    timeout: 300,
    memory: 512,
    cpu: 1,
    diskSpace: 1024,
  },
  daytona: {
    timeout: 600,
    memory: 1024,
    cpu: 2,
    diskSpace: 2048,
  },
};

网络策略

typescript
const NETWORK_POLICY = {
  allowOutbound: false,
  allowedDomains: ["pypi.org", "npmjs.com"],
  blockedPorts: [22, 23, 25, 3306, 5432],
};

安全配置全景:资源限制与网络策略

小结

本文构建了一个安全代码执行平台,综合运用了:

  1. 沙箱系统:Modal 和 Daytona 沙箱实现代码隔离执行
  2. 人机协作:危险操作需要人工审批
  3. 安全分析:执行前进行代码安全检查
  4. 流式输出:实时显示执行进度和输出
  5. 多语言支持:Python、JavaScript、Go、Bash

下一篇文章,我们将构建一个多代理协作工作流,模拟完整的软件开发团队。

读文档、看源码、写代码,理解 AI Agent 本质 🤖