ml-intern：Hugging Face 开源的自主 ML 工程师 Agent

项目简介

ml-intern 是 Hugging Face 团队开源的自主研究型 ML 实习生 Agent。它能够自主读取论文、训练模型并发布与机器学习相关的高质量代码，深度访问 Hugging Face 文档、数据集和云计算资源。项目于今日登顶 GitHub Trending，累计获得 1,682 颗星，今日新增 530 星。

快速开始

安装

1
2
3
4


git clone git@github.com:huggingface/ml-intern.git
cd ml-intern
uv sync
uv tool install -e .

环境配置

在项目根目录创建 .env 文件：

1
2
3


ANTHROPIC_API_KEY=    # 使用 Anthropic 模型时需要
HF_TOKEN=             # Hugging Face Token
GITHUB_TOKEN=         # GitHub 个人访问令牌

如果未设置 HF_TOKEN，CLI 会在首次启动时提示输入。

使用方式

交互模式（启动聊天会话）：

1

ml-intern

无头模式（单次提示，自动批准）：

1

ml-intern "fine-tune llama on my dataset"

可选参数：

1
2
3


ml-intern --model anthropic/claude-opus-4-6 "your prompt"
ml-intern --max-iterations 100 "your prompt"
ml-intern --no-stream "your prompt"

系统架构

ml-intern 由四个核心组件驱动：

组件	功能
ContextManager	消息历史管理、自动压缩（170k token 上限）、会话上传到 HF
ToolRouter	工具路由：HF 文档/仓库/数据集、GitHub 代码搜索、沙箱、规划、MCP 服务
Doom Loop Detector	检测重复工具调用模式，注入纠正提示防止死循环
submission_loop	操作队列处理与事件分发

Agent 执行循环

接收用户消息 → 添加至 ContextManager
LLM 调用（litellm.acompletion）
解析 tool_calls[]
审查检查（涉及 jobs、沙箱、破坏性操作时请求用户批准）
通过 ToolRouter 执行工具
将结果加回 ContextManager
重复循环（最多 300 次迭代）

事件系统

Agent 通过内置事件队列（event_queue）对外广播运行状态，便于监控和与外部系统集成：

事件名	说明
`processing`	开始处理用户输入
`ready`	Agent 准备就绪
`assistant_chunk`	流式 token 块
`assistant_message`	完整的 LLM 响应文本
`tool_call`	工具被调用及其参数
`tool_output`	工具执行结果
`approval_required`	请求用户批准敏感操作
`turn_complete`	Agent 当前轮次处理完成
`compacted`	上下文已被压缩
`shutdown`	Agent 关闭

扩展开发

添加内置工具

编辑 agent/core/tools.py：

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15


def create_builtin_tools() -> list[ToolSpec]:
    return [
        ToolSpec(
            name="your_tool",
            description="What your tool does",
            parameters={
                "type": "object",
                "properties": {
                    "param": {"type": "string", "description": "Parameter description"}
                },
                "required": ["param"]
            },
            handler=your_async_handler,
        ),
    ]

接入 MCP 服务器

编辑 configs/main_agent_config.json：

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12


{
    "model_name": "anthropic/claude-sonnet-4-5-20250929",
    "mcpServers": {
        "your-server-name": {
            "transport": "http",
            "url": "https://example.com/mcp",
            "headers": {
                "Authorization": "Bearer ${YOUR_TOKEN}"
            }
        }
    }
}

总结

ml-intern 将 ML 工程师的日常工作（读论文、调参、训练、发布）自动化成一个可驱动的 Agent，核心优势包括：

深度集成 Hugging Face 生态（文档、数据集、模型仓库）
内置 Doom Loop 检测，防止 Agent 陷入无效循环
支持 MCP 服务器扩展，可接入任意外部工具
完善的事件系统，便于下游系统监控和集成
同时支持交互式对话与无头自动化两种使用场景

项目地址：https://github.com/huggingface/ml-intern