Leo/Arcrun

Files

T

Leo c2a2f82ade docs(arcrun): SDD llm-interface — AI 操盤手使用體驗 first-class

設計動機：3 天 mira dogfood 累積 14 個痛點，7 個純粹是 LI 缺失。
arcrun 過去設計集中在「人」（u6u-gui / docs），AI 對 arcrun 的可用性
沒被當第一公民。

SDD 三件套（matrix/arcrun/.agents/specs/llm-interface/）：

requirements.md
  - personas（Claude Code 主力 / 用戶私人 agent / SDK 使用者）
  - 範圍涵蓋 5 系統（cypher-executor / registry / u6u-mcp / u6u-gui / kbdb）
  - 10 個 FR：onboarding / CRUD 對等 / dry-run / 結構化 trace /
    可程式化 error / feedback tool / implicit telemetry /
    skill blocks / examples / weekly closed loop
  - 5 個 NFR：相容 / 多 transport / error contract 穩定 /
    feedback exportable / coverage 量化

design.md
  - 5 層 LI 模型：AGENTS.md / arcrun-mcp / Skills / Examples / Telemetry
  - 25 個 MCP tool 完整清單分 5 類
  - error_code enum v1
  - coverage matrix（GUI 動作 vs MCP / 31 cypher-executor 路由 vs LI）
  - 完整 AGENTS.md 模板
  - u6u-mcp → arcrun-mcp migration plan（90 天 deprecation）
  - weekly_review workflow YAML 範本

tasks.md
  - 5 個 milestone（M1 收 data / M2 gap-fill / M3 skill+examples /
    M4 closed loop / M5 rename）
  - 估算 23 個工作日 (~5 週)
  - M1 是硬前置（不收 data 改了也不知道對沒）

Audit 基準（用 4 個並行 Explore agent 整理）：
  - cypher-executor: 31 HTTP 路由，9 個 AI-essential
  - u6u-mcp: 15 tool，缺 update/delete/history/validate/feedback
  - u6u-gui: 8 個人類動作可對等 LI / 3 個視覺類不需
  - kbdb: 50 路由 13 group，LI 走 abstracted tool 不直接 expose

同步更新 .claude/rules/04-current-progress.md SDD 索引。

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

2026-05-16 14:58:21 +08:00

8.0 KiB

Raw Blame History

Tasks: LI (LLM Interface) for arcrun

SDD: design.md + requirements.md（同目錄）進度標記：[ ] pending / [🔄] doing / [x] done / [⏸] blocked

Milestone 1：可量測（先收 data）

目標：1 週內把「平台自己收 AI 用得好不好」的數據管道接通。

M1.1 AGENTS.md v1

寫 arcrun/AGENTS.md（按 design.md §5 模板）
CI hook：repo AGENTS.md 變動 → 自動同步 KBDB block
arcrun_get_onboarding MCP tool（讀 KBDB block）

M1.2 Implicit telemetry 收集

建 KBDB template agent-telemetry（slots: event_type, workflow_name, error_code, duration_ms, api_key_hash, agent_user_agent）
cypher-executor webhook-handlers.executeWebhookGraph 末尾加 telemetry 寫入（成功 / 失敗都記）
cypher-executor routes/webhooks-named.ts push 加 deploy 事件
cypher-executor routes/cypher.ts validate 失敗 → validation_error 事件
api_key SHA-256 截 16 字元 helper
隱私 check：workflow content 不 log，只 name

M1.3 Explicit feedback tool

建 KBDB template agent-feedback（slots: issue_type, workflow_name, retry_count, blocked, suggested_fix, agent_user_agent）
u6u-mcp 加 tool arcrun_report_feedback
Zod schema 鎖死 issue_type enum
寫入時 user_id 從 partner-auth middleware 拿
寫入時 tag 自動補（agent-feedback, issue:{type}）

M1.4 驗收

用 Claude Code 跑 mira 開發 1-2 天，自然累積 telemetry + feedback
用 curl kbdb-get.arcrun.dev type=agent-telemetry 確認有 data
用 curl kbdb-get.arcrun.dev type=agent-feedback 確認 enum 有效

Milestone 2：gap-fill（補 MCP 工具）

目標：人類 GUI 能做的，AI 透過 MCP 都能做。

M2.1 新增 cypher-executor 路由

GET /executions/:id — 回結構化 trace（讀 EXEC_CONTEXT KV）
- 既有資料：graph-executor.ts trace array，需確認 KV 持久化
GET /workflows/:name/executions?limit=10 — 最近 N 次執行 ID + 摘要
- 需 ANALYTICS_KV 或新 index
GET /executions/paused — 列當前 paused executions
- 走 EXEC_CONTEXT KV scan paused:* prefix
POST /preview — dry-run，不寫 KV
- 複用 GraphExecutor，env.EXEC_CONTEXT 改 in-memory mock
POST /webhooks/named/:name/diff — 新舊 YAML diff
GET /my-telemetry?limit=N — 用戶自己看 telemetry

M2.2 MCP tools（包既有 + 新增 endpoint）

arcrun_validate_yaml — 包 /validate
arcrun_get_execution_trace
arcrun_list_recent_executions
arcrun_list_paused_executions
arcrun_resume_execution — 包 /workflows/resume
arcrun_list_workflows — 既有但確認
arcrun_get_workflow
arcrun_delete_workflow
arcrun_preview_workflow
arcrun_diff_workflow
arcrun_list_recipes / arcrun_create_recipe
arcrun_list_auth_recipes / arcrun_create_auth_recipe
arcrun_my_telemetry

M2.3 Error contract 統一

定義 error_code enum v1（design.md §3.1.4）
u6u-mcp 所有 tool 統一 error wrap（helper function）
cypher-executor 所有 route 統一 error response（含 error_code + next_actions）
寫測試：每個 error_code 至少一個 case

M2.4 驗收

模擬 zero-knowledge AI（新 conversation）按 AGENTS.md 部署一個 hello workflow
量測：從 list_components 到 run_workflow 成功總 MCP call < 5
比較人類 GUI 路徑，clickwise 對等

Milestone 3：skill blocks + examples

目標：AI 寫第一個 workflow 不靠猜，有範本和 playbook。

M3.1 種子 skill blocks（5 個）

skill-build_watcher_workflow — cron + 過濾 + trigger 模式
skill-debug_paused_workflow — claude_api callback 流程 + 怎麼追
skill-migrate_http_to_trigger_workflow — 從 self-fetch 換 trigger_workflow
skill-rag_with_arcrun — KBDB search + claude_api 組裝
skill-add_new_wasm_component — TinyGo 寫 + push + 註冊白名單

M3.2 MCP tools

arcrun_list_skills(tag?)
arcrun_get_skill(id)
arcrun_publish_skill — AI 把學到的回存

M3.3 種子 examples（10 個）

webhook-to-slack
cron-watcher
llm-classify
rag-search-answer
email-summary
pdf-to-blocks
github-issue-bot
daily-digest
parallel-fanout
error-retry

每個包 workflow.yaml + description.md + tags.json，放 arcrun/registry/examples/{slug}/。

M3.4 examples 索引 + 搜尋

CI build 範例 → KBDB block type=workflow-example（含 YAML + tags + description）
arcrun_search_examples(use_case) MCP tool（走 KBDB /search）

Milestone 4：closed loop

目標：data 收得到 → 平台自己消化產出 roadmap。

M4.1 Weekly review workflow

寫 polaris/mira/arcrun/agent_feedback_weekly_review.yaml（依 design.md §4.5 範本）
cron 0 9 * * 1（週一早 9 UTC）
acr push
手動觸發測試一次

M4.2 LLM 聚合 prompt

寫 prompt：把 feedback + telemetry 餵 Claude → 產出 Top 5 痛點 + 建議
結果格式固定：markdown sections（痛點 / 證據 / 建議 / 嚴重度）
存 KBDB block type=arcrun-roadmap

M4.3 通知

notify_telegram 節點：推給 leo
同時寫進 mira 河道（讓 leo 在熟悉介面看）

M4.4 驗收

跑滿 1 週 → 收到第一份 roadmap
leo review 後挑 1-2 個 issue 修補
跑第二週 → 確認該 issue 從 top list 消失

Milestone 5：rename + cleanup

目標：完成 LI 品牌化，u6u-mcp 退場。

M5.1 部署

u6u-mcp Worker 改 wrangler name → arcrun-mcp
加 route mcp.arcrun.dev/*
舊 route mcp.finally.click/* 保留 90 天

M5.2 Tool rename

每個 u6u_* tool 加 alias arcrun_*
舊 tool call → response 含 deprecation_warning 欄
AGENTS.md 全用新名

M5.3 文件

arcrun/AGENTS.md 最終版
u6u-mcp/README.md 標 deprecated，指向新位置
matrix/arcrun/.agents/specs/llm-interface/design.md 更新「實際部署狀態」附錄

M5.4 觀察 90 天

監控舊 tool name 使用率
0 使用 → 移除 alias
寫一篇 retrospective：LI 做完前後 AI 使用 arcrun 的 time-to-first-workflow 對比

Backlog（暫不排）

B.1 KBDB MCP 獨立 SDD

LI 範圍只包 KBDB 的 agent-* template
完整 KBDB AI 介面（type=note/page/triplet/template/record 等）另立 SDD kbdb-llm-interface
跟 mira KM 系統互動最密

B.2 多 agent 隔離

多 AI 共用同 ak_xxx 時，telemetry 區隔 agent_user_agent
進階：每個 AI 子 namespace（mira / cursor / 自製 agent）

B.3 AGENTS.md i18n

v1 純中文（leo + 自家用）
v2 英文版（給開源用戶）

B.4 自動 skill 萃取

weekly_review 產出的 pattern 自動包成 skill draft
leo review approve → publish

B.5 SDK 對等（python-sdk / js-sdk）

SDK 提供和 MCP 同樣的 25 個 method
給「不想用 MCP 的人」也能 AI-friendly
走 sdk-and-website SDD 範圍

B.6 GUI side 補 LI 看板

u6u-gui 加 /li-dashboard 顯示用戶自己的 telemetry / feedback
不阻擋 LI 推出（leo 先看 KBDB 原始 block 即可）

依賴關係

M1 (data 收集)
   ↓
M2 (MCP gap-fill)
   ↓
M3 (skill + examples)        ← 可與 M2 並行後段
   ↓
M4 (closed loop) ←─── 需 M1 data 累積 1-2 週
   ↓
M5 (rename)

工估算

Milestone	工	阻擋項
M1	5 個工作日	無
M2	5 個工作日	M1 完（telemetry 先就位才好驗證 M2 改動）
M3	5 個工作日	M2 完（tool 介面定型才寫 skill）
M4	3 個工作日	M1 data 累積 1 週
M5	5 個工作日	M2-M4 完
總	23 個工作日 (~5 週)

實際視 leo 排程，可邊用邊改、不必一氣呵成。M1 是硬前置——資料不收，改了也不知道改對沒。

8.0 KiB Raw Blame History Unescape Escape