docs(arcrun): SDD llm-interface — AI 操盤手使用體驗 first-class

設計動機：3 天 mira dogfood 累積 14 個痛點，7 個純粹是 LI 缺失。 arcrun 過去設計集中在「人」（u6u-gui / docs），AI 對 arcrun 的可用性沒被當第一公民。 SDD 三件套（matrix/arcrun/.agents/specs/llm-interface/）： requirements.md - personas（Claude Code 主力 / 用戶私人 agent / SDK 使用者） - 範圍涵蓋 5 系統（cypher-executor / registry / u6u-mcp / u6u-gui / kbdb） - 10 個 FR：onboarding / CRUD 對等 / dry-run / 結構化 trace / 可程式化 error / feedback tool / implicit telemetry / skill blocks / examples / weekly closed loop - 5 個 NFR：相容 / 多 transport / error contract 穩定 / feedback exportable / coverage 量化 design.md - 5 層 LI 模型：AGENTS.md / arcrun-mcp / Skills / Examples / Telemetry - 25 個 MCP tool 完整清單分 5 類 - error_code enum v1 - coverage matrix（GUI 動作 vs MCP / 31 cypher-executor 路由 vs LI） - 完整 AGENTS.md 模板 - u6u-mcp → arcrun-mcp migration plan（90 天 deprecation） - weekly_review workflow YAML 範本 tasks.md - 5 個 milestone（M1 收 data / M2 gap-fill / M3 skill+examples / M4 closed loop / M5 rename） - 估算 23 個工作日 (~5 週) - M1 是硬前置（不收 data 改了也不知道對沒） Audit 基準（用 4 個並行 Explore agent 整理）： - cypher-executor: 31 HTTP 路由，9 個 AI-essential - u6u-mcp: 15 tool，缺 update/delete/history/validate/feedback - u6u-gui: 8 個人類動作可對等 LI / 3 個視覺類不需 - kbdb: 50 路由 13 group，LI 走 abstracted tool 不直接 expose 同步更新 .claude/rules/04-current-progress.md SDD 索引。 Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-16 14:58:21 +08:00
parent 521624261d
commit c2a2f82ade
4 changed files with 1103 additions and 0 deletions
@@ -0,0 +1,244 @@
+# Tasks: LI (LLM Interface) for arcrun
+
+> SDD: design.md + requirements.md（同目錄）
+> 進度標記：`[ ]` pending / `[🔄]` doing / `[x]` done / `[⏸]` blocked
+
+---
+
+## Milestone 1：可量測（先收 data）
+
+目標：1 週內把「平台自己收 AI 用得好不好」的數據管道接通。
+
+### M1.1 AGENTS.md v1
+- [ ] 寫 `arcrun/AGENTS.md`（按 design.md §5 模板）
+- [ ] CI hook：repo `AGENTS.md` 變動 → 自動同步 KBDB block
+- [ ] `arcrun_get_onboarding` MCP tool（讀 KBDB block）
+
+### M1.2 Implicit telemetry 收集
+- [ ] 建 KBDB template `agent-telemetry`（slots: event_type, workflow_name, error_code, duration_ms, api_key_hash, agent_user_agent）
+- [ ] cypher-executor `webhook-handlers.executeWebhookGraph` 末尾加 telemetry 寫入（成功 / 失敗都記）
+- [ ] cypher-executor `routes/webhooks-named.ts` push 加 deploy 事件
+- [ ] cypher-executor `routes/cypher.ts` validate 失敗 → validation_error 事件
+- [ ] api_key SHA-256 截 16 字元 helper
+- [ ] 隱私 check：workflow content 不 log，只 name
+
+### M1.3 Explicit feedback tool
+- [ ] 建 KBDB template `agent-feedback`（slots: issue_type, workflow_name, retry_count, blocked, suggested_fix, agent_user_agent）
+- [ ] u6u-mcp 加 tool `arcrun_report_feedback`
+- [ ] Zod schema 鎖死 issue_type enum
+- [ ] 寫入時 `user_id` 從 partner-auth middleware 拿
+- [ ] 寫入時 tag 自動補（`agent-feedback`, `issue:{type}`）
+
+### M1.4 驗收
+- [ ] 用 Claude Code 跑 mira 開發 1-2 天，自然累積 telemetry + feedback
+- [ ] 用 `curl kbdb-get.arcrun.dev type=agent-telemetry` 確認有 data
+- [ ] 用 `curl kbdb-get.arcrun.dev type=agent-feedback` 確認 enum 有效
+
+---
+
+## Milestone 2：gap-fill（補 MCP 工具）
+
+目標：人類 GUI 能做的，AI 透過 MCP 都能做。
+
+### M2.1 新增 cypher-executor 路由
+
+- [ ] `GET /executions/:id` — 回結構化 trace（讀 EXEC_CONTEXT KV）
+  - 既有資料：graph-executor.ts trace array，需確認 KV 持久化
+- [ ] `GET /workflows/:name/executions?limit=10` — 最近 N 次執行 ID + 摘要
+  - 需 ANALYTICS_KV 或新 index
+- [ ] `GET /executions/paused` — 列當前 paused executions
+  - 走 EXEC_CONTEXT KV scan `paused:*` prefix
+- [ ] `POST /preview` — dry-run，不寫 KV
+  - 複用 GraphExecutor，env.EXEC_CONTEXT 改 in-memory mock
+- [ ] `POST /webhooks/named/:name/diff` — 新舊 YAML diff
+- [ ] `GET /my-telemetry?limit=N` — 用戶自己看 telemetry
+
+### M2.2 MCP tools（包既有 + 新增 endpoint）
+
+- [ ] `arcrun_validate_yaml` — 包 `/validate`
+- [ ] `arcrun_get_execution_trace`
+- [ ] `arcrun_list_recent_executions`
+- [ ] `arcrun_list_paused_executions`
+- [ ] `arcrun_resume_execution` — 包 `/workflows/resume`
+- [ ] `arcrun_list_workflows` — 既有但確認
+- [ ] `arcrun_get_workflow`
+- [ ] `arcrun_delete_workflow`
+- [ ] `arcrun_preview_workflow`
+- [ ] `arcrun_diff_workflow`
+- [ ] `arcrun_list_recipes` / `arcrun_create_recipe`
+- [ ] `arcrun_list_auth_recipes` / `arcrun_create_auth_recipe`
+- [ ] `arcrun_my_telemetry`
+
+### M2.3 Error contract 統一
+
+- [ ] 定義 `error_code` enum v1（design.md §3.1.4）
+- [ ] u6u-mcp 所有 tool 統一 error wrap（helper function）
+- [ ] cypher-executor 所有 route 統一 error response（含 error_code + next_actions）
+- [ ] 寫測試：每個 error_code 至少一個 case
+
+### M2.4 驗收
+
+- [ ] 模擬 zero-knowledge AI（新 conversation）按 AGENTS.md 部署一個 hello workflow
+- [ ] 量測：從 `list_components` 到 `run_workflow` 成功總 MCP call < 5
+- [ ] 比較人類 GUI 路徑，clickwise 對等
+
+---
+
+## Milestone 3：skill blocks + examples
+
+目標：AI 寫第一個 workflow 不靠猜，有範本和 playbook。
+
+### M3.1 種子 skill blocks（5 個）
+
+- [ ] `skill-build_watcher_workflow` — cron + 過濾 + trigger 模式
+- [ ] `skill-debug_paused_workflow` — claude_api callback 流程 + 怎麼追
+- [ ] `skill-migrate_http_to_trigger_workflow` — 從 self-fetch 換 trigger_workflow
+- [ ] `skill-rag_with_arcrun` — KBDB search + claude_api 組裝
+- [ ] `skill-add_new_wasm_component` — TinyGo 寫 + push + 註冊白名單
+
+### M3.2 MCP tools
+
+- [ ] `arcrun_list_skills(tag?)`
+- [ ] `arcrun_get_skill(id)`
+- [ ] `arcrun_publish_skill` — AI 把學到的回存
+
+### M3.3 種子 examples（10 個）
+
+- [ ] `webhook-to-slack`
+- [ ] `cron-watcher`
+- [ ] `llm-classify`
+- [ ] `rag-search-answer`
+- [ ] `email-summary`
+- [ ] `pdf-to-blocks`
+- [ ] `github-issue-bot`
+- [ ] `daily-digest`
+- [ ] `parallel-fanout`
+- [ ] `error-retry`
+
+每個包 `workflow.yaml` + `description.md` + `tags.json`，放 `arcrun/registry/examples/{slug}/`。
+
+### M3.4 examples 索引 + 搜尋
+
+- [ ] CI build 範例 → KBDB block type=`workflow-example`（含 YAML + tags + description）
+- [ ] `arcrun_search_examples(use_case)` MCP tool（走 KBDB `/search`）
+
+---
+
+## Milestone 4：closed loop
+
+目標：data 收得到 → 平台自己消化產出 roadmap。
+
+### M4.1 Weekly review workflow
+
+- [ ] 寫 `polaris/mira/arcrun/agent_feedback_weekly_review.yaml`（依 design.md §4.5 範本）
+- [ ] cron `0 9 * * 1`（週一早 9 UTC）
+- [ ] `acr push`
+- [ ] 手動觸發測試一次
+
+### M4.2 LLM 聚合 prompt
+
+- [ ] 寫 prompt：把 feedback + telemetry 餵 Claude → 產出 Top 5 痛點 + 建議
+- [ ] 結果格式固定：markdown sections（痛點 / 證據 / 建議 / 嚴重度）
+- [ ] 存 KBDB block type=`arcrun-roadmap`
+
+### M4.3 通知
+
+- [ ] notify_telegram 節點：推給 leo
+- [ ] 同時寫進 mira 河道（讓 leo 在熟悉介面看）
+
+### M4.4 驗收
+
+- [ ] 跑滿 1 週 → 收到第一份 roadmap
+- [ ] leo review 後挑 1-2 個 issue 修補
+- [ ] 跑第二週 → 確認該 issue 從 top list 消失
+
+---
+
+## Milestone 5：rename + cleanup
+
+目標：完成 LI 品牌化，u6u-mcp 退場。
+
+### M5.1 部署
+
+- [ ] u6u-mcp Worker 改 wrangler name → `arcrun-mcp`
+- [ ] 加 route `mcp.arcrun.dev/*`
+- [ ] 舊 route `mcp.finally.click/*` 保留 90 天
+
+### M5.2 Tool rename
+
+- [ ] 每個 `u6u_*` tool 加 alias `arcrun_*`
+- [ ] 舊 tool call → response 含 `deprecation_warning` 欄
+- [ ] AGENTS.md 全用新名
+
+### M5.3 文件
+
+- [ ] `arcrun/AGENTS.md` 最終版
+- [ ] `u6u-mcp/README.md` 標 deprecated，指向新位置
+- [ ] `matrix/arcrun/.agents/specs/llm-interface/design.md` 更新「實際部署狀態」附錄
+
+### M5.4 觀察 90 天
+
+- [ ] 監控舊 tool name 使用率
+- [ ] 0 使用 → 移除 alias
+- [ ] 寫一篇 retrospective：LI 做完前後 AI 使用 arcrun 的 time-to-first-workflow 對比
+
+---
+
+## Backlog（暫不排）
+
+### B.1 KBDB MCP 獨立 SDD
+- LI 範圍只包 KBDB 的 `agent-*` template
+- 完整 KBDB AI 介面（type=note/page/triplet/template/record 等）另立 SDD `kbdb-llm-interface`
+- 跟 mira KM 系統互動最密
+
+### B.2 多 agent 隔離
+- 多 AI 共用同 ak_xxx 時，telemetry 區隔 agent_user_agent
+- 進階：每個 AI 子 namespace（mira / cursor / 自製 agent）
+
+### B.3 AGENTS.md i18n
+- v1 純中文（leo + 自家用）
+- v2 英文版（給開源用戶）
+
+### B.4 自動 skill 萃取
+- weekly_review 產出的 pattern 自動包成 skill draft
+- leo review approve → publish
+
+### B.5 SDK 對等（python-sdk / js-sdk）
+- SDK 提供和 MCP 同樣的 25 個 method
+- 給「不想用 MCP 的人」也能 AI-friendly
+- 走 sdk-and-website SDD 範圍
+
+### B.6 GUI side 補 LI 看板
+- u6u-gui 加 `/li-dashboard` 顯示用戶自己的 telemetry / feedback
+- 不阻擋 LI 推出（leo 先看 KBDB 原始 block 即可）
+
+---
+
+## 依賴關係
+
+```
+M1 (data 收集)
+   ↓
+M2 (MCP gap-fill)
+   ↓
+M3 (skill + examples)        ← 可與 M2 並行後段
+   ↓
+M4 (closed loop) ←─── 需 M1 data 累積 1-2 週
+   ↓
+M5 (rename)
+```
+
+---
+
+## 工估算
+
+| Milestone | 工 | 阻擋項 |
+|---|---|---|
+| M1 | 5 個工作日 | 無 |
+| M2 | 5 個工作日 | M1 完（telemetry 先就位才好驗證 M2 改動） |
+| M3 | 5 個工作日 | M2 完（tool 介面定型才寫 skill） |
+| M4 | 3 個工作日 | M1 data 累積 1 週 |
+| M5 | 5 個工作日 | M2-M4 完 |
+| **總** | **23 個工作日 (~5 週)** | |
+
+實際視 leo 排程，可邊用邊改、不必一氣呵成。**M1 是硬前置**——資料不收，改了也不知道改對沒。