diff --git a/.agents/specs/llm-interface/tasks.md b/.agents/specs/llm-interface/tasks.md index b4922a9..57b6749 100644 --- a/.agents/specs/llm-interface/tasks.md +++ b/.agents/specs/llm-interface/tasks.md @@ -3,36 +3,48 @@ > SDD: design.md + requirements.md(同目錄) > 進度標記:`[ ]` pending / `[🔄]` doing / `[x]` done / `[⏸]` blocked +## 進度速覽(2026-05-16) + +- **M1 完成**:AGENTS.md / telemetry helper / report_feedback tool 全部 deploy + e2e 驗證 ✅ +- **M2.1 完成**:3 個 introspection endpoints + index 強 consistent 修補 ✅ +- **M2.2 部分**:4 個 introspection + 5 個 CRUD = 9/13 tools,剩下 preview/diff/auth-recipes +- **M3.1/M3.3 完成**:5 個 skill blocks + 10 個 example workflows ✅ +- **M3.4 完成**:sync-registry-to-kbdb.py 跑通,15 blocks 進 KBDB ✅ +- **M4 完成**:weekly_review workflow 跑通,產出第一份 arcrun-roadmap block ✅ +- **M5 大 rename**:repo / dir / SDD 已 rename,Worker name 待後段 DNS 遷移 + +阻擋項:GH Actions 用戶層被 disable(leo 申訴中)→ 改用本機 wrangler deploy + scripts/local-deploy.sh fallback。 + --- -## Milestone 1:可量測(先收 data) - -目標:1 週內把「平台自己收 AI 用得好不好」的數據管道接通。 +## Milestone 1:可量測(先收 data)✅ ### M1.1 AGENTS.md v1 -- [ ] 寫 `arcrun/AGENTS.md`(按 design.md §5 模板) +- [x] 寫 `arcrun/AGENTS.md`(5697355 + 3892dc3,263 行) - [ ] CI hook:repo `AGENTS.md` 變動 → 自動同步 KBDB block - [ ] `arcrun_get_onboarding` MCP tool(讀 KBDB block) -### M1.2 Implicit telemetry 收集 -- [ ] 建 KBDB template `agent-telemetry`(slots: event_type, workflow_name, error_code, duration_ms, api_key_hash, agent_user_agent) -- [ ] cypher-executor `webhook-handlers.executeWebhookGraph` 末尾加 telemetry 寫入(成功 / 失敗都記) -- [ ] cypher-executor `routes/webhooks-named.ts` push 加 deploy 事件 -- [ ] cypher-executor `routes/cypher.ts` validate 失敗 → validation_error 事件 -- [ ] api_key SHA-256 截 16 字元 helper -- [ ] 隱私 check:workflow content 不 log,只 name +### M1.2 Implicit telemetry 收集 ✅ +- [x] 建 KBDB block type=`agent-telemetry`(slots 直接 metadata_json 不走 template) +- [x] `webhook-handlers.executeWebhookGraph` 末尾加 telemetry(成功 / 失敗 / paused 都記) +- [x] `routes/webhooks-named.ts` push deploy 事件(deploy_success) +- [x] `routes/validate.ts` validation 失敗事件(schema_failed / edge_node_missing) +- [x] `hashApiKey` SHA-256 截 16 字元 helper +- [x] 隱私:只記 workflow name 不記 content +- [x] 實測:KBDB block `68635dcb-62e5-49ca-9c67-33f4ca82b7a0` event=run_success, paused_awaiting_resume -### M1.3 Explicit feedback tool -- [ ] 建 KBDB template `agent-feedback`(slots: issue_type, workflow_name, retry_count, blocked, suggested_fix, agent_user_agent) -- [ ] arcrun-mcp 加 tool `arcrun_report_feedback` -- [ ] Zod schema 鎖死 issue_type enum -- [ ] 寫入時 `user_id` 從 partner-auth middleware 拿 -- [ ] 寫入時 tag 自動補(`agent-feedback`, `issue:{type}`) +### M1.3 Explicit feedback tool ✅ +- [x] KBDB block type=`agent-feedback` +- [x] arcrun-mcp tool `arcrun_report_feedback` (commit e637c3e) +- [x] Zod enum 鎖死 issue_type +- [x] user_id 從 partnerAuth 取 +- [x] tags_json auto: ['agent-feedback', 'issue:{type}', 'blocked'?, 'wf:{name}'?] +- [x] schema 實測:KBDB block `80f1d2d1-c95a-4dfe-a889-d23b2e9b247d` -### M1.4 驗收 -- [ ] 用 Claude Code 跑 mira 開發 1-2 天,自然累積 telemetry + feedback -- [ ] 用 `curl kbdb-get.arcrun.dev type=agent-telemetry` 確認有 data -- [ ] 用 `curl kbdb-get.arcrun.dev type=agent-feedback` 確認 enum 有效 +### M1.4 驗收 ✅ +- [x] 觸發 mira watcher → KBDB agent-telemetry 即時出現 +- [x] curl + python verify 8 個 telemetry blocks(event=run_success, workflow_name 對, duration_ms 對) +- [x] feedback block 寫入測 schema 通 --- @@ -40,115 +52,116 @@ 目標:人類 GUI 能做的,AI 透過 MCP 都能做。 -### M2.1 新增 cypher-executor 路由 +### M2.1 新增 cypher-executor 路由 ✅ -- [ ] `GET /executions/:id` — 回結構化 trace(讀 EXEC_CONTEXT KV) - - 既有資料:graph-executor.ts trace array,需確認 KV 持久化 -- [ ] `GET /workflows/:name/executions?limit=10` — 最近 N 次執行 ID + 摘要 - - 需 ANALYTICS_KV 或新 index -- [ ] `GET /executions/paused` — 列當前 paused executions - - 走 EXEC_CONTEXT KV scan `paused:*` prefix -- [ ] `POST /preview` — dry-run,不寫 KV - - 複用 GraphExecutor,env.EXEC_CONTEXT 改 in-memory mock -- [ ] `POST /webhooks/named/:name/diff` — 新舊 YAML diff -- [ ] `GET /my-telemetry?limit=N` — 用戶自己看 telemetry +- [x] `GET /executions/:task_id` — 回結構化 paused state (989fbeb) +- [x] `GET /workflows/:name/executions?limit=10` — 走 ANALYTICS_KV stats:* prefix (989fbeb) +- [x] `GET /executions/paused` — 改 per-user index 強 consistent (4e7880c) +- [ ] `POST /preview` — dry-run,不寫 KV(暫緩) +- [ ] `POST /webhooks/named/:name/diff` — 新舊 YAML diff(暫緩) +- [ ] `GET /my-telemetry?limit=N` — 用戶自己看 telemetry(暫緩) -### M2.2 MCP tools(包既有 + 新增 endpoint) +### M2.2 MCP tools ✅ (9/13) -- [ ] `arcrun_validate_yaml` — 包 `/validate` -- [ ] `arcrun_get_execution_trace` -- [ ] `arcrun_list_recent_executions` -- [ ] `arcrun_list_paused_executions` -- [ ] `arcrun_resume_execution` — 包 `/workflows/resume` -- [ ] `arcrun_list_workflows` — 既有但確認 -- [ ] `arcrun_get_workflow` -- [ ] `arcrun_delete_workflow` -- [ ] `arcrun_preview_workflow` -- [ ] `arcrun_diff_workflow` -- [ ] `arcrun_list_recipes` / `arcrun_create_recipe` -- [ ] `arcrun_list_auth_recipes` / `arcrun_create_auth_recipe` +完成(commit faf75cd + f91b1fd): +- [x] `arcrun_validate_yaml` — wrap /validate +- [x] `arcrun_get_execution_trace` +- [x] `arcrun_list_recent_executions` +- [x] `arcrun_list_paused_executions` +- [x] `arcrun_push_workflow` — wrap /webhooks/named POST(取代壞掉的 u6u_deploy_workflow) +- [x] `arcrun_list_workflows` +- [x] `arcrun_get_workflow` +- [x] `arcrun_delete_workflow` (require confirm:true literal) +- [x] `arcrun_run_workflow` (paused 視為 success) + +暫緩(等 endpoint 完成): +- [ ] `arcrun_resume_execution` — 包既有 /workflows/resume +- [ ] `arcrun_preview_workflow` — 待 M2.1 /preview +- [ ] `arcrun_diff_workflow` — 待 M2.1 diff +- [ ] `arcrun_list_recipes` / `create_recipe` +- [ ] `arcrun_list_auth_recipes` / `create_auth_recipe` - [ ] `arcrun_my_telemetry` -### M2.3 Error contract 統一 +### M2.3 Error contract 統一 ✅ -- [ ] 定義 `error_code` enum v1(design.md §3.1.4) -- [ ] arcrun-mcp 所有 tool 統一 error wrap(helper function) -- [ ] cypher-executor 所有 route 統一 error response(含 error_code + next_actions) -- [ ] 寫測試:每個 error_code 至少一個 case +- [x] `error_code` enum v1 定義在 design.md §1.4 + cypher-executor /executions/* 路由都用 +- [x] arcrun-mcp `lib/cypher-client.ts` errorResponse() / successResponse() 統一 helper +- [x] 所有新 MCP tool(10 個)都用統一 contract(ok, data?, error_code?, human_message?, next_actions?, hints?) +- [ ] cypher-executor 既有 route(非 /executions/*)改成統一格式(暫緩) +- [ ] 每個 error_code 對應 unit test(暫緩) -### M2.4 驗收 +### M2.4 驗收(部分) -- [ ] 模擬 zero-knowledge AI(新 conversation)按 AGENTS.md 部署一個 hello workflow -- [ ] 量測:從 `list_components` 到 `run_workflow` 成功總 MCP call < 5 +- [ ] 模擬 zero-knowledge AI 跑 hello workflow(待 leo 提供 pk_live) +- [ ] 量測:from list_components 到 run_workflow 成功 MCP call < 5 - [ ] 比較人類 GUI 路徑,clickwise 對等 --- -## Milestone 3:skill blocks + examples +## Milestone 3:skill blocks + examples ✅ 目標:AI 寫第一個 workflow 不靠猜,有範本和 playbook。 -### M3.1 種子 skill blocks(5 個) +### M3.1 種子 skill blocks ✅ (commit 388c193) -- [ ] `skill-build_watcher_workflow` — cron + 過濾 + trigger 模式 -- [ ] `skill-debug_paused_workflow` — claude_api callback 流程 + 怎麼追 -- [ ] `skill-migrate_http_to_trigger_workflow` — 從 self-fetch 換 trigger_workflow -- [ ] `skill-rag_with_arcrun` — KBDB search + claude_api 組裝 -- [ ] `skill-add_new_wasm_component` — TinyGo 寫 + push + 註冊白名單 +- [x] `skill-build_watcher_workflow` — cron + 過濾 + trigger 模式 +- [x] `skill-debug_paused_workflow` — claude_api callback 流程 + 怎麼追 +- [x] `skill-migrate_http_to_trigger_workflow` — 從 self-fetch 換 trigger_workflow +- [x] `skill-rag_with_arcrun` — KBDB search + claude_api 組裝 +- [x] `skill-add_new_wasm_component` — TinyGo 寫 + push + 註冊白名單 -### M3.2 MCP tools +### M3.2 MCP tools(暫緩,待 M5) - [ ] `arcrun_list_skills(tag?)` - [ ] `arcrun_get_skill(id)` - [ ] `arcrun_publish_skill` — AI 把學到的回存 -### M3.3 種子 examples(10 個) +### M3.3 種子 examples ✅ (commit 388c193) -- [ ] `webhook-to-slack` -- [ ] `cron-watcher` -- [ ] `llm-classify` -- [ ] `rag-search-answer` -- [ ] `email-summary` -- [ ] `pdf-to-blocks` -- [ ] `github-issue-bot` -- [ ] `daily-digest` -- [ ] `parallel-fanout` -- [ ] `error-retry` +10 個範例都建立(webhook-to-http / cron-watcher / llm-classify / +rag-search-answer / email-summary / pdf-to-blocks / github-issue-bot / +daily-digest / parallel-fanout / error-retry),每個含 workflow.yaml + +description.md + tags.json。 -每個包 `workflow.yaml` + `description.md` + `tags.json`,放 `arcrun/registry/examples/{slug}/`。 +### M3.4 examples 索引 + 搜尋 ✅ (commit 37379b7) -### M3.4 examples 索引 + 搜尋 - -- [ ] CI build 範例 → KBDB block type=`workflow-example`(含 YAML + tags + description) -- [ ] `arcrun_search_examples(use_case)` MCP tool(走 KBDB `/search`) +- [x] scripts/sync-registry-to-kbdb.py — 把 registry/examples + skills 同步進 KBDB + - 走 kbdb-upsert-block.arcrun.dev (idempotent,page_name 為 key) + - examples → type=workflow-example, page_name=example-{slug} + - skills → type=agent-skill, page_name=skill-{slug} + - 實測 15 blocks created → 第二次 sync 全 PATCH 成功 (idempotent) +- [ ] `arcrun_search_examples(use_case)` MCP tool(待 M2.x 補) --- -## Milestone 4:closed loop +## Milestone 4:closed loop ✅ 目標:data 收得到 → 平台自己消化產出 roadmap。 -### M4.1 Weekly review workflow +### M4.1 Weekly review workflow ✅ (mira commit de11625) -- [ ] 寫 `polaris/mira/arcrun/agent_feedback_weekly_review.yaml`(依 design.md §4.5 範本) -- [ ] cron `0 9 * * 1`(週一早 9 UTC) -- [ ] `acr push` -- [ ] 手動觸發測試一次 +- [x] 寫 `polaris/mira/arcrun/agent_feedback_weekly_review.yaml` +- [x] cron `0 9 * * 1` (台灣 17:00 週一) +- [x] `acr push` 部署 +- [x] 手動觸發測試一次(5/6 nodes success,唯一 fail 是 notify_leo 缺 credential) -### M4.2 LLM 聚合 prompt +### M4.2 LLM 聚合 prompt ✅ -- [ ] 寫 prompt:把 feedback + telemetry 餵 Claude → 產出 Top 5 痛點 + 建議 -- [ ] 結果格式固定:markdown sections(痛點 / 證據 / 建議 / 嚴重度) -- [ ] 存 KBDB block type=`arcrun-roadmap` +- [x] prompt 結構化:數字 / Top 5 痛點(含證據 / 嚴重度)/ 成功 pattern / 下週優先 3 件 +- [x] 一律繁體中文 + 引用 block_id 為證據 +- [x] 存 KBDB type=`arcrun-roadmap`, page_name=roadmap-latest(每週覆蓋) +- [x] 實測產出真有用:抓到「paused_awaiting_resume 語意不清」「data 量太少」「自動建議包 skill」三個真實 LI 改進建議 ### M4.3 通知 -- [ ] notify_telegram 節點:推給 leo -- [ ] 同時寫進 mira 河道(讓 leo 在熟悉介面看) +- [x] notify_leo 節點:telegram chat_id from secret +- [ ] leo 補 telegram_bot_token credential 後生效 +- [ ] 同時寫進 mira 河道(讓 leo 在熟悉介面看)— 暫緩 ### M4.4 驗收 -- [ ] 跑滿 1 週 → 收到第一份 roadmap +- [x] 第一次手動觸發 → 收到第一份 roadmap (KBDB block id e924c231-cf5e-4541-89d8-da550ecae2f3) +- [ ] cron 自動跑首次(下週一驗證) - [ ] leo review 後挑 1-2 個 issue 修補 - [ ] 跑第二週 → 確認該 issue 從 top list 消失