arcrun — AI workflow execution engine (clean history)
Self-hosted 開源:WASM 零件 + recipe + cypher-executor,跑在你自己的 Cloudflare。 此為重建的乾淨歷史起點(移除曾誤 commit 的 GCP SA 金鑰,舊歷史保留在 richblack/arcrun 與本地 backup 分支)。含: - acr init --self-hosted installer(建 KV/R2 + codeload 拉預編譯 wasm + wrangler deploy + seed recipe) - recipe push 把關(資料外流提醒 + 打通檢查) - 19 個正當零件預編譯 wasm(claude_api/km_writer/kbdb_upsert_block 排除:違反 DECISIONS §1) - CLI / cypher-executor / registry / 完整 SDD Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
@@ -0,0 +1,61 @@
|
||||
# Tasks — Resumable Workflow
|
||||
|
||||
> 對應 SDD:[design.md](design.md)
|
||||
> 上次更新:2026-05-07
|
||||
|
||||
**狀態 legend**:`[ ]` 待辦 / `[🔄]` 進行中 / `[x]` 完成
|
||||
|
||||
---
|
||||
|
||||
## Phase 1:Mira daemon 端 callback 支援
|
||||
|
||||
- [x] 1.1 改 `/opt/mira/mira-daemon.js`(Hetzner mira container)`/execute` 接受 `params.callback_url`
|
||||
- [x] 1.2 fireCallback function:task done/failed 時 POST callback_url,body = `{task_id, success, data?, error?}`
|
||||
- [x] 1.3 callback retry:4 次(立即 + 1s/5s/30s backoff),全失敗 log
|
||||
- [x] 1.4 patch script 寫好 `/tmp/patch-mira-daemon.py`,docker cp 進 container(注意:rebuild image 會丟失,需重 patch 或正式 commit 進 Dockerfile/git repo)
|
||||
- [x] 1.5 真實端對端驗證:daemon log 顯示 `[Mira callback] task=task_2_... POST https://cypher.arcrun.dev/workflows/resume OK 200`(2026-05-07 07:24:04 + task_3 短測試)
|
||||
|
||||
## Phase 2:cypher-executor resumable runtime
|
||||
|
||||
- [x] 2.1 寫 `paused-runs.ts`(81 行):persistPausedRun / loadPausedRun / consumePausedRun + isResumablePending 偵測器,24h TTL
|
||||
- [x] 2.2 改 `graph-executor.ts` Component case:偵測 pending → 寫 KV + throw WorkflowPaused
|
||||
- [x] 2.3 改 `cypher-handlers.ts`:catch WorkflowPaused → 回 `{success:true, paused:true, task_id, run_id, paused_node_id, trace, graph}`
|
||||
- [x] 2.4 callback_url 自動注入:componentId==='claude_api' 時 mergedContext.callback_url = PUBLIC_BASE_URL 或預設 cypher.arcrun.dev/workflows/resume
|
||||
|
||||
## Phase 3:resume endpoint
|
||||
|
||||
- [x] 3.1 寫 `routes/resume.ts`:POST /workflows/resume,consumePausedRun → resumeFromPaused
|
||||
- [x] 3.2 graph-executor 加 `resumeFromPaused()` 方法:把 callback_result 當 paused_node 輸出 + spread 進 ctx + 從下游節點繼續
|
||||
- [x] 3.3 idempotent 驗證:第二次 callback 回 `{noop:true, reason:"state 不存在或過期"}`
|
||||
- [x] 3.4 cypher-executor 部署 v0580980b
|
||||
- [x] 3.5 mount /workflows/resume 進 index.ts
|
||||
|
||||
## Phase 4:claude_api 容器透傳 callback_url
|
||||
|
||||
- [x] 4.1 改 `claude_api/main.go`:Input 加 CallbackURL;timeout 預設改 120s
|
||||
- [x] 4.2 重 build wasm + redeploy claude-api.arcrun.dev (v f926e3dd)
|
||||
- [x] 4.3 真實端對端驗證:daemon 收到 callback_url → task done 後 POST cypher-executor/workflows/resume → 200 OK
|
||||
|
||||
## Phase 5:端對端整合測試
|
||||
|
||||
- [ ] 5.1 用 MCP `u6u_execute_workflow` 跑 wiki 合成 + 5KB+ 草稿
|
||||
- [ ] 5.2 第一次回應應為 `{paused, task_id, run_id}`
|
||||
- [ ] 5.3 等 daemon callback 進來(log 看到 /workflows/resume 命中)
|
||||
- [ ] 5.4 觀察 wiki page 真的寫進 KBDB(即使原 MCP call 已斷線)
|
||||
- [ ] 5.5 trace 含完整節點紀錄(paused → resumed)
|
||||
|
||||
---
|
||||
|
||||
## 風險追蹤
|
||||
|
||||
- 風險 1:daemon callback 進來時,cypher.arcrun.dev 還沒醒(CF Worker cold start)→ 第一次 retry 接住(daemon retry policy 涵蓋)
|
||||
- 風險 2:v1 沒 final_callback 給原 client → 用戶要主動查狀態
|
||||
- 接受:mira 河道 UI 可定期 refetch wiki page,或用既有 KBDB 觸發機制
|
||||
- v2 加 final_callback 統一處理
|
||||
|
||||
## v2 已記錄
|
||||
|
||||
- nested pending(一個 run 多個 paused 節點)
|
||||
- foreach 內 pending(item-level resume)
|
||||
- final_callback 給原 client(trigger 時帶 final_callback_url)
|
||||
- poll_task 零件(外部 API 沒 webhook 時用)
|
||||
Reference in New Issue
Block a user