arcrun — AI workflow execution engine (clean history)

Self-hosted 開源：WASM 零件 + recipe + cypher-executor，跑在你自己的 Cloudflare。此為重建的乾淨歷史起點（移除曾誤 commit 的 GCP SA 金鑰，舊歷史保留在 richblack/arcrun 與本地 backup 分支）。含： - acr init --self-hosted installer（建 KV/R2 + codeload 拉預編譯 wasm + wrangler deploy + seed recipe） - recipe push 把關（資料外流提醒 + 打通檢查） - 19 個正當零件預編譯 wasm（claude_api/km_writer/kbdb_upsert_block 排除：違反 DECISIONS §1） - CLI / cypher-executor / registry / 完整 SDD Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-03 15:52:38 +08:00
commit 922a57fe34
485 changed files with 89356 additions and 0 deletions
@@ -0,0 +1,240 @@
+# SDD: arcrun Recipe System（容器 + Recipe 模式）
+
+> 2026-05-07 建立。吃狗糧寫 wiki 合成 workflow 時撞牆發現的平台缺口。
+> 核心原則：**一個 WASM 零件 = 容器，內容（recipe）存資料庫**。
+> n8n 為每種 API 寫獨立 node，arcrun 走「容器 + recipe」減少零件數量。
+
+---
+
+## 1. 問題
+
+### 1.1 撞牆現場
+
+寫 mira wiki 合成 workflow（7-B）時：
+- 流程：`kbdb_get(stale)` → foreach → `kbdb_get(drafts)` → `claude_api(合成 prompt)` → `kbdb_ingest`
+- 第三步要組 prompt：`schema 內容 + skill 模板 + drafts array + existing_entities`
+- cypher binding 內建 `{{var}}` 模板太弱（只支援 top-level，不支援嵌套 / array → string）
+- 沒有 `string_template` 零件、沒有 `array_to_markdown` 零件
+- 寫專用 `wiki_prompt_builder` 零件 = 走 n8n 老路，每個 AI workflow 都要寫一個
+
+### 1.2 根因
+
+**arcrun recipe 系統只覆蓋 HTTP / auth 兩層**：
+
+| Recipe 種類 | 存哪 | 容器 | 狀態 |
+|---|---|---|---|
+| auth_recipe | RECIPES KV (`auth_recipe:{service}`) | auth_static_key / auth_oauth2 / ... | ✅ 已有 |
+| api_recipe | RECIPES KV (`rec_{hash}`) | http_request | ✅ 已有（hard-code 在 cypher-executor 待清，Phase 1-3 處理）|
+| **prompt_recipe** | ❌ 不存在 | claude_api（容器） | **缺** |
+
+`claude_api` 零件目前吃 `prompt: string`（已組好的字串），沒有「recipe 模式」可以讓 AI 用「組合配方」的方式呼叫。
+
+### 1.3 影響
+
+- **致命**：寫不出第一個 wiki 合成 workflow（7-B 卡關）
+- **推廣破功**：arcrun 對外 prop 是「容器 + recipe，AI 不用寫 code」，但 prompt 這層做不到
+- **未來所有 AI workflow 都會撞同樣問題**：rss-tech-news 評語、河道 AI 副駕、ai-comment、文章摘要⋯ 全部需要組 prompt
+
+---
+
+## 2. 設計
+
+### 2.1 核心：prompt_recipe 平行於 auth_recipe / api_recipe
+
+**儲存**：`RECIPES` KV，key 格式 `prompt_recipe:{name}`
+
+**結構**：
+```yaml
+id: prompt_recipe:wiki_synthesis
+version: v1
+description: "Mira wiki 合成（抽 triplet + 寫 wiki paragraph）"
+model: sonnet                          # haiku / sonnet / opus（claude_api 沿用既有 routing）
+
+# 從 KBDB / 其他來源取的 fragment（在 prompt 組合時抓並插入）
+fragments:
+  - var: schema
+    source: kbdb_block
+    block_id: "7a4e456e-1b0f-406a-8842-5e01d1cf1eef"  # mira-wiki-schema
+    field: content
+  - var: skill_template
+    source: kbdb_block
+    block_id: "85e3b81e-dca8-4131-bcdc-990bd0d3a16f"  # source-skill-wiki-synthesis
+    field: content
+
+# 從 workflow context 取（input/前置節點輸出）
+inputs:
+  - var: drafts                      # 草稿 array
+    from: "ctx.read_drafts.blocks"
+    transform: "json_array"          # 轉成 JSON array string
+  - var: existing_entities
+    from: "ctx.read_entities.blocks"
+    transform: "extract_field:page_name"  # 抽 array 的 page_name 欄位 join 成 list
+  - var: entity_name
+    from: "ctx.loop.item"            # foreach 迴圈當前元素
+
+# 最終 prompt 由 fragments + inputs 套進 skill_template 組成
+prompt_assembly:
+  system: "{{schema}}"               # 直接用 schema 當 system prompt
+  user: "{{skill_template}}"         # skill template 內含 {{drafts}} {{existing_entities}} {{entity_name}} 變數
+
+# 期待輸出
+output:
+  format: json                       # claude_api 自動 parse 為 object
+  schema:                            # zod-style，parse 失敗回 success:false
+    type: object
+    required: [triplets, entities, paragraphs, source_summary]
+```
+
+### 2.2 Recipe 解析在 cypher-executor（架構選擇 B）
+
+**設計決策**（2026-05-07）：recipe 解析跟 prompt 組裝**在 cypher-executor TS**，不改既有 claude_api WASM。
+
+理由：
+1. recipe 解析是 cypher-executor 既有 `api_recipe / auth_recipe` 同性質工作
+2. 既有 claude_api 已部署 + 已測試，不動影響面最小
+3. transform 邏輯（json_array / extract_field 等）TS 寫起來比 TinyGo 簡單 10 倍
+4. 不違反 §1.6 — skill 還是 KBDB block，cypher-executor 只是組合者，不寫死 prompt
+
+**流程：**
+
+```
+workflow YAML 節點 config 出現 `recipe: prompt_recipe:xxx`
+        │
+        ▼
+cypher-executor graph-executor.ts
+  在執行該節點前 → 偵測 recipe 欄位 → 走 recipe expander
+        │
+        ▼
+recipe expander（新 module）
+  1. 從 RECIPES KV 抓 `prompt_recipe:xxx` 定義
+  2. 按 fragments 規則 → 用既有 KBDB client 抓 block content
+  3. 按 inputs 規則 → 從 context 取值 + 跑 transform
+  4. 組 system prompt + user prompt
+  5. 把 {prompt, model, mira_token, ...} 當作節點實際 input
+        │
+        ▼
+loader 呼叫 claude_api 容器（不知道 recipe 存在，仍吃舊介面）
+        │
+        ▼
+claude_api 容器 → Mira daemon → 回 LLM 結果
+        │
+        ▼
+graph-executor 取結果 → 按 recipe.output 規則 parse JSON / 驗 schema
+```
+
+**對 claude_api 容器的影響**：完全沒有。它仍吃 `{mira_token, prompt, model}`。
+
+**對 workflow 作者的體驗**：
+```yaml
+config:
+  synthesize:
+    component: claude_api
+    recipe: "prompt_recipe:wiki_synthesis"   # ← cypher-executor 偵測到這欄位，自動解析
+    mira_token: "{{secret.mira_token}}"
+```
+
+不寫 recipe 走舊路：
+```yaml
+config:
+  reply:
+    component: claude_api
+    prompt: "{{ctx.user_message}}"           # ← 沒 recipe，cypher-executor 直接透傳
+    mira_token: "{{secret.mira_token}}"
+```
+
+### 2.3 Workflow YAML 體驗
+
+```yaml
+name: wiki_synthesis
+flow:
+  - "input >> 完成後 >> read_stale"
+  - "read_stale >> 對每個 >> read_drafts"
+  - "read_drafts >> 完成後 >> synthesize"
+  - "synthesize >> 完成後 >> write_wiki"
+config:
+  read_stale:
+    component: kbdb_get
+    page_name: "mira-wiki-index-stale"
+  read_drafts:
+    component: kbdb_get
+    page_name: "{{loop.item}}"           # entity name
+  synthesize:
+    component: claude_api
+    recipe: "prompt_recipe:wiki_synthesis"  # ← 重點：指 recipe，不寫 prompt
+    mira_token: "{{secret.mira_token}}"
+  write_wiki:
+    component: kbdb_ingest
+    text: "{{prev.paragraphs}}"
+```
+
+**AI 寫這 workflow 只需要：**
+1. 知道有 `kbdb_get / claude_api / kbdb_ingest` 三個容器（MCP search 找得到）
+2. 知道有 `prompt_recipe:wiki_synthesis` 這個配方（MCP search 找得到）
+3. 不需要懂 prompt 怎麼組、不需要看 wiki schema 文字
+
+### 2.4 Recipe 是 KBDB block 還是 KV？
+
+**選 KV**（`RECIPES` namespace），跟既有 auth_recipe / api_recipe 一致：
+- key: `prompt_recipe:{name}`
+- value: YAML/JSON
+- CLI 跟 MCP 用既有 `recipe push` / `recipe list` 工具管理（不需新工具）
+
+**不選 KBDB block**：
+- 雖然 polaris/mira/CLAUDE.md §1.6 說「source-skill 存 KBDB block」
+- 但 §1.6 講的是 mira 業務的 skill template（schema / skill 模板）
+- recipe 是「組合配方」（指向哪些 block + 怎麼組），是 platform 層
+- recipe **裡面** 引用 KBDB block id（fragments.source: kbdb_block）— 兩層關係清楚
+
+---
+
+## 3. 範圍邊界
+
+**在本 SDD 範圍內：**
+- ✅ Phase 1: prompt_recipe schema + RECIPES KV 規範
+- ✅ Phase 2: claude_api 改吃 recipe（向後相容舊 prompt 參數）
+- ✅ Phase 3: 寫第一個 recipe `prompt_recipe:wiki_synthesis`
+- ✅ Phase 4: 用此 recipe 完成 mira 7-B workflow
+- ✅ Phase 5: MCP 加 recipe 管理 tool（list / get / push / delete prompt_recipe）
+
+**不在範圍內：**
+- HTTP api_recipe / auth_recipe 改造（已有，不動）
+- 多模態 prompt（image input）— 等 P2
+- recipe 沙盒驗收（recipe 是資料不是 code，不需要）
+
+**前置依賴（已完成）：**
+- ✅ kbdb_get 零件（5.3）
+- ✅ component-registry MCP backfill（component-registry-canon Phase 1）
+
+---
+
+## 4. 為什麼這個設計重要
+
+| n8n | arcrun |
+|---|---|
+| Gmail node、Slack node、OpenAI node、Anthropic node、各 LLM node ⋯（每種 API 一個 node）| `http_request` 容器 + 各 service 的 api_recipe |
+| 每個 LLM 用法新 node（chat / completion / embedding）| `claude_api` 容器 + 各用途的 prompt_recipe |
+| AI 要學「Gmail node 怎麼用」「Slack node 怎麼用」⋯ | AI 要學「容器 + 配方」一次學會 |
+| 零件數爆炸（500+） | 容器固定（< 30），配方無限擴充 |
+| 配方藏在程式碼 | 配方在 KV，AI 直接 CRUD |
+
+**對 AI 推廣**：第三方 AI 看到「30 個容器 + 100 個配方」遠比「500 個 node」好理解，且配方是文字資料不是 code，AI 寫配方比寫 node 簡單。
+
+---
+
+## 5. 風險與緩解
+
+| 風險 | 緩解 |
+|---|---|
+| recipe 結構過度複雜，AI 寫不出來 | Phase 3 寫第一個 recipe（wiki_synthesis）作為範本，未來 AI 抄 |
+| 向後相容讓 claude_api 變兩條路 | 內部統一用 recipe path，舊 prompt 參數 → 自動轉成 inline recipe |
+| recipe 引用 KBDB block id 寫死，block 改 id 就壞 | KBDB block 用 `page_name` 識別比 id 穩定，recipe 支援 `block_page_name` 欄位 |
+| KV 寫入頻繁的 transform 邏輯（json_array, extract_field:x）→ 變 mini DSL | 限制 transform 種類（10 個內），列白名單，超過就請寫零件 |
+
+---
+
+## 6. 變更紀錄
+
+| 版本 | 日期 | 內容 |
+|---|---|---|
+| v1.0 | 2026-05-07 | 初版。吃狗糧寫 wiki 合成 workflow 撞到「prompt 組裝缺口」，補 prompt_recipe 層平行於既有 auth_recipe / api_recipe。 |
+| v1.1 | 2026-05-07 | 架構選擇 B：recipe 解析在 cypher-executor TS（不改 claude_api WASM）。減少改動面、可單元測試、跟既有 api_recipe 同層次。 |