Leo/Arcrun

Files

T

Leo 497f92a268 feat(arcrun): recipe system + resumable workflow + component registry canon

Three new platform capabilities + one component (kbdb_get) to enable
real AI workflow execution through cypher binding YAML.

## Recipe System (容器 + Recipe 模式)
SDD: .agents/specs/recipe-system/

- prompt_recipe schema (Zod): fragments + inputs + assembly + output
- recipe-expander.ts: expand recipe ref → real prompt by fetching KBDB blocks
  + pulling context fields with transforms (pluck_content / extract_field / etc)
- 7 transform whitelist: json_array / to_string / join / markdown_list /
  extract_field / first / pluck_content
- graph-executor hooks: detect node.data.recipe → expand → inject into ctx
- output JSON parser (with markdown fence stripping for Claude-wrapped JSON)
- Stored in RECIPES KV under prompt_recipe:{name}

## Resumable Workflow (webhook callback resume)
SDD: .agents/specs/resumable-workflow/

- WorkflowPaused class + paused-runs.ts (persist/load/consume in EXEC_CONTEXT KV, 24h TTL)
- graph-executor: detect {pending:true, task_id} → persist state → throw WorkflowPaused
- cypher-handlers: catch → return {success:true, paused:true, task_id, run_id}
- POST /workflows/resume route: consume KV state → resumeFromPaused()
- Auto-inject callback_url for claude_api nodes (PUBLIC_BASE_URL or default cypher.arcrun.dev)
- claude_api/main.go: forward callback_url to Mira daemon, default timeout 25s→120s
- Idempotent (consume = load+delete)

## Component Registry Canon
SDD: .agents/specs/component-registry-canon/

- Add POST /components/index-only endpoint (metadata-only, no wasm/sandbox)
- Backfill script (mjs): scan registry/components/*/contract.yaml → submit to KV
- register-component.sh: SSOT for local + CI hook (deploy.yml change in next commit)
- Drop R2 dead storage from submitComponent + types + wrangler
- Schema relaxed: category enum + auth/ai/platform; cold_start 50→500ms; size 2→8MB

## kbdb_get component
- registry/components/kbdb_get/: TinyGo WASM, two modes (block_id / page_name list)
- .component-builds/kbdb_get/: WASI shim worker (kbdb-get.arcrun.dev)

End-to-end validation: AI uses MCP execute_workflow with recipe ref →
cypher-executor expands prompt from KBDB schema/skill blocks + drafts →
claude_api calls Mira daemon → daemon callback fires resume route →
workflow continues. Verified with real 2KB+ Karpathy LLM Wiki draft.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

2026-05-07 15:52:19 +08:00

9.4 KiB

Raw Blame History

SDD: arcrun Recipe System（容器 + Recipe 模式）

2026-05-07 建立。吃狗糧寫 wiki 合成 workflow 時撞牆發現的平台缺口。核心原則：一個 WASM 零件 = 容器，內容（recipe）存資料庫。 n8n 為每種 API 寫獨立 node，arcrun 走「容器 + recipe」減少零件數量。

1. 問題

1.1 撞牆現場

寫 mira wiki 合成 workflow（7-B）時：

流程：kbdb_get(stale) → foreach → kbdb_get(drafts) → claude_api(合成 prompt) → kbdb_ingest
第三步要組 prompt：schema 內容 + skill 模板 + drafts array + existing_entities
cypher binding 內建 {{var}} 模板太弱（只支援 top-level，不支援嵌套 / array → string）
沒有 string_template 零件、沒有 array_to_markdown 零件
寫專用 wiki_prompt_builder 零件 = 走 n8n 老路，每個 AI workflow 都要寫一個

1.2 根因

arcrun recipe 系統只覆蓋 HTTP / auth 兩層：

Recipe 種類	存哪	容器	狀態
auth_recipe	RECIPES KV (`auth_recipe:{service}`)	auth_static_key / auth_oauth2 / ...	✅ 已有
api_recipe	RECIPES KV (`rec_{hash}`)	http_request	✅ 已有（hard-code 在 cypher-executor 待清，Phase 1-3 處理）
prompt_recipe	❌ 不存在	claude_api（容器）	缺

claude_api 零件目前吃 prompt: string（已組好的字串），沒有「recipe 模式」可以讓 AI 用「組合配方」的方式呼叫。

1.3 影響

致命：寫不出第一個 wiki 合成 workflow（7-B 卡關）
推廣破功：arcrun 對外 prop 是「容器 + recipe，AI 不用寫 code」，但 prompt 這層做不到
未來所有 AI workflow 都會撞同樣問題：rss-tech-news 評語、河道 AI 副駕、ai-comment、文章摘要⋯ 全部需要組 prompt

2. 設計

2.1 核心：prompt_recipe 平行於 auth_recipe / api_recipe

儲存：RECIPES KV，key 格式 prompt_recipe:{name}

結構：

id: prompt_recipe:wiki_synthesis
version: v1
description: "Mira wiki 合成（抽 triplet + 寫 wiki paragraph）"
model: sonnet                          # haiku / sonnet / opus（claude_api 沿用既有 routing）

# 從 KBDB / 其他來源取的 fragment（在 prompt 組合時抓並插入）
fragments:
  - var: schema
    source: kbdb_block
    block_id: "7a4e456e-1b0f-406a-8842-5e01d1cf1eef"  # mira-wiki-schema
    field: content
  - var: skill_template
    source: kbdb_block
    block_id: "85e3b81e-dca8-4131-bcdc-990bd0d3a16f"  # source-skill-wiki-synthesis
    field: content

# 從 workflow context 取（input/前置節點輸出）
inputs:
  - var: drafts                      # 草稿 array
    from: "ctx.read_drafts.blocks"
    transform: "json_array"          # 轉成 JSON array string
  - var: existing_entities
    from: "ctx.read_entities.blocks"
    transform: "extract_field:page_name"  # 抽 array 的 page_name 欄位 join 成 list
  - var: entity_name
    from: "ctx.loop.item"            # foreach 迴圈當前元素

# 最終 prompt 由 fragments + inputs 套進 skill_template 組成
prompt_assembly:
  system: "{{schema}}"               # 直接用 schema 當 system prompt
  user: "{{skill_template}}"         # skill template 內含 {{drafts}} {{existing_entities}} {{entity_name}} 變數

# 期待輸出
output:
  format: json                       # claude_api 自動 parse 為 object
  schema:                            # zod-style，parse 失敗回 success:false
    type: object
    required: [triplets, entities, paragraphs, source_summary]

2.2 Recipe 解析在 cypher-executor（架構選擇 B）

設計決策（2026-05-07）：recipe 解析跟 prompt 組裝在 cypher-executor TS，不改既有 claude_api WASM。

理由：

recipe 解析是 cypher-executor 既有 api_recipe / auth_recipe 同性質工作
既有 claude_api 已部署 + 已測試，不動影響面最小
transform 邏輯（json_array / extract_field 等）TS 寫起來比 TinyGo 簡單 10 倍
不違反 §1.6 — skill 還是 KBDB block，cypher-executor 只是組合者，不寫死 prompt

流程：

workflow YAML 節點 config 出現 `recipe: prompt_recipe:xxx`
        │
        ▼
cypher-executor graph-executor.ts
  在執行該節點前 → 偵測 recipe 欄位 → 走 recipe expander
        │
        ▼
recipe expander（新 module）
  1. 從 RECIPES KV 抓 `prompt_recipe:xxx` 定義
  2. 按 fragments 規則 → 用既有 KBDB client 抓 block content
  3. 按 inputs 規則 → 從 context 取值 + 跑 transform
  4. 組 system prompt + user prompt
  5. 把 {prompt, model, mira_token, ...} 當作節點實際 input
        │
        ▼
loader 呼叫 claude_api 容器（不知道 recipe 存在，仍吃舊介面）
        │
        ▼
claude_api 容器 → Mira daemon → 回 LLM 結果
        │
        ▼
graph-executor 取結果 → 按 recipe.output 規則 parse JSON / 驗 schema

對 claude_api 容器的影響：完全沒有。它仍吃 {mira_token, prompt, model}。

對 workflow 作者的體驗：

config:
  synthesize:
    component: claude_api
    recipe: "prompt_recipe:wiki_synthesis"   # ← cypher-executor 偵測到這欄位，自動解析
    mira_token: "{{secret.mira_token}}"

不寫 recipe 走舊路：

config:
  reply:
    component: claude_api
    prompt: "{{ctx.user_message}}"           # ← 沒 recipe，cypher-executor 直接透傳
    mira_token: "{{secret.mira_token}}"

2.3 Workflow YAML 體驗

name: wiki_synthesis
flow:
  - "input >> 完成後 >> read_stale"
  - "read_stale >> 對每個 >> read_drafts"
  - "read_drafts >> 完成後 >> synthesize"
  - "synthesize >> 完成後 >> write_wiki"
config:
  read_stale:
    component: kbdb_get
    page_name: "mira-wiki-index-stale"
  read_drafts:
    component: kbdb_get
    page_name: "{{loop.item}}"           # entity name
  synthesize:
    component: claude_api
    recipe: "prompt_recipe:wiki_synthesis"  # ← 重點：指 recipe，不寫 prompt
    mira_token: "{{secret.mira_token}}"
  write_wiki:
    component: kbdb_ingest
    text: "{{prev.paragraphs}}"

AI 寫這 workflow 只需要：

知道有 kbdb_get / claude_api / kbdb_ingest 三個容器（MCP search 找得到）
知道有 prompt_recipe:wiki_synthesis 這個配方（MCP search 找得到）
不需要懂 prompt 怎麼組、不需要看 wiki schema 文字

2.4 Recipe 是 KBDB block 還是 KV？

選 KV（RECIPES namespace），跟既有 auth_recipe / api_recipe 一致：

key: prompt_recipe:{name}
value: YAML/JSON
CLI 跟 MCP 用既有 recipe push / recipe list 工具管理（不需新工具）

不選 KBDB block：

雖然 polaris/mira/CLAUDE.md §1.6 說「source-skill 存 KBDB block」
但 §1.6 講的是 mira 業務的 skill template（schema / skill 模板）
recipe 是「組合配方」（指向哪些 block + 怎麼組），是 platform 層
recipe 裡面引用 KBDB block id（fragments.source: kbdb_block）— 兩層關係清楚

3. 範圍邊界

在本 SDD 範圍內：

✅ Phase 1: prompt_recipe schema + RECIPES KV 規範
✅ Phase 2: claude_api 改吃 recipe（向後相容舊 prompt 參數）
✅ Phase 3: 寫第一個 recipe prompt_recipe:wiki_synthesis
✅ Phase 4: 用此 recipe 完成 mira 7-B workflow
✅ Phase 5: MCP 加 recipe 管理 tool（list / get / push / delete prompt_recipe）

不在範圍內：

HTTP api_recipe / auth_recipe 改造（已有，不動）
多模態 prompt（image input）— 等 P2
recipe 沙盒驗收（recipe 是資料不是 code，不需要）

前置依賴（已完成）：

✅ kbdb_get 零件（5.3）
✅ component-registry MCP backfill（component-registry-canon Phase 1）

4. 為什麼這個設計重要

n8n	arcrun
Gmail node、Slack node、OpenAI node、Anthropic node、各 LLM node ⋯（每種 API 一個 node）	`http_request` 容器 + 各 service 的 api_recipe
每個 LLM 用法新 node（chat / completion / embedding）	`claude_api` 容器 + 各用途的 prompt_recipe
AI 要學「Gmail node 怎麼用」「Slack node 怎麼用」⋯	AI 要學「容器 + 配方」一次學會
零件數爆炸（500+）	容器固定（< 30），配方無限擴充
配方藏在程式碼	配方在 KV，AI 直接 CRUD

對 AI 推廣：第三方 AI 看到「30 個容器 + 100 個配方」遠比「500 個 node」好理解，且配方是文字資料不是 code，AI 寫配方比寫 node 簡單。

5. 風險與緩解

風險	緩解
recipe 結構過度複雜，AI 寫不出來	Phase 3 寫第一個 recipe（wiki_synthesis）作為範本，未來 AI 抄
向後相容讓 claude_api 變兩條路	內部統一用 recipe path，舊 prompt 參數 → 自動轉成 inline recipe
recipe 引用 KBDB block id 寫死，block 改 id 就壞	KBDB block 用 `page_name` 識別比 id 穩定，recipe 支援 `block_page_name` 欄位
KV 寫入頻繁的 transform 邏輯（json_array, extract_field:x）→ 變 mini DSL	限制 transform 種類（10 個內），列白名單，超過就請寫零件

6. 變更紀錄

版本	日期	內容
v1.0	2026-05-07	初版。吃狗糧寫 wiki 合成 workflow 撞到「prompt 組裝缺口」，補 prompt_recipe 層平行於既有 auth_recipe / api_recipe。
v1.1	2026-05-07	架構選擇 B：recipe 解析在 cypher-executor TS（不改 claude_api WASM）。減少改動面、可單元測試、跟既有 api_recipe 同層次。

9.4 KiB Raw Blame History Unescape Escape