# Design Document: arcrun MVP ## Overview arcrun MVP 的核心設計原則是**最小異動、最快可用**。所有目標都能透過以下三個操作達成: 1. **Cherry-pick**:從 `matrix` 搬移指定目錄,不重寫 2. **Carve-out**:移除 cypher-executor 中與 InkStone 耦合的程式碼路徑 3. **Supplement**:補充 contract.yaml 缺少的欄位、新增 CLI 不建立新的抽象層,不改變現有零件邏輯,只做讓開源可用所需的最小改動。 --- ## Architecture ### 目標 Repo 結構 ``` arcrun/(新獨立開源 repo) ├── README.md ├── CONTRIBUTING.md ├── cypher-executor/ │ ├── src/ │ │ ├── index.ts │ │ ├── types.ts │ │ ├── graph-executor.ts │ │ ├── lib/ │ │ │ ├── component-loader.ts ← 改:只從 WASM_BUCKET 讀,移除 KBDB/REGISTRY 邏輯 │ │ │ ├── component-dispatcher.ts │ │ │ ├── wasm-executor.ts │ │ │ ├── wasi-shim.ts │ │ │ └── constants.ts ← 改:移除 MINI_ME / KBDB 特殊零件 hardcode │ │ └── actions/ │ │ ├── triplet-parser.ts │ │ ├── graph-builder.ts │ │ ├── execution-evaluator.ts │ │ ├── execution-logger.ts │ │ ├── webhook-handlers.ts │ │ ├── webhook-graph-resolver.ts ← 改:加入 credential 注入邏輯 │ │ └── (移除 autoPublishMissing.ts) │ └── wrangler.toml ← 改:移除 9 個 InkStone bindings,新增 CREDENTIALS_KV ├── credentials/ ← 直接搬移,無需修改 │ └── src/... ├── builtins/ ← 直接搬移,無需修改 │ └── src/... └── registry/ └── components/ ← 搬移後補充 contract.yaml ├── gmail/ ├── google_sheets/ ├── telegram/ ├── line_notify/ ├── ... (其餘 17 個零件) └── cli/ ← 新增:arcrun CLI ├── package.json ├── tsconfig.json └── src/ ├── index.ts ├── commands/ │ ├── init.ts │ ├── creds.ts │ ├── push.ts │ ├── run.ts │ ├── validate.ts │ ├── parts.ts │ ├── list.ts │ └── logs.ts └── lib/ ├── config.ts # 讀寫 ~/.arcrun/config.yaml ├── cf-api.ts # Cloudflare KV / R2 HTTP API wrapper └── yaml-parser.ts # workflow.yaml 解析與三元組轉換 ``` --- ## Component Loader 改造(關鍵變更) ### 現況(matrix 版) ```typescript // component-loader.ts 現有四層優先序: // 1. 特殊零件 hardcode → MINI_ME / KBDB Service Binding // 2. 內建零件 Map → 本地純轉換 // 3. 新版:查詢 KBDB record 含 component_type → WASM 或 Service Binding // 4. 舊版 fallback:查詢無 component_type 的 KBDB record ``` ### 開源版(arcrun) ```typescript // component-loader.ts 簡化為三層: // 1. 內建零件 Map → 本地純轉換(passthrough / counter 等,保留) // 2. WASM_BUCKET R2 直讀 → component-name.wasm // 3. 找不到 → 回傳結構化錯誤 async function loadComponent(componentId: string, env: Env) { // 層 1:內建零件(無需 R2) if (BUILTIN_COMPONENTS.has(componentId)) { return BUILTIN_COMPONENTS.get(componentId) } // 層 2:從 WASM_BUCKET R2 讀取 const wasmKey = `${componentId}/${componentId}.wasm` const wasmObj = await env.WASM_BUCKET.get(wasmKey) if (wasmObj) { return { type: 'wasm', buffer: await wasmObj.arrayBuffer() } } // 層 3:找不到 throw new Error(`Component not found: ${componentId}. 請確認 ${wasmKey} 存在於 WASM_BUCKET。`) } ``` 移除:`MINI_ME`、`KBDB` 特殊零件的 hardcode 路徑(`comp_claude_chat`、`comp_kbdb_search`、`comp_kbdb_history`)。 --- ## Credential 注入流程設計 ### 執行時序 ``` acr run newsletter_subscribe ↓ cypher-executor POST /webhook/:id ↓ webhook-graph-resolver 讀 WEBHOOKS KV → workflow 定義 ↓ graph-executor 執行節點 send_thanks ↓ 執行前:credential-injector(新增) 查 send_thanks 對應零件 canonical_id = "gmail" 讀 registry/components/gmail/component.contract.yaml 發現 credentials_required: [{key: "gmail_token", inject_as: "access_token"}] GET CREDENTIALS_KV["gmail_token"] → AES-GCM 解密 input.access_token = decryptedToken ↓ wasm-executor 執行 gmail.wasm(stdin = 含 access_token 的完整 input) ↓ 回傳結果 ``` ### credential-injector 實作位置 放在 `cypher-executor/src/actions/credential-injector.ts`(新增),由 `graph-executor.ts` 在每個節點執行前呼叫。 ```typescript async function injectCredentials( componentId: string, input: Record, env: Env ): Promise> { const contract = await loadContract(componentId) // 從 WASM_BUCKET 或本地讀取 if (!contract.credentials_required) return input const enriched = { ...input } for (const cred of contract.credentials_required) { const record = await env.CREDENTIALS_KV.get(cred.key) if (!record) { throw new Error( `缺少 credential: ${cred.key}\n修復:在 credentials.yaml 加入 ${cred.key} 後執行 acr creds push` ) } const { encrypted, iv } = JSON.parse(record) enriched[cred.inject_as] = await decrypt(encrypted, iv, env.ENCRYPTION_KEY) } return enriched } ``` --- ## workflow.yaml 解析設計 ### CLI push 流程 ``` acr push newsletter_subscribe.yaml ↓ yaml-parser.ts 讀取 workflow.yaml ↓ 解析 flow[] 三元組 → triplets: [{subject, relation, object}] 驗證關係詞(拒絕 PIPE) ↓ POST cypher-executor /cypher/search → ExecutionGraph(節點 + 邊) ↓ 合併 config: + ExecutionGraph → WorkflowDefinition ↓ PUT WEBHOOKS KV[workflow_name] = JSON.stringify(WorkflowDefinition) ↓ 輸出 webhook URL ``` ### workflow.yaml 格式(確認版) ```yaml name: newsletter_subscribe description: 訂閱電子報,發感謝信並記錄到 GSheets flow: - "input >> 完成後 >> send_thanks" - "input >> 完成後 >> save_to_sheet" - "send_thanks >> 完成後 >> output" - "send_thanks >> 失敗時 >> notify_error" - "save_to_sheet >> 完成後 >> output" config: send_thanks: to: "{{input.email}}" subject: "感謝訂閱!" body: "歡迎加入!" # access_token 由 credentials.yaml 的 gmail_token 自動注入 save_to_sheet: action: write spreadsheet_id: "{{creds.sheet_id}}" range: "訂閱者!A:B" values: [["{{input.email}}", "{{input.timestamp}}"]] notify_error: chat_id: "{{creds.telegram_chat_id}}" text: "發信失敗:{{input.email}}" ``` --- ## contract.yaml 補充欄位格式 ### credentials_required(gmail 範例) ```yaml credentials_required: - key: gmail_token type: google_oauth description: "Google OAuth access token(gmail.send scope)" inject_as: access_token ``` ### config_example(gmail 範例) ```yaml config_example: | send_email: # 節點名稱(可自訂) to: "" # 收件人 Email(必填) subject: "" # 主旨(必填) body: "" # 內文(必填) # access_token 由 credentials.yaml 的 gmail_token 自動注入 ``` ### 各零件 credentials_required 對照表 | 零件 | key | type | inject_as | |------|-----|------|-----------| | gmail | gmail_token | google_oauth | access_token | | google_sheets | google_oauth | google_oauth | access_token | | telegram | telegram_bot_token | telegram_bot_token | bot_token | | line_notify | line_token | line_token | token | --- ## CLI 技術設計 ### 依賴 ```json { "dependencies": { "commander": "^12.0.0", "js-yaml": "^4.1.0", "chalk": "^5.3.0", "ora": "^8.0.1" } } ``` ### config.yaml 格式(~/.arcrun/config.yaml) ```yaml cloudflare_account_id: abc123 webhooks_kv_id: xyz789 credentials_kv_id: abc456 wasm_bucket: arcrun-wasm cypher_executor_url: https://cypher-executor.xxx.workers.dev credentials_worker_url: https://arcrun-credentials.xxx.workers.dev api_token: ***(加密存本機) ``` ### Cloudflare API 操作 CLI 使用 Cloudflare REST API(不依賴 Wrangler CLI): - KV 寫入:`PUT /client/v4/accounts/{id}/storage/kv/namespaces/{ns_id}/values/{key}` - KV 讀取:`GET /client/v4/accounts/{id}/storage/kv/namespaces/{ns_id}/values/{key}` - KV 列出:`GET /client/v4/accounts/{id}/storage/kv/namespaces/{ns_id}/keys` --- ## wrangler.toml 變更對照 ### 移除(InkStone 專屬) ```toml # 全部移除: [[services]] binding = "KBDB" service = "inkstone-kbdb-api" [[services]] binding = "REGISTRY" service = "inkstone-component-registry" [[services]] binding = "CLINIC_GDRIVE" service = "clinic-gdrive" # ... CLINIC_EXCEL, CLINIC_ANALYSIS, CLINIC_RENDER, CLINIC_GSHEETS [[services]] binding = "AICEO" service = "inkstone-aiceo-bot" [[services]] binding = "MINI_ME" service = "inkstone-mini-me" ``` ### 保留 ```toml [[kv_namespaces]] binding = "EXEC_CONTEXT" [[kv_namespaces]] binding = "WEBHOOKS" [[r2_buckets]] binding = "WASM_BUCKET" [ai] binding = "AI" ``` ### 新增 ```toml [[kv_namespaces]] binding = "CREDENTIALS_KV" id = "" # 用戶自行填入 ``` --- ## Standard 模式架構(用戶自己的 KV,arcrun.dev 的引擎) ### 儲存責任分界 ``` arcrun.dev 負責: WASM_BUCKET 公眾零件庫(.wasm 二進位) ANALYTICS_KV 零件執行統計 ACCOUNTS_KV API Key → tenant_id + CF API Token 對應 SUBMISSIONS_KV 零件提交審核狀態 用戶自己負責(一個 CF KV,arcrun.dev 不存取明文): USER_KV workflow:{name} → workflow 執行圖(JSON) cred:{key} → AES-GCM 加密 credential ``` ### 完整系統圖 ``` ┌──────────────────────────────────────────────────────────────┐ │ arcrun.dev(你的 Cloudflare 帳號) │ │ │ │ auth-worker(api.arcrun.dev) │ │ POST /register → { api_key, tenant_id } │ │ ACCOUNTS_KV: { tenant_id, cf_api_token, api_key_hash } │ │ ※ 不儲存用戶 credential 或 workflow 內容 │ │ │ │ cypher-executor(cypher.arcrun.dev,共享) │ │ X-Arcrun-API-Key → tenant_id → cf_api_token │ │ 用 cf_api_token 呼叫 CF KV API → 讀用戶自己的 USER_KV │ │ WASM_BUCKET: gmail.wasm / telegram.wasm / ...(共享) │ │ │ │ public registry(registry.arcrun.dev) │ │ GET /components → 零件清單 + 統計 + author + visibility │ │ POST /submit → 接收零件,沙盒驗收後設 author_only │ │ POST /analytics/record → 執行統計(非同步) │ └──────────────────────────────────────────────────────────────┘ ↕ CF KV API(用戶的 cf_api_token,KV Edit 權限) ┌──────────────────────────────────────────────────────────────┐ │ 用戶自己的 CF 帳號 │ │ USER_KV │ │ workflow:newsletter → { triplets, config } │ │ cred:gmail_token → { encrypted, iv } │ │ cred:telegram_bot → { encrypted, iv } │ └──────────────────────────────────────────────────────────────┘ ``` ### acr init 互動流程 ``` $ acr init ? 你的 Cloudflare Account ID: abc123 ? USER_KV Namespace ID(在 CF Dashboard 建立一個 KV 後貼上): kv_xyz ? CF API Token(只需 KV Edit 權限,arcrun 用此存取你的 KV): *** ? Email(取得 arcrun.dev API Key): you@example.com → 呼叫 POST https://api.arcrun.dev/register { email, cf_api_token_hash } → 取得 api_key: ak_xxxxxxxx ✓ 設定完成 → ~/.arcrun/config.yaml ✓ 建立 credentials.yaml(已加入 .gitignore) 你的 credential 與 workflow 存在你自己的 CF KV,arcrun 不會儲存它們。 ``` ### API Key 驗證與 KV 存取 Middleware ```typescript // cypher-executor/src/lib/tenant.ts(新增) export async function resolveUserKv(request: Request, env: Env) { if (env.MULTI_TENANT === 'false') { // Self-hosted:直接用本地 KV binding return { kv: env.LOCAL_KV, prefix: '' } } const apiKey = request.headers.get('X-Arcrun-API-Key') if (!apiKey) throw new Response('Missing API Key', { status: 401 }) const hash = await sha256(apiKey) const account = await env.ACCOUNTS_KV.get(`hash:${hash}`) if (!account) throw new Response('Invalid API Key', { status: 401 }) const { cf_api_token, account_id, kv_namespace_id } = JSON.parse(account) // 回傳 CF KV API wrapper,用用戶自己的 token 存取 return { kv: new CfKvClient({ cf_api_token, account_id, kv_namespace_id }), prefix: '' } } ``` ### USER_KV Key Schema ``` Standard 模式(用戶自己的 KV): workflow:{name} → WorkflowDefinition JSON cred:{key} → { encrypted (base64), iv (base64) } Self-hosted(本地 KV binding): 維持現有 key 格式,無 prefix ``` --- ## 執行統計設計 ### Analytics Record(非同步,不阻擋執行) ```typescript // cypher-executor/src/actions/analytics.ts(新增) export function recordExecution( componentId: string, version: string, success: boolean, durationMs: number ): void { // fire-and-forget,不 await fetch('https://registry.arcrun.dev/analytics/record', { method: 'POST', headers: { 'Content-Type': 'application/json' }, body: JSON.stringify({ canonical_id: componentId, version, success, duration_ms: durationMs }) }).catch(() => {}) // 統計失敗不影響執行 } ``` ### 統計聚合(registry Worker) ``` ANALYTICS_KV 結構: "stats:gmail:v1" → { total_runs: 140382, success_runs: 139444, total_ms: 16845840 } 每次 POST /analytics/record: 原子更新(KV 樂觀鎖)→ total_runs++, success_runs += success, total_ms += duration_ms GET /components 回傳: success_rate = success_runs / total_runs * 100 avg_duration_ms = total_ms / total_runs 排序:total_runs × success_rate(DESC) ``` --- ## 零件貢獻流程設計 ### `acr parts publish` 流程 ``` $ acr parts publish gmail-v2 1. CLI 讀取 registry/components/gmail-v2/ 目錄 - component.contract.yaml(必須有 author 欄位) - main.go - gmail-v2.wasm 2. POST https://registry.arcrun.dev/submit multipart form: contract: source: wasm: Header: X-Arcrun-API-Key: ak_xxxxx 3. registry 回應: { submission_id: "sub_abc123", status: "pending_review" } 4. CLI 輸出: ✓ 提交成功(submission_id: sub_abc123) 查詢進度:acr parts publish --status sub_abc123 ``` ### Registry 沙盒驗收與 visibility 狀態機 ``` POST /submit 觸發(非同步執行): [整合類零件:gmail、telegram、google_sheets、line_notify、http_request] Step 1: 體積檢查(< 2048KB) Step 2: syscall 掃描(無 filesystem syscall;網路 syscall 允許,因需呼叫外部 API) 通過 → visibility: author_only(作者立即可用,等人工審核) [功能類零件:所有其他零件] Step 1: 體積檢查(< 2048KB) Step 2: 冷啟動時間(< 50ms) Step 3: syscall 掃描(無網路 / 無 filesystem) Step 4: Gherkin 測試(contract 中所有 scenario 100% 通過) 通過 → visibility: author_only(作者立即可用,等人工審核) 任一步驟失敗 → status: rejected(回傳 failed_step + reason) 人工審核通過 → visibility: public - 零件出現在所有人的 GET /components - 開始累積公開執行統計 人工審核拒絕 → visibility 維持 author_only - 作者仍可使用,但收到拒絕原因 - 作者修改後可重新提交 acr parts 顯示規則: visibility: author_only → [待審核] 只有你可用(不顯示統計) visibility: public → ★ 成功率 | N 次執行 | by @author 任一失敗 → status: rejected - 回傳 { failed_step, reason },格式與 Requirement 2 相同 ``` --- ## 開發順序(Phase 對齊 requirements) ``` Phase 1:搬移與清理(Requirement 1) 1.1 建立 arcrun repo,搬移四個目錄 1.2 清理 cypher-executor/wrangler.toml 1.3 改寫 component-loader(移除 KBDB/REGISTRY/MINI_ME 路徑) 1.4 移除 autoPublishMissing.ts(依賴 REGISTRY binding) 1.5 本機 wrangler dev 測試 /health Phase 2:零件完整度(Requirement 2) 2.1 審查 21 個零件 contract.yaml(表格回報) 2.2 補充 credentials_required(4 個零件) 2.3 補充 config_example(全部 21 個) 2.4 驗證 main.go required 與 contract 一致 Phase 3:credential 注入(Requirement 3) 3.1 新增 credential-injector.ts 3.2 整合進 graph-executor 節點執行前 3.3 測試 gmail 零件端對端(credentials.yaml → push → run) Phase 4:CLI(Requirement 4) 4.1 acr init(--hosted / --self-hosted 分支) 4.2 acr creds push(Hosted 走 API,Self-hosted 走 KV) 4.3 acr push 4.4 arcrun run 4.5 acr validate 4.6 acr parts / acr parts scaffold / acr parts publish 4.7 acr list / acr logs Phase 5:開源發布(Requirement 5) 5.1 撰寫 README.md(含 --hosted 快速開始) 5.2 撰寫 CONTRIBUTING.md 5.3 確認無 InkStone 內部資訊殘留 5.4 GitHub 發布 + npm publish Phase 6:Hosted SaaS(Requirement 6) 6.1 建立 auth-worker(api.arcrun.dev) 6.2 cypher-executor 加入 tenant middleware 6.3 CREDENTIALS_KV key schema 加 tenant prefix 6.4 部署至 arcrun.dev Phase 7:統計與貢獻(Requirement 7 + 8) 7.1 analytics.ts(執行後 fire-and-forget) 7.2 registry /analytics/record 端點 7.3 ANALYTICS_KV 聚合邏輯 7.4 GET /components 加入統計排序 7.5 POST /submit 沙盒驗收 + author 寫入 7.6 acr parts publish 指令 ```