Files
Arcrun/.agents/specs/arcrun-core-mvp/design.md
T
uncle6me-web 922a57fe34 arcrun — AI workflow execution engine (clean history)
Self-hosted 開源:WASM 零件 + recipe + cypher-executor,跑在你自己的 Cloudflare。

此為重建的乾淨歷史起點(移除曾誤 commit 的 GCP SA 金鑰,舊歷史保留在
richblack/arcrun 與本地 backup 分支)。含:
- acr init --self-hosted installer(建 KV/R2 + codeload 拉預編譯 wasm + wrangler deploy + seed recipe)
- recipe push 把關(資料外流提醒 + 打通檢查)
- 19 個正當零件預編譯 wasm(claude_api/km_writer/kbdb_upsert_block 排除:違反 DECISIONS §1)
- CLI / cypher-executor / registry / 完整 SDD

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-03 15:52:38 +08:00

612 lines
19 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Design Document: arcrun MVP
## Overview
arcrun MVP 的核心設計原則是**最小異動、最快可用**。所有目標都能透過以下三個操作達成:
1. **Cherry-pick**:從 `matrix` 搬移指定目錄,不重寫
2. **Carve-out**:移除 cypher-executor 中與 InkStone 耦合的程式碼路徑
3. **Supplement**:補充 contract.yaml 缺少的欄位、新增 CLI
不建立新的抽象層,不改變現有零件邏輯,只做讓開源可用所需的最小改動。
---
## Architecture
### 目標 Repo 結構
```
arcrun/(新獨立開源 repo
├── README.md
├── CONTRIBUTING.md
├── cypher-executor/
│ ├── src/
│ │ ├── index.ts
│ │ ├── types.ts
│ │ ├── graph-executor.ts
│ │ ├── lib/
│ │ │ ├── component-loader.ts ← 改:只從 WASM_BUCKET 讀,移除 KBDB/REGISTRY 邏輯
│ │ │ ├── component-dispatcher.ts
│ │ │ ├── wasm-executor.ts
│ │ │ ├── wasi-shim.ts
│ │ │ └── constants.ts ← 改:移除 MINI_ME / KBDB 特殊零件 hardcode
│ │ └── actions/
│ │ ├── triplet-parser.ts
│ │ ├── graph-builder.ts
│ │ ├── execution-evaluator.ts
│ │ ├── execution-logger.ts
│ │ ├── webhook-handlers.ts
│ │ ├── webhook-graph-resolver.ts ← 改:加入 credential 注入邏輯
│ │ └── (移除 autoPublishMissing.ts)
│ └── wrangler.toml ← 改:移除 9 個 InkStone bindings,新增 CREDENTIALS_KV
├── credentials/ ← 直接搬移,無需修改
│ └── src/...
├── builtins/ ← 直接搬移,無需修改
│ └── src/...
└── registry/
└── components/ ← 搬移後補充 contract.yaml
├── gmail/
├── google_sheets/
├── telegram/
├── line_notify/
├── ... (其餘 17 個零件)
└── cli/ ← 新增:arcrun CLI
├── package.json
├── tsconfig.json
└── src/
├── index.ts
├── commands/
│ ├── init.ts
│ ├── creds.ts
│ ├── push.ts
│ ├── run.ts
│ ├── validate.ts
│ ├── parts.ts
│ ├── list.ts
│ └── logs.ts
└── lib/
├── config.ts # 讀寫 ~/.arcrun/config.yaml
├── cf-api.ts # Cloudflare KV / R2 HTTP API wrapper
└── yaml-parser.ts # workflow.yaml 解析與三元組轉換
```
---
## Component Loader 改造(關鍵變更)
### 現況(matrix 版)
```typescript
// component-loader.ts 現有四層優先序:
// 1. 特殊零件 hardcode → MINI_ME / KBDB Service Binding
// 2. 內建零件 Map → 本地純轉換
// 3. 新版:查詢 KBDB record 含 component_type → WASM 或 Service Binding
// 4. 舊版 fallback:查詢無 component_type 的 KBDB record
```
### 開源版(arcrun
```typescript
// component-loader.ts 簡化為三層:
// 1. 內建零件 Map → 本地純轉換(passthrough / counter 等,保留)
// 2. WASM_BUCKET R2 直讀 → component-name.wasm
// 3. 找不到 → 回傳結構化錯誤
async function loadComponent(componentId: string, env: Env) {
// 層 1:內建零件(無需 R2
if (BUILTIN_COMPONENTS.has(componentId)) {
return BUILTIN_COMPONENTS.get(componentId)
}
// 層 2:從 WASM_BUCKET R2 讀取
const wasmKey = `${componentId}/${componentId}.wasm`
const wasmObj = await env.WASM_BUCKET.get(wasmKey)
if (wasmObj) {
return { type: 'wasm', buffer: await wasmObj.arrayBuffer() }
}
// 層 3:找不到
throw new Error(`Component not found: ${componentId}. 請確認 ${wasmKey} 存在於 WASM_BUCKET。`)
}
```
移除:`MINI_ME``KBDB` 特殊零件的 hardcode 路徑(`comp_claude_chat``comp_kbdb_search``comp_kbdb_history`)。
---
## Credential 注入流程設計
### 執行時序
```
acr run newsletter_subscribe
cypher-executor POST /webhook/:id
webhook-graph-resolver 讀 WEBHOOKS KV → workflow 定義
graph-executor 執行節點 send_thanks
執行前:credential-injector(新增)
查 send_thanks 對應零件 canonical_id = "gmail"
讀 registry/components/gmail/component.contract.yaml
發現 credentials_required: [{key: "gmail_token", inject_as: "access_token"}]
GET CREDENTIALS_KV["gmail_token"] → AES-GCM 解密
input.access_token = decryptedToken
wasm-executor 執行 gmail.wasmstdin = 含 access_token 的完整 input
回傳結果
```
### credential-injector 實作位置
放在 `cypher-executor/src/actions/credential-injector.ts`(新增),由 `graph-executor.ts` 在每個節點執行前呼叫。
```typescript
async function injectCredentials(
componentId: string,
input: Record<string, unknown>,
env: Env
): Promise<Record<string, unknown>> {
const contract = await loadContract(componentId) // 從 WASM_BUCKET 或本地讀取
if (!contract.credentials_required) return input
const enriched = { ...input }
for (const cred of contract.credentials_required) {
const record = await env.CREDENTIALS_KV.get(cred.key)
if (!record) {
throw new Error(
`缺少 credential: ${cred.key}\n修復:在 credentials.yaml 加入 ${cred.key} 後執行 acr creds push`
)
}
const { encrypted, iv } = JSON.parse(record)
enriched[cred.inject_as] = await decrypt(encrypted, iv, env.ENCRYPTION_KEY)
}
return enriched
}
```
---
## workflow.yaml 解析設計
### CLI push 流程
```
acr push newsletter_subscribe.yaml
yaml-parser.ts 讀取 workflow.yaml
解析 flow[] 三元組 → triplets: [{subject, relation, object}]
驗證關係詞(拒絕 PIPE
POST cypher-executor /cypher/search → ExecutionGraph(節點 + 邊)
合併 config: + ExecutionGraph → WorkflowDefinition
PUT WEBHOOKS KV[workflow_name] = JSON.stringify(WorkflowDefinition)
輸出 webhook URL
```
### workflow.yaml 格式(確認版)
```yaml
name: newsletter_subscribe
description: 訂閱電子報,發感謝信並記錄到 GSheets
flow:
- "input >> 完成後 >> send_thanks"
- "input >> 完成後 >> save_to_sheet"
- "send_thanks >> 完成後 >> output"
- "send_thanks >> 失敗時 >> notify_error"
- "save_to_sheet >> 完成後 >> output"
config:
send_thanks:
to: "{{input.email}}"
subject: "感謝訂閱!"
body: "歡迎加入!"
# access_token 由 credentials.yaml 的 gmail_token 自動注入
save_to_sheet:
action: write
spreadsheet_id: "{{creds.sheet_id}}"
range: "訂閱者!A:B"
values: [["{{input.email}}", "{{input.timestamp}}"]]
notify_error:
chat_id: "{{creds.telegram_chat_id}}"
text: "發信失敗:{{input.email}}"
```
---
## contract.yaml 補充欄位格式
### credentials_requiredgmail 範例)
```yaml
credentials_required:
- key: gmail_token
type: google_oauth
description: "Google OAuth access tokengmail.send scope"
inject_as: access_token
```
### config_examplegmail 範例)
```yaml
config_example: |
send_email: # 節點名稱(可自訂)
to: "" # 收件人 Email(必填)
subject: "" # 主旨(必填)
body: "" # 內文(必填)
# access_token 由 credentials.yaml 的 gmail_token 自動注入
```
### 各零件 credentials_required 對照表
| 零件 | key | type | inject_as |
|------|-----|------|-----------|
| gmail | gmail_token | google_oauth | access_token |
| google_sheets | google_oauth | google_oauth | access_token |
| telegram | telegram_bot_token | telegram_bot_token | bot_token |
| line_notify | line_token | line_token | token |
---
## CLI 技術設計
### 依賴
```json
{
"dependencies": {
"commander": "^12.0.0",
"js-yaml": "^4.1.0",
"chalk": "^5.3.0",
"ora": "^8.0.1"
}
}
```
### config.yaml 格式(~/.arcrun/config.yaml
```yaml
cloudflare_account_id: abc123
webhooks_kv_id: xyz789
credentials_kv_id: abc456
wasm_bucket: arcrun-wasm
cypher_executor_url: https://cypher-executor.xxx.workers.dev
credentials_worker_url: https://arcrun-credentials.xxx.workers.dev
api_token: ***(加密存本機)
```
### Cloudflare API 操作
CLI 使用 Cloudflare REST API(不依賴 Wrangler CLI):
- KV 寫入:`PUT /client/v4/accounts/{id}/storage/kv/namespaces/{ns_id}/values/{key}`
- KV 讀取:`GET /client/v4/accounts/{id}/storage/kv/namespaces/{ns_id}/values/{key}`
- KV 列出:`GET /client/v4/accounts/{id}/storage/kv/namespaces/{ns_id}/keys`
---
## wrangler.toml 變更對照
### 移除(InkStone 專屬)
```toml
# 全部移除:
[[services]]
binding = "KBDB"
service = "inkstone-kbdb-api"
[[services]]
binding = "REGISTRY"
service = "inkstone-component-registry"
[[services]]
binding = "CLINIC_GDRIVE"
service = "clinic-gdrive"
# ... CLINIC_EXCEL, CLINIC_ANALYSIS, CLINIC_RENDER, CLINIC_GSHEETS
[[services]]
binding = "AICEO"
service = "inkstone-aiceo-bot"
[[services]]
binding = "MINI_ME"
service = "inkstone-mini-me"
```
### 保留
```toml
[[kv_namespaces]]
binding = "EXEC_CONTEXT"
[[kv_namespaces]]
binding = "WEBHOOKS"
[[r2_buckets]]
binding = "WASM_BUCKET"
[ai]
binding = "AI"
```
### 新增
```toml
[[kv_namespaces]]
binding = "CREDENTIALS_KV"
id = "" # 用戶自行填入
```
---
## Standard 模式架構(用戶自己的 KVarcrun.dev 的引擎)
### 儲存責任分界
```
arcrun.dev 負責:
WASM_BUCKET 公眾零件庫(.wasm 二進位)
ANALYTICS_KV 零件執行統計
ACCOUNTS_KV API Key → tenant_id + CF API Token 對應
SUBMISSIONS_KV 零件提交審核狀態
用戶自己負責(一個 CF KV,arcrun.dev 不存取明文):
USER_KV
workflow:{name} → workflow 執行圖(JSON
cred:{key} → AES-GCM 加密 credential
```
### 完整系統圖
```
┌──────────────────────────────────────────────────────────────┐
│ arcrun.dev(你的 Cloudflare 帳號) │
│ │
│ auth-workerapi.arcrun.dev
│ POST /register → { api_key, tenant_id } │
│ ACCOUNTS_KV: { tenant_id, cf_api_token, api_key_hash } │
│ ※ 不儲存用戶 credential 或 workflow 內容 │
│ │
│ cypher-executorcypher.arcrun.dev,共享) │
│ X-Arcrun-API-Key → tenant_id → cf_api_token │
│ 用 cf_api_token 呼叫 CF KV API → 讀用戶自己的 USER_KV │
│ WASM_BUCKET: gmail.wasm / telegram.wasm / ...(共享) │
│ │
│ public registryregistry.arcrun.dev
│ GET /components → 零件清單 + 統計 + author + visibility │
│ POST /submit → 接收零件,沙盒驗收後設 author_only │
│ POST /analytics/record → 執行統計(非同步) │
└──────────────────────────────────────────────────────────────┘
↕ CF KV API(用戶的 cf_api_tokenKV Edit 權限)
┌──────────────────────────────────────────────────────────────┐
│ 用戶自己的 CF 帳號 │
│ USER_KV │
│ workflow:newsletter → { triplets, config } │
│ cred:gmail_token → { encrypted, iv } │
│ cred:telegram_bot → { encrypted, iv } │
└──────────────────────────────────────────────────────────────┘
```
### acr init 互動流程
```
$ acr init
? 你的 Cloudflare Account ID: abc123
? USER_KV Namespace ID(在 CF Dashboard 建立一個 KV 後貼上): kv_xyz
? CF API Token(只需 KV Edit 權限,arcrun 用此存取你的 KV: ***
? Email(取得 arcrun.dev API Key: you@example.com
→ 呼叫 POST https://api.arcrun.dev/register { email, cf_api_token_hash }
→ 取得 api_key: ak_xxxxxxxx
✓ 設定完成 → ~/.arcrun/config.yaml
✓ 建立 credentials.yaml(已加入 .gitignore
你的 credential 與 workflow 存在你自己的 CF KVarcrun 不會儲存它們。
```
### API Key 驗證與 KV 存取 Middleware
```typescript
// cypher-executor/src/lib/tenant.ts(新增)
export async function resolveUserKv(request: Request, env: Env) {
if (env.MULTI_TENANT === 'false') {
// Self-hosted:直接用本地 KV binding
return { kv: env.LOCAL_KV, prefix: '' }
}
const apiKey = request.headers.get('X-Arcrun-API-Key')
if (!apiKey) throw new Response('Missing API Key', { status: 401 })
const hash = await sha256(apiKey)
const account = await env.ACCOUNTS_KV.get(`hash:${hash}`)
if (!account) throw new Response('Invalid API Key', { status: 401 })
const { cf_api_token, account_id, kv_namespace_id } = JSON.parse(account)
// 回傳 CF KV API wrapper,用用戶自己的 token 存取
return {
kv: new CfKvClient({ cf_api_token, account_id, kv_namespace_id }),
prefix: ''
}
}
```
### USER_KV Key Schema
```
Standard 模式(用戶自己的 KV):
workflow:{name} → WorkflowDefinition JSON
cred:{key} → { encrypted (base64), iv (base64) }
Self-hosted(本地 KV binding):
維持現有 key 格式,無 prefix
```
---
## 執行統計設計
### Analytics Record(非同步,不阻擋執行)
```typescript
// cypher-executor/src/actions/analytics.ts(新增)
export function recordExecution(
componentId: string,
version: string,
success: boolean,
durationMs: number
): void {
// fire-and-forget,不 await
fetch('https://registry.arcrun.dev/analytics/record', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({ canonical_id: componentId, version, success, duration_ms: durationMs })
}).catch(() => {}) // 統計失敗不影響執行
}
```
### 統計聚合(registry Worker
```
ANALYTICS_KV 結構:
"stats:gmail:v1" → { total_runs: 140382, success_runs: 139444, total_ms: 16845840 }
每次 POST /analytics/record
原子更新(KV 樂觀鎖)→ total_runs++, success_runs += success, total_ms += duration_ms
GET /components 回傳:
success_rate = success_runs / total_runs * 100
avg_duration_ms = total_ms / total_runs
排序:total_runs × success_rateDESC
```
---
## 零件貢獻流程設計
### `acr parts publish` 流程
```
$ acr parts publish gmail-v2
1. CLI 讀取 registry/components/gmail-v2/ 目錄
- component.contract.yaml(必須有 author 欄位)
- main.go
- gmail-v2.wasm
2. POST https://registry.arcrun.dev/submit
multipart form:
contract: <yaml content>
source: <main.go content>
wasm: <binary>
Header: X-Arcrun-API-Key: ak_xxxxx
3. registry 回應:
{ submission_id: "sub_abc123", status: "pending_review" }
4. CLI 輸出:
✓ 提交成功(submission_id: sub_abc123
查詢進度:acr parts publish --status sub_abc123
```
### Registry 沙盒驗收與 visibility 狀態機
```
POST /submit 觸發(非同步執行):
[整合類零件:gmail、telegram、google_sheets、line_notify、http_request]
Step 1: 體積檢查(< 2048KB
Step 2: syscall 掃描(無 filesystem syscall;網路 syscall 允許,因需呼叫外部 API)
通過 → visibility: author_only(作者立即可用,等人工審核)
[功能類零件:所有其他零件]
Step 1: 體積檢查(< 2048KB
Step 2: 冷啟動時間(< 50ms
Step 3: syscall 掃描(無網路 / 無 filesystem
Step 4: Gherkin 測試(contract 中所有 scenario 100% 通過)
通過 → visibility: author_only(作者立即可用,等人工審核)
任一步驟失敗 → status: rejected(回傳 failed_step + reason
人工審核通過 → visibility: public
- 零件出現在所有人的 GET /components
- 開始累積公開執行統計
人工審核拒絕 → visibility 維持 author_only
- 作者仍可使用,但收到拒絕原因
- 作者修改後可重新提交
acr parts 顯示規則:
visibility: author_only → [待審核] 只有你可用(不顯示統計)
visibility: public → ★ 成功率 | N 次執行 | by @author
任一失敗 → status: rejected
- 回傳 { failed_step, reason },格式與 Requirement 2 相同
```
---
## 開發順序(Phase 對齊 requirements
```
Phase 1:搬移與清理(Requirement 1
1.1 建立 arcrun repo,搬移四個目錄
1.2 清理 cypher-executor/wrangler.toml
1.3 改寫 component-loader(移除 KBDB/REGISTRY/MINI_ME 路徑)
1.4 移除 autoPublishMissing.ts(依賴 REGISTRY binding
1.5 本機 wrangler dev 測試 /health
Phase 2:零件完整度(Requirement 2
2.1 審查 21 個零件 contract.yaml(表格回報)
2.2 補充 credentials_required4 個零件)
2.3 補充 config_example(全部 21 個)
2.4 驗證 main.go required 與 contract 一致
Phase 3credential 注入(Requirement 3
3.1 新增 credential-injector.ts
3.2 整合進 graph-executor 節點執行前
3.3 測試 gmail 零件端對端(credentials.yaml → push → run
Phase 4CLIRequirement 4
4.1 acr init--hosted / --self-hosted 分支)
4.2 acr creds pushHosted 走 APISelf-hosted 走 KV
4.3 acr push
4.4 arcrun run
4.5 acr validate
4.6 acr parts / acr parts scaffold / acr parts publish
4.7 acr list / acr logs
Phase 5:開源發布(Requirement 5
5.1 撰寫 README.md(含 --hosted 快速開始)
5.2 撰寫 CONTRIBUTING.md
5.3 確認無 InkStone 內部資訊殘留
5.4 GitHub 發布 + npm publish
Phase 6Hosted SaaSRequirement 6
6.1 建立 auth-workerapi.arcrun.dev
6.2 cypher-executor 加入 tenant middleware
6.3 CREDENTIALS_KV key schema 加 tenant prefix
6.4 部署至 arcrun.dev
Phase 7:統計與貢獻(Requirement 7 + 8
7.1 analytics.ts(執行後 fire-and-forget
7.2 registry /analytics/record 端點
7.3 ANALYTICS_KV 聚合邏輯
7.4 GET /components 加入統計排序
7.5 POST /submit 沙盒驗收 + author 寫入
7.6 acr parts publish 指令
```