docs(registry): seed 10 examples + 5 skills (LI SDD M3.1 + M3.3)

對應 .agents/specs/llm-interface/ Milestone 3.1 + 3.3。 registry/examples/ — 10 個可直接 push 的 workflow 範本： starter: webhook-to-http common: cron-watcher, llm-classify, rag-search-answer, daily-digest external: email-summary (gmail+claude+telegram), pdf-to-blocks, github-issue-bot advanced: parallel-fanout (trigger_workflow fan-out), error-retry (try_catch+wait pattern) 每個含：workflow.yaml（可直接 push）+ description.md（解決什麼問題 / 改成你自己的 / 學到什麼）+ tags.json（搜尋用） registry/skills/ — 5 個 AI playbook（markdown）： build_watcher_workflow — cron + filter + trigger 模式 debug_paused_workflow — claude_api callback paused 怎麼追 migrate_http_to_trigger_workflow — 從 self-fetch 換 trigger_workflow rag_with_arcrun — KBDB + claude_api 組裝 RAG add_new_wasm_component — TinyGo 寫 + 部署全流程兩者差異： examples = 可直接拿來改的 YAML skills = 面對 X 問題該怎麼想 + 該用哪個 example 兩者後續：CI 自動同步進 KBDB（type=workflow-example / type=agent-skill）， MCP arcrun_search_examples / arcrun_list_skills 走 KBDB semantic search。（CI sync 是 M3.4 工作） Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-16 16:33:54 +08:00
parent 989fbeb9ac
commit 388c193ae7
37 changed files with 1324 additions and 0 deletions
@@ -0,0 +1,42 @@
 # Arcrun Examples Library
 > 給 AI 操盤手快速參考的 workflow 範本庫。每個範例都是可直接 `acr push` 部署的 YAML。
 >
 > 對應 SDD: `matrix/arcrun/.agents/specs/llm-interface/` Milestone 3.3
 ## 結構
 每個範例一個資料夾：
 ```
 {slug}/
 ├── workflow.yaml         可直接 push 部署
 ├── description.md        解決什麼問題 / 怎麼觸發 / 預期結果
 └── tags.json             ["webhook", "llm", "cron", ...] 用於搜尋
 ```
 ## 範例列表
 | Slug | 場景 |
 |---|---|
 | `webhook-to-http` | 簡單轉發：webhook → 打另一個 API |
 | `cron-watcher` | 每 5 分鐘掃資料庫 → 觸發子 workflow（mira pattern） |
 | `llm-classify` | claude_api 分類文字 → 寫 KBDB |
 | `rag-search-answer` | 從 KBDB 找 context → claude 回答 |
 | `email-summary` | gmail 收信 → claude 摘要 → telegram 推 |
 | `pdf-to-blocks` | 上傳 PDF → 轉文字 → 切 chunk → 存 KBDB |
 | `github-issue-bot` | 收 GH webhook → claude 分析 → 留 comment |
 | `daily-digest` | cron → 多源聚合（KBDB / RSS / 等） → 推送 |
 | `parallel-fanout` | 一份輸入分發多 workflow 並行處理 |
 | `error-retry` | try_catch + wait + retry 重試外部 API |
 ## 如何用（AI 視角）
 1. `arcrun_search_examples('rag context answer')` → 命中 `rag-search-answer`
 2. 拿 YAML，按自己需求改 prompt / 資料來源
 3. `arcrun_validate_yaml` → `arcrun_push_workflow` → 完成
 ## CI 自動同步
 GH Actions 監聽本目錄變動 → 把每個範例 PATCH 進 KBDB type=workflow-example
 （含 YAML + description + tags）。MCP `search_examples` 走 KBDB semantic search。
@@ -0,0 +1,31 @@
 # cron-watcher
 ## 解決什麼問題
 定期巡 KBDB（或任何資料源），找到「未處理」的紀錄，每筆觸發一個處理 workflow。
 **最常見的 pattern**：mira 就是這樣把河道貼文自動跑 wiki_synthesis。
 ## 怎麼觸發
 不用手動觸發 — 部署後自動每 5 分鐘跑。
 cron 解析在 `acr push` 時自動偵測首節點是 `cron` 零件，存進 `WEBHOOKS:cron-idx:` 索引，
 `scheduled()` handler 每分鐘 tick 對齊。
 ## 改成你自己的
 - `watch_cron.cron_expr` 改頻率（標準 5 欄 cron 語法）
 - `list_unprocessed` 改你的 KBDB query（type / source / tag 等）
 - `filter_new.condition` 改你的「未處理」定義
 - `trigger_processor.workflow_name` 改你的處理 workflow
 ## 為什麼用 trigger_workflow 不用 http_request
 CF Workers 有 self-fetch 防護：cypher-executor 自打 `cypher.arcrun.dev/*` 或自己的
 `arcrun-cypher-executor.*.workers.dev` 都被攔（CF 1042）。
 `trigger_workflow` 是 cypher-executor 內建的 orchestration 零件，直接 in-process
 call `executeWebhookGraph`，**不走外部 HTTP**，徹底繞掉 self-fetch。
 ## 學到什麼
 - cron + FOREACH + trigger_workflow 三件套
 - `{{api_key}}` 從 trigger context 自動帶（cron 觸發時 cypher-executor 自動塞進去）
 - `對每個 X >> Y` 中文關係詞（也接受 `FOREACH X`）
 - filter 零件用 `condition.op: eq` 配 `tags_json: "[]"` 偵測「無 tag」
@@ -0,0 +1 @@
 ["cron", "watcher", "kbdb", "foreach", "trigger_workflow", "common-pattern"]
@@ -0,0 +1,38 @@
 name: cron_watcher_example
 description: 每 5 分鐘掃 KBDB 未處理的 note → 對每筆觸發子 workflow
 flow:
  - "watch_cron >> ON_SUCCESS >> list_unprocessed"
  - "list_unprocessed >> ON_SUCCESS >> filter_new"
  - "filter_new >> 對每個 item >> trigger_processor"
 config:
  watch_cron:
    component: cron
    cron_expr: "*/5 * * * *"
    description: "每 5 分鐘掃一次"
  list_unprocessed:
    component: kbdb_get
    api_key: "{{api_key}}"
    type: "note"
    source: "user-input"
    limit: 20
  filter_new:
    component: filter
    items: "{{list_unprocessed.blocks}}"
    condition:
      key: "tags_json"
      op: "eq"
      value: "[]"
  # trigger_workflow 是內建 orchestration 零件，in-process call 另一個 workflow
  # **千萬不要用 http_request 自打 cypher-executor 自己的 webhook** — 會撞 CF self-fetch 死鎖
  trigger_processor:
    component: trigger_workflow
    workflow_name: "your_processor_workflow"   # ← 改成你的處理 workflow 名
    api_key: "{{api_key}}"
    input:
      api_key: "{{api_key}}"
      block_id: "{{item.id}}"
@@ -0,0 +1,28 @@
 # daily-digest
 ## 解決什麼問題
 資訊焦慮：HN / GitHub trending / 自己筆記每天都看不完。
 每天早上一份 LLM 整理過的精選，3 分鐘看完今天的 signal。
 ## 怎麼觸發
 不用，cron 排程每天 00:00 UTC（台灣 08:00）自動跑。
 ## 改成你自己的
 - 加 / 減 source：dev.to RSS、Twitter list、Slack、自家 KBDB tag 等
 - 摘要 prompt 改為你的口味（嚴肅 / 幽默 / 簡短）
 - 推送目的可換：email、Notion 加一個 page、KBDB 存歷史
 ## 為什麼這 pattern 重要
 **Fan-in** 是 arcrun 的特色：3 條 source 並行 fetch，cypher-executor 自動等全部完成才跑 compose。
 不用寫 promise.all、不用怕 race，宣告式描述「compose 依賴這 3 個」即可。
 ## 變體
 - 加 priority：若 KBDB 有 `tag:urgent` 的 note，置頂
 - 接 calendar：把今天 meeting 也塞進摘要
 - 接 weather + 通勤路況（API call 多源）
 ## 學到什麼
 - **Fan-in / fan-out**：cypher binding YAML 多條邊指向同一節點就是 fan-in
 - 系統變數：`{{_today}}` / `{{_yesterday}}` / `{{_now}}` 內建可用
 - cron 多步排程：一個 cron 觸發 3 條並行 fetch chain
 - `kbdb_get` 用 `source` 篩特定來源（這裡只收 leo 直接寫的，不收 AI 生成的）
@@ -0,0 +1 @@
 ["cron", "digest", "fan-in", "multi-source", "llm", "telegram", "common-pattern"]
@@ -0,0 +1,62 @@
 name: daily_digest
 description: 每天早上聚合多源資料 (KBDB / RSS / GitHub trending) → claude 摘要 → telegram
 flow:
  - "morning_cron >> ON_SUCCESS >> fetch_kbdb_yesterday"
  - "morning_cron >> ON_SUCCESS >> fetch_rss"
  - "morning_cron >> ON_SUCCESS >> fetch_github_trending"
  - "fetch_kbdb_yesterday >> ON_SUCCESS >> compose_digest"
  - "fetch_rss >> ON_SUCCESS >> compose_digest"
  - "fetch_github_trending >> ON_SUCCESS >> compose_digest"
  - "compose_digest >> ON_SUCCESS >> push_digest"
 config:
  morning_cron:
    component: cron
    cron_expr: "0 0 * * *"   # UTC 00:00 = 台灣 08:00
  fetch_kbdb_yesterday:
    component: kbdb_get
    api_key: "{{api_key}}"
    type: "note"
    source: "km-writer-direct"
    limit: 50
  fetch_rss:
    component: http_request
    url: "https://hnrss.org/frontpage?count=10"
    method: GET
  fetch_github_trending:
    component: http_request
    url: "https://api.github.com/search/repositories?q=created:>{{_yesterday}}+stars:>500&sort=stars&order=desc&per_page=5"
    method: GET
    headers:
      Accept: "application/vnd.github+json"
  # compose 收三條 fan-in（cypher-executor 自動等三個 source 都完成才跑）
  compose_digest:
    component: claude_api
    timeout_ms: 60000
    _recipe_output_format: text
    prompt: |
      整理 leo 今天的「晨間訊息摘要」。三部分各取重點 5-8 條：
      ## 我昨天寫的（KBDB notes）
      {{fetch_kbdb_yesterday.blocks}}
      ## Hacker News
      {{fetch_rss.data}}
      ## GitHub 熱門新 repo
      {{fetch_github_trending.data}}
      格式：markdown bullets，每條 < 30 字，標明來源。
  push_digest:
    component: telegram
    chat_id: "{{secret.LEO_TELEGRAM_CHAT_ID}}"
    text: |
      ☀️ 早安 {{_today}}
      {{compose_digest.data.text}}
@@ -0,0 +1,31 @@
 # email-summary
 ## 解決什麼問題
 信箱爆炸不想一封一封看？每天早上 8 點收到一份 LLM 整理過的「今天該注意的事」。
 ## 前置
 - 設好 gmail auth credential（`acr creds push gmail`，OAuth2 flow）
 - 設好 telegram bot + chat_id（推送目的地）
 ## 怎麼觸發
 不用手動，cron 排程每天 08:00 自動跑。
 ## 改成你自己的
 - `daily_cron.cron_expr` 改時區（注意 cypher-executor 跑 UTC，台灣要 -8h）
 - `fetch_unread.query` 改 gmail 搜尋條件
 - 摘要 prompt 改成你的優先級邏輯
 - 推送可換 line_notify、slack、或寫進 KBDB 等
 ## 為什麼這 pattern 重要
 最典型「多服務串聯」case：data source（gmail）+ LLM 處理 + 通知。
 arcrun 三件套各自獨立、用 cypher binding YAML 串起來。
 ## 進階
 - 加 `if_control` 節點：若摘要無新急件，跳過 telegram 不打擾
 - 加 KBDB 存歷史摘要（type=daily-digest）方便回看
 - 接 ai-meka workflow 自動排日程（急件 → calendar event）
 ## 學到什麼
 - `cron` 排程 + 多步串聯標準骨架
 - `{{secret.X}}` 走 credential 系統取得 sensitive value（不寫死 YAML）
 - gmail / telegram 都是 arcrun 內建零件（list_components 看完整清單）
@@ -0,0 +1 @@
 ["cron", "gmail", "llm", "telegram", "digest", "automation", "common-pattern"]
@@ -0,0 +1,43 @@
 name: email_summary
 description: 每天 8am 撈 gmail 最近未讀 → claude 摘要 → telegram 推送
 flow:
  - "daily_cron >> ON_SUCCESS >> fetch_unread"
  - "fetch_unread >> ON_SUCCESS >> summarize"
  - "summarize >> ON_SUCCESS >> push_to_telegram"
 config:
  daily_cron:
    component: cron
    cron_expr: "0 8 * * *"     # 每天 08:00 UTC（依需求調時區）
  fetch_unread:
    component: gmail
    action: "list"
    query: "is:unread newer_than:1d"
    max_results: 20
  summarize:
    component: claude_api
    timeout_ms: 60000
    _recipe_output_format: text
    prompt: |
      你是 leo 的 email 助理。把下列 {{fetch_unread.count}} 封信濃縮成
      一份「今天該注意的事」摘要：
      **格式**：
      - 急件（需 24h 內回）：list
      - 帳單 / 重要通知：list
      - 一般資訊（可週末看）：list
      - 廣告 / spam：忽略
      Emails：
      {{fetch_unread.messages}}
  push_to_telegram:
    component: telegram
    chat_id: "{{secret.LEO_TELEGRAM_CHAT_ID}}"
    text: |
      📬 今日 email 摘要
      {{summarize.data.text}}
@@ -0,0 +1,38 @@
 # error-retry
 ## 解決什麼問題
 外部 API 偶發 500 / timeout 是常態。寫死「打一次就放棄」太脆弱。
 這個 pattern 提供標準 retry chain：失敗 → 等 5 秒 → 重試一次 → 還失敗才通知人。
 ## 怎麼觸發
 ```bash
 curl -X POST https://cypher.arcrun.dev/webhooks/named/error_retry/trigger \
  -d '{
    "api_key":"ak_xxx",
    "target_url":"https://flaky-api.example.com/endpoint",
    "payload":{"x":1},
    "workflow_name":"my_workflow"
  }'
 ```
 ## 改成你自己的
 - `wait_a_bit.seconds` 改延遲（指數 backoff：5, 15, 45 秒）
 - 串更多 retry 節點（generic 寫 3-4 次足夠）
 - `final_fail_notify` 換 email / pagerduty / slack 等
 - 加 `if_control` 判斷 error 類型（4xx 不重試、5xx 重試）
 ## 為什麼這 pattern 重要
 - arcrun 的 `ON_FAIL` 邊是宣告式 error handling，比寫 try/catch 直觀
 - `wait` 零件不消耗 CPU（cypher-executor 排程 sleep 後恢復），比 setTimeout 健康
 - 失敗最終要通知人，不能默默吞 — 通知本身也是 workflow 的責任
 ## 變體
 - **Circuit breaker**：3 次連續失敗 → 寫 KBDB `circuit:open` flag → 後續 trigger 直接跳過
 - **Dead letter queue**：失敗的 input 寫 KBDB type=dlq-input，方便事後手動重跑
 - **Idempotency key**：retry 時帶同樣的 request_id，避免下游重複處理
 ## 學到什麼
 - `ON_FAIL` 邊：節點失敗時走哪條
 - `wait` 零件：宣告式 delay，不阻塞 worker（推到 paused-resume）
 - `{{node_id.error}}` 取得失敗節點的錯誤訊息
 - 把「最終失敗通知」當 workflow 一部分，不靠系統外部 monitoring
@@ -0,0 +1 @@
 ["error-handling", "retry", "wait", "try-catch", "robustness", "advanced"]
@@ -0,0 +1,43 @@
 name: error_retry
 description: try_catch + wait + retry 模式：外部 API 偶發掛掉時自動重試
 flow:
  - "input >> ON_SUCCESS >> try_call"
  - "try_call >> ON_SUCCESS >> done"
  - "try_call >> ON_FAIL >> wait_a_bit"
  - "wait_a_bit >> ON_SUCCESS >> retry_call"
  - "retry_call >> ON_SUCCESS >> done"
  - "retry_call >> ON_FAIL >> final_fail_notify"
 config:
  try_call:
    component: http_request
    url: "{{input.target_url}}"
    method: POST
    body_json:
      payload: "{{input.payload}}"
  wait_a_bit:
    component: wait
    seconds: 5
  # 第二次嘗試。生產環境通常 retry 2-3 次配指數 backoff
  retry_call:
    component: http_request
    url: "{{input.target_url}}"
    method: POST
    body_json:
      payload: "{{input.payload}}"
      _retry: 1
  done:
    component: comp_passthrough
    # 純記錄成功，下游若需要可繼續鏈
  final_fail_notify:
    component: telegram
    chat_id: "{{secret.LEO_TELEGRAM_CHAT_ID}}"
    text: |
      ⚠️ workflow {{input.workflow_name}} 兩次重試都失敗
      target: {{input.target_url}}
      last error: {{retry_call.error}}
@@ -0,0 +1,33 @@
 # github-issue-bot
 ## 解決什麼問題
 開源專案維護苦：每天好幾個 issue 進來，要先看 → 分流 → 引導用戶補資訊。
 這個 bot 自動做第一輪：分類 / 評估嚴重度 / 留有意義的 comment / 加 label。
 ## 前置
 1. 在 GitHub repo settings → Webhooks → 加 webhook：
   - URL: `https://cypher.arcrun.dev/webhooks/named/github_issue_bot/trigger`
   - Content type: `application/json`
   - Events: `Issues (opened)`
   - 加 secret header `X-Arcrun-API-Key: ak_xxx`
 2. 設 credential `GITHUB_BOT_TOKEN`（一支 PAT 或 GitHub App token）
 ## 預期結果
 新 issue 開出來 30 秒後，bot 就 comment + 加標籤了。
 ## 改成你自己的
 - prompt 改為你的專案 conventions（用詞、語氣）
 - severity / category enum 改為你的分類
 - 加 conditional：critical 自動 telegram 通知 maintainer
 - 加 KBDB 存歷史 issue + claude 分析 → 用 RAG 找重複 issue
 - 加 `if_control`：若 issue body 有 `traceback` 自動 reproduce
 ## 為什麼這 pattern 重要
 - LLM 做「結構化判斷」比寫 if-else 強：能讀人類自然語言、抓上下文、判斷模糊邊界
 - GitHub webhook → workflow 是最常見「外部 event → 處理」場景，所有 SaaS webhook 都類似
 ## 學到什麼
 - 多步串聯（analyze → comment → label）每步都有 next，ON_SUCCESS 串
 - `{{analyze.X}}` 從 claude_api JSON 自動展開到下游
 - 同一個 API（GitHub）多次 call 共享 `Authorization` header
 - 嚴重度 / 類別這類 LLM 判斷，用 enum + required_fields 確保結構穩定
@@ -0,0 +1 @@
 ["github", "webhook", "llm", "automation", "triage", "external-api"]
@@ -0,0 +1,51 @@
 name: github_issue_bot
 description: GH webhook 收新 issue → claude 分析 → 自動留 comment + 加 label
 flow:
  - "input >> ON_SUCCESS >> analyze"
  - "analyze >> ON_SUCCESS >> add_comment"
  - "add_comment >> ON_SUCCESS >> add_labels"
 config:
  analyze:
    component: claude_api
    timeout_ms: 30000
    _recipe_output_format: json
    _recipe_output_required_fields:
      - severity
      - category
      - first_response
    prompt: |
      你是 GitHub issue 第一線分流員。對下列 issue 給出：
      - severity: "critical" | "high" | "medium" | "low"
      - category: "bug" | "feature" | "doc" | "question" | "other"
      - first_response: 一段 markdown，禮貌、有用、不假裝是真人
      若是 bug，guide 用戶提供 reproduce steps；若 question 直接回答；
      若 feature 引導去 discussion；若 doc 直接收。
      Issue:
      Title: {{input.issue.title}}
      Body: {{input.issue.body}}
  add_comment:
    component: http_request
    url: "https://api.github.com/repos/{{input.repository.full_name}}/issues/{{input.issue.number}}/comments"
    method: POST
    headers:
      Authorization: "Bearer {{secret.GITHUB_BOT_TOKEN}}"
      Accept: "application/vnd.github+json"
    body_json:
      body: "{{analyze.first_response}}"
  add_labels:
    component: http_request
    url: "https://api.github.com/repos/{{input.repository.full_name}}/issues/{{input.issue.number}}/labels"
    method: POST
    headers:
      Authorization: "Bearer {{secret.GITHUB_BOT_TOKEN}}"
    body_json:
      labels:
        - "auto-triaged"
        - "severity:{{analyze.severity}}"
        - "type:{{analyze.category}}"
@@ -0,0 +1,35 @@
 # llm-classify
 ## 解決什麼問題
 LLM 結構化輸出最常見場景：把自由文字分到固定 category。
 claude_api 用 `_recipe_output_format: json` 自動 parse + validate 必填欄位。
 ## 怎麼觸發
 ```bash
 curl -X POST https://cypher.arcrun.dev/webhooks/named/llm_classify_example/trigger \
  -H "X-Arcrun-API-Key: ak_xxx" \
  -d '{"api_key":"ak_xxx","text":"How to deploy Cloudflare Workers in production?"}'
 ```
 ## 預期結果
 - claude 回 JSON `{category, confidence, reason}`
 - KBDB 寫一筆 block，tags_json 含 `category:tech`
 - response 回 `{success: true, data: {id, ...}}`
 ## 為什麼這 pattern 重要
 - `_recipe_output_format: json` + `_recipe_output_required_fields` 是 claude_api 的 magic：
  Claude 回 JSON 後 cypher-executor 自動：
  1. 剝 ```json fence
  2. parse
  3. 驗 required fields 存在
  4. 把每個欄位（category / confidence / reason）放到 ctx 頂層，下游 `{{category}}` 直接用
 - 不用寫 parse / validate / shape 邏輯，純 prompt + schema
 ## 改成你自己的
 - prompt 改你的分類規則（category 清單可長可短）
 - 下游 save_with_tag 可換成 telegram 推播 / gmail / 等
 - 若需要多步分類（先粗分後細分），鏈兩個 claude_api 節點即可
 ## 注意
 - claude_api 走 mira daemon (Phase A)，會 paused 一陣子等 callback resume
 - 若 prompt 抽不出 required_fields，會 validation_failed 不寫 KBDB（safer than partial save）
@@ -0,0 +1 @@
 ["llm", "claude", "classify", "structured-output", "kbdb", "common-pattern"]
@@ -0,0 +1,32 @@
 name: llm_classify_example
 description: webhook 收文字 → claude 分類 → 寫 KBDB 加 tag
 flow:
  - "input >> ON_SUCCESS >> classify"
  - "classify >> ON_SUCCESS >> save_with_tag"
 config:
  classify:
    component: claude_api
    timeout_ms: 30000
    _recipe_output_format: json
    _recipe_output_required_fields:
      - category
      - confidence
    prompt: |
      分類以下文字到下列其中一個 category：
      - tech / business / personal / other
      只回 JSON：
      {"category": "tech", "confidence": 0.85, "reason": "..."}
      文字：{{input.text}}
  save_with_tag:
    component: kbdb_create_block
    api_key: "{{api_key}}"
    type: "note"
    source: "llm-classified"
    user_id: "ai_classifier"
    content: "{{input.text}}"
    tags_json: '["llm-classified", "category:{{category}}"]'
@@ -0,0 +1,38 @@
 # parallel-fanout
 ## 解決什麼問題
 同一份輸入要做多種處理（摘要 / 翻譯 / 分類 / 等）。
 不想等順序執行（總時長 = 全部加總）→ 並行（總時長 = 最慢一個）。
 ## 怎麼觸發
 ```bash
 curl -X POST https://cypher.arcrun.dev/webhooks/named/parallel_fanout/trigger \
  -H "X-Arcrun-API-Key: ak_xxx" \
  -d '{"api_key":"ak_xxx","text":"...","target_lang":"en"}'
 ```
 ## 預期行為
 - 3 個子 workflow 同時啟動，各自獨立執行
 - 主 workflow 返回所有子 workflow 都 trigger 成功的時間（毫秒級）
 - 子 workflow 完成的結果**不會回到** parent —— 各自寫各自的 KBDB / 通知
 ## 改成你自己的
 - 增 / 減 dispatch 節點數
 - workflow_name 換你的真實處理 workflow
 - 若需要等子 workflow 都完成 → 子 workflow 寫完成標記到 KBDB，parent 後續 cron 撿
 ## 變體：等所有子 workflow 完成
 arcrun 預設 trigger_workflow 是 fire-and-await（paused 也算 success）。
 若要嚴格「等到完成」，要：
 1. 子 workflow 末步寫 `done:{request_id}` block 到 KBDB
 2. parent 加 polling 節點 + wait 重試
 （M2 之後會出 wait_for_workflows 內建零件）
 ## 為什麼這 pattern 重要
 - arcrun 是 multi-tenant / multi-tier 平台。Fan-out 讓你能 build「主 controller + N 個 worker」架構
 - 比 promise.all 更穩：每個子 workflow 獨立 paused/resume，互不污染狀態
 ## 學到什麼
 - **Fan-out**：一個節點多條 ON_SUCCESS 邊出去，並行執行
 - `trigger_workflow` 是內建 orchestration 零件（cypher-executor in-process call，繞 CF self-fetch）
 - input 變數在 fan-out 時複製給每條分支（不互相影響）
@@ -0,0 +1 @@
 ["fan-out", "parallel", "trigger_workflow", "multi-step", "advanced"]
@@ -0,0 +1,35 @@
 name: parallel_fanout
 description: 一份輸入分發多個子 workflow 並行處理（trigger_workflow 模式）
 flow:
  - "input >> ON_SUCCESS >> dispatch_to_summary"
  - "input >> ON_SUCCESS >> dispatch_to_translate"
  - "input >> ON_SUCCESS >> dispatch_to_classify"
 config:
  # 三個並行子 workflow 觸發。各自獨立執行、互不影響、不等彼此
  # cypher-executor 處理 fan-out：三條邊同源 (input) → 三個目標各自跑
  dispatch_to_summary:
    component: trigger_workflow
    workflow_name: "llm_classify_example"   # 改成你的 summary workflow
    api_key: "{{api_key}}"
    input:
      api_key: "{{api_key}}"
      text: "{{input.text}}"
  dispatch_to_translate:
    component: trigger_workflow
    workflow_name: "your_translate_workflow"
    api_key: "{{api_key}}"
    input:
      api_key: "{{api_key}}"
      text: "{{input.text}}"
      target_lang: "{{input.target_lang}}"
  dispatch_to_classify:
    component: trigger_workflow
    workflow_name: "llm_classify_example"
    api_key: "{{api_key}}"
    input:
      api_key: "{{api_key}}"
      text: "{{input.text}}"
@@ -0,0 +1,40 @@
 # pdf-to-blocks
 ## 解決什麼問題
 研究 / 學習：丟一份 PDF 進來，自動轉文字 + 切 chunk + 存 KBDB，之後可 RAG search。
 適合做：論文閱讀庫、合約查詢、技術文件 RAG。
 ## 怎麼觸發
 ```bash
 curl -X POST https://cypher.arcrun.dev/webhooks/named/pdf_to_blocks/trigger \
  -H "X-Arcrun-API-Key: ak_xxx" \
  -d '{
    "api_key":"ak_xxx",
    "pdf_url":"https://arxiv.org/pdf/2411.02959.pdf",
    "title":"HtmlRAG",
    "user_id":"inkstone_leo_research"
  }'
 ```
 ## 怎麼用後續
 搭配 `rag-search-answer` workflow：
 ```bash
 curl ... rag_search_answer/trigger \
  -d '{"question":"HtmlRAG 對 Markdown 的優勢是什麼?", "user_id":"inkstone_leo_research"}'
 ```
 → claude 從你剛 ingest 的 PDF chunks 找 context 回答
 ## 改成你自己的
 - 替換 convert 來源（cto.finally.click 也有 convert，自家環境可用）
 - `kbdb_ingest` 預設 chunk ~500 字，要改在 KBDB 端設
 - `source: "pdf:{url}"` 是 idempotency key — 同 URL 重複 ingest 會被偵測
 ## 變體
 - 接 `claude_api` 在 ingest 後跑「自動 tag」流程（每個 chunk 抽 keyword tag）
 - 接 `email-summary` pattern：訂閱 arxiv RSS → 自動 PDF 收進來
 - 把 ingest 結果 trigger `wiki_synthesis`（mira 用此 chain）
 ## 學到什麼
 - KBDB 有 `/convert` endpoint 直接吃 PDF / DOC，不用自己處理 OCR
 - `kbdb_ingest` 自動 chunking + embedding 一條龍
 - `source: "{type}:{key}"` 是 KBDB idempotency 慣例
@@ -0,0 +1 @@
 ["pdf", "ingest", "kbdb", "rag-prep", "chunking", "knowledge-base"]
@@ -0,0 +1,25 @@
 name: pdf_to_blocks
 description: 收 PDF URL → 轉文字 → 切 chunk → 存 KBDB 每塊一個 block
 flow:
  - "input >> ON_SUCCESS >> convert_pdf"
  - "convert_pdf >> ON_SUCCESS >> ingest_to_kbdb"
 config:
  convert_pdf:
    component: http_request
    url: "https://kbdb.finally.click/convert"
    method: POST
    body_json:
      file_url: "{{input.pdf_url}}"
      format: "text"
  # kbdb_ingest 自動 chunk + 寫 blocks（每塊 ~500 字）
  # source 用 file_url 當去重 key（同 PDF 重 ingest 不會重複建）
  ingest_to_kbdb:
    component: kbdb_ingest
    api_key: "{{api_key}}"
    page_name: "pdf-{{input.title}}"
    text: "{{convert_pdf.data.text}}"
    source: "pdf:{{input.pdf_url}}"
    user_id: "{{input.user_id}}"
@@ -0,0 +1,36 @@
 # rag-search-answer
 ## 解決什麼問題
 最經典 RAG：用戶問問題 → KBDB semantic search 找相關 blocks → 餵 claude 回答。
 比直接問 claude 強：claude 有了實際 context，不會編、可引用、回答跟你的資料一致。
 ## 怎麼觸發
 ```bash
 curl -X POST https://cypher.arcrun.dev/webhooks/named/rag_search_answer/trigger \
  -H "X-Arcrun-API-Key: ak_xxx" \
  -d '{
    "api_key":"ak_xxx",
    "question":"如何避免 CF self-fetch 死鎖?",
    "user_id":"inkstone_mira_tools"
  }'
 ```
 ## 改成你自己的
 - `search_kbdb.topK` 改 N（取多少 context，3-10 常見）
 - `search_kbdb.user_id` 改為 query 該用戶下的 blocks，或拿掉撈全庫
 - prompt 改為你的 domain（客服 / 法律 / 醫療 / 技術文件）
 - 進階：加 `_recipe_output_format: json` 讓 claude 回結構化 {answer, citations[]}
 ## 為什麼這 pattern 重要
 RAG 是 LLM 真正派上用場的場景。沒 RAG，LLM 在你私有資料上的回答是猜的。
 ## 變體
 - **多輪 RAG**：先 claude 改寫 question → KBDB search → claude 答（query rewriting）
 - **多源**：KBDB + web search + DB query → merge → claude
 - **filter**：claude 先判斷 "需要 RAG 嗎？" → 不需要直接回（省 search latency）
 - **followup**：把 claude 答案 + 用戶 question 一起存 KBDB，下次同問題直接 cache hit
 ## 學到什麼
 - `kbdb_search` 走 semantic（embedding），不是字面比對 — query 不用打對關鍵字
 - `{{search_kbdb.results}}` 自動展開為 markdown 列表（component contract）
 - claude prompt 內注入 context 是 RAG 的核心，不需要 vector DB 之外的額外組件
@@ -0,0 +1 @@
 ["rag", "llm", "claude", "kbdb", "semantic-search", "qa", "common-pattern"]
@@ -0,0 +1,33 @@
 name: rag_search_answer
 description: 收問題 → 從 KBDB semantic search → 把 top context 餵 claude 回答
 flow:
  - "input >> ON_SUCCESS >> search_kbdb"
  - "search_kbdb >> ON_SUCCESS >> answer_with_context"
 config:
  search_kbdb:
    component: kbdb_search
    api_key: "{{api_key}}"
    query: "{{input.question}}"
    topK: 5
    user_id: "{{input.user_id}}"   # 可選，限定某用戶 namespace
  answer_with_context:
    component: claude_api
    timeout_ms: 45000
    _recipe_output_format: text
    prompt: |
      你是知識庫助手。根據下列 context 回答問題。
      **規則**：
      1. 只用 context 內的資訊，不外推
      2. context 沒講的，老實說「資料庫裡查不到」，不要編
      3. 引用時標 [block_id]，方便用戶追原始
      Context:
      {{search_kbdb.results}}
      問題：{{input.question}}
      回答：
@@ -0,0 +1,27 @@
 # webhook-to-http
 ## 解決什麼問題
 最小可用範例：用戶 POST 到 arcrun webhook，arcrun 把整個 payload 轉發到另一個 HTTP API。
 適合測試 arcrun 連通性、做簡單的 API 橋接、event forwarding。
 ## 怎麼觸發
 ```bash
 curl -X POST https://cypher.arcrun.dev/webhooks/named/webhook_to_http/trigger \
  -H "X-Arcrun-API-Key: ak_xxx" \
  -H "Content-Type: application/json" \
  -d '{"hello": "world"}'
 ```
 ## 預期結果
 - response 含 `success: true` 跟下游 httpbin 回的 echo
 - 下游 URL 收到 `{received: {hello: "world"}, timestamp: "2026-..."}`
 ## 改成你自己的
 - `forward.url` 改你想打的 API
 - `body_json` 改你要送的 payload schema
 - 需要 auth header → `forward.headers` 加（或用 credentials 機制）
 ## 學到什麼
 - 最簡單的 flow：input → 單一節點
 - `{{input}}` 取得 trigger 時 POST 進來的整份 JSON
 - `body_json` 結構化 body（不是 string）
@@ -0,0 +1 @@
 ["webhook", "http", "starter", "forward", "bridge"]
@@ -0,0 +1,16 @@
 name: webhook_to_http
 description: 收 webhook → 轉發到另一個 HTTP API
 flow:
  - "input >> ON_SUCCESS >> forward"
 config:
  forward:
    component: http_request
    url: "https://httpbin.org/post"
    method: POST
    headers:
      Content-Type: "application/json"
    body_json:
      received: "{{input}}"
      timestamp: "{{_now}}"
@@ -0,0 +1,29 @@
 # Arcrun Skill Library
 > 給 AI 操盤手用的 playbook（pattern + 流程指引）。
 > 比 examples 更高層 — examples 是「可直接用的 YAML」，skills 是「面對 X 問題該怎麼想 + 該用什麼 example」。
 >
 > 對應 SDD: `matrix/arcrun/.agents/specs/llm-interface/` Milestone 3.1
 ## 結構
 每個 skill 是一份 markdown：
 ```
 {skill-name}.md
 ```
 ## Skill 清單
 | Skill | 何時用 |
 |---|---|
 | `build_watcher_workflow` | 用戶想「每 X 分鐘掃資料，找到符合的處理」 |
 | `debug_paused_workflow` | workflow 卡 paused 不動了 |
 | `migrate_http_to_trigger_workflow` | 看到舊 workflow 用 http_request 自打，CF self-fetch 死鎖 |
 | `rag_with_arcrun` | 用戶想做「問問題 + 用我的資料回答」 |
 | `add_new_wasm_component` | 缺零件需要寫新的（TinyGo WASM） |
 ## CI 自動同步
 GH Actions 監聽本目錄變動 → PATCH 每個 skill 進 KBDB type=agent-skill block。
 MCP `arcrun_list_skills(tag?)` / `arcrun_get_skill(slug)` 給 AI 查。
@@ -0,0 +1,156 @@
 # Skill: Add New WASM Component
 ## 何時用這個 skill
 `arcrun_list_components()` 沒有你需要的零件。要寫一個新的 TinyGo / AssemblyScript WASM。
 **重要**：寫零件 = **改 arcrun 平台本身**，不是改 user workflow。
 這 skill 預設你有 arcrun repo write access。沒有 → 告訴用戶「需要 X 零件，請聯絡平台維護者」，停手。
 ## 7 步流程
 ### 1. 確認真的需要新零件
 先想：能不能用 `http_request` 加組合搞定？
 - 多數第三方 API → `http_request` 已夠（搭配 `auth_recipe` 處理 auth）
 - 簡單轉換 → 用 logic primitives（`set` / `filter` / `array_ops`）
 - 複雜流程編排 → cypher binding 多步而非單一大零件
 真的需要新零件的場景：
 - 跟 cypher-executor host functions 互動（KV、加解密、簽 JWT）
 - 邏輯太複雜不適合多節點分解
 - 為效能（一次 worker call 取代 10 次 fetch）
 ### 2. 讀規範
 - `matrix/arcrun/.claude/rules/03-component-architecture.md` — 部署慣例
 - `matrix/arcrun/.claude/rules/01-tech-stack.md` — TinyGo 限制
 - 既有相似零件範例：`matrix/arcrun/registry/components/{name}/main.go`
 ### 3. 開新目錄
 ```
 matrix/arcrun/registry/components/{your_component}/
 ├── main.go                    TinyGo source
 ├── component.contract.yaml    input/output schema + 描述
 └── (build 後產出 .wasm)
 ```
 合約格式（contract.yaml）：
 ```yaml
 canonical_id: your_component
 display_name: 中文顯示名
 category: data | auth | api | logic
 version: 0.1.0
 description: |
  做什麼用、限制、注意事項。AI 看這份決定要不要用你的零件
 input_schema:
  type: object
  required: [foo, bar]
  properties:
    foo: { type: string, description: "..." }
    bar: { type: number, description: "..." }
 output_schema:
  type: object
  properties:
    result: { type: string }
    success: { type: boolean }
 gherkin_tests:
  - given: "input foo=hello"
    when: "component runs"
    then: "result contains hello"
 ```
 ### 4. 寫 main.go
 ```go
 package main
 import (
    "encoding/json"
    "io"
    "os"
 )
 type Input struct {
    Foo string `json:"foo"`
    Bar int    `json:"bar"`
 }
 type Output struct {
    Result  string `json:"result"`
    Success bool   `json:"success"`
 }
 func main() {
    bytes, _ := io.ReadAll(os.Stdin)
    var in Input
    json.Unmarshal(bytes, &in)
    // 你的邏輯
    result := in.Foo + ":" + string(rune(in.Bar))
    out := Output{Result: result, Success: true}
    json.NewEncoder(os.Stdout).Encode(out)
 }
 ```
 限制：
 - 只 import：`os`、`io`、`encoding/json`、`encoding/base64`、`strings`、`time` 等 stdlib
 - **禁用**：`net/http`（用 host function `u6u.http_request`）、`crypto/rsa`（用 host function）
 - 全部 logic 在 main()，stdin/stdout JSON I/O
 ### 5. 本地 build + 測
 ```bash
 cd registry/components/your_component
 tinygo build -target=wasi -o your_component.wasm main.go
 echo '{"foo":"hello","bar":42}' | wasmtime your_component.wasm
 ```
 ### 6. 部署成獨立 Worker
 ```
 .component-builds/your_component/
 ├── wrangler.toml      name = "arcrun-your-component"
 ├── package.json
 ├── component.wasm     從上面 build 複製過來
 └── src/index.ts       固定 WASI shim（複製 component-worker-template）
 ```
 `wrangler.toml`：
 ```toml
 name = "arcrun-your-component"
 main = "src/index.ts"
 compatibility_date = "2025-02-19"
 workers_dev = true   # 必須 true，cypher-executor 走 workers.dev 對內 URL
 [[routes]]
 pattern = "your-component.arcrun.dev/*"
 zone_name = "arcrun.dev"
 ```
 push → CI 自動部署。
 ### 7. 註冊到 cypher-executor 白名單
 ⚠️ **目前的架構債**（M2 計畫修）：每加零件要手動加 `cypher-executor/src/lib/component-loader.ts`：
 ```ts
 const WASM_HTTP_RUNNER_IDS: ReadonlySet<string> = new Set([
  // ... 既有
  'your_component',  // ← 加這行
 ]);
 ```
 不加 → workflow 用你的零件會噴「找不到零件」。
 未來會改成 registry KV 動態查（`cypher-executor-dynamic-component-discovery` SDD 待開）。
 ## 驗證上線
 ```
 arcrun_list_components() → 應該看到 your_component
 arcrun_get_component_contract(canonical_id='your_component') → 看 schema
 ```
 寫個 test workflow 用你的零件，跑通就完成。
@@ -0,0 +1,86 @@
 # Skill: Build Watcher Workflow
 ## 何時用這個 skill
 用戶說：
 - 「每 X 分鐘 / 小時掃 Y → 找到符合條件的處理」
 - 「監聽某資料源，新資料進來自動處理」
 - 「定期巡 X 看有沒有新的」
 ## 核心 pattern
 ```
 cron → list (撈候選) → filter (過濾未處理) → 對每個 → trigger 處理 workflow
 ```
 ## 5 步流程
 ### 1. 確認資料源
 問用戶（或從上下文推）：
 - 資料在哪？KBDB / 外部 API / 檔案系統？
 - 用什麼欄位區分「已處理 vs 未處理」？常見：
  - tag（`tags_json` 有沒有 `"processed"`）
  - 狀態欄位（`status: pending`）
  - 缺某 metadata（如沒 `summary`）
 - 不要靠時間判斷 — 因為 cron 漏跑會永久 miss
 ### 2. 看範例 + 改
 `arcrun_search_examples('cron watcher')` → 命中 `cron-watcher` 範例。
 複製 YAML 改三處：
 - `watch_cron.cron_expr` — 改頻率
 - `list_unprocessed` — 改 query
 - `filter_new.condition` — 改你的「未處理」定義
 - `trigger_processor.workflow_name` — 改你的處理 workflow 名
 ### 3. 處理 workflow 要 idempotent
 watcher 可能重跑（cron 漏跑後補跑、手動 trigger 補跑）。處理 workflow 必須：
 - 第一步檢查「我是不是已處理過此 record」
 - 或在末步 mark 已處理（加 tag / 改 status）
 - 失敗時 graceful（記 telemetry，不重 crash）
 ### 4. 永遠用 `trigger_workflow` 不用 `http_request` 自打
 **這是 #1 死坑**。cypher-executor 走 `http_request` 打自己的 `cypher.arcrun.dev` 或
 `arcrun-cypher-executor.*.workers.dev` 都被 CF self-fetch 防護擋（1042 / 522 錯誤）。
 用內建 `trigger_workflow` 零件：
 ```yaml
 trigger_processor:
  component: trigger_workflow
  workflow_name: "your_processor"
  api_key: "{{api_key}}"
  input:
    api_key: "{{api_key}}"
    block_id: "{{item.id}}"
 ```
 ### 5. 部署 + 驗證
 ```
 arcrun_validate_yaml(yaml) → arcrun_push_workflow(yaml) → wait 5 min → arcrun_list_recent_executions
 ```
 第一次 cron tick 跑完後看 executions list 確認有運作；若沒有，看 `arcrun_list_paused_executions` 看有沒有卡住。
 ## 常見陷阱
 | 症狀 | 原因 | 解 |
 |---|---|---|
 | watcher 跑了但每次處理同樣 N 筆 | 沒做 mark 已處理 | 處理 workflow 末步加 tag / status 變更 |
 | watcher 跑了沒處理任何 | filter condition 寫錯 | acr validate 過但邏輯錯，curl 觸發測一次手動觸發看 trace |
 | 處理 workflow 永遠 paused | claude_api callback 沒回 | mira daemon 健康檢查；正常是 30-60 秒回 |
 | 處理量大爆 worker | 一次 trigger 太多 | list_unprocessed 加 limit，分多次 cron 跑 |
 | cron 沒 fire | 首節點不是 cron 零件 | scheduled() 只認首節點 cron — 確認 YAML flow 第一行是 `cron_node >> X` |
 ## 真實案例
 `mira_feed_watcher.yaml` (polaris/mira/arcrun/) 是這 pattern 的生產使用：
 - cron `*/5 * * * *` 掃 leo 河道貼文
 - filter `tags_json eq "[]"` 抓未處理
 - trigger_workflow 觸發 `wiki_synthesis`
 - wiki_synthesis 內部末步 mark `wiki-processed` tag 確保 idempotency
 完整 YAML 見 mira repo。
@@ -0,0 +1,81 @@
 # Skill: Debug Paused Workflow
 ## 何時用這個 skill
 - 你 `arcrun_run_workflow(...)` 得到 error 含「workflow paused at node X waiting for task task_XXX」
 - 用戶說「workflow 跑了卻沒結果」/「等很久」
 - 看到 `error_code: paused_awaiting_resume`
 ## 重要觀念：paused **不是錯誤**
 某些零件設計為 async：發起任務 → 立刻回 paused → 等外部 callback POST `/workflows/resume` → cypher-executor 接續執行。
 典型 paused 零件：
 - `claude_api` — 打 mira daemon，daemon 跑 Claude（30-60 秒）→ 回 callback
 - `http_request_async`（未來會有）— 發 webhook 後等回應
 - 任何用 `pending: true, task_id: X` 模式的零件
 paused 的 workflow **正在跑**，只是 cypher-executor 不浪費 CPU 等它，把 state 持久化到 KV 等 callback。
 ## Debug 流程
 ### Step 1：確認是不是真 paused（不是 fail）
 ```
 arcrun_list_paused_executions(api_key=ak_xxx, limit=20)
 ```
 看回傳的 paused 陣列：
 - 找你的 workflow 名稱
 - 看 `expires_at`（距離 24h TTL 還多久）
 - 拿 `task_id` 進下一步
 ### Step 2：看 paused state 細節
 ```
 arcrun_get_execution_trace(api_key=ak_xxx, task_id=task_XXX)
 ```
 回傳 `paused_pending_result` 含外部任務 id（如 mira daemon 的 task_id），`paused_node_id` 告訴你卡在哪。
 ### Step 3：判斷卡住原因
 | 觀察 | 原因 | 解 |
 |---|---|---|
 | `expires_at` 已過 | 24h 沒 callback，state 已 GC | 重 trigger workflow |
 | paused_node 是 `claude_api` 且 mira daemon 503 | daemon 掛了 | `ssh cto && systemctl status cloud-cto` |
 | paused_node 是 `claude_api` 且 daemon 正常 | callback 還沒回 | 等 30-90 秒 |
 | `paused_pending_result` 沒 `task_id` | 零件實作 bug | 看零件源碼 |
 | `paused_pending_result.callback_url` 錯 | 部署 URL 設錯 | 看零件 env config |
 ### Step 4：手動 resume（救急用）
 若已知 callback 結果（從外部 log / 直接打外部 API 拿到），可手動：
 ```bash
 curl -X POST https://cypher.arcrun.dev/workflows/resume \
  -H "Content-Type: application/json" \
  -d '{
    "task_id": "task_XXX",
    "result": { ... 模擬 callback 應該回的東西 ... }
  }'
 ```
 cypher-executor 找出對應 paused state 接續執行。
 ### Step 5：避免再卡住
 部署 watcher / async 流程時：
 - 設合理 timeout（claude_api 預設 30s，重 prompt 可拉到 60-90s）
 - 處理 daemon 健康檢查（monitor 加 alert）
 - 別在 high-load 時段同時 trigger 太多 paused workflow（KV write 量爆）
 ## paused 跟 fail 的差異速查
 | 狀態 | success 欄 | error 含 | 該做 |
 |---|---|---|---|
 | **成功完成** | true | — | 看 data 結果 |
 | **paused** | false（但其實算成功） | "workflow paused at node X" | 等 callback / get_execution_trace |
 | **真錯** | false | 各種 error 訊息（非 paused） | 看 trace 第一個 failed node |
 `trigger_workflow` 內建零件已把 paused 視為 status='paused_awaiting_resume' 而非 fail（commit 5216242）。
@@ -0,0 +1,90 @@
 # Skill: Migrate http_request → trigger_workflow
 ## 何時用這個 skill
 你看到既有 workflow YAML 內有：
 ```yaml
 some_node:
  component: http_request
  url: "https://cypher.arcrun.dev/webhooks/named/another_workflow/trigger"
  # 或
  url: "https://arcrun-cypher-executor.uncle6-me.workers.dev/webhooks/named/X/trigger"
 ```
 這是 **錯誤 pattern** — CF Workers self-fetch 防護會擋掉，回 1042 / 522。
 **永遠改用 `trigger_workflow` 內建零件**。
 ## 為什麼會擋
 Cloudflare Workers 有反同 zone 自循環防護：
 - 同 zone（`*.arcrun.dev`）Worker 互打容易死鎖
 - workers.dev 也擋（Worker → 自身 URL）
 歷史背景：mira_feed_watcher 之前用 http_request 自打，怎麼設都失敗，最終加 `trigger_workflow` 內建零件繞掉（commit b8ecef0, 2026-05-16）。
 ## 怎麼遷移（3 行改動）
 ### Before
 ```yaml
 trigger_synthesis:
  component: http_request
  url: "https://arcrun-cypher-executor.uncle6-me.workers.dev/webhooks/named/wiki_synthesis/trigger"
  method: POST
  headers:
    X-Arcrun-API-Key: "{{api_key}}"
    Content-Type: "application/json"
  body_json:
    api_key: "{{api_key}}"
    raw_block_id: "{{item.id}}"
 ```
 ### After
 ```yaml
 trigger_synthesis:
  component: trigger_workflow
  workflow_name: "wiki_synthesis"
  api_key: "{{api_key}}"
  input:
    api_key: "{{api_key}}"
    raw_block_id: "{{item.id}}"
 ```
 key 對應：
 - `url` → 拆 `workflow_name`
 - `headers.X-Arcrun-API-Key` → `api_key`
 - `body_json` → `input`
 - method / Content-Type → 不需要（in-process call）
 ## 行為差異
 | 維度 | http_request 自打 | trigger_workflow |
 |---|---|---|
 | 走的路徑 | 外部 HTTP（被擋） | in-process call executeWebhookGraph |
 | latency | 一次 round-trip 50-200ms | < 1ms |
 | paused 狀態回報 | http 收 5xx 視為失敗 | status='paused_awaiting_resume' 算成功 |
 | auth 注入 | 手寫 header | 自動 |
 | 跨 zone | 會撞 self-fetch | 完全繞掉 |
 | 計量 | 算外部 fetch quota | 算同 Worker CPU |
 ## 例外：什麼時候真的需要 http_request
 `trigger_workflow` 只能觸發**同一 arcrun 帳號**的 workflow（同 api_key namespace）。
 跨帳號 / 跨環境 / 觸發其他平台需要 http_request：
 - 觸發另一個 arcrun 用戶的 webhook（少見場景）
 - 觸發外部 API（zapier / n8n / 自家別的 service）
 - 跨 Cloudflare account 的 worker
 這些**不會** self-fetch 問題（因為目的地不是自己 Worker），http_request 仍適用。
 ## 部署前驗證
 ```
 arcrun_validate_yaml(yaml)
 arcrun_push_workflow(yaml)
 arcrun_run_workflow(your_watcher_name, {...})
 arcrun_list_recent_executions(workflow_name='your_watcher_name')
 ```
 確認 verdict='success' 且 duration_ms < 500ms（trigger_workflow 應該很快）。
@@ -0,0 +1,115 @@
 # Skill: RAG with Arcrun
 ## 何時用這個 skill
 用戶說：
 - 「我有一堆 X 資料，想問問題它幫我答」
 - 「Claude 不知道我的私人資料，怎麼讓它知道」
 - 「客服 bot 看我們的 docs 回答」
 - 「企業內部知識庫問答」
 ## 三步 RAG 架構
 ```
 用戶問 → 搜尋 → 把 context 餵 LLM → 回答
 ```
 arcrun 對應：
 ```yaml
 flow:
  - "input >> ON_SUCCESS >> search"      # KBDB semantic search
  - "search >> ON_SUCCESS >> answer"     # claude_api 帶 context
 ```
 完整範本見 `arcrun_search_examples('rag')` → `rag-search-answer`。
 ## 5 個關鍵決定
 ### 1. 資料怎麼進 KBDB？
 源頭決定品質：
 - **PDF / 文件** → 用 `pdf-to-blocks` workflow（自動 chunk + embedding）
 - **Logseq / Notion / 手記** → 寫腳本 ingest 或讓 mira 平台處理
 - **Web crawl** → http_request → `kbdb_ingest`
 - **每天 RSS** → cron + kbdb_ingest
 關鍵：
 - 用 `source` 欄位區分來源（之後 query 可篩 source）
 - 用 `user_id` 區分 namespace（多租戶或多 domain）
 - chunk 大小：500-1000 字最佳（太小無 context，太大稀釋 relevance）
 ### 2. search 怎麼設？
 ```yaml
 search:
  component: kbdb_search
  api_key: "{{api_key}}"
  query: "{{input.question}}"
  topK: 5                          # 3-10 都合理
  user_id: "{{input.user_id}}"     # 限定 namespace（多租戶必要）
 ```
 進階參數：
 - `source` — 限定來源（只查 "pdf:*" 或 "wiki:*"）
 - `tag` — 限定 tag（"verified" / "policy" / 等）
 - semantic search 走 embedding，query 用自然語言即可，不用打對 keyword
 ### 3. prompt 怎麼餵 context？
 關鍵：**明確標 context 邊界 + 給 LLM 拒絕回答的權力**
 ```
 你是知識庫助手。**只用 context 內的資訊**回答問題。
 規則：
 1. context 沒講的，老實說「資料庫裡查不到」
 2. 引用時標 [block_id]，方便用戶追原始
 3. 不要外推、不要編造
 Context:
 {{search.results}}
 問題：{{input.question}}
 回答：
 ```
 不給 LLM「拒絕的權力」，它會亂猜。
 ### 4. 引用怎麼顯示？
 進階：用 `_recipe_output_format: json` 讓 claude 回結構化：
 ```json
 {
  "answer": "...",
  "citations": [{"block_id": "abc-123", "snippet": "..."}],
  "confidence": "high"
 }
 ```
 前端可 render 成可點擊的 citation 連結。
 ### 5. 怎麼測準度？
 `arcrun_search_examples('rag-eval')` 暫無範例。手動：
 1. 準備 N 個「黃金 QA pair」（問題 + 應有的答案）
 2. 跑 workflow N 次，比對結果
 3. 若準度 < 70%：先檢查 KBDB chunk 品質、再 tune topK、最後 tune prompt
 ## 常見陷阱
 | 症狀 | 原因 | 解 |
 |---|---|---|
 | 答案不準 | chunk 太大 / 太小 | re-ingest 改 chunk size |
 | 答案編造 | prompt 沒給拒絕權 | prompt 加「不知道就說不知道」 |
 | 找不到該找到的 | semantic 不命中 | 換 query rewriting / 增 topK |
 | 答案太長 | prompt 沒限制 | prompt 加「答案 < 100 字」 |
 | 慢 | claude_api timeout | 拉 timeout_ms 或減 context |
 ## 進階變體
 - **多輪 query rewriting**：claude 先改寫 question → search → 答
 - **mix sources**：KBDB + web search + DB query → merge
 - **cache**：相同 question 的答案存 KBDB，下次 lookup hit 直接回（省 LLM call）
 - **conversational**：傳 chat history 進 prompt，支援 follow-up
 - **filter-then-rerank**：semantic search 撈 20 → claude rerank 取前 5 → 餵 final answer
		`@@ -0,0 +1 @@`
							`["cron", "watcher", "kbdb", "foreach", "trigger_workflow", "common-pattern"]`
		`@@ -0,0 +1 @@`
							`["cron", "digest", "fan-in", "multi-source", "llm", "telegram", "common-pattern"]`
		`@@ -0,0 +1 @@`
							`["cron", "gmail", "llm", "telegram", "digest", "automation", "common-pattern"]`
		`@@ -0,0 +1 @@`
							`["error-handling", "retry", "wait", "try-catch", "robustness", "advanced"]`
		`@@ -0,0 +1 @@`
							`["github", "webhook", "llm", "automation", "triage", "external-api"]`
		`@@ -0,0 +1 @@`
							`["llm", "claude", "classify", "structured-output", "kbdb", "common-pattern"]`
		`@@ -0,0 +1 @@`
							`["fan-out", "parallel", "trigger_workflow", "multi-step", "advanced"]`
		`@@ -0,0 +1 @@`
							`["pdf", "ingest", "kbdb", "rag-prep", "chunking", "knowledge-base"]`
		`@@ -0,0 +1 @@`
							`["rag", "llm", "claude", "kbdb", "semantic-search", "qa", "common-pattern"]`
		`@@ -0,0 +1 @@`
							`["webhook", "http", "starter", "forward", "bridge"]`