Leo/Arcrun

Files

T

Leo 388c193ae7 docs(registry): seed 10 examples + 5 skills (LI SDD M3.1 + M3.3)

對應 .agents/specs/llm-interface/ Milestone 3.1 + 3.3。

registry/examples/ — 10 個可直接 push 的 workflow 範本：
  starter:    webhook-to-http
  common:     cron-watcher, llm-classify, rag-search-answer, daily-digest
  external:   email-summary (gmail+claude+telegram), pdf-to-blocks,
              github-issue-bot
  advanced:   parallel-fanout (trigger_workflow fan-out),
              error-retry (try_catch+wait pattern)

  每個含：workflow.yaml（可直接 push）+ description.md（解決什麼問題 /
  改成你自己的 / 學到什麼）+ tags.json（搜尋用）

registry/skills/ — 5 個 AI playbook（markdown）：
  build_watcher_workflow            — cron + filter + trigger 模式
  debug_paused_workflow             — claude_api callback paused 怎麼追
  migrate_http_to_trigger_workflow  — 從 self-fetch 換 trigger_workflow
  rag_with_arcrun                   — KBDB + claude_api 組裝 RAG
  add_new_wasm_component            — TinyGo 寫 + 部署全流程

兩者差異：
  examples = 可直接拿來改的 YAML
  skills = 面對 X 問題該怎麼想 + 該用哪個 example

兩者後續：CI 自動同步進 KBDB（type=workflow-example / type=agent-skill），
MCP arcrun_search_examples / arcrun_list_skills 走 KBDB semantic search。
（CI sync 是 M3.4 工作）

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

2026-05-16 16:33:54 +08:00

3.1 KiB

Raw Permalink Blame History

Skill: Build Watcher Workflow

何時用這個 skill

用戶說：

「每 X 分鐘 / 小時掃 Y → 找到符合條件的處理」
「監聽某資料源，新資料進來自動處理」
「定期巡 X 看有沒有新的」

核心 pattern

cron → list (撈候選) → filter (過濾未處理) → 對每個 → trigger 處理 workflow

5 步流程

1. 確認資料源

問用戶（或從上下文推）：

資料在哪？KBDB / 外部 API / 檔案系統？
用什麼欄位區分「已處理 vs 未處理」？常見：
- tag（tags_json 有沒有 "processed"）
- 狀態欄位（status: pending）
- 缺某 metadata（如沒 summary）
不要靠時間判斷 — 因為 cron 漏跑會永久 miss

2. 看範例 + 改

arcrun_search_examples('cron watcher') → 命中 cron-watcher 範例。複製 YAML 改三處：

watch_cron.cron_expr — 改頻率
list_unprocessed — 改 query
filter_new.condition — 改你的「未處理」定義
trigger_processor.workflow_name — 改你的處理 workflow 名

3. 處理 workflow 要 idempotent

watcher 可能重跑（cron 漏跑後補跑、手動 trigger 補跑）。處理 workflow 必須：

第一步檢查「我是不是已處理過此 record」
或在末步 mark 已處理（加 tag / 改 status）
失敗時 graceful（記 telemetry，不重 crash）

4. 永遠用 `trigger_workflow` 不用 `http_request` 自打

這是 #1 死坑。cypher-executor 走 http_request 打自己的 cypher.arcrun.dev 或 arcrun-cypher-executor.*.workers.dev 都被 CF self-fetch 防護擋（1042 / 522 錯誤）。

用內建 trigger_workflow 零件：

trigger_processor:
  component: trigger_workflow
  workflow_name: "your_processor"
  api_key: "{{api_key}}"
  input:
    api_key: "{{api_key}}"
    block_id: "{{item.id}}"

5. 部署 + 驗證

arcrun_validate_yaml(yaml) → arcrun_push_workflow(yaml) → wait 5 min → arcrun_list_recent_executions

第一次 cron tick 跑完後看 executions list 確認有運作；若沒有，看 arcrun_list_paused_executions 看有沒有卡住。

常見陷阱

症狀	原因	解
watcher 跑了但每次處理同樣 N 筆	沒做 mark 已處理	處理 workflow 末步加 tag / status 變更
watcher 跑了沒處理任何	filter condition 寫錯	acr validate 過但邏輯錯，curl 觸發測一次手動觸發看 trace
處理 workflow 永遠 paused	claude_api callback 沒回	mira daemon 健康檢查；正常是 30-60 秒回
處理量大爆 worker	一次 trigger 太多	list_unprocessed 加 limit，分多次 cron 跑
cron 沒 fire	首節點不是 cron 零件	scheduled() 只認首節點 cron — 確認 YAML flow 第一行是 `cron_node >> X`

真實案例

mira_feed_watcher.yaml (polaris/mira/arcrun/) 是這 pattern 的生產使用：

cron */5 * * * * 掃 leo 河道貼文
filter tags_json eq "[]" 抓未處理
trigger_workflow 觸發 wiki_synthesis
wiki_synthesis 內部末步 mark wiki-processed tag 確保 idempotency

完整 YAML 見 mira repo。

3.1 KiB Raw Permalink Blame History Unescape Escape