feat(ingest): T0.5–T5 純餵食器管線實作(issue #2)

ingest 全管線(採取優先、extract fallback、跨庫織網、POST envelope):
- T0.5 骨架:Hono + zod-openapi,無 D1/Vectorize/AI 綁定(不碰儲存鐵律)
- T1 SourceAdapter:GitHub runtime API 拉 + per-file sha256 content-hash + /refresh 受理端
- T2 採取(路徑 A 優先):harvest template 1.8.0+ 卡(gloss/實體/typed-edge)
- T3 extract(路徑 B fallback):LlmCaller 可選模型 + JSON-fail 升級閘 + 端點對齊硬自檢護欄;第一版不 embed(只打標)
- T4 跨庫織網(主職):匯總多 repo → 偵測跨庫橋/異見,不算 bridge_score(graph 領域)
- T5 輸出:buildEnvelope strict + 顯式禁送欄位自檢;graph-client 純 POST(cherry-pick _kbdb_client.py 改不碰 base);薄 ops CLI(不帶查詢 MCP)

envelope 對齊 full contract(embed/id/aliases/predicate_embed);同步 contract 向量化欄位升格。

gate:vitest 28 passed / tsc clean / wrangler dry-run 乾淨(只 env-var 綁定)。
端到端 ingest→graph:graph receiver 已補對齊 → 待 ingest 部署 + GRAPH_BASE_URL → 待部署驗,未假綠。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-06-26 20:40:53 +08:00
parent dffefdcdc2
commit 16ad1cb208
24 changed files with 4003 additions and 28 deletions
+25
View File
@@ -0,0 +1,25 @@
{
"name": "kbdb-ingest-plugin",
"version": "0.1.0",
"private": true,
"description": "KBDB-ingest 插件:純餵食器——GitHub 拉 + 採取/萃取三元組候選 + 跨庫織網 → POST envelope 給 kbdb-graph-plugin。不碰儲存。",
"type": "module",
"scripts": {
"dev": "wrangler dev",
"deploy": "wrangler deploy",
"test": "vitest run",
"test:watch": "vitest",
"ingest": "node scripts/ingest-cli.mjs"
},
"dependencies": {
"@hono/zod-openapi": "^1.2.4",
"hono": "^4.7.0",
"zod": "^4.3.6"
},
"devDependencies": {
"@cloudflare/workers-types": "^4.20250219.0",
"typescript": "^5.7.0",
"vitest": "^3.1.0",
"wrangler": "^4.0.0"
}
}