English {#english}
Zuplo AI Metering Policy (RAG Quota + Usage → D1 / Stripe)
Purpose: Check quota before RAG query (Inbound); record token usage after call (Outbound).
References: rag-ai-edge-architecture-plan.md §3, rag-ai-phase1-implementation-plan.md §2.5.
Flow
| Step | Owner | Action |
|---|---|---|
| Inbound | Zuplo Custom Policy | API Key → tenant_id (Consumer metadata). Call Control Plane GET /internal/ai-quota?tenant_id=.... If remaining_daily <= 0 return 429. |
| Backend | RAG Query Worker | Include response header x-ai-usage-tokens. |
| Outbound | Zuplo Custom Policy | After next(), read x-ai-usage-tokens from response. Async (waitUntil etc.) call Control Plane POST /internal/ai-usage → D1 upsert + Stripe usage_records (idempotency_key). |
Control Plane endpoints (already implemented)
- GET /internal/ai-quota?tenant_id=
Auth:Authorization: Bearer <INTERNAL_API_KEY>. Response:{ remaining_daily, remaining_monthly }. Zuplo Inbound returns 429 whenremaining_daily <= 0. - POST /internal/ai-usage
Auth: same. Body:{ "tenant_id": "...", "tokens_used": 123, "idempotency_key": "optional" }. Updates D1tenant_ai_usage_dailyand Stripesubscription_items/.../usage_recordswhen tenant hasstripe_subscription_item_id.
Zuplo policy example (pseudocode)
Implement as Custom Policy in Zuplo Dashboard (adjust to Zuplo Policy API and env vars).
Inbound — Quota check: Resolve tenant_id from request.user?.data?.tenantId or Consumer metadata; 401 if missing. Fetch Control Plane /internal/ai-quota?tenant_id=...; if remaining_daily <= 0 return 429 JSON; else return next().
Outbound — Usage recording: response = await next(). Read x-ai-usage-tokens from response; if tokensUsed > 0 and tenantId, call POST /internal/ai-usage asynchronously (waitUntil) with tenant_id, tokens_used, idempotency_key. Return response.
Environment variables (Zuplo)
| Name | Description |
|---|---|
CONTROL_PLANE_URL | Control Plane Worker URL (e.g. https://prego-control-plane.workers.dev) |
INTERNAL_API_KEY | Bearer token for Control Plane /internal/* (same as Control Plane INTERNAL_API_KEY) |
Consumer (API Key) metadata must include tenantId for Inbound/Outbound. Zuplo Sync already registers with metadata: { tenantId }.
한국어 {#korean}
Zuplo AI 메터링 정책 (RAG Quota + Usage → D1 / Stripe)
목적: RAG Query 호출 전 쿼터 확인(Inbound), 호출 후 토큰 사용량 기록(Outbound).
참조: rag-ai-edge-architecture-plan.md §3, rag-ai-phase1-implementation-plan.md §2.5.
흐름
| 단계 | 담당 | 동작 |
|---|---|---|
| Inbound | Zuplo Custom Policy | API Key → tenant_id(Consumer 메타). Control Plane GET /internal/ai-quota?tenant_id=... 호출. remaining_daily <= 0 이면 429 반환. |
| Backend | RAG Query Worker | 응답 헤더 x-ai-usage-tokens 포함. |
| Outbound | Zuplo Custom Policy | next() 후 응답에서 x-ai-usage-tokens 추출. 비동기(waitUntil 등)로 Control Plane POST /internal/ai-usage 호출 → D1 누적 + Stripe usage_records(idempotency_key). |
Control Plane 엔드포인트 (이미 구현)
-
GET /internal/ai-quota?tenant_id=
- 인증:
Authorization: Bearer <INTERNAL_API_KEY> - 응답:
{ remaining_daily, remaining_monthly } - Zuplo Inbound에서
remaining_daily <= 0이면 429.
- 인증:
-
POST /internal/ai-usage
- 인증:
Authorization: Bearer <INTERNAL_API_KEY> - Body:
{ "tenant_id": "...", "tokens_used": 123, "idempotency_key": "optional" } - 동작: D1
tenant_ai_usage_dailyupsert, Stripesubscription_items/.../usage_records(테넌트에stripe_subscription_item_id있을 때).
- 인증:
Zuplo 정책 예시 (의사 코드)
Zuplo Dashboard에서 Custom Policy로 아래 로직을 구현한다.
(실제 문법은 Zuplo Policy API·환경 변수에 맞게 조정.)
Inbound — Quota 체크
// 1. tenant_id 확정: request.user?.data?.tenantId 또는 Consumer metadataconst tenantId = request.user?.data?.tenantId ?? request.user?.metadata?.tenantId;if (!tenantId) return Response(401, "Missing tenant");
// 2. Control Plane에 쿼터 조회const res = await fetch( `${CONTROL_PLANE_URL}/internal/ai-quota?tenant_id=${encodeURIComponent(tenantId)}`, { headers: { Authorization: `Bearer ${INTERNAL_API_KEY}` } });const { remaining_daily } = await res.json();
// 3. 초과 시 429if (remaining_daily <= 0) { return new Response(JSON.stringify({ error: "AI quota exceeded" }), { status: 429, headers: { "Content-Type": "application/json" } });}
return next();Outbound — 사용량 기록
const response = await next();
// x-ai-usage-tokens 추출 (백엔드가 RAG Query Worker인 경우만 처리 가능)const tokensHeader = response.headers.get("x-ai-usage-tokens");const tokensUsed = tokensHeader ? parseInt(tokensHeader, 10) : 0;
if (tokensUsed > 0 && tenantId) { // 비동기로 기록 (waitUntil 또는 Zuplo 동등 메커니즘) waitUntil( fetch(`${CONTROL_PLANE_URL}/internal/ai-usage`, { method: "POST", headers: { Authorization: `Bearer ${INTERNAL_API_KEY}`, "Content-Type": "application/json", }, body: JSON.stringify({ tenant_id: tenantId, tokens_used: tokensUsed, idempotency_key: `${tenantId}_${requestId}_${Date.now()}`, }), }) );}
return response;환경 변수 (Zuplo)
| 이름 | 설명 |
|---|---|
CONTROL_PLANE_URL | Control Plane Worker URL (예: https://prego-control-plane.workers.dev) |
INTERNAL_API_KEY | Control Plane /internal/* Bearer 토큰 (Control Plane의 INTERNAL_API_KEY와 동일) |
Consumer(API Key) 메타데이터에 tenantId가 저장되어 있어야 Inbound/Outbound에서 테넌트를 알 수 있다. Zuplo Sync 시 이미 metadata: { tenantId } 로 등록됨.