Skip to content

English {#english}

PREGO v4.0 Operations Runbook Summary

For full details see §14 Runbook in the implementation plan. Below is a quick reference.

Frequently used workflows

SituationAction
Pulumi Up (per region)Actions → Pulumi Up → Run workflow, set region: sg / us / eu. Or push infra changes to main.
Pulumi DestroyActions → Pulumi Destroy → enter tenant_id, region, confirm: DESTROY. Production destroy requires approval.
Pulumi Preview (PR)Runs automatically on PRs that change prego-pulumi/**. Matrix: sg, us, eu.

When something fails

FailureWhere to check
Pulumi Up failedActions logs, Pulumi Cloud lock/state. If needed, run pulumi refresh / pulumi up manually (same stack).
Provision failedControl Plane → provision_jobs, trace_id → GET /trace/:trace_id for LogPath. Check Slack alerts.
Stripe Webhook duplicateIgnored idempotently via provider_events. Subscription stays Active during 7-day grace.
Manual Hard PurgeNot recommended for tenants < 30 days. If required: workers/purge-job POST /purge or wait for Cron. Order: §5.8 (Hetzner → R2 → D1, keep audit_logs).

Secrets and environment

  • List and rotation: GitHub secrets. Recommended: HCLOUD_* 90d, PULUMI 180d, CLOUDFLARE 90d.
  • After rotating: Run once manually any workflow that uses that secret to verify.

Manual worker triggers

WorkerManual run
Usage AggregatorPOST /aggregate
Cycle ClosePOST /cycle-close
Purge JobPOST /purge
AutoscalerPOST /run

(Worker URLs: Wrangler or Cloudflare dashboard.)

Reference


flowchart LR
  subgraph Triggers
    A[Pulumi Up]
    B[Pulumi Destroy]
    C[Provision]
    D[Workers]
  end
  subgraph On failure
    E[Actions / Pulumi Cloud]
    F[Control Plane / trace_id]
    G[Slack]
  end
  A --> E
  B --> E
  C --> F
  C --> G
  D --> D

한국어 {#korean}

PREGO v4.0 운영 Runbook 요약

상세는 구현 기획서 §14 Runbook 참조. 아래는 빠른 참조.

자주 쓰는 워크플로

상황조치
Pulumi Up (리전별)Actions → Pulumi Up → Run workflow, region: sg / us / eu. 또는 main에 infra 변경 push.
Pulumi DestroyActions → Pulumi Destroytenant_id, region, confirm: DESTROY 입력. production-destroy 승인 필요.
Pulumi Preview (PR)prego-pulumi/** 변경 PR 시 자동. matrix: sg, us, eu.

실패 시 확인

실패확인 위치
Pulumi Up 실패Actions 로그, Pulumi Cloud lock/state. 필요 시 수동 pulumi refresh / pulumi up (동일 스택).
Provision 실패Control Plane → provision_jobs, trace_id → GET /trace/:trace_id 로 LogPath. Slack 알림 확인.
Stripe Webhook 중복provider_events 멱등으로 무시됨. 7일 Grace 동안 Active 유지.
Hard Purge 수동30일 미만은 비권장. 예외 시 workers/purge-job POST /purge 또는 Cron 대기. 순서: §5.8 (Hetzner → R2 → D1, audit_logs 유지).

Secrets·환경

  • 목록·로테이션: github-secrets. 권장: HCLOUD_* 90일, PULUMI 180일, CLOUDFLARE 90일.
  • 로테이션 후: 해당 Secret을 쓰는 워크플로 1회 수동 실행으로 검증.

Worker 수동 트리거

Worker수동 실행
Usage AggregatorPOST /aggregate
Cycle ClosePOST /cycle-close
Purge JobPOST /purge
AutoscalerPOST /run

(각 Worker 배포 URL은 Wrangler/Cloudflare 대시보드에서 확인.)

참조

Help