Skip to main content

Feature deep-dive · SaaSForge AI

Stripe credits system for usage-based AI SaaS billing

Stripe credits system for usage-based AI SaaS billing

Flat-rate 'unlimited AI' marketing rarely survives finance review — token costs are real, customers churn when bills exceed expectations, and 'unlimited' becomes 'we have to throttle you'. A credit economy turns that operational risk into a product surface: customers buy a known credit balance, the product tells them what each action costs, and overages convert cleanly into upgrade prompts.

Why credits beat seats or flat tiers for AI products

Seat-based billing assumes uniform usage per user; AI usage is famously skewed (one power user generates 90% of the tokens). Flat tiers either underprice power users or overprice casual ones. Credits price the unit of value — a chat turn, a document upload, an image generation — directly, so customers pay proportional to what they consume.

The implementation cost is non-trivial: you need accurate metering, idempotent debits, balance reset at billing cycle boundaries, and graceful degradation when credits run out. SaaSForge AI ships these primitives so the metering surface doesn't become product debt.

Tokens to credits: the conversion rate

The simplest credit model is a linear conversion from tokens used to credits debited. SaaSForge AI defaults to a rate that covers OpenAI/Claude API costs plus a margin, configurable per workspace. Embedding tokens debit at a different (lower) rate than completion tokens because their underlying cost is different.

Multi-modal actions (image generation, audio transcription) get their own per-unit credit costs. The conversion table lives in config, not hard-coded — so model price changes don't require a code release.

Debiting credits after a chat turn
const cost = computeCreditCost({
  inputTokens: usage.input_tokens,
  outputTokens: usage.output_tokens,
  model: "claude-opus-4-7",
});
await debitCredits({
  workspaceId,
  amount: cost,
  reason: "chat",
  // Idempotency key prevents double-debits on retries
  idempotencyKey: `chat:${turnId}`,
});

Stripe subscription as the reset cycle

Credit balance is associated with a Stripe customer's subscription. When the subscription's `invoice.payment_succeeded` webhook fires (monthly or annually), the balance resets to the plan's credit allotment. SaaSForge AI handles the webhook with idempotency tracking — webhook retries are common at the Stripe layer, and a naive reset handler will double-credit the customer.

Subscription cancellations, downgrades, and upgrades each have well-defined credit semantics: cancel keeps current balance until period end, downgrade prorates against the next cycle, upgrade tops up immediately. The webhook lifecycle handler ships with these branches documented.

Operational gotchas worth knowing about

The hardest bug in any credit system is the race between concurrent debits exceeding balance. SaaSForge AI uses row-level locking on the balance row (`select ... for update`) inside a transaction with the debit, so two simultaneous chat turns can't both succeed when only one credit remains. This costs latency on the chat path; the alternative is reconciliation drift, which is much worse.

Refunds are a second gotcha: a Stripe refund should restore credits proportional to what was consumed, not the full plan allotment. The webhook handler ships with proration logic so a mid-cycle refund leaves the customer with credits matching what they actually paid for.

Frequently asked

Can I sell credits as one-time purchases on top of subscriptions?
Yes. The credit balance is just a number; subscription resets and one-time top-ups both increment it. SaaSForge AI ships the subscription path; one-time top-ups are an additional Stripe Checkout endpoint with a webhook that calls the same `addCredits` primitive.
What happens when credits run out mid-conversation?
The chat endpoint checks the balance before each turn and returns a 402 Payment Required if insufficient. The UI catches the 402 and surfaces an upgrade prompt with the exact deficit. Streaming responses already in flight are billed for tokens already produced, not the full intended length.
How are usage logs structured?
Each debit creates a row in a `credit_transactions` table (workspace, amount, reason, idempotency key, timestamp, related entity). Customers see this as a usage history; finance sees it as a reconciliation source against Stripe revenue. Both views read the same table; no parallel reporting system to keep in sync.
Why use idempotency keys instead of just trusting the request?
Network retries — at the client, at Vercel's edge, at Stripe's webhook layer — will hit the debit endpoint multiple times for the same logical event. The idempotency key turns repeat calls into no-ops on the database side. Without it, a flaky client connection silently double-charges users.
Ships in SaaSForge AI

See SaaSForge AI. Skip the deliberation.

Full source code. Lifetime updates. Polar Merchant-of-Record checkout. Private GitHub repo on purchase.