Is this a ChatGPT clone or a ChatGPT integration?

Neither. SaaSForge AI ships the chat product shape (threads, messages, streaming, edit, regenerate) but the model behind it is your call: Claude, OpenAI, Mistral, or a self-hosted model via the Vercel AI SDK's provider abstraction. The 'ChatGPT clone' framing describes the UX pattern, not the model vendor.

Can I add tool calls and function execution?

Yes. The message schema already has a `tool_calls` column and the streaming layer surfaces tool-call events through the Vercel AI SDK. Wiring a specific tool (a database query, a web fetch, a custom action) is product work; the chat substrate does not need to change to support it.

How is conversation context managed for long threads?

By default the full thread is replayed into the model on each turn, which is fine up to a few thousand tokens. Past that, the boilerplate exposes hooks for summarising older turns or windowing the context. Pairing chat with RAG (also shipped in SaaSForge AI) is the more common solution: ground answers in retrieved chunks rather than relying on the chat history alone.

Does it support voice or image input?

The default chat surface is text. Image input is straightforward to add for multimodal models (Claude and GPT-4o both accept images); voice input is a separate integration (browser SpeechRecognition or Whisper). Both are out of scope for the base chat surface and documented as extension points.

ChatGPT clone boilerplate for Next.js, threads, messages, streaming chat UI

Most chat-clone tutorials stop at a single textarea wired to an OpenAI endpoint. A real ChatGPT-style product needs threads (so users can pick up yesterday's conversation), message history (so context survives a refresh), streaming (so the response feels live), and the smaller UI niceties (edit, regenerate, copy, share) that make the surface feel like a product. SaaSForge AI ships the full chat shell so you focus on the parts that differentiate your product, not the parts every clone needs.

Threads as the unit of memory

A thread is one ongoing conversation: a title (auto-generated from the first turn), a created timestamp, an owner, and a list of message turns ordered by sequence. SaaSForge AI persists threads in Supabase Postgres with workspace scoping enforced by RLS, so a user only ever lists and reads their own threads. The sidebar lists threads in reverse chronological order with the latest message preview, the same shape ChatGPT and Claude.ai both use.

Naming threads matters more than it sounds. The boilerplate auto-titles a thread from the first user message via a short model call, so the sidebar stays scannable without asking the user to name every conversation. Renaming and pinning are first-class actions on the thread itself.

Messages, roles, and the data shape

Each message turn is a row keyed to a thread, with a role (user, assistant, system, tool), the content, optional tool calls, and a token-usage record from the model API. The schema keeps tool calls and citations as structured fields so the UI can render them without re-parsing the message string.

Persisting the assistant's response after streaming completes is where most clones get sloppy. SaaSForge AI buffers the streamed tokens on the server, writes the final message row only when the stream closes cleanly, and exposes an idempotency key so a network retry does not double-write the turn.

Message row schema (Drizzle)

export const messages = pgTable("messages", {
  id: uuid("id").defaultRandom().primaryKey(),
  threadId: uuid("thread_id").notNull().references(() => threads.id, { onDelete: "cascade" }),
  workspaceId: uuid("workspace_id").notNull(),
  role: text("role", { enum: ["user", "assistant", "system", "tool"] }).notNull(),
  content: text("content").notNull(),
  toolCalls: jsonb("tool_calls"),
  inputTokens: integer("input_tokens"),
  outputTokens: integer("output_tokens"),
  createdAt: timestamp("created_at").defaultNow().notNull(),
});

Streaming UI without the React footguns

The streaming surface uses the Vercel AI SDK's `useChat` hook on the client paired with a Next.js Route Handler that returns a streamed response. The hook handles partial-message rendering, in-flight cancellation, and the small concurrency issues (a user sending a new message while the previous response is still streaming) that turn into bug reports on hand-rolled implementations.

Auto-scroll is one of those details that looks trivial and is not: a naive scroll-to-bottom fights the user when they scroll up to read earlier turns. SaaSForge AI's chat surface only auto-scrolls when the user is already near the bottom, mirroring the behaviour ChatGPT users expect.

Edit, regenerate, and the small flows that matter

Edit-and-resend lets the user rewrite their last message and re-trigger the model call, dropping the previous assistant turn from the thread tail. Regenerate-from-here re-runs the assistant turn with the same input, useful when the first response was off. Both flows are wired so the message ordering stays consistent and the credit ledger does not double-charge a single intent.

Branching, keeping multiple assistant responses for the same prompt, is intentionally out of scope on the default template because the UX gets busy fast. The data model supports it (messages can carry a `parent_id`) so adding branching later is a UI surface, not a database migration.

ChatGPT clone boilerplate for Next.js (threads, messages, streaming)