How to Build an AI Chatbot From Scratch: A Step-by-Step Guide

Building an AI chatbot is one of the best ways to understand how modern AI applications work under the hood. In this tutorial, we will build a fully functional chatbot with streaming responses, conversation memory, and a clean UI — then deploy it to production.

By the end, you will have a chatbot that rivals the basic functionality of ChatGPT's interface, running on your own infrastructure with your own API key.

Architecture Overview

Before writing code, let us map out what we are building:

┌─────────────┐     HTTP/SSE      ┌──────────────┐     API Call     ┌─────────────┐
│  React UI   │ ───────────────▶  │  Node.js API │ ──────────────▶  │  LLM API    │
│  (Frontend) │ ◀───────────────  │  (Backend)   │ ◀──────────────  │  (Claude/   │
│             │   Streamed tokens │              │  Streamed tokens │   OpenAI)   │
└─────────────┘                   └──────────────┘                  └─────────────┘
                                        │
                                        ▼
                                  ┌──────────────┐
                                  │  In-Memory   │
                                  │  Conversation│
                                  │  Store       │
                                  └──────────────┘

The stack: React frontend, Express.js backend, and either the Anthropic or OpenAI API for the language model. We will use Server-Sent Events (SSE) for streaming.

Step 1: Choose Your Model API

You have two primary options for the LLM backend: Anthropic Claude API — Excellent for nuanced, longer-form responses. Claude's system prompts are powerful for shaping chatbot personality. The API uses a messages-based format that maps cleanly to chat interfaces. OpenAI GPT API — The most widely documented option. GPT-4o provides fast, capable responses. The Chat Completions API is straightforward. For this tutorial, we will use the Anthropic Claude API, but the architecture works identically with OpenAI — you only swap out the API call in one function. Get your API key: Sign up at console.anthropic.com, create a project, and generate an API key. Store it securely — never commit it to version control.

Step 2: Set Up the Backend

Initialize a Node.js project and install dependencies:

mkdir ai-chatbot && cd ai-chatbot
npm init -y
npm install express cors @anthropic-ai/sdk dotenv uuid

Create your environment file:

# .env
ANTHROPIC_API_KEY=sk-ant-your-key-here
PORT=3001

Now build the Express server. Create server.js:

import express from 'express';
import cors from 'cors';
import Anthropic from '@anthropic-ai/sdk';
import { randomUUID } from 'crypto';
import 'dotenv/config';

const app = express();
app.use(cors());
app.use(express.json());

const anthropic = new Anthropic({
  apiKey: process.env.ANTHROPIC_API_KEY,
});

// In-memory conversation store
const conversations = new Map();

const SYSTEM_PROMPT = You are a helpful, knowledgeable assistant. 
You give clear, concise answers and ask clarifying questions 
when a request is ambiguous. You format responses with markdown 
when it improves readability.;

app.listen(process.env.PORT || 3001, () => {
  console.log(Server running on port ${process.env.PORT || 3001});
});

This gives us a running server with the Anthropic client initialized and a Map to store conversation histories.

Step 3: Build the Chat Endpoint with Streaming

The key to a responsive chatbot is streaming. Instead of waiting for the entire response to generate (which can take 10-30 seconds for long answers), we stream tokens to the frontend as they are produced. Add this endpoint to server.js:

app.post('/api/chat', async (req, res) => {
  const { message, conversationId } = req.body;

  // Get or create conversation
  const convId = conversationId || randomUUID();
  if (!conversations.has(convId)) {
    conversations.set(convId, []);
  }
  const history = conversations.get(convId);

  // Add user message to history
  history.push({ role: 'user', content: message });

  // Set up SSE headers
  res.setHeader('Content-Type', 'text/event-stream');
  res.setHeader('Cache-Control', 'no-cache');
  res.setHeader('Connection', 'keep-alive');

  // Send conversation ID first
  res.write(data: ${JSON.stringify({ type: 'id', conversationId: convId })}\n\n);

  try {
    let fullResponse = '';

    const stream = anthropic.messages.stream({
      model: 'claude-sonnet-4-20250514',
      max_tokens: 4096,
      system: SYSTEM_PROMPT,
      messages: history,
    });

    stream.on('text', (text) => {
      fullResponse += text;
      res.write(data: ${JSON.stringify({ type: 'token', content: text })}\n\n);
    });

    stream.on('finalMessage', () => {
      // Save assistant response to history
      history.push({ role: 'assistant', content: fullResponse });

      res.write(data: ${JSON.stringify({ type: 'done' })}\n\n);
      res.end();
    });

    stream.on('error', (error) => {
      console.error('Stream error:', error);
      res.write(data: ${JSON.stringify({ type: 'error', message: error.message })}\n\n);
      res.end();
    });
  } catch (error) {
    console.error('API error:', error);
    res.write(data: ${JSON.stringify({ type: 'error', message: 'Failed to generate response' })}\n\n);
    res.end();
  }
});

Let us break down what this does:

Receives the user message and either retrieves an existing conversation or creates a new one.

Sets SSE headers so the browser knows to expect a stream of events.

Calls the Anthropic API with streaming enabled. The .stream() method returns an event emitter that fires text events as tokens arrive.

Forwards each token to the client as an SSE event.

Saves the complete response to conversation history when the stream finishes.

Step 4: Add Conversation Management

Users need to start new conversations and retrieve existing ones. Add these endpoints:

// List conversations (returns IDs and first message preview)
app.get('/api/conversations', (req, res) => {
  const list = [];
  for (const [id, messages] of conversations) {
    if (messages.length > 0) {
      list.push({
        id,
        preview: messages[0].content.substring(0, 80),
        messageCount: messages.length,
        lastUpdated: Date.now(),
      });
    }
  }
  res.json(list);
});

// Get full conversation history
app.get('/api/conversations/:id', (req, res) => {
  const history = conversations.get(req.params.id);
  if (!history) {
    return res.status(404).json({ error: 'Conversation not found' });
  }
  res.json({ id: req.params.id, messages: history });
});

// Delete a conversation
app.delete('/api/conversations/:id', (req, res) => {
  conversations.delete(req.params.id);
  res.json({ success: true });
});

Step 5: Build the Chat UI

For the frontend, create a React application. We will keep it focused on the chat functionality:

npm create vite@latest client -- --template react
cd client
npm install

Replace src/App.jsx with the chat interface:

import { useState, useRef, useEffect } from 'react';
import './App.css';

function App() {
  const [messages, setMessages] = useState([]);
  const [input, setInput] = useState('');
  const [isStreaming, setIsStreaming] = useState(false);
  const [conversationId, setConversationId] = useState(null);
  const messagesEndRef = useRef(null);

  const scrollToBottom = () => {
    messagesEndRef.current?.scrollIntoView({ behavior: 'smooth' });
  };

  useEffect(() => { scrollToBottom(); }, [messages]);

  const sendMessage = async () => {
    if (!input.trim() || isStreaming) return;

    const userMessage = input.trim();
    setInput('');
    setMessages(prev => [...prev, { role: 'user', content: userMessage }]);
    setIsStreaming(true);

    // Add empty assistant message that we will stream into
    setMessages(prev => [...prev, { role: 'assistant', content: '' }]);

    try {
      const response = await fetch('http://localhost:3001/api/chat', {
        method: 'POST',
        headers: { 'Content-Type': 'application/json' },
        body: JSON.stringify({
          message: userMessage,
          conversationId,
        }),
      });

      const reader = response.body.getReader();
      const decoder = new TextDecoder();

      while (true) {
        const { done, value } = await reader.read();
        if (done) break;

        const chunk = decoder.decode(value);
        const lines = chunk.split('\n').filter(line => line.startsWith('data: '));

        for (const line of lines) {
          const data = JSON.parse(line.slice(6));

          if (data.type === 'id') {
            setConversationId(data.conversationId);
          } else if (data.type === 'token') {
            setMessages(prev => {
              const updated = [...prev];
              const last = updated[updated.length - 1];
              last.content += data.content;
              return updated;
            });
          } else if (data.type === 'error') {
            console.error('Stream error:', data.message);
          }
        }
      }
    } catch (error) {
      console.error('Request failed:', error);
      setMessages(prev => {
        const updated = [...prev];
        updated[updated.length - 1].content = 'Sorry, something went wrong. Please try again.';
        return updated;
      });
    } finally {
      setIsStreaming(false);
    }
  };

  const handleKeyDown = (e) => {
    if (e.key === 'Enter' && !e.shiftKey) {
      e.preventDefault();
      sendMessage();
    }
  };

  return (
    <div className="chat-container">
      <header className="chat-header">
        <h1>AI Chatbot</h1>
        <button onClick={() => { setMessages([]); setConversationId(null); }}>
          New Chat
        </button>
      </header>

      <div className="messages">
        {messages.map((msg, i) => (
          <div key={i} className={message ${msg.role}}>
            <div className="message-content">{msg.content}</div>
          </div>
        ))}
        <div ref={messagesEndRef} />
      </div>

      <div className="input-area">
        <textarea
          value={input}
          onChange={(e) => setInput(e.target.value)}
          onKeyDown={handleKeyDown}
          placeholder="Type your message..."
          rows={1}
          disabled={isStreaming}
        />
        <button onClick={sendMessage} disabled={isStreaming || !input.trim()}>
          {isStreaming ? '...' : 'Send'}
        </button>
      </div>
    </div>
  );
}

export default App;

Step 6: Handle Edge Cases

A production chatbot needs to handle several things that tutorials often skip.

Token Limit Management

Conversation histories grow indefinitely, but the API has a context window limit. Add a function to trim old messages when the conversation gets too long:

function trimHistory(messages, maxTokenEstimate = 150000) {
  // Rough estimate: 1 token ≈ 4 characters
  const estimateTokens = (msgs) =>
    msgs.reduce((sum, m) => sum + Math.ceil(m.content.length / 4), 0);

  while (messages.length > 2 && estimateTokens(messages) > maxTokenEstimate) {
    // Remove the oldest user-assistant pair, keeping the first message for context
    messages.splice(1, 2);
  }
  return messages;
}

Call trimHistory(history) before passing messages to the API. This preserves the first message (which often sets context) while removing older exchanges from the middle.

Rate Limiting

Protect your API key from abuse with basic rate limiting:

import rateLimit from 'express-rate-limit';

const limiter = rateLimit({
  windowMs: 60  1000, // 1 minute
  max: 20, // 20 requests per minute per IP
  message: { error: 'Too many requests. Please wait a moment.' },
});

app.use('/api/chat', limiter);

Graceful Error Recovery
When the API returns errors — rate limits, overloaded servers, invalid requests — your chatbot should not just crash. The streaming error handler we built earlier catches API-level errors, but you should also handle network timeouts:
const stream = anthropic.messages.stream({ model: 'claude-sonnet-4-20250514', max_tokens: 4096, system: SYSTEM_PROMPT, messages: trimHistory(history), }).on('error', (error) => { if (error.status === 429) { res.write(data: ${JSON.stringify({ type: 'error', message: 'Rate limited. Please wait 30 seconds and try again.' })}\n\n); } else { res.write(data: ${JSON.stringify({ type: 'error', message: 'An error occurred. Please try again.' })}\n\n); } res.end(); });

Step 7: Add Markdown Rendering
AI responses frequently contain markdown — code blocks, lists, headers, bold text. Rendering raw markdown in the browser looks terrible. Add a markdown renderer to the frontend:
cd client npm install react-markdown remark-gfm rehype-highlight
Update the message display component:
import ReactMarkdown from 'react-markdown'; import remarkGfm from 'remark-gfm'; import rehypeHighlight from 'rehype-highlight'; // Inside the messages map: <div className="message-content"> {msg.role === 'assistant' ? ( <ReactMarkdown remarkPlugins={[remarkGfm]} rehypePlugins={[rehypeHighlight]}> {msg.content} </ReactMarkdown> ) : ( msg.content )} </div>
This gives you GitHub-flavored markdown with syntax-highlighted code blocks. The visual improvement is dramatic — responses with code snippets, tables, or structured lists become actually readable.
Step 8: Deploy to Production
For deployment, we need to combine the frontend and backend into a single deployable unit.
Build the Frontend

cd client npm run build
This creates a dist/ folder with static files.
Serve Static Files from Express
Add this to your server.js, after your API routes:

import path from 'path';
import { fileURLToPath } from 'url';

const __dirname = path.dirname(fileURLToPath(import.meta.url));

// Serve the built React app
app.use(express.static(path.join(__dirname, 'client', 'dist')));

// Catch-all: serve index.html for client-side routing
app.get('', (req, res) => {
  res.sendFile(path.join(__dirname, 'client', 'dist', 'index.html'));
});

Deploy to a Cloud Provider

Railway or Render (simplest): Push your repo to GitHub, connect it to Railway or Render, set the ANTHROPIC_API_KEY environment variable, and deploy. Both platforms detect Node.js automatically and handle the rest. Docker (most portable):

FROM node:20-alpine
WORKDIR /app
COPY package*.json ./
RUN npm ci --production
COPY . .
RUN cd client && npm ci && npm run build
EXPOSE 3001
CMD ["node", "server.js"]

Build and run: docker build -t chatbot . && docker run -p 3001:3001 --env-file .env chatbot

Production Checklist

Before going live, verify these items:

Environment variables are set on the hosting platform, not hardcoded
CORS is restricted to your actual domain instead of allowing all origins
Rate limiting is configured appropriately for your expected traffic
HTTPS is enabled (most platforms handle this automatically)
Error logging is connected to a service like Sentry or LogTail so you catch issues in production
Conversation cleanup — add a TTL to your conversation store so old conversations are deleted after 24 hours, or switch to Redis for persistent storage with built-in expiration

Going Further

This chatbot is functional but intentionally minimal. Here are high-impact improvements worth implementing:

Persistent storage. Replace the in-memory Map with PostgreSQL or Redis. This lets conversations survive server restarts and enables multi-server deployments. Authentication. Add user accounts so conversations are private. A simple JWT-based auth system works well. Libraries like passport.js or lucia-auth handle the heavy lifting. File uploads. Claude's API supports image inputs. Add a file upload endpoint that converts images to base64 and includes them in the messages array. This enables vision-based conversations. System prompt customization. Let users configure the chatbot's personality. Store system prompts per conversation and let users modify them through a settings panel. Streaming markdown. Our current implementation re-renders the full markdown on every token. For smoother performance, look into incremental markdown parsing libraries that only process new content.

The core architecture we built — SSE streaming, conversation state management, and a clean separation between frontend and backend — scales cleanly as you add these features. Each improvement is additive rather than requiring a rewrite, which is the sign of a solid foundation.

How to Build an AI Chatbot From Scratch: A Step-by-Step Guide

How to Build an AI Chatbot From Scratch: A Step-by-Step Guide

Architecture Overview

Step 1: Choose Your Model API

Step 2: Set Up the Backend

Step 3: Build the Chat Endpoint with Streaming

Step 4: Add Conversation Management

Step 5: Build the Chat UI

Step 6: Handle Edge Cases

Token Limit Management

Rate Limiting

Graceful Error Recovery

Step 7: Add Markdown Rendering

Step 8: Deploy to Production

Build the Frontend

Serve Static Files from Express

Deploy to a Cloud Provider

Production Checklist

Going Further

Latest News

The Complete Guide to RAG Systems

10 Productivity Tips with AI Tools

Local AI: The Privacy-Preserving Tech Revolution of 2026