How to Install Bubblegum: Developer Guide (Vercel Edition)

Introduction

Modern AI tools like ChatGPT, Perplexity, and Claude are becoming key discovery engines — crawling, interpreting, and recommending brand content. To ensure these tools interpret your site accurately, Bubblegum offers a lightweight middleware solution.

This comprehensive guide covers the complete setup and deployment process for Bubblegum Middleware for Vercel-hosted sites.


What is Middleware?

AI agents are now an important segment of web traffic, discovering, indexing, and interacting with your digital content. This middleware acts as an intelligent router – seamlessly directing recognized AI bot traffic to AI-optimized, relevant pages while serving regular visitors as normal.

What are the key benefits?
  • AI Traffic Middleware: Detects AI bots using advanced user-agent filtering and pattern recognition

  • Seamless Routing: AI or LLM bot requests are routed to special "AI-ready" content (Markdown, structured data endpoints, etc.) set by your team

  • Multi-model Ready: Routes can be adapted or extended for different bot types (OpenAI, GoogleBot-AI, Perplexity, Claude, social/chat bots, etc.)

  • No Impact on UX or SEO: Human visitors and standard search engine crawlers are unaffected.

  • Comprehensive Logging: All routes, bot hits, and errors are logged for analytics


How It Works

1

Interception

The Middleware sits between your domain and all incoming traffic

2

Bot Detection

Each request's user-agent (and optionally headers) are checked against a curated list of bot patterns, including major LLMs, scrapers, and headless browsers

3

Dynamic Routing

  • AI bot requests: Redirected to an AI-ready version of the content (e.g., /about becomes /about.md)

  • Non-bot traffic: Served as usual with no disruption or latency


Setup Instructions (Vercel Projects)

Bubblegum supports two integration approaches for Vercel projects, depending on your framework and routing needs:

1. Using vercel.json (Framework Agnostic)

Works for any site hosted on Vercel, including static sites, non-Next.js frameworks, and custom setups.

Pros:

  • Framework agnostic: You don’t need Next.js or any specific tech stack.

  • Simple setup: One file, predictable rewrite rules.

Cons:

  • No fallback logic: If a page isn’t available in .md AI-optimized form, the bot will not get a fallback version.

  • Requires coverage: You'll need to ensure all intended pages are provided as .md versions.

How to Setup

  1. Add a vercel.json file at the root of your project, or update your existing one:

{
  "$schema": "https://openapi.vercel.sh/vercel.json",
  "rewrites": [
    {
      "source": "/:path*",
      "has": [
        {
          "type": "header",
          "key": "user-agent",
          "value": "(?i).*(GPTBot|ChatGPT-User|Anthropic-AI|ClaudeBot|Claude-Web|PerplexityBot|OAI-SearchBot|openai\\.com/bot).*"
        }
      ],
      "missing": [
        {
          "type": "query",
          "key": "path",
          "value": ".*\\.(css|js|jpg|jpeg|png|gif|svg|ico|woff|woff2|ttf|eot|pdf|zip|mp4|mp3|avi|mov|wmv|flv|webm|ogg|wav|aac|m4a|flac|doc|docx|xls|xlsx|ppt|pptx|txt|xml|json|yaml|yml|md|html|htm)$"
        }
      ],
      "destination": "https://<YOUR_BUBBLEGUM_MARKDOWN_DOMAIN>/:path*.md"
    }
  ]
}
  1. Commit and Deploy:

  • Commit your changes and trigger a redeploy in Vercel.

  • The rules are picked up automatically, no additional code is required.

2. Next.js Middleware Approach (Fine-Grained Control)

Gives you flexible, programmatic control—ideal for Next.js projects.

Pros:

  • Supports fallback logic: If a page doesn’t exist in .md, bots get standard site content as a fallback.

  • Customizable: Define bot detection, logging, IP range controls, etc.

Cons:

  • Requires basic knowledge of Next.js middleware/configuration.

How to Setup

  1. Add this middleware.ts file at the root of your Next.js project:

middleware.ts
import { NextRequest, NextResponse } from "next/server";
import { AI_IP_ADDRESSES } from "./ai-ip-addresses";

// Helper function to check if an IP address is in a CIDR range
function isIPInRange(ip: string, cidr: string): boolean {
  try {
    const [range, bits = "32"] = cidr.split("/");
    const mask = ~(2 ** (32 - parseInt(bits)) - 1);

    const ipNum = ip
      .split(".")
      .reduce((acc, octet) => (acc << 8) + parseInt(octet), 0);
    const rangeNum = range
      .split(".")
      .reduce((acc, octet) => (acc << 8) + parseInt(octet), 0);

    return (ipNum & mask) === (rangeNum & mask);
  } catch {
    return false;
  }
}

// Bot detection function
function isBot(userAgent: string | null, clientIP?: string): boolean {
  if (!userAgent) return false;

  const botPatterns = ["curl"];

  const AI_BOT_PATTERNS = [/ChatGPT|OpenAI|Claude|Perplexity|GoogleBot-AI/i];

  const lowerUserAgent = userAgent.toLowerCase();

  // Check string patterns
  const hasStringPattern = botPatterns.some((pattern) =>
    lowerUserAgent.includes(pattern)
  );

  // Check AI bot regex patterns
  const hasAIPattern = AI_BOT_PATTERNS.some((pattern) =>
    pattern.test(userAgent)
  );

  // Check if client IP is from known AI service ranges
  let isFromAIService = false;
  if (clientIP && AI_IP_ADDRESSES) {
    isFromAIService = AI_IP_ADDRESSES.some((cidr: string) =>
      isIPInRange(clientIP, cidr)
    );
  }

  return hasStringPattern || hasAIPattern || isFromAIService;
}

// Helper function to get client IP from various headers
function getClientIP(request: NextRequest): string | undefined {
  return (
    request.headers.get("cf-connecting-ip") ||
    request.headers.get("x-forwarded-for") ||
    request.headers.get("x-real-ip") ||
    undefined
  );
}

// Helper function to create AI URL with .md extension
function createAIUrl(originalUrl: URL, llmUrl: string): URL {
  const aiUrl = new URL(llmUrl);
  let pathname = originalUrl.pathname;

  // Add .md extension if path has no extension
  if (!pathname.includes(".") || pathname.endsWith("/")) {
    // If no extension or ends with slash, add .md
    pathname = pathname.endsWith("/")
      ? pathname + "index.md"
      : pathname + ".md";
  }

  // Preserve the path and query parameters
  aiUrl.pathname = pathname;
  aiUrl.search = originalUrl.search;

  return aiUrl;
}

export const config = {
  matcher: [
    /*
     * Match all request paths except for the ones starting with:
     * - api (API routes)
     * - _next/static (static files)
     * - _next/image (image optimization files)
     * - favicon.ico (favicon file)
     */
    "/((?!api|_next/static|_next/image|favicon.ico).*)",
  ],
};

export async function middleware(request: NextRequest) {
  const requestId = Math.random().toString(36).substring(7);
  const startTime = Date.now();

  console.log(`[${requestId}] Request started:`, {
    method: request.method,
    url: request.url,
    pathname: request.nextUrl.pathname,
    userAgent: request.headers.get("user-agent"),
    referer: request.headers.get("referer"),
    origin: request.headers.get("origin"),
    timestamp: new Date().toISOString(),
  });

  // Log URL pattern analysis for debugging client-side validation
  const pathname = request.nextUrl.pathname;

  // Check if pathname matches the expected pattern from the error
  const expectedPattern =
    /^.*\/(?:c|co|o|ac|cn|do|md|bin|sh|sim|e|ba)\/(?:c1-)?[a-zA-Z0-9=]+(?:\/(?:[a-z]{2,3}|zh-hans|zh-hant)(?:-[a-zA-Z0-9]+)?)?(\/|$)/;
  const matchesExpectedPattern = expectedPattern.test(pathname);

  console.log(`[${requestId}] URL pattern analysis:`, {
    pathname: pathname,
    matchesExpectedPattern: matchesExpectedPattern,
    isCartPath: pathname === "/cart" || pathname.startsWith("/cart"),
    isRootPath: pathname === "/" || pathname === "",
    hasExtension: pathname.includes("."),
  });

  try {
    // Check if this is bot traffic
    const userAgent = request.headers.get("user-agent") || null;
    const clientIP = getClientIP(request);
    const isBotTraffic = isBot(userAgent, clientIP);

    console.log(`[${requestId}] Bot detection:`, {
      userAgent: userAgent,
      clientIP: clientIP,
      isBot: isBotTraffic,
    });

    if (isBotTraffic) {
      console.log(`[${requestId}] Routing bot traffic to AI subdomain`);

      // Get LLM URL from environment variable
      const llmUrl = process.env.BUBBLEGUM_AI_URL;
      if (!llmUrl) {
        console.error(`[${requestId}] BUBBLEGUM_AI_URL environment variable not set`);
        // Fall through to normal processing
      } else {
        // Route bot traffic to AI subdomain
        const aiUrl = createAIUrl(request.nextUrl, llmUrl);

        console.log(`[${requestId}] AI URL constructed:`, {
          originalPath: pathname,
          aiPath: aiUrl.pathname,
          fullAiUrl: aiUrl.toString(),
        });

        try {
          // Forward the request to the AI subdomain
          const response = await fetch(aiUrl.toString(), {
            method: request.method,
            headers: Object.fromEntries(request.headers.entries()),
            body: request.body,
          });

          console.log(`[${requestId}] AI subdomain response:`, {
            status: response.status,
            statusText: response.statusText,
            duration: Date.now() - startTime,
          });

          if (response.status === 200) {
            // Get response body and headers
            const responseBody = await response.arrayBuffer();
            const responseHeaders = new Headers(response.headers);

            return new NextResponse(responseBody, {
              status: response.status,
              statusText: response.statusText,
              headers: responseHeaders,
            });
          } else {
            console.log(
              `[${requestId}] AI subdomain returned non-200, falling back to origin`
            );
          }
        } catch (error) {
          console.error(`[${requestId}] AI subdomain fetch failed:`, {
            error: error instanceof Error ? error.message : String(error),
            duration: Date.now() - startTime,
          });
          // Fall through to normal processing
        }
      }
    }

    // For non-bot traffic or when AI subdomain fails, continue with normal Next.js processing
    console.log(`[${requestId}] Continuing with normal Next.js processing`);
    return NextResponse.next();
  } catch (error) {
    console.error(`[${requestId}] Unexpected error in middleware:`, {
      error: error instanceof Error ? error.message : String(error),
      stack: error instanceof Error ? error.stack : undefined,
      duration: Date.now() - startTime,
    });

    // Return error response
    return new NextResponse(
      `Internal Server Error: ${
        error instanceof Error ? error.message : "Unknown error"
      }`,
      {
        status: 500,
        statusText: "Internal Server Error",
      }
    );
  }
}

  • All the code you provided, including helper functions, bot logic, config, and AI routing, should be copied verbatim.

  1. Add this auto-generated ai-ip-addresses.ts at the project root:

ai-ip-addresses.ts
// Auto-generated file - do not edit manually
// Last updated: 2025-09-04T16:14:30.557Z
// Source URLs: https://openai.com/gptbot.json, https://openai.com/chatgpt-user.json, https://openai.com/searchbot.json

export const AI_IP_ADDRESSES = [
  "52.230.152.0/24",
  "20.171.206.0/24",
  "20.171.207.0/24",
  "4.227.36.0/25",
  "20.125.66.80/28",
  "172.182.204.0/24",
  "172.182.214.0/24",
  "172.182.215.0/24",
  "23.98.142.176/28",
  "23.102.140.144/28",
  "13.65.138.112/28",
  "23.98.179.16/28",
  "13.65.138.96/28",
  "172.183.222.128/28",
  "20.102.212.144/28",
  "40.116.73.208/28",
  "172.183.143.224/28",
  "52.190.190.16/28",
  "13.83.237.176/28",
  "51.8.155.64/28",
  "74.249.86.176/28",
  "51.8.155.48/28",
  "20.55.229.144/28",
  "135.237.131.208/28",
  "135.237.133.48/28",
  "51.8.155.112/28",
  "135.237.133.112/28",
  "52.159.249.96/28",
  "52.190.137.16/28",
  "52.255.111.112/28",
  "40.84.181.32/28",
  "172.178.141.112/28",
  "52.190.142.64/28",
  "172.178.140.144/28",
  "52.190.137.144/28",
  "172.178.141.128/28",
  "57.154.187.32/28",
  "4.196.118.112/28",
  "20.193.50.32/28",
  "20.215.188.192/28",
  "20.215.214.16/28",
  "4.197.22.112/28",
  "4.197.115.112/28",
  "172.213.21.16/28",
  "172.213.11.144/28",
  "172.213.12.112/28",
  "172.213.21.144/28",
  "20.90.7.144/28",
  "57.154.175.0/28",
  "57.154.174.112/28",
  "52.236.94.144/28",
  "137.135.191.176/28",
  "23.98.186.192/28",
  "23.98.186.96/28",
  "23.98.186.176/28",
  "23.98.186.64/28",
  "68.221.67.192/28",
  "68.221.67.160/28",
  "13.83.167.128/28",
  "20.228.106.176/28",
  "52.159.227.32/28",
  "68.220.57.64/28",
  "172.213.21.112/28",
  "68.221.67.224/28",
  "68.221.75.16/28",
  "20.97.189.96/28",
  "52.252.113.240/28",
  "52.230.163.32/28",
  "172.212.159.64/28",
  "52.255.111.80/28",
  "52.255.111.0/28",
  "4.151.241.240/28",
  "52.255.111.32/28",
  "52.255.111.48/28",
  "52.255.111.16/28",
  "52.230.164.176/28",
  "52.176.139.176/28",
  "52.173.234.16/28",
  "4.151.71.176/28",
  "4.151.119.48/28",
  "52.255.109.112/28",
  "52.255.109.80/28",
  "20.161.75.208/28",
  "68.154.28.96/28",
  "52.255.109.128/28",
  "52.255.109.96/28",
  "52.255.109.144/28",
  "52.173.234.80/28",
  "132.196.82.48/28",
  "20.249.63.208/28",
  "20.63.221.64/28",
  "13.76.116.80/28",
  "20.235.87.224/28",
  "4.205.128.176/28",
  "52.225.75.208/28",
  "52.190.139.48/28",
  "68.221.67.240/28",
  "40.75.14.224/28",
  "135.119.134.192/28",
  "51.8.155.80/28",
  "135.119.134.128/28",
  "52.173.219.112/28",
  "52.242.132.224/28",
  "52.173.219.96/28",
  "52.242.132.240/28",
  "74.7.36.64/28",
  "74.7.36.96/28",
  "74.7.35.48/28",
  "74.7.35.112/28",
  "74.7.36.80/28",
  "52.156.77.144/28",
  "52.148.129.32/28",
  "20.117.22.224/28",
  "20.235.75.208/28",
  "172.204.16.64/28",
  "4.196.198.80/28",
  "20.194.157.176/28",
  "23.102.141.32/28",
  "52.173.235.80/28",
  "52.173.123.0/28",
  "40.84.221.208/28",
  "104.210.139.224/28",
  "20.0.53.96/28",
  "52.154.22.48/28",
  "52.242.245.208/28",
  "191.235.66.16/28",
  "191.233.196.112/28",
  "191.233.194.32/28",
  "23.97.109.224/28",
  "138.91.46.96/28",
  "13.76.32.208/28",
  "52.187.246.128/28",
  "13.70.107.160/28",
  "138.91.30.48/28",
  "20.210.154.128/28",
  "20.194.1.0/28",
  "20.194.0.208/28",
  "20.77.178.240/28",
  "4.234.83.96/28",
  "40.84.221.224/28",
  "104.210.139.192/28",
  "191.239.245.16/28",
  "191.234.167.128/28",
  "191.235.99.80/28",
  "191.235.98.144/28",
  "68.218.30.112/28",
  "4.197.19.176/28",
  "20.42.10.176/28",
  "172.203.190.128/28",
  "104.210.140.128/28",
  "51.8.102.0/24",
  "135.234.64.0/24",
  "172.182.195.48/28",
  "20.25.151.224/28",
  "20.171.53.224/28",
  "20.169.6.224/28",
  "172.182.193.80/28",
  "172.182.193.224/28",
  "172.182.194.32/28",
  "172.182.194.144/28",
  "172.182.213.192/28",
  "172.182.209.208/28",
  "172.182.224.0/28",
  "172.182.211.192/28",
  "20.169.7.48/28",
  "20.168.18.32/28",
  "20.171.123.64/28",
  "20.14.99.96/28",
] as const;

export const AI_IP_ADDRESSES_METADATA = {
  lastUpdated: "2025-09-04T16:14:30.557Z",
  sourceUrls: [
    "https://openai.com/gptbot.json",
    "https://openai.com/chatgpt-user.json",
    "https://openai.com/searchbot.json",
  ],
  totalCount: 168,
} as const;

export type AI_IP_ADDRESS = (typeof AI_IP_ADDRESSES)[number];

  • This ensures known bot IP detection stays current.

  • File will be updated as new AI services are onboarded.

  1. BUBBLEGUM_AI_URL:

  • The AI routing URL will be provided via onboarding, or you can request this from [email protected].

  1. Commit and Deploy (Agnostic):

  • Once both files are added/updated, commit changes and redeploy as usual.

  • The rules are picked up automatically, no additional code is required.


How to Test

To verify that bot redirects are working as expected:

  1. Use a tool like curl or Postman

  2. Send a request with one of the bot user-agent headers (e.g. GPTBot)

  3. Confirm the request is rewritten to the .md version hosted on your Bubblegum domain

Need help generating a test command? Reach out to us at [email protected]


Decision Table: Which Option Should You Use?

Approach
Works For
Fallback
Customization Level
Recommended For

vercel.json

Any Vercel site

Basic rewriting

Static sites, non-Next.js

Next.js middleware

Next.js only

Advanced (logging, IPs, custom routing)

Next.js, hybrid, or dynamic sites

Additional Resources


Frequently Asked Questions:

Q: Can I customize bot traffic destinations?

A: Yes. The middleware is fully configurable. You can define which AI-ready content (like Markdown files or structured endpoints) each AI bot should be routed to, allowing for granular control by page or bot type.

Q: How does the new markdown content work with our existing website and code?

A: You don’t need to change your site or code. We use something called “middleware” that runs through Cloudflare. It simply detects if a request is coming from a bot (like ChatGPT or Perplexity) and, if so, shows them your AI-optimized content. Regular visitors still see your usual site.

Q: Can I see which AI bots are visiting my storefront and what pages they access?

A: Yes. All bot hits and route matches are logged by the middleware. You can use this data to analyze AI bot traffic patterns, page-level access, and even export this for analytics dashboards.

Q: How do I test the routing for different types of AI bots before going live?

A: You can simulate bot traffic by using tools like curl or Postman and setting the User-Agent header to match known AI bots (e.g., ChatGPT, Claude, PerplexityBot). This allows you to verify detection and routing without deploying to production.

Q: Can I segment analytics to see where AI-driven purchases or conversions originate?

A: Yes, if you integrate the middleware logs with downstream analytics tools (e.g., Segment, GA4, or custom attribution systems), you can isolate AI-sourced traffic and measure downstream conversions or revenue impact.

Q: What happens if a bot is misidentified or a legitimate customer is mistaken for an AI bot?

A: The bot detection logic is carefully curated and updated to minimize false positives. In the rare case of misidentification, fallback behavior is non-disruptive—most misrouted users would still see product content, just via the AI-ready page.

Q: How does this work with my existing CDN, headless commerce platform, or CMS?

A: The middleware runs at the Cloudflare edge, before any request reaches your origin or CMS. This means it is compatible with any platform (e.g. Shopify, Contentful, Sanity, BigCommerce) without needing backend changes.

Q: Are there risks to privacy or customer data when directing bot traffic?

A: No. The middleware only affects bots—it does not expose any private customer data or session-specific content. AI-ready pages are typically static or structured summaries, ensuring privacy is maintained.

Last updated

Was this helpful?