AI-Generated Sessions in GA4: Do AI Tools Have a Crush on Your Website?

The Case File

Your GA4 dashboard shows thousands of sessions. Your traffic acquisition report lists Google, social media, and direct traffic. But there's a blind spot: AI-generated sessions from platforms like ChatGPT, Perplexity, Gemini, Claude, and DeepSeek are either invisible or misattributed as direct traffic.

This check measures whether your GA4 property is properly identifying and tracking sessions originating from AI platforms. When users click through from ChatGPT's search interface, ask Perplexity for recommendations, or follow a link from Gemini, those visits should appear as distinct referral sources—not vanish into the "direct" bucket.

The benchmark: AI-generated sessions should represent at least 0.1% of your total sessions. If you're seeing zero or near-zero AI traffic, you're not tracking a rapidly growing channel that now represents a meaningful share of web traffic across industries.

The Root Causes

1. GA4's Default Configuration Doesn't Isolate AI Traffic

GA4 tracks referral traffic automatically when a referrer header is passed. Most AI platforms—including ChatGPT, Perplexity, Gemini, and Claude—do send referrer information when users click links. However, GA4's default channel grouping lumps AI referrals into the generic "Referral" channel alongside hundreds of other domains.

Without a custom channel group or exploration report, AI traffic is technically tracked but practically invisible. You won't see "AI" as a line item in your Traffic Acquisition report unless you explicitly configure it.

2. Some AI Traffic Appears as Direct

Not all AI interactions generate trackable sessions. Consider these scenarios:

  • Copy-paste behavior: Users copy URLs from AI responses and paste them into browsers. No referrer header = direct traffic.

  • In-app browsers: Some AI platforms use embedded browsers that strip referrer data.

  • Mobile apps: ChatGPT's mobile app may not consistently pass referrer headers depending on the operating system and version.

According to industry research, while platforms like Perplexity and ChatGPT's web interface do pass referrers, a significant portion of AI-driven visits still appear as direct traffic due to user behavior patterns.

3. Blocked AI Crawlers vs. Browsing Agents

There's confusion between AI crawlers (like GPTBot, which trains models) and AI browsing agents (like ChatGPT-User, which fetches content for user queries).

Your robots.txt file might block crawlers:

Copy code

User-agent: GPTBot

Disallow: /

But this doesn't affect referral traffic. When a human clicks a link in ChatGPT, the request comes from ChatGPT-User, not GPTBot. Blocking GPTBot prevents OpenAI from training on your content—it doesn't prevent traffic tracking.

However, if you've blocked browsing agents (ChatGPT-User, PerplexityBot, GoogleOther), AI platforms can't fetch your content to display in responses, which indirectly reduces the likelihood of getting cited and receiving referral traffic.

4. Missing Custom Dimensions or Segments

GA4 doesn't automatically create an "AI" traffic segment. Without custom configuration, you must:

  • Manually filter the Session source dimension for domains like chatgpt.com, perplexity.ai, gemini.google.com, claude.ai, and deepseek.com

  • Create exploration reports with regex filters

  • Build custom channel groups

If you haven't done this, AI traffic exists in your data but remains hidden in aggregate reports.

5. Recent Platform Changes

AI platforms are evolving rapidly:

  • ChatGPT Search (launched late 2024) sends traffic with the referrer chatgpt.com

  • Perplexity Comet (launched 2025) may use different referrer patterns

  • Google AI Mode traffic appears with google.com as the referrer, making it indistinguishable from organic search without query parameter analysis

If your tracking setup predates these launches, you may be missing newer AI traffic sources.

The "So What?" (Business Impact)

1. You're Flying Blind on a High-Growth Channel

AI traffic grew 527% in 2025 according to recent studies. ChatGPT alone holds over 80% market share among AI chatbots, with Perplexity at 8-12%. Industry data shows AI traffic now represents 0.12% to 0.5% of total web sessions on average, with some sites seeing 5% or more.

If you're not tracking this channel, you can't:

  • Optimize content for AI visibility (known as AEO—AI Engine Optimization)

  • Measure ROI from AI platform citations

  • Understand user intent differences between AI-referred and search-referred visitors

2. Misattribution Breaks Your Marketing Analysis

When AI traffic is misattributed as direct, your reports show:

  • Inflated direct traffic percentages (making your brand appear stronger than it is)

  • Undervalued content marketing (blog posts cited by AI platforms get no credit)

  • Incorrect conversion attribution (AI-driven conversions appear as direct)

This distorts budget allocation decisions. If AI referrals convert well but appear as direct traffic, you might underinvest in AI optimization strategies.

3. Competitive Disadvantage

Your competitors who track AI traffic can:

  • Identify which content gets cited by AI platforms

  • Measure engagement quality (session duration, pages per session, conversion rate from AI vs. other sources)

  • Optimize for AI discoverability (structured data, clear answers, authoritative citations)

Research shows AI-referred traffic has different engagement patterns than search traffic. One case study found ChatGPT traffic had higher bounce rates but longer session durations compared to Google organic. Without tracking, you can't optimize for these behavioral differences.

4. Missed SEO and Content Strategy Signals

AI platforms are becoming answer engines, not just search engines. If you're getting zero AI traffic, it might indicate:

  • Your content isn't being indexed or cited by AI platforms

  • Your robots.txt blocks browsing agents

  • Your content format doesn't match AI citation patterns (lack of clear, concise answers)

This is a leading indicator for future organic search performance, as Google increasingly integrates AI Overviews into search results.

The Investigation (How to Debug)

Method 1: Check Traffic Acquisition Report with Filters

  1. Navigate to Reports > Acquisition > Traffic Acquisition

  2. Click the Session source dropdown and change to Session source / medium

  3. Use the search box to filter for AI domains:

    • Search for chatgpt

    • Search for perplexity

    • Search for gemini

    • Search for claude

    • Search for deepseek

If you see zero results for all these searches, AI traffic isn't being tracked or isn't reaching your site.

Method 2: Create a GA4 Exploration Report

  1. Go to Explore in the left navigation

  2. Click + (Blank) to create a new exploration

  3. Under Dimensions, click the + icon and add:

    • Page referrer

    • Session source

    • Landing page

  4. Under Metrics, add:

    • Sessions

    • Engaged sessions

    • Average engagement time

  5. Drag Page referrer to the Rows section

  6. Drag Sessions to the Values section

  7. Click Add filter and create a filter:

    • Dimension: Page referrer

    • Match type: matches regex

    • Expression: chatgpt\.com|perplexity\.ai|gemini\.google\.com|claude\.ai|deepseek\.com|chat\.openai\.com

This will show all sessions where the referrer matches known AI platforms.

Method 3: Check for Direct Traffic Anomalies

  1. In Traffic Acquisition, filter to Direct traffic

  2. Add Landing page as a secondary dimension

  3. Look for:

    • High-value landing pages (blog posts, product pages) with unusually high direct traffic

    • Landing pages that rank well in search but show high direct traffic percentages

    • New content that immediately gets direct traffic (unlikely without brand awareness)

These patterns suggest misattributed AI traffic.

Method 4: Verify robots.txt Configuration

  1. Navigate to yoursite.com/robots.txt

  2. Check for these user agents:

    • ChatGPT-User (OpenAI browsing agent)

    • GPTBot (OpenAI crawler for training)

    • PerplexityBot (Perplexity crawler)

    • GoogleOther (Google's AI agent)

    • ClaudeBot (Anthropic's crawler)

If you see Disallow: / for browsing agents (ChatGPT-User, GoogleOther), you're blocking AI platforms from accessing your content, which prevents citations and referral traffic.

The Solution (How to Fix)

Solution 1: Create a Custom AI Channel Group (Recommended)

This makes AI traffic visible in all standard GA4 reports.

Step 1: Access Channel Groups

  1. Click Admin (gear icon, bottom left)

  2. Under Data display, click Channel groups

  3. Click Create new channel group

  4. Name it: Custom channel group with AI

Step 2: Copy Default Channel Group

  1. GA4 will prompt you to start from the default. Click Continue

  2. This preserves your existing channel definitions

Step 3: Add AI Channel

  1. Click Add new channel at the top

  2. Name the channel: AI Referral

  3. Under Rules, configure:

    • Rule 1: Session source matches regex .*chatgpt.*|.*perplexity.*|.*claude.*|.*deepseek.*

    • Click Add condition for additional precision:

    • Rule 2: Session source matches regex chat\.openai\.com|chatgpt\.com|perplexity\.ai|claude\.ai|deepseek\.com|gemini\.google\.com

Step 4: Position the Channel

Use the drag handle to move AI Referral above the generic Referral channel. Channel groups are evaluated top-to-bottom, so AI traffic must be caught before falling into the broader Referral category.

Step 5: Save and Apply

  1. Click Save

  2. In your property settings, set this as your Default channel group

  3. Allow 24-48 hours for data to populate (the change is not retroactive)

Regex Pattern for Comprehensive Coverage:

regexCopy code

^.*ai|.*\.openai.*|.*chatgpt.*|.*perplexity.*|.*claude.*|.*gemini.*|.*deepseek.*|.*copilot\.microsoft\.com|.*you\.com

This pattern catches:

  • ChatGPT (chatgpt.com, chat.openai.com)

  • Perplexity (perplexity.ai)

  • Claude (claude.ai)

  • Gemini (gemini.google.com)

  • DeepSeek (deepseek.com)

  • Microsoft Copilot (copilot.microsoft.com)

  • You.com (AI search engine)

Solution 2: Create a Reusable Exploration Report

For immediate visibility without waiting for channel group data:

  1. Go to Explore > + Blank

  2. Name it: AI Traffic Analysis

  3. Add dimensions:

    • Session source

    • Session medium

    • Landing page

    • Device category

  4. Add metrics:

    • Sessions

    • Engaged sessions

    • Average engagement time

    • Conversions

    • Event count

  5. Create a Segment:

    • Click + (Segments) > Create custom segment

    • Name: AI Traffic

    • Condition: Session > Session source matches regex (use pattern above)

  6. Apply the segment to your exploration

This report shows AI traffic immediately, including historical data.

Solution 3: Configure robots.txt for AI Visibility

If you want AI platforms to cite your content (which drives referral traffic), ensure browsing agents are allowed:

For OpenAI (ChatGPT):

Copy code

User-agent: ChatGPT-User

Allow: /


User-agent: OAI-SearchBot

Allow: /

For Perplexity:

Copy code

User-agent: PerplexityBot

Allow: /

For Google (Gemini):

Copy code

User-agent: GoogleOther

Allow: /

For Anthropic (Claude):

Copy code

User-agent: ClaudeBot

Allow: /

For DeepSeek:

Copy code

User-agent: DeepSeekBot

Allow: /

Important distinction: If you want to prevent AI training on your content but still allow citations and traffic, block training crawlers but allow browsing agents:

Copy code

# Block training crawlers

User-agent: GPTBot

Disallow: /


User-agent: Google-Extended

Disallow: /


# Allow browsing agents (for citations and referral traffic)

User-agent: ChatGPT-User

Allow: /


User-agent: GoogleOther

Allow: /

Solution 4: Implement UTM Tracking for Controlled Tests

For campaigns or content specifically targeting AI platforms:

  1. Create UTM-tagged URLs: yoursite.com/page?utm_source=ai_test&utm_medium=ai_referral&utm_campaign=chatgpt_visibility

  2. Submit these URLs in AI platform interfaces

  3. Track performance in Campaigns reports

This provides definitive attribution but only works for content you can control.

Solution 5: Monitor with BigQuery (Advanced)

If you export GA4 data to BigQuery, query for AI referrals:

sqlCopy code

SELECT

  traffic_source.source AS session_source,

  traffic_source.medium AS session_medium,

  COUNT(DISTINCT CONCAT(user_pseudo_id, 

    CAST(event_timestamp AS STRING))) AS sessions,

  SUM(ecommerce.purchase_revenue) AS revenue

FROM

  `your_project.analytics_XXXXXX.events_*`

WHERE

  _TABLE_SUFFIX BETWEEN '20250101' AND '20251231'

  AND event_name = 'session_start'

  AND REGEXP_CONTAINS(traffic_source.source, 

    r'chatgpt|perplexity|claude|gemini|deepseek')

GROUP BY

  session_source, session_medium

ORDER BY

  sessions DESC

This provides historical analysis and custom attribution models.

Case Closed: Watson's Advantage

Manually checking for AI-generated sessions requires creating custom channel groups, building exploration reports with regex filters, and continuously updating patterns as new AI platforms emerge. You must remember to check for ChatGPT, Perplexity, Gemini, Claude, DeepSeek, and future platforms—each with potentially different referrer patterns.

The Watson Analytics Detective dashboard spots this Advice-level check instantly, displaying your AI-generated sessions as a percentage of total traffic alongside 60+ other data quality issues. Watson automatically monitors for all major AI platforms, alerts you when AI traffic drops below the 0.1% benchmark, and verifies that your robots.txt configuration allows AI browsing agents.

While this check is categorized as "Advice" (not Critical), it represents a strategic blind spot that grows more significant each quarter. In an era where AI platforms drive increasing traffic and influence content discovery, visibility into this channel is essential for competitive content strategy.

Discover what Watson sees in your data: www.analyticsdetectives.com/watson

Next
Next

Site Search Tracking in GA4: Setup & Debug