AI-Generated Sessions in GA4: Do AI Tools Have a Crush on Your Website?
The Case File
Your GA4 dashboard shows thousands of sessions. Your traffic acquisition report lists Google, social media, and direct traffic. But there's a blind spot: AI-generated sessions from platforms like ChatGPT, Perplexity, Gemini, Claude, and DeepSeek are either invisible or misattributed as direct traffic.
This check measures whether your GA4 property is properly identifying and tracking sessions originating from AI platforms. When users click through from ChatGPT's search interface, ask Perplexity for recommendations, or follow a link from Gemini, those visits should appear as distinct referral sources—not vanish into the "direct" bucket.
The benchmark: AI-generated sessions should represent at least 0.1% of your total sessions. If you're seeing zero or near-zero AI traffic, you're not tracking a rapidly growing channel that now represents a meaningful share of web traffic across industries.
The Root Causes
1. GA4's Default Configuration Doesn't Isolate AI Traffic
GA4 tracks referral traffic automatically when a referrer header is passed. Most AI platforms—including ChatGPT, Perplexity, Gemini, and Claude—do send referrer information when users click links. However, GA4's default channel grouping lumps AI referrals into the generic "Referral" channel alongside hundreds of other domains.
Without a custom channel group or exploration report, AI traffic is technically tracked but practically invisible. You won't see "AI" as a line item in your Traffic Acquisition report unless you explicitly configure it.
2. Some AI Traffic Appears as Direct
Not all AI interactions generate trackable sessions. Consider these scenarios:
Copy-paste behavior: Users copy URLs from AI responses and paste them into browsers. No referrer header = direct traffic.
In-app browsers: Some AI platforms use embedded browsers that strip referrer data.
Mobile apps: ChatGPT's mobile app may not consistently pass referrer headers depending on the operating system and version.
According to industry research, while platforms like Perplexity and ChatGPT's web interface do pass referrers, a significant portion of AI-driven visits still appear as direct traffic due to user behavior patterns.
3. Blocked AI Crawlers vs. Browsing Agents
There's confusion between AI crawlers (like GPTBot, which trains models) and AI browsing agents (like ChatGPT-User, which fetches content for user queries).
Your robots.txt file might block crawlers:
Copy code
User-agent: GPTBot
Disallow: /
But this doesn't affect referral traffic. When a human clicks a link in ChatGPT, the request comes from ChatGPT-User, not GPTBot. Blocking GPTBot prevents OpenAI from training on your content—it doesn't prevent traffic tracking.
However, if you've blocked browsing agents (ChatGPT-User, PerplexityBot, GoogleOther), AI platforms can't fetch your content to display in responses, which indirectly reduces the likelihood of getting cited and receiving referral traffic.
4. Missing Custom Dimensions or Segments
GA4 doesn't automatically create an "AI" traffic segment. Without custom configuration, you must:
Manually filter the Session source dimension for domains like chatgpt.com, perplexity.ai, gemini.google.com, claude.ai, and deepseek.com
Create exploration reports with regex filters
Build custom channel groups
If you haven't done this, AI traffic exists in your data but remains hidden in aggregate reports.
5. Recent Platform Changes
AI platforms are evolving rapidly:
ChatGPT Search (launched late 2024) sends traffic with the referrer chatgpt.com
Perplexity Comet (launched 2025) may use different referrer patterns
Google AI Mode traffic appears with google.com as the referrer, making it indistinguishable from organic search without query parameter analysis
If your tracking setup predates these launches, you may be missing newer AI traffic sources.
The "So What?" (Business Impact)
1. You're Flying Blind on a High-Growth Channel
AI traffic grew 527% in 2025 according to recent studies. ChatGPT alone holds over 80% market share among AI chatbots, with Perplexity at 8-12%. Industry data shows AI traffic now represents 0.12% to 0.5% of total web sessions on average, with some sites seeing 5% or more.
If you're not tracking this channel, you can't:
Optimize content for AI visibility (known as AEO—AI Engine Optimization)
Measure ROI from AI platform citations
Understand user intent differences between AI-referred and search-referred visitors
2. Misattribution Breaks Your Marketing Analysis
When AI traffic is misattributed as direct, your reports show:
Inflated direct traffic percentages (making your brand appear stronger than it is)
Undervalued content marketing (blog posts cited by AI platforms get no credit)
Incorrect conversion attribution (AI-driven conversions appear as direct)
This distorts budget allocation decisions. If AI referrals convert well but appear as direct traffic, you might underinvest in AI optimization strategies.
3. Competitive Disadvantage
Your competitors who track AI traffic can:
Identify which content gets cited by AI platforms
Measure engagement quality (session duration, pages per session, conversion rate from AI vs. other sources)
Optimize for AI discoverability (structured data, clear answers, authoritative citations)
Research shows AI-referred traffic has different engagement patterns than search traffic. One case study found ChatGPT traffic had higher bounce rates but longer session durations compared to Google organic. Without tracking, you can't optimize for these behavioral differences.
4. Missed SEO and Content Strategy Signals
AI platforms are becoming answer engines, not just search engines. If you're getting zero AI traffic, it might indicate:
Your content isn't being indexed or cited by AI platforms
Your robots.txt blocks browsing agents
Your content format doesn't match AI citation patterns (lack of clear, concise answers)
This is a leading indicator for future organic search performance, as Google increasingly integrates AI Overviews into search results.
The Investigation (How to Debug)
Method 1: Check Traffic Acquisition Report with Filters
Navigate to Reports > Acquisition > Traffic Acquisition
Click the Session source dropdown and change to Session source / medium
Use the search box to filter for AI domains:
Search for chatgpt
Search for perplexity
Search for gemini
Search for claude
Search for deepseek
If you see zero results for all these searches, AI traffic isn't being tracked or isn't reaching your site.
Method 2: Create a GA4 Exploration Report
Go to Explore in the left navigation
Click + (Blank) to create a new exploration
Under Dimensions, click the + icon and add:
Page referrer
Session source
Landing page
Under Metrics, add:
Sessions
Engaged sessions
Average engagement time
Drag Page referrer to the Rows section
Drag Sessions to the Values section
Click Add filter and create a filter:
Dimension: Page referrer
Match type: matches regex
Expression: chatgpt\.com|perplexity\.ai|gemini\.google\.com|claude\.ai|deepseek\.com|chat\.openai\.com
This will show all sessions where the referrer matches known AI platforms.
Method 3: Check for Direct Traffic Anomalies
In Traffic Acquisition, filter to Direct traffic
Add Landing page as a secondary dimension
Look for:
High-value landing pages (blog posts, product pages) with unusually high direct traffic
Landing pages that rank well in search but show high direct traffic percentages
New content that immediately gets direct traffic (unlikely without brand awareness)
These patterns suggest misattributed AI traffic.
Method 4: Verify robots.txt Configuration
Navigate to yoursite.com/robots.txt
Check for these user agents:
ChatGPT-User (OpenAI browsing agent)
GPTBot (OpenAI crawler for training)
PerplexityBot (Perplexity crawler)
GoogleOther (Google's AI agent)
ClaudeBot (Anthropic's crawler)
If you see Disallow: / for browsing agents (ChatGPT-User, GoogleOther), you're blocking AI platforms from accessing your content, which prevents citations and referral traffic.
The Solution (How to Fix)
Solution 1: Create a Custom AI Channel Group (Recommended)
This makes AI traffic visible in all standard GA4 reports.
Step 1: Access Channel Groups
Click Admin (gear icon, bottom left)
Under Data display, click Channel groups
Click Create new channel group
Name it: Custom channel group with AI
Step 2: Copy Default Channel Group
GA4 will prompt you to start from the default. Click Continue
This preserves your existing channel definitions
Step 3: Add AI Channel
Click Add new channel at the top
Name the channel: AI Referral
Under Rules, configure:
Rule 1: Session source matches regex .*chatgpt.*|.*perplexity.*|.*claude.*|.*deepseek.*
Click Add condition for additional precision:
Rule 2: Session source matches regex chat\.openai\.com|chatgpt\.com|perplexity\.ai|claude\.ai|deepseek\.com|gemini\.google\.com
Step 4: Position the Channel
Use the drag handle to move AI Referral above the generic Referral channel. Channel groups are evaluated top-to-bottom, so AI traffic must be caught before falling into the broader Referral category.
Step 5: Save and Apply
Click Save
In your property settings, set this as your Default channel group
Allow 24-48 hours for data to populate (the change is not retroactive)
Regex Pattern for Comprehensive Coverage:
regexCopy code
^.*ai|.*\.openai.*|.*chatgpt.*|.*perplexity.*|.*claude.*|.*gemini.*|.*deepseek.*|.*copilot\.microsoft\.com|.*you\.com
This pattern catches:
ChatGPT (chatgpt.com, chat.openai.com)
Perplexity (perplexity.ai)
Claude (claude.ai)
Gemini (gemini.google.com)
DeepSeek (deepseek.com)
Microsoft Copilot (copilot.microsoft.com)
You.com (AI search engine)
Solution 2: Create a Reusable Exploration Report
For immediate visibility without waiting for channel group data:
Go to Explore > + Blank
Name it: AI Traffic Analysis
Add dimensions:
Session source
Session medium
Landing page
Device category
Add metrics:
Sessions
Engaged sessions
Average engagement time
Conversions
Event count
Create a Segment:
Click + (Segments) > Create custom segment
Name: AI Traffic
Condition: Session > Session source matches regex (use pattern above)
Apply the segment to your exploration
This report shows AI traffic immediately, including historical data.
Solution 3: Configure robots.txt for AI Visibility
If you want AI platforms to cite your content (which drives referral traffic), ensure browsing agents are allowed:
For OpenAI (ChatGPT):
Copy code
User-agent: ChatGPT-User
Allow: /
User-agent: OAI-SearchBot
Allow: /
For Perplexity:
Copy code
User-agent: PerplexityBot
Allow: /
For Google (Gemini):
Copy code
User-agent: GoogleOther
Allow: /
For Anthropic (Claude):
Copy code
User-agent: ClaudeBot
Allow: /
For DeepSeek:
Copy code
User-agent: DeepSeekBot
Allow: /
Important distinction: If you want to prevent AI training on your content but still allow citations and traffic, block training crawlers but allow browsing agents:
Copy code
# Block training crawlers
User-agent: GPTBot
Disallow: /
User-agent: Google-Extended
Disallow: /
# Allow browsing agents (for citations and referral traffic)
User-agent: ChatGPT-User
Allow: /
User-agent: GoogleOther
Allow: /
Solution 4: Implement UTM Tracking for Controlled Tests
For campaigns or content specifically targeting AI platforms:
Create UTM-tagged URLs: yoursite.com/page?utm_source=ai_test&utm_medium=ai_referral&utm_campaign=chatgpt_visibility
Submit these URLs in AI platform interfaces
Track performance in Campaigns reports
This provides definitive attribution but only works for content you can control.
Solution 5: Monitor with BigQuery (Advanced)
If you export GA4 data to BigQuery, query for AI referrals:
sqlCopy code
SELECT
traffic_source.source AS session_source,
traffic_source.medium AS session_medium,
COUNT(DISTINCT CONCAT(user_pseudo_id,
CAST(event_timestamp AS STRING))) AS sessions,
SUM(ecommerce.purchase_revenue) AS revenue
FROM
`your_project.analytics_XXXXXX.events_*`
WHERE
_TABLE_SUFFIX BETWEEN '20250101' AND '20251231'
AND event_name = 'session_start'
AND REGEXP_CONTAINS(traffic_source.source,
r'chatgpt|perplexity|claude|gemini|deepseek')
GROUP BY
session_source, session_medium
ORDER BY
sessions DESC
This provides historical analysis and custom attribution models.
Case Closed: Watson's Advantage
Manually checking for AI-generated sessions requires creating custom channel groups, building exploration reports with regex filters, and continuously updating patterns as new AI platforms emerge. You must remember to check for ChatGPT, Perplexity, Gemini, Claude, DeepSeek, and future platforms—each with potentially different referrer patterns.
The Watson Analytics Detective dashboard spots this Advice-level check instantly, displaying your AI-generated sessions as a percentage of total traffic alongside 60+ other data quality issues. Watson automatically monitors for all major AI platforms, alerts you when AI traffic drops below the 0.1% benchmark, and verifies that your robots.txt configuration allows AI browsing agents.
While this check is categorized as "Advice" (not Critical), it represents a strategic blind spot that grows more significant each quarter. In an era where AI platforms drive increasing traffic and influence content discovery, visibility into this channel is essential for competitive content strategy.
Discover what Watson sees in your data: www.analyticsdetectives.com/watson