Fix (not set) Hostname in GA4

The Case File

When you open your GA4 reports and see traffic attributed to (not set) as the hostname, you're looking at events that didn't originate from your website's tracking tag. The hostname dimension in GA4 identifies which domain sent the analytics hit—it's extracted automatically from the page URL when gtag.js or the GA4 configuration tag fires on your website.

When this dimension shows (not set), it means GA4 received event data without any URL context. This isn't web traffic in the traditional sense. It's almost always one of two things: server-side events sent via Measurement Protocol (intentional backend tracking) or bot spam attempting to pollute your data (malicious or accidental).

A high volume of (not set) hostname events—especially if they're marked as key events (conversions)—can distort your user metrics, inflate engagement rates, and make attribution analysis unreliable.

The Root Causes

1. Measurement Protocol Events Without page_location

GA4's Measurement Protocol allows you to send events directly to Google's servers from backend systems, CRMs, or server-side applications. These events bypass the browser entirely.

The hostname dimension is populated by parsing the page_location event parameter. If your Measurement Protocol implementation doesn't include page_location in the event payload, GA4 has no URL to extract a hostname from—resulting in (not set).

Example scenario: You're tracking offline conversions (e.g., phone call bookings, in-store purchases) via a CRM integration. The system sends a purchase event to GA4 but omits the page_location parameter because there was no webpage involved.

According to the Google Analytics API documentation, the hostname dimension is "populated by the event parameter page_location." Without it, the dimension cannot be populated.

2. Server-Side Google Tag Manager (sGTM) Misconfiguration

If you're using server-side GTM, events are proxied through your own server container before reaching GA4. If the server container doesn't forward the page_location parameter correctly—or if the GA4 client in sGTM is misconfigured—hostname data can be lost in transit.

This is especially common when:

  • Events are transformed or enriched server-side, and the page_location parameter is accidentally stripped

  • Custom server-side tags fire events without inheriting client-side context

3. Ghost Spam and Bot Traffic

Ghost spam is a legacy issue from Universal Analytics that still occasionally affects GA4. Malicious bots send fake hits directly to the Measurement Protocol endpoint without ever visiting your website. Because these hits never load your actual pages, they contain no hostname data.

While GA4's Measurement Protocol requires an API secret (making ghost spam harder than in UA), misconfigured or exposed API keys can still allow spam through. Additionally, some crawler bots may trigger tags in unusual ways that fail to capture the page URL.

As noted in research on bot detection in GA4, GA4's default bot filtering isn't comprehensive, and ghost spam can still report fake or missing hostnames.

4. Race Conditions and Data Layer Timing Issues

In rare cases, if a GTM tag fires before the page URL is available in the data layer (e.g., during a redirect or in a single-page application), the page_location parameter might be undefined when the event is sent. This is a timing issue, not a configuration error per se, but it produces the same symptom.

The "So What?" (Business Impact)

Polluted Conversion Data

If (not set) hostname events include key events, your conversion metrics are inflated with non-web activity. This breaks funnel analysis and makes it impossible to attribute conversions to specific pages or campaigns.

Skewed User Metrics

Bot spam with (not set) hostnames artificially inflates session counts, engagement rates, and active user metrics. This makes your data unreliable for executive reporting or A/B testing.

Broken Attribution

Without hostname context, you can't determine which subdomain, landing page, or website property drove the event. For multi-domain setups, this is catastrophic—you lose visibility into which site or campaign is performing.

Wasted Ad Spend

If your GA4 data feeds into Google Ads for conversion tracking or audience building, polluted data can trigger incorrect bidding strategies or audience exclusions, wasting budget on low-quality traffic.

The Investigation (How to Debug)

To confirm whether (not set) hostname is affecting your data, follow these steps:

Step 1: Create a Custom Exploration Report

  1. In GA4, go to Explore in the left sidebar

  2. Create a new Free Form exploration

  3. Add Hostname as a Row dimension

  4. Add Event count, Key events, and Users as Metrics

  5. Look for the (not set) row

Step 2: Analyze Event Composition

  1. Add Event name as a secondary dimension

  2. Check which specific events are showing (not set) hostname

  3. If you see events like purchase, generate_lead, or custom backend events, these are likely Measurement Protocol hits

  4. If you see generic events like page_view or session_start with (not set), this suggests bot spam

Step 3: Check Traffic Source

  1. Add Session source/medium as another dimension

  2. (not set) hostname combined with suspicious referrers (e.g., random domains, (direct)/(none) with zero engagement) points to spam

  3. Legitimate Measurement Protocol events will often show (direct)/(none) or match your CRM/backend source

Step 4: Review Key Events

Calculate the % of Total Key Events attributed to (not set). If this exceeds 5%, you have a data quality problem that requires immediate action.

The Solution (How to Fix)

Fix 1: Add page_location to Measurement Protocol Events

If you're intentionally using Measurement Protocol for server-side tracking, you must include the page_location parameter in your event payload.

Example (JSON payload):

{
  "client_id": "12345.67890",
  "events": [{
    "name": "purchase",
    "params": {
      "page_location": "https://www.yoursite.com/checkout/confirmation",
      "currency": "USD",
      "value": 99.99
    }
  }]
}

Even if the event didn't occur on a webpage, you can pass a logical page_location (e.g., https://www.yoursite.com/crm/offline-conversion) to ensure hostname is populated correctly.

According to Stack Overflow discussions, once page_location is passed, the hostname dimension is automatically populated by GA4.

Fix 2: Validate Server-Side GTM Configuration

If using server-side GTM:

  1. Go to your sGTM container

  2. Check your GA4 client configuration

  3. Ensure the client is correctly parsing and forwarding the page_location parameter from incoming requests

  4. Use Preview mode in sGTM to inspect event payloads and confirm page_location is present

Fix 3: Filter Out Bot Spam in GTM

To prevent bot spam from reaching GA4:

  1. In Google Tag Manager (web container), create a new Variable:

    • Type: JavaScript Variable

    • Variable Name: Hostname

    • Code: function() { return window.location.hostname; }

  2. Create a Trigger Exception on your GA4 Configuration Tag:

    • Trigger Type: Page View

    • Condition: Hostname does not match RegEx ^(www\.)?yoursite\.com$

This ensures the GA4 tag only fires on your legitimate domains, blocking hits from bots that don't actually load your pages.

Fix 4: Enable GA4 Data Filters (Post-Collection)

While GTM prevention is ideal, you can also filter (not set) hostname data in GA4:

  1. Go to Admin > Data Settings > Data Filters

  2. Create a new filter:

    • Filter Name: Exclude Missing Hostname

    • Filter Type: Internal Traffic (or create a custom filter)

    • Condition: Hostname does not equal (not set)

Note: GA4 doesn't natively support hostname-based data filters in the UI. For advanced filtering, you'll need to use BigQuery exports and filter at the analysis layer, or rely on GTM blocking (recommended).

Fix 5: Secure Your Measurement Protocol API Secret

If you suspect spam via Measurement Protocol:

  1. Go to Admin > Data Streams > Select your web stream

  2. Click Measurement Protocol API secrets

  3. Delete any exposed or old secrets

  4. Generate a new secret and update your backend integrations

  5. Never expose API secrets in client-side code or public repositories

Case Closed

Finding (not set) hostname manually requires building custom exploration reports, cross-referencing event names, and manually auditing Measurement Protocol implementations—a process that can take 30+ minutes per property.

The Watson Analytics Detective dashboard spots this Warning-level error instantly, alongside 60+ other data quality checks. It automatically calculates the percentage of key events affected, flags suspicious event patterns, and provides a visual breakdown of which events are missing hostname data—saving you hours of manual GA4 auditing.

Explore Watson Analytics Detective →

Previous
Previous

Fix Unfiltered Dev IP Referrals in GA4

Next
Next

Fix Referral Spam in GA4 – Diagnosis and Solution