PII in GA4: Diagnosis and Solution
The Case File
Your Google Analytics 4 property is collecting Personally Identifiable Information (PII) in page URLs. This is a critical violation of Google's Terms of Service that can result in immediate account suspension.
PII includes email addresses, phone numbers, mailing addresses, social security numbers, full names, and precise geolocation data. When these data points appear in your page_location dimension—the full URL GA4 tracks for every pageview—you're in breach of contract. The goal is zero PII. Any detection is a red flag.
This check scans your GA4 data for patterns that match common PII formats: email addresses containing "@" symbols, numeric sequences resembling phone numbers or SSNs, and address-like strings in URL parameters or paths.
The Root Causes (Why This Happens)
PII doesn't magically appear in your analytics. It leaks through specific technical failures:
1. GET-Based Form Submissions
When forms use the GET method instead of POST, form field values append to the URL as query parameters. A login form with method="GET" creates URLs like example.com/login?email=user@domain.com&password=12345. GA4 automatically captures the full URL, including these parameters.
2. Client-Side URL Manipulation
Developers sometimes pass user data through URL parameters for routing or state management. Single-page applications (SPAs) built with React, Vue, or Angular may construct URLs like /profile?user_email=john.doe@company.com without sanitizing the data layer.
3. Third-Party Tools and Plugins
Marketing automation platforms, CRM integrations, and email service providers often append tracking parameters that include PII. A campaign link might look like ?email={{contact.email}}&phone={{contact.phone}} if merge tags aren't properly configured.
4. Search Functionality
On-site search bars can capture PII when users type personal information into search fields. If your search uses GET parameters (/search?q=john.smith@email.com), that query becomes part of the page URL GA4 tracks.
5. Referrer Leakage
External sites can pass PII through referrer strings. If a user clicks a link from a third-party platform that includes their email in the URL, GA4 captures that referrer as part of the session data.
6. GTM Variable Misconfiguration
Custom GTM variables that pull data from the data layer, cookies, or JavaScript variables may inadvertently include PII. A poorly configured Page URL or Page Path variable can pass unfiltered data to GA4 tags.
The "So What?" (Business Impact)
This isn't a minor technical hiccup. The consequences are severe:
Legal Exposure: Under GDPR, CCPA, and similar privacy regulations, collecting PII without proper consent and data processing agreements can trigger fines up to €20 million or 4% of annual global revenue. Google's Terms of Service explicitly prohibit PII collection, making violations a contractual breach.
Account Suspension: Google can suspend your GA4 property without warning if automated systems or manual reviews detect PII. You lose all historical data, real-time reporting, and the ability to track conversions. There's no guaranteed appeal process.
Data Integrity Collapse: PII in URLs creates data fragmentation. Each unique email or phone number generates a separate page path, inflating your page count and making analysis impossible. Your Pages and Screens report becomes unusable noise.
Compliance Audit Failures: If you're subject to SOC 2, ISO 27001, or HIPAA audits, PII in analytics systems is an automatic finding. Remediation is costly and damages stakeholder trust.
The Investigation (How to Debug)
You can manually audit for PII without Watson using GA4's native interface:
Navigate to Reports > Engagement > Pages and screens
In the data table, click the search icon (magnifying glass) above the Page path and screen class dimension
Search for common PII patterns:
@ (email addresses)
email= (query parameter)
phone= or tel=
Sequences of 9-10 digits (phone numbers, SSNs)
For deeper analysis, go to Explore > Free form
Add Page location as a dimension (not Page path—you need the full URL with query parameters)
Add a filter: Page location > contains > @
Review results for email patterns, then repeat with filters like phone=, ssn=, or regex patterns if your Exploration supports advanced filtering
This manual process is time-intensive and prone to false negatives. You're searching for needles in a haystack of thousands of URLs.
The Solution (How to Fix)
Fix PII leakage at three levels: prevention, redaction, and remediation.
Level 1: Enable GA4 Data Redaction (Immediate)
Google introduced client-side data redaction in late 2023. This feature automatically strips PII before data leaves the browser.
Step-by-step:
Go to Admin (gear icon, bottom left)
Under Data collection and modification, click Data streams
Select your web data stream
Click Configure tag settings (top right)
Scroll to Redact data and click it
Toggle on Email redaction (automatically detects and masks email addresses)
Toggle on Redact URL query parameters
In the text field, list query parameter names to redact (comma-separated): email,phone,user_email,customer_email,tel,ssn,userid
Click Test and enter sample URLs to verify redaction works
Click Save
Important: This only affects new data. Historical PII remains in your reports.
Level 2: GTM-Based URL Sanitization (Advanced)
For granular control, use Google Tag Manager to strip PII before GA4 tags fire.
Create a Custom JavaScript Variable:
In GTM, go to Variables > New > Custom JavaScript
Name it JS - Clean Page URL
Paste this code (adapted from Simo Ahava's PII removal script):
javascriptCopy code
function() {
var url = {{Page URL}}; // Built-in GTM variable
// Remove email addresses
url = url.replace(/([a-zA-Z0-9._-]+@[a-zA-Z0-9._-]+\.[a-zA-Z0-9_-]+)/gi, '[email-redacted]');
// Remove common PII query parameters
url = url.replace(/([?&])(email|phone|tel|ssn|user_email|customer_email)=[^&]*/gi, '$1$2=[redacted]');
// Remove numeric sequences 9+ digits (phone/SSN patterns)
url = url.replace(/\b\d{9,}\b/g, '[number-redacted]');
return url;
}
Open in CodePen
In your GA4 Configuration Tag, add a Fields to Set parameter:
Field Name: page_location
Value: {{JS - Clean Page URL}}
This overrides the default page URL with your sanitized version.
Level 3: Prevent PII at the Source (Best Practice)
For Developers:
Change all forms to method="POST" instead of GET
Never pass PII in URL parameters—use session storage, cookies (with HttpOnly flags), or server-side session management
Sanitize data layer pushes: dataLayer.push({'user_id': hashedId}) instead of raw emails
Implement URL rewriting rules to strip sensitive parameters at the server level
For Marketers:
Audit email campaign links and UTM parameters for merge tags that might insert PII
Configure CRM integrations to hash or tokenize user identifiers before appending to URLs
Review third-party tracking pixels and ensure they don't pass PII in their callbacks
Remediation: Delete Historical PII
If PII already exists in GA4 reports, you must request deletion:
Go to Admin > Data deletion requests (under Data settings)
Create a new request specifying the date range and affected data streams
Note: Deletion is irreversible and can take 45+ days to process
For immediate mitigation, apply data filters in reports to exclude pages containing PII patterns.
Case Closed
Manually hunting for PII across thousands of page URLs is tedious and error-prone. You need regex expertise, deep GA4 knowledge, and hours of exploration time.
The Watson Analytics Detective dashboard spots this Critical error instantly. It scans your entire GA4 property, flags suspected PII by type (email, phone, address), and shows you exactly which pages are affected—alongside 60+ other data quality checks. No manual digging. No missed violations.
Investigate faster: www.analyticsdetectives.com/watson