You collect IP addresses in your logs, your analytics, your audit trail. Under GDPR and other privacy regimes, IPs are personal data. The default of “keep everything indefinitely” doesn’t fly anymore. The standard response is IP anonymization — transforming IPs so they retain analytical value without identifying individuals.
This post covers the major anonymization techniques (truncation, hashing, salting, dropping), their trade-offs, when each is appropriate, and the patterns used in practice.
Why Anonymize
A few reasons:
Legal compliance
GDPR, CCPA, and similar require data minimization. Don’t collect or retain what you don’t need. For most use cases (analytics, fraud detection), the full IP isn’t required after the initial processing.
Risk reduction
A breach of anonymized data is less damaging than a breach of raw IPs. Even if attackers get the data, they can’t trivially identify users.
Trust signaling
“We anonymize IPs after 24 hours” is a credible privacy commitment to display in policies.
Cost
Less data to store, fewer compliance reviews, simpler subject access requests.
Anonymization Techniques
Truncation
Drop the last N bits of the IP. The most common approach.
Original IPv4: 192.168.42.123
Truncated /24: 192.168.42.0 ← last octet zeroed
Truncated /16: 192.168.0.0 ← last two octets zeroed
Original IPv6: 2001:db8:1234:5678:90ab:cdef:1234:5678
Truncated /48: 2001:db8:1234:: ← last 80 bits zeroed
Pros:
- Trivial to implement.
- Preserves geographic and network-level information (you can still geolocate to country/region; ASN may still resolve).
- One-way; can’t reverse to recover the original.
Cons:
- Multiple users on the same /24 collapse to the same anonymized value. Useful or not depending on context.
This is what Google Analytics does with its “IP Anonymization” feature: truncate to /24 for IPv4, /48 for IPv6.
Hashing
Apply a one-way hash to the IP:
import crypto from 'node:crypto'
function hashIp(ip: string): string {
return crypto.createHash('sha256').update(ip).digest('hex')
}
hashIp('192.168.42.123')
// → 'f9b8e8...'
Pros:
- Unique IPs map to unique hashes (you can still count distinct users).
- Reverse-engineering requires a lookup table.
Cons:
- A small IP space (4 billion IPv4 addresses) is feasible to brute-force into a lookup table. Plain hashing is not anonymization for a small input space.
Hashing with salt
Better — add a secret to the hash:
function hashIp(ip: string, salt: string): string {
return crypto.createHash('sha256').update(salt + ip).digest('hex')
}
If the salt is secret, attackers can’t precompute the hash table.
Pros:
- Hashes are unique per user (with same salt).
- Salt makes reverse-engineering infeasible.
Cons:
- If the salt leaks (via breach), all hashes can be reverse-engineered.
- Rotating the salt invalidates existing data.
For real anonymization, use a strong salt and rotate periodically.
Dropping
Just don’t keep the IP. Process it inline (e.g., look up country, then discard):
const country = await getCountry(ip)
delete req.ip // not stored
log.info({ country, path: req.url }) // only what's useful
Pros:
- Strongest privacy. Nothing to leak.
- Simplest from a compliance perspective.
Cons:
- Lose all per-user / per-IP analysis after the initial transaction.
Often combined with truncation/hashing for raw logs while keeping the derived fields (country, ASN) for analytics.
Pseudonymization with rotating ID
Map each IP to a random pseudonym; rotate periodically:
function pseudonymize(ip: string): string {
return ipToPseudonymCache.getOrCompute(ip, () => generateRandomId())
}
With a daily rotation, you can count distinct users per day but can’t link across days.
Pros:
- Strong privacy.
- Useful for cross-session analytics within a short window.
Cons:
- More complex to implement.
- The pseudonymization mapping is itself sensitive data.
What Each Technique Allows
Choose by what analyses you need:
| Need | Drop | Truncate /24 | Hash | Salted hash | Pseudonymize |
|---|---|---|---|---|---|
| Count distinct users (lifetime) | No | Approximate | Yes | Yes | No |
| Count distinct users (per day) | No | Approximate | Yes | Yes | Yes |
| Geolocate to country | If kept derivatively | Yes | No | No | No |
| Identify a specific user | No | No | Yes (with table) | No | No |
| Detect specific abuser | No | Partially | Yes | Yes (within salt life) | Within day |
| Compliance friendliness | Highest | High | Lower | Higher | High |
For most analytics, truncation is the sweet spot. For things like rate-limiting and fraud detection where you need to identify the same source over time, salted hashing works.
When to Anonymize
A few common patterns:
Immediate
Don’t even log the full IP. Derive fields you need (country, ASN, hash), log those, drop the IP.
Retention-based
Log full IP for 24-72 hours for fraud / abuse investigation. After that, truncate or drop.
Aggregation
For analytics dashboards, aggregate by /24 (or country) and discard individual logs.
The right pattern depends on what you actually need the IPs for.
Anonymization in Common Tools
Google Analytics
Provides aip parameter to anonymize the IP at collection (truncates /24 IPv4, /48 IPv6). On by default in EU.
Cloudflare
Their analytics aggregate by default; raw IPs aren’t exposed except in logs you explicitly enable.
Matomo (formerly Piwik)
Has built-in IP anonymization at configurable byte counts.
CDN logs (Cloudflare, AWS CloudFront)
Log raw IPs by default. Configure to mask or drop fields you don’t need.
Implementation Examples
Express middleware that truncates
import { Request, Response, NextFunction } from 'express'
function truncateIp(ip: string): string {
if (ip.includes(':')) {
// IPv6: zero last 80 bits
const parts = ip.split(':').slice(0, 3)
return parts.join(':') + '::'
} else {
// IPv4: zero last octet
const parts = ip.split('.').slice(0, 3)
return parts.concat('0').join('.')
}
}
app.use((req: Request, _res: Response, next: NextFunction) => {
;(req as any).anonIp = truncateIp(req.ip)
next()
})
Python with logging integration
import ipaddress
def truncate_ip(ip_str: str) -> str:
ip = ipaddress.ip_address(ip_str)
if ip.version == 4:
network = ipaddress.ip_network(f'{ip_str}/24', strict=False)
else:
network = ipaddress.ip_network(f'{ip_str}/48', strict=False)
return str(network.network_address)
Storing Derived Fields Instead
A better pattern than raw or truncated IPs for many use cases:
const result = await convertIP(req.ip)
const logEntry = {
country: result.data?.continent?.country?.code,
asn: result.data?.asn?.number,
asnType: result.data?.asn?.type,
truncatedIp: truncateIp(req.ip),
// Don't store raw IP
timestamp: new Date(),
}
The country and ASN are what most analytics actually want. Keep those; discard the IP.
Geolocation Accuracy After Truncation
A subtle question: if you’ve truncated, can you still geolocate?
For /24 truncation, mostly yes. The /24 generally belongs to one ISP in one geographic area. You lose the specific IP but keep the region.
For /16 truncation, less. A /16 might span multiple cities or even countries.
If you need geolocation, do it before truncation. Store the country/region as a separate field; truncate the IP for storage.
Combining with Rate Limiting
For rate limiting, you need a stable identifier. Anonymized values still work:
- Truncated IP rate-limits by /24 — affects everyone behind a NAT.
- Hashed IP rate-limits per unique IP (with secret salt).
- Per-ASN (see rate limiting by ASN) works on the ASN field, not the IP.
Choose based on the granularity you want.
Audit Logs and Retention
For security-sensitive logs (auth events, admin actions), the regulatory bar is different:
- Some regulations require keeping logs with original IPs for 6 months to 7 years.
- Many regulations explicitly allow IP storage for legitimate security purposes.
Distinguish your “general analytics” logs from “security audit” logs. Anonymize aggressively in the first; retain longer with controlled access in the second.
TL;DR
- IPs are personal data under GDPR; anonymization reduces compliance risk.
- Truncation (zero last octets) preserves geographic info; the simplest approach.
- Hashing without salt is reversible for IPs; not anonymization.
- Salted hashing is real anonymization if the salt is secret.
- Dropping is the strongest privacy.
- Store derived fields (country, ASN) instead of raw IPs where possible.
- Different retention for analytics vs security audit logs.
- Geolocate before truncating if you need country/region data.
IP anonymization is one of those “do it right once, save compliance time forever” investments. The tools are simple; the patterns are well-established. For the legal context, see GDPR and IP addresses; for related signal-based decisions that work on anonymized data, IP reputation scores.