You Can’t Hunt What You Can’t See: Building a Telemetry Coverage Map

Great hunts die in blind spots. Whether you’re a government/LE team building a case or an enterprise SOC trying to catch early-stage intrusions, you can only investigate what your data can reveal. A telemetry coverage map makes gaps explicit, prioritizes fixes, and keeps hunts grounded in reality.

What a Coverage Map Is (and Isn’t)

A coverage map is a living inventory of what you collect, at what fidelity, with what retention, and where signals join. It’s not a logging spreadsheet. It ties data lanes to attack stages and priority questions (PIRs), so people know which hunts are feasible today—and which require new data.

The Five Lanes to Get Right

Endpoint/Workload

Process lineage, command lines, module loads, persistence, script interpreters, archival tools.
Signals: living-off-the-land, staging before exfil, credential access.
Identity/Access

IdP/SSO logs, MFA events, device posture, consent grants, admin actions.
Signals: valid-account abuse, token theft, OAuth consent abuse, privilege escalation.
Network/DNS/TLS/Egress

Resolver logs, SNI/JA3, proxy/firewall egress, NetFlow.
Signals: C2 beacons, DNS exfil, callbacks to first-seen infrastructure.
Cloud/SaaS Control Plane

API calls, service principal activity, resource creation/deletion, cross-tenant access.
Signals: control-plane abuse, persistence without malware, data access anomalies.
Application/Runtime

AuthZ failures, high-volume exports, file/object reads/writes, service-to-service calls.
Signals: data staging, lateral movement inside the app, insider misuse.

Augment with threat intelligence (infrastructure reuse, actor TTPs, exploitation-in-the-wild) to add context and prioritization.

How to Build the Map (Fast)

Step 1 — Define your questions.
List 5–10 PIR-style questions you need to answer (e.g., “Can we see and stop token-free sessions?” “Can we spot DNS exfil?”).

Step 2 — Score each lane per question (0–3).

0: No coverage

1: Partial/low-fidelity (events exist but miss key fields)

2: Sufficient for hunts with manual pivots

3: High-fidelity + correlated + retained long enough

Step 3 — Capture four properties per datasource.

Fidelity: fields present (e.g., process tree, user agent, TLS SNI)

Continuity: loss rate, timestamp quality (NTP), ordering

Retention: hot/cold days vs. typical dwell time

Joinability: stable IDs to link user ↔ device ↔ IP ↔ app

Step 4 — Color the gaps; pick three to fix.
Prioritize by risk and feasibility: internet-facing assets, identity, egress, and crown-jewel apps usually come first.

What “Good” Looks Like

Join keys everywhere: device ID + user principal + account/tenant + IP/ASN.
Enough retention for reality: 30–90 days hot for identity and egress; longer cold storage for major cases.
Behavioral fields present: not just “allowed/blocked” but how (process parent, scope, API method, Graph call).
Backpressure aware: sampling where safe (NetFlow), lossless where critical (IdP, endpoint lineage).

Turn the Map into Hunts

For each question, define expected artifacts and corroborators across lanes:

DNS Exfil Hypothesis
Primary: high-entropy subdomains + periodic TXT responses (DNS)
Corroborators: archiving process before spikes (endpoint); first-seen egress to rare ASN (network); TI overlap with a known cluster
OAuth Consent Abuse Hypothesis
Primary: new app consent with risky scopes (identity)
Corroborators: Graph API burst (cloud); mailbox rule creation (app); domain/issuer patterns from TI

If any corroborator lane is “0/1,” the hunt is fragile—either adjust scope or fix the gap.

Metrics That Matter

Coverage score trend: average per lane across your top 10 questions (aim +0.5 per quarter).
Detection uplift: hunts promoted to durable detections once corroborators exist.
Time-to-first-signal: from hypothesis to first high-confidence lead.
Gap burn-down: number of “0” scores reduced to “2+” per quarter.

Common Failure Modes (and Fixes)

Collecting but not joining: rich logs with no common IDs → add entity resolution at ingest.
Short retention: can’t reconstruct dwell time → increase hot storage for identity/egress; archive compressed cold tiers.
Binary-only thinking: “Is it blocked?” → capture behavior fields; you hunt behaviors, not verdicts.
One-and-done maps: coverage drifts → review monthly; update after major incidents.

Takeaways

A telemetry coverage map turns “we think we log it” into we can prove we see it. By scoring fidelity, continuity, retention, and joinability against the questions you actually need to answer, both enterprises and gov/LE teams can aim hunts where they’ll work today—and invest to make tomorrow’s hunts possible.

Do you have the tools it takes to understand who is attacking your organization and why? Ultimately, it’s the only way to know how to stop attacks. Platform Blue offers government-grade threat intelligence to the worlds most elite threat hunting organizations. Get a demo today!

Platform Blue

You Can’t Hunt What You Can’t See: Building a Telemetry Coverage Map

What a Coverage Map Is (and Isn’t)

The Five Lanes to Get Right

How to Build the Map (Fast)

What “Good” Looks Like

Turn the Map into Hunts

Metrics That Matter

Common Failure Modes (and Fixes)

Takeaways

Experience Illumination

Platform Blue®
Full Spectrum Intelligence®

Platform Blue

You Can’t Hunt What You Can’t See: Building a Telemetry Coverage Map

What a Coverage Map Is (and Isn’t)

The Five Lanes to Get Right

How to Build the Map (Fast)

What “Good” Looks Like

Turn the Map into Hunts

Metrics That Matter

Common Failure Modes (and Fixes)

Takeaways

Experience Illumination

Platform Blue®Full Spectrum Intelligence®

Platform Blue®
Full Spectrum Intelligence®