Great hunts die in blind spots. Whether you’re a government/LE team building a case or an enterprise SOC trying to catch early-stage intrusions, you can only investigate what your data can reveal. A telemetry coverage map makes gaps explicit, prioritizes fixes, and keeps hunts grounded in reality.
What a Coverage Map Is (and Isn’t)
A coverage map is a living inventory of what you collect, at what fidelity, with what retention, and where signals join. It’s not a logging spreadsheet. It ties data lanes to attack stages and priority questions (PIRs), so people know which hunts are feasible today—and which require new data.
The Five Lanes to Get Right
- Endpoint/Workload
Process lineage, command lines, module loads, persistence, script interpreters, archival tools.
Signals: living-off-the-land, staging before exfil, credential access. - Identity/Access
IdP/SSO logs, MFA events, device posture, consent grants, admin actions.
Signals: valid-account abuse, token theft, OAuth consent abuse, privilege escalation. - Network/DNS/TLS/Egress
Resolver logs, SNI/JA3, proxy/firewall egress, NetFlow.
Signals: C2 beacons, DNS exfil, callbacks to first-seen infrastructure. - Cloud/SaaS Control Plane
API calls, service principal activity, resource creation/deletion, cross-tenant access.
Signals: control-plane abuse, persistence without malware, data access anomalies. - Application/Runtime
AuthZ failures, high-volume exports, file/object reads/writes, service-to-service calls.
Signals: data staging, lateral movement inside the app, insider misuse.
Augment with threat intelligence (infrastructure reuse, actor TTPs, exploitation-in-the-wild) to add context and prioritization.
How to Build the Map (Fast)
Step 1 — Define your questions.
List 5–10 PIR-style questions you need to answer (e.g., “Can we see and stop token-free sessions?” “Can we spot DNS exfil?”).
Step 2 — Score each lane per question (0–3).
- 0: No coverage
- 1: Partial/low-fidelity (events exist but miss key fields)
- 2: Sufficient for hunts with manual pivots
- 3: High-fidelity + correlated + retained long enough
Step 3 — Capture four properties per datasource.
- Fidelity: fields present (e.g., process tree, user agent, TLS SNI)
- Continuity: loss rate, timestamp quality (NTP), ordering
- Retention: hot/cold days vs. typical dwell time
- Joinability: stable IDs to link user ↔ device ↔ IP ↔ app
Step 4 — Color the gaps; pick three to fix.
Prioritize by risk and feasibility: internet-facing assets, identity, egress, and crown-jewel apps usually come first.
What “Good” Looks Like
- Join keys everywhere: device ID + user principal + account/tenant + IP/ASN.
- Enough retention for reality: 30–90 days hot for identity and egress; longer cold storage for major cases.
- Behavioral fields present: not just “allowed/blocked” but how (process parent, scope, API method, Graph call).
- Backpressure aware: sampling where safe (NetFlow), lossless where critical (IdP, endpoint lineage).
Turn the Map into Hunts
For each question, define expected artifacts and corroborators across lanes:
- DNS Exfil Hypothesis
Primary: high-entropy subdomains + periodic TXT responses (DNS)
Corroborators: archiving process before spikes (endpoint); first-seen egress to rare ASN (network); TI overlap with a known cluster - OAuth Consent Abuse Hypothesis
Primary: new app consent with risky scopes (identity)
Corroborators: Graph API burst (cloud); mailbox rule creation (app); domain/issuer patterns from TI
If any corroborator lane is “0/1,” the hunt is fragile—either adjust scope or fix the gap.
Metrics That Matter
- Coverage score trend: average per lane across your top 10 questions (aim +0.5 per quarter).
- Detection uplift: hunts promoted to durable detections once corroborators exist.
- Time-to-first-signal: from hypothesis to first high-confidence lead.
- Gap burn-down: number of “0” scores reduced to “2+” per quarter.
Common Failure Modes (and Fixes)
- Collecting but not joining: rich logs with no common IDs → add entity resolution at ingest.
- Short retention: can’t reconstruct dwell time → increase hot storage for identity/egress; archive compressed cold tiers.
- Binary-only thinking: “Is it blocked?” → capture behavior fields; you hunt behaviors, not verdicts.
- One-and-done maps: coverage drifts → review monthly; update after major incidents.
Takeaways
A telemetry coverage map turns “we think we log it” into we can prove we see it. By scoring fidelity, continuity, retention, and joinability against the questions you actually need to answer, both enterprises and gov/LE teams can aim hunts where they’ll work today—and invest to make tomorrow’s hunts possible.
Do you have the tools it takes to understand who is attacking your organization and why? Ultimately, it’s the only way to know how to stop attacks. Platform Blue offers government-grade threat intelligence to the worlds most elite threat hunting organizations. Get a demo today!