Exploring Detection Tests in MITRE Round 4: It's More Than What You Detect

Mitre Round 4 involved detailed testing of endpoint security (XDR and EDR) protection technologies against simulated attack scenarios based on the Wizard Spider and Sandworm threat groups. These evaluations covered both detection and protection (endpoint prevention) capabilities for each participating vendor.

For MITRE Round 4, thirty security solution vendors were evaluated in the detection tests. We will explore the nature of these tests more deeply, and explain what the key points are that enable security teams to distinguish between the different results.

Many vendors are highlighting very high rates of detection visibility based on the results of the 2022 MITRE ATT&CK Round 4 evaluation, and the differences in performance between these vendors may appear to be very narrow in range. When individual results are explored in greater detail, a clearer picture emerges, highlighting factors such as detection quality and the impact and importance of missed data. These factors are vital differentiators in determining the overall efficacy of a prospective security solution.

The differences in data quality that exist between the detection categories utilized by MITRE are not insignificant, and the manner by which those detections are reached is also of critical importance. Understanding these factors can enable organizations to more accurately determine the relative strength of each prospective security solution.

MITRE Detection Evaluation

The detection portion of the evaluation included a total of nineteen major attack steps, each of which involved a number of attack substeps; with 109 distinct substeps performed in total. Each substep represented an action that was individually detectable, with MITRE recognizing a range of detection qualities possible for each detection. This range of quality represents the detail (or completeness) of the detection, with each degree of quality representing greater context, detail and more complete insight into the threat event that took place.

The range of detection outcomes recognized by MITRE is as follows:

Detection Type Detection Category Description
None No Detection No detection event was raised for the action. This action was missed entirely. The analyst remains unaware that the action took place.
Telemetry Telemetry The activity was recorded as raw data. No additional context was provided beyond what took place and when it happened. The analyst must know to search for the data in order to properly surface it, and then must determine its importance and relevance. It is simply data without context.
General Analytics The event was recorded and raised as an alert. The analyst is made aware that what took place is potentially malicious, but was not provided any detail that describes “what” was malicious about the action or “why” it is considered malicious. The Analyst is made aware that this is an event that might warrant further attention, requiring them to conduct additional research to determine the nature of the event and what it means for the organization.
Tactic Analytics The event was recorded and raised as an alert, with additional detail outlining the tactic employed in that step. This level of detail informs the analyst that something is potentially malicious, and also provides additional context that describes “what” is malicious about the recorded event. Details concerning “why” this is regarded as malicious are not included. The Analyst is aware that this event is malicious and can determine from the data provided what has taken place, but may need to research further in order to respond accurately.
Technique Analytics The event was recorded and raised as an alert, with additional context that explains “what” about the event is considered malicious AND “why” this is considered malicious behavior. This level of detail will map the event directly to one of the known and published attack techniques defined and cataloged within the MITRE ATT&CK Framework and provides the analyst with the relevant information needed to proceed decisively. The Analyst is aware of the malicious event and is provided with the necessary information to respond appropriately.


Taking a Closer Look: Day 1 (Wizard Spider) - Attack Step #10

The first day of the detection evaluation simulated an attack conducted by the Wizard Spider ransomware threat group. During this scenario, the simulated organization was initially breached via the introduction of attacker supplied ingress utilities, after which an escalating sequence of discovery, lateral movement and additional attack tool transfers take place, ultimately culminating in the encryption of sensitive data by the attackers. Step #10 represents the accomplishment of the attacker objectives and is where this encryption takes place. During this stage, the attackers conducted the following activities:

  • Attack tools are transferred to the target host
  • The attackers take steps to hide their activities behind compromised access tokens and via injection of code into legitimate processes
  • The target data is located
  • The target data is encrypted

This sequence represents the last opportunity to interrupt the attack before the data is encrypted and potentially rendered unusable; therefore completeness and quality of detail is especially vital here. There are seven steps in this sequence.

In this example, Palo Alto Networks Cortex XDR raised detections for every substep in the sequence. Additionally every detection raised provided fully detailed information on what took place, including the technique the attackers employed. Analysts were fully informed as to exactly what took place, why the action is critical, and how it was performed. This level of detail enables analysts to immediately respond based on the data presented and do so accurately and rapidly. They don't need to waste time triaging, gathering additional data and investigating to gain sufficient context to take action.

When we compare this result to that of a different vendor, such as CrowdStrike, we can see where gaps in detail become apparent. While CrowdStrike managed to raise detections for each step, 43% of those detections were basic telemetry - informing the analyst that something took place, but providing no detail to enable the analyst to determine the nature and importance of those events. They are indistinguishable from noise, and require the analyst to investigate further to determine the context and criticality of those events.

Additionally, 14% of the total event data only yielded a high quality telemetry detection AFTER specific configuration changes were issued to the Crowdstrike agent, meaning this detail was not available to the analyst at the time the incident first occurred. More than half of the total events in this sequence provided incomplete data which required additional effort on the part of the analyst to fully understand before they could act with certainty.

CrowdStrike provided incomplete data for more than half of the total events in this sequence which required additional effort on the part of the analyst to fully understand before they could act with certainty. This gap is present throughout the entire evaluation; overall they managed to provide complete information (Technique level) in 71% of the total detections, leaving 30% of the data in need of further curation by analysts.

Incomplete information can be challenging for analysts, however when information is missed entirely, the Analysts’ job becomes even more challenging. Microsoft provides examples of this in this same attack sequence. While CrowdStrike did manage to detect each step (though was largely incomplete in what was reported), Microsoft missed 55% of the attack sequence entirely, leaving the analyst to piece together what happened from a partial data set of events, and providing no indication regarding the additional attacker activities which took place, but were ignored by this vendor.

In this case, the analyst must piece together what led to the encryption, but as no information is available (and is left unaware) on the nature and location of the attacker tool that was introduced, the access token that was compromised, and what information the attacker managed to discover before encryption was attempted. Consequently, in the ideal case where the analyst manages to disrupt the attack chain before encryption can be completed, the tools, compromised credential, and discovered information remain available and useful to the attacker—with no indication to the security team regarding these elements. The attacker has an opportunity to leverage these resources in a follow up action, and the analyst has no means, via these results, of discovering those elements preemptively. The Analyst is effectively blind until the attacker chooses to take further action using these (or other) resources.

Technique/Substep CrowdStrike Result Microsoft Result Palo Alto Networks Result
10.A.1 - Ingress tool Transfer Detected - Technique Missed Detected - Technique
10.A.2 - Access token Manipulation Detected - Telemetry Missed
10.A.3 - Process Discovery Detected - Telemetry Detected - Technique
10.A.4 - Process Injection Detected - Technique Detected - Technique
10.A.5 - System Information Discovery Detected - Telemetry Missed
10.A.6 - File and Directory Discovery Detected - Technique Missed
10.A.7 - Encryption for impact CONFIGURATION CHANGE - Technique Detected - Technique

Technique - most complete and highest quality detection
Missed: no detection of attacker activity

What is the Impact?

Time and effort spent in supplementing incomplete information places analysts at a disadvantage versus attackers, since this creates an enlarged window where attackers may continue to act (dwell time) without an ideal response by the security team, and provides an opportunity for an attack to spread further and become more damaging before analysts have the information needed to respond effectively. Every data point that must be curated before it can be used reduces the efficacy of the analysts who must spend additional time curating and analyzing.

Missed data makes it impossible for the analyst to immediately understand the full scope and sequence of the attack, and creates significant opportunities for the attackers to remain resident within an environment and conduct further activities while avoiding notice into the full extent of their activities. Response and eviction of an attacker in this case becomes much more difficult, since key elements of the attacker’s footprint are not known to the analyst and are unlikely to be addressed. Essentially, it is like we are left trying to identify a suspect, when the only information we have been given is that they are wearing a red shirt.

In Summary

Both scenarios (incomplete data and missed data) increase risk, cost and overhead, and can undermine confidence in the response that is enacted. The result is that already limited analyst resources are tied up in attempting to determine whether the attack is resolved, while not having the information to reach that conclusion with certainty. This is why it becomes important to look closely at the quality of the detections being raised, AND to be aware of how likely a prospective solution is to leave potentially serious gaps in the data that is collected. Both of these aspects can have significant impact on the outcome of breach investigations and can determine the completeness and accuracy of the organization’s response. Detection efficacy is about more than what you detect. The quality of those detections is critical, as is the degree of missed events that the analyst must contend with. Neither low quality, nor missed events benefit the analyst.

Chart title: MITRE Round 4 - Technique Detection % - Minus Configuration Changes

Note: X-Axis tracks out-of-the-box Technique Detections (config changes removed) on the y-axis .

What This Means for Buyers

These examples highlight the importance of both completeness and quality of data when responding to attacks; deficiencies in either of these attributes can have significant consequences. The less data provided to the analyst, the greater the potential is for key elements of the attack to be missed. Likewise, the less contextualized the data being provided is, the more work (effort and time) must be done by the analyst(s) to determine the meaning and relevance of the data before it can be acted upon with certainty.

Solutions which provide the analysts with complete data that is delivered with full context provide the SOC significant advantages when an investigation is underway since analyst time, and incident time - both of which are critical measures - are not wasted compensating for deficiencies in the data being worked with.

Learn more about how to unpack the MITRE results by joining our ‘Dissecting the 2022 MITRE ATT&CK Evaluations Webinar and visiting our MITRE webpage.