An Introduction to Data Detection and Response (DDR)

Oct 19, 2023
11 minutes
150 views

How long would it take you to respond to a cloud data breach?

For most organizations, the answer is far too long. When it takes less than a minute for an authorized user to download an entire database’s worth of customer data, it’s clear that something is amiss.

The move to cloud infrastructure requires new approaches to data loss prevention (DLP). Monitoring data in real-time becomes more complicated in the cloud. You can't install an agent on a database hosted by Amazon or Google or place a proxy in front of thousands of datastores. The emerging solution is data detection and response (DDR), which is often seen as the new cloud DLP — a means to introduce dynamic monitoring to cloud environments.

In this post, we'll cover the basics of DDR — why it's needed and how you can get started.

How Data Detection and Response (DDR) Solutions Work

DDR describes a set of technology-enabled solutions used to secure cloud data from exfiltration. DDR focuses on real-time monitoring, detection and response. It provides dynamic monitoring on top of the static defense layers provided by CSPM and DSPM tools.

In a nutshell: DDR solutions use real-time log analytics to monitor cloud environments that store data and detect data risks as soon as they occur.

In more detail: Today's organizations store data across a wide variety of cloud environments. For a midsize or larger enterprise, data will be found in multiple PaaS (e.g., Amazon RDS), IaaS (virtual machines running datastores), and DBaaS (e.g., Snowflake) tools.

Monitoring every data action, however, isn’t exactly feasible in the age of big data. Here’s where DDR solutions come in. Many include DSPM capabilities to lay the groundwork of discovering and classifying data assets. The DSPM process highlights risks detected with the data assets, such as sensitive data not encrypted or a programmatic data flow violating data sovereignty rules. These risks would be prioritized by DSPM and remediated by the relevant data owners or IT.

Once sensitive data assets have been mapped, the DDR solution starts monitoring activity related to these assets. The solution accomplishes this via the cloud-native logging available in the public cloud. The cloud provider will generate an event log for every query or read request.

The DDR tool will parse the log in near-real time and apply a threat model to identify suspicious activity, such as data flowing to an external account. If this is a new risk, the DDR will issue an alert and suggest the best response. Alerted risks are generally urgent and need to be acted upon immediately.

Policy Examples: How DDR Is Used in Practice

To better understand the types of incidents a DDR solution would detect, let’s look at a few examples that we’ve seen in the wild.

Data Sovereignty Issues

DDR can detect when data is flowing to an unauthorized physical location, helping organizations to comply with regulations to store data in specific geographical areas (such as the EU or California).

Assets Moved to Unencrypted or Unsecure Storage

As data flows between databases and cloud storage, it can easily land in a datastore lacking optimal security (often resulting from a temporary but forgotten workaround). DDR alerts this type of movement when it happens.

Snapshots and Shadow Backups

Teams are under increasing pressure to do more with data, which can lead to shadow analytics happening outside of regular, authorized workflows. DDR helps find copies of data stored or shared in ways that can lead to a breach.

DDR and DSPM

Data security posture management (DSPM) protects from the data outwards by adding a layer of data awareness. The DSPM tool scans the stored data — detecting assets that contain sensitive data (such as PII or access codes) — classifies the data and assesses the risk associated with it. This gives security teams a clearer picture of data risk and data flow, allowing them to prioritize the cloud assets where a breach would be most damaging.

While DSPM offers more granular and fine-tuned cloud data protection, both of these solutions are static and are focused on posture. They allow organizations to understand where the risk lies, but they offer little in terms of real-time incident response.

On the other hand, DDR is dynamic. It focuses on data events happening in real time, sending alerts and giving security teams a chance to intervene before the damage is done (or while it's still minimal). It monitors the specific event level, versus other solutions that look at configurations and data at rest.

An Example Scenario

An employee has access to a database containing customer data. This access is legitimately authorized due to the nature of the employee’s role at the company. But, when the employee decides to leave the company, she copies the database to her personal laptop with plans to take it to the next company.

In this scenario, everything was fine permissions-wise, but the result is a major exfiltration event. A well-calibrated threat model could detect that this export contained an unusual batch of data, or other irregularities in this event that should raise a red flag. The DDR tool would send an alert and provide full forensics — pinpointing the exact asset and actor involved in the exfiltration. This would save precious time and allow security teams to intervene before any real damage is done.

The Importance of the Threat Model in DDR

As the example above demonstrates, the threat model is a crucial component of a DDR solution.

It's not enough to be able to access and parse cloud logs in real-time. Every day, a massive volume of data events (new data services onboarded, new datastores, backups and snapshots created) take place in an enterprise's cloud account. If the DDR tool can't identify which of these pose an actual risk, it will either miss critical incidents or overwhelm security teams — who already suffer from notification overload — with false positives.

Threat models are developed by cybersecurity researchers, taking into account:

  • Attack patterns revealed in previous data breaches
  • Specific weaknesses in every data service and how these can be exploited
  • The unique footprint a security incident in the cloud logs
DDR with DSPM use cases
Figure 1: DDR with DSPM use cases

While many DDR solutions are likely to appear in the next few years, the true differentiator between them will be the quality and accuracy of the threat model that powers their real-time detection.

Why DDR? Technical and Business Benefits

There’s no shortage of cybersecurity tools on the current CISO agenda. Is another type of tool needed or will it contribute to further tooling bloat?

Benefits of data detection and response in the cloud
Figure 2: Benefits of data detection and response

DDR provides mission-critical functionality that is missing from the existing cloud security stack. When agents aren’t feasible, you need to monitor every activity that concerns your data. DDR protects your data from being exfiltrated or misused, as well as from violating a regulation. DDR helps reduce operational overhead by integrating with SIEM/SOAR solutions, so teams can consume all alerts in one place.

When Static Defense Layers Aren’t Enough: Lessons from a Breach

The 2018 Imperva breach started with an attacker getting access to a snapshot of an Amazon RDS database containing sensitive data. The attacker used an AWS API key stolen from a misconfigured compute instance that was publicly accessible. Could a similar event have happened today?

While CSPM tools might have been able to identify the misconfiguration, and DSPM might have been able to detect the fact that there was sensitive data stored on the misconfigured instance, this example also highlights the limitations of these approaches. Once the attacker has access that appears legitimate, they wouldn’t be able to identify the unusual behavior: exporting a snapshot of the database to an unknown device.

Indeed, the Imperva breach was discovered 10 months after the fact, via a third party. During this period, the company wasn’t aware and couldn’t notify its users that sensitive data had been leaked.

A DDR solution, which monitors the AWS account at the event log level, could have potentially identified such an attack in real time and alerted internal security teams — allowing them to respond immediately, rather than many months later.

Challenges Exacerbated by Data Tooling Sprawl, Microservices and Multicloud Environments

Rather than a single data warehouse or data platform, today's sensitive data can be found in dozens, if not hundreds, of datastores.

Business and data teams remain on the lookout for new ways to drive profitability through analytics. This often means adopting different tools and technologies — databases, serverless query engines, BI and data science platforms. On the dev side, microservice architecture splinters a codebase into dozens or hundreds of smaller services, each with an attached data asset.

The result of these processes is that data can move more freely than ever, making it harder to keep track of who’s accessing, altering or exfiltrating data. Policies should prevent sensitive data from being copied without good reason, but the reality is inevitably messier. This is before mentioning hybrid and multicloud environments, which are growing increasingly popular and add significant complexity.

With data distributed across so many potential targets, the attack surface becomes difficult to monitor. The cloud or service providers will offer some native solutions, but these won’t cover tools offered by different vendors — especially when they're hosted in a different cloud.

A Single Threat Model Across Environments

DDR tools address this challenge by monitoring activity across cloud environments, as it's recorded in each cloud provider's centralized logging systems (for example, Amazon CloudTrail or Azure Monitor).

Here’s an example of such a log:

CloudTrail event
Figure 3: Sample CloudTrail event

‍In Figure 3, we see that a malicious actor has made a snapshot of a sensitive RDS database public. This means that the snapshot, along with any sensitive data it contains, can now be restored in any AWS account. This could potentially expose sensitive data to unauthorized access.

DDR allows organizations to monitor all cloud activity from one place, rather than building ad hoc solutions per data service or relying on a patchwork of security tools. Once a threat model is deployed, it can be applied to every environment where sensitive data is stored.

DDR replaces the labor and compute-intensive processes of collecting, parsing and analyzing data from each database or VM. This helps security teams reduce their overhead and focus their efforts on managing strategic risk, rather than playing whack-a-mole with dozens or hundreds of potential vulnerabilities.

Getting Started with DDR

DDR is relatively new. As with any technology decision, it's important to understand how it fits into your organization's way of doing things, and to map out the prerequisites for a successful implementation beforehand. Here are a few pointers to look out for if you're thinking of investing in a DDR solution.

1. Know where your sensitive data is. A strong DSPM foundation, which can identify and classify sensitive data assets, is essential. Monitoring every single query on every single datastore isn't feasible (unless you have infinite resources to spend). You want to narrow down the monitoring surface to the places where sensitive data is actually stored. This should be part of your DDR solution, rather than require additional.

2. Have a clear policy for dealing with incidents. DDR solutions will surface potential issues, but if you don't have a well-defined process for responding to them, you’ll only further overwhelm your security and engineering teams.

3. Decide on an owner. Every enterprise has its own ways of dealing with data in general, and data security in particular. Make sure you know where DDR falls within your org chart. If it's with SOC teams, they need to understand data context and how to avoid production-breaking changes. If it's DataOps or DevOps, they need to know how to respond to a security incident.

Supporting Innovation Without Sacrificing Security

The cloud is here to stay, as are microservices and containers. As cybersecurity professionals, we can't prevent the organization from adopting technologies that accelerate innovation and give developers more flexibility. But we need to do everything we can to prevent the potential calamity of a data breach or ransomware attack.

DDR offers a critical aspect that was previously missing in the cloud security landscape: dynamic monitoring of complex and multicloud environments. Monitoring real-time data activity, in addition to static security posture, can help security teams catch incidents earlier, averting disastrous data loss or minimizing the damage it causes.

Learn More

If you haven’t tried Prisma Cloud and would like to test drive best-in-class Code to CloudTM security, we’d love for you to experience a free 30-day trial.

 


Subscribe to Cloud Native Security Blogs!

Sign up to receive must-read articles, Playbooks of the Week, new feature announcements, and more.