Cloud Data Access: From Chaos to Governance

Oct 25, 2023
8 minutes
... views

Controlling access to sensitive data is foundational to any cybersecurity strategy. But the cloud adds complications on the road to least privilege. In this post, we delve into the realities of data access governance in today’s multicloud architectures and how a DSPM-based solution can help you focus on protecting the data that matters.

Data Access Governance Defined

Data access governance (DAG) is the process of implementing policies, procedures, and controls to manage access to organizational data. When correctly implemented, DAG ensures that only authorized users and systems can access, manipulate, and share sensitive information, in accordance with data security and compliance requirements.

How Cloud Architecture Challenges Centralized Access Governance Models

The principle of least privilege states that any process, program or user must only be able to access the resources and information necessary for their legitimate purpose. In the context of data, entities should be granted the minimum permissions to view or modify records, as required to perform their jobs. An analyst looking at customer purchase patterns doesn't need access to user email addresses if they can rely on anonymized user IDs, for example.

When enterprises stored their data in monolithic, on-premises data warehouse architectures, implementing least privilege was easier to achieve. The data platforms were tightly ruled by sysadmins. To prevent unauthorized access, they could rely on a combination of network security solutions (firewall and enterprise VPN), role-based access control (RBAC) at the database level, and agent-based monitoring of user activity and audit trails.

Traditional on-premises data warehouse architecture vs. modern data stack
Figure 1: Traditional on-premises data warehouse architecture vs. modern data stack

The cloud removed the hard coupling between data systems and IT and made business teams much more agile. But the drive to become more data-driven means more individuals and systems need access to data. And while the basic approach of least privilege hasn’t changed, it gets difficult to implement when you have hundreds of users and dozens of roles, all of which legitimately need some level of access to something.

The Challenge of Decentralization

Cloud infrastructure is elastic and easy to provision. Different business units manage their set of resources, which may not share a common governance layer. Consistent access control policies are difficult to enforce.

Multiple Access Pathways and Complex Dependencies

Cloud services can be exposed via multiple interfaces and APIs and often form an intricate web of dependencies. Deciphering which permissions are legitimately needed requires significant technical effort.

IAM Complexity Across Multiple Vendors

Organizations relying on multiple CSP IAM tools — each identifying their set of user identities, authentication mechanisms and access policies — can’t establish net-effective permissions for their users.

Fragmented, API-Based Architectures

Businesses are gravitating toward data lakes and the modern data stack, which requires building an entire ecosystem of API-connected tools for data ingestion, integration, storage and compute. Many of these tools, however, can potentially access sensitive data, piling on additional requirements for permissions management.

state of data engineering 2023
Source: LakeFS: The State of Data Engineering 2023

Let’s look at a few examples to understand how these challenges can quickly become major security issues.

3 Common Risk Scenarios Related to Data Access Governance

Lack of Accountability

Lack of proper oversight can result in unauthorized or overprivileged access going unnoticed. Without a clear audit trail of user activity and permissions granted, security incidents become much more difficult to triage and investigate.

An example: A large enterprise discovers its customer details have been leaked to the dark web. Due to complicated cross-account permissions, it takes weeks to determine that a sales engineer’s account was compromised and responsible for the breach — preventing the company from identifying other compromised resources.

Compliance Violations

Regulatory and industry frameworks such as GDPR and PCI DSS explicitly require certain controls related to data access. Having a clear picture of who has access to which dataset can also play a part in evidence collection ahead of an audit.

An example: A financial services company is subject to PCI DSS compliance, which requires strict control over access to payment card information. Due to mishandled permissions, a developer copies the cardholder data to a noncompliant staging environment. The violation is discovered during a routine audit.

Data Exfiltration

Uncontrolled access to sensitive data significantly increases the risk of data loss, especially insider threats or compromised credentials. This can lead to extreme unpleasantness such as ransomware attacks or customer data leaks.

An example: A contractor needs temporary access to a database containing customer details, but their access isn't revoked after the project is done. This goes unnoticed until 6 months later, when the contractor downloads the database and sells it to a competitor.

Solving Your Cloud Data Access Problems

At this point you might be thinking, This sounds like a permissions issue. Wouldn’t you solve it with an IAM tool?

All too often, trying to prevent unauthorized access on the configuration level turns into a game of whack-a-mole. Ultimately, security teams become bogged down with daily alerts and no means to prioritize incidents.

Identity and access management (IAM) tools play a critical role in managing users, permissions and access controls within an organization. But IAM is complex and not specifically designed to address data access issues. It treats datastores like any other cloud resource, in that it lacks awareness of the data stored and data flow between services.

Cloud infrastructure entitlement management (CIEM) simplifies and consolidates IAM for cloud environments, reducing permission sprawl. While CIEM helps monitor access to datastores, not all standalone CIEM tools are content-aware, meaning that many can’t see who has access to sensitive data.

While data security posture management (DSPM) tools classify sensitive data, not all DSPM tools offer data access governance (DAG). The lack of access-awareness can limit standalone DSPM tools.

Understand that data access challenges aren’t typical permissions issues. That distinction is paramount.

Datastores have unique access patterns. They usually require broader and more fluid access than other cloud resources. In contrast to typical cloud resources, more people have a legitimate purpose for wanting access to datastores.

What’s more, not all data is created equal. While all data exposure is undesirable, the risks associated with sensitive data (PII, PCI, etc.) are far greater. Businesses would realize optimal benefits from a targeted approach that focuses on preventing unauthorized access to these high-risk data assets. After all, sensitive data is the most common target for modern cyberattacks. When it's compromised, the financial and reputational consequences can be dire.

Modern enterprises are moving toward a data-centric approach to cloud data access — one that seamlessly integrates CIEM and DSPM as only a CNAPP can.

The DSPM Solution to DAG

What does a data-centric solution to data access governance look like? The answer starts with data security posture management (DSPM).

DSPM is a set of practices and technologies used to discover, classify and prioritize sensitive data stored in cloud environments. By building on the insights provided by DSPM, organizations can gain immediate visibility into who has access to which cloud datastore and prioritize the issues that put sensitive data at risk.

Start with an inventory of your sensitive data. Before you can prioritize and implement effective data access governance, you need to know what data you hold and where it resides. DSPM tools provide a comprehensive inventory of your sensitive data, which enables you to identify the most critical assets that require a higher level of attention.

Visualize and Map Data Access

Prisma Cloud's DSPM shows a visual mapping of data access in your organization, which helps you understand the relationships between users, roles, resources and datastores. Within this visual interface, you can identify and review access permissions, detect irregularities and enforce least-privileged access. You can also identify the specific role and account used to gain access to the resource.

Prioritize High-Risk Issues

Once you have a clear picture of your sensitive data assets and understand who has access to what, you can identify and prioritize issues that need your immediate attention.

Cross-Account Access to Sensitive Data Assets

You’ll have occasions where an external account needs access to a cloud resource, such as an ETL tool reading data from Azure Blob Storage. Many exfiltration and compliance issues, though, start with permissions that remain active after they’re needed. DSPM helps you keep track of cross-account permissions and identify whether a specific permission remains necessary.

Non-SSO Users with Access to Sensitive Data

Users who aren’t authenticated via single sign-on (SSO) pose a risk, as their access may be more difficult to manage and secure. Prisma Cloud lets you identify users with credentials that allow them access to sensitive data. You can then verify the access is justifiable and monitored.

Learn More

To effectively govern data access, it's essential to adopt a data-centric approach using DSPM alongside CIEM for fine-grained access control. By leveraging their combined power within a CNAPP, organizations can effectively balance data security, compliance, and operational agility in today's complex cloud environments.


Subscribe to Cloud Native Security Blogs!

Sign up to receive must-read articles, Playbooks of the Week, new feature announcements, and more.