Container Escape: New Vulnerabilities Affecting Docker and RunC

Feb 05, 2024
13 minutes
1892 views

A recent discovery identifies critical vulnerabilities affecting Docker and other container engines. Collectively called "Leaky Vessels", the vulnerabilities pose a significant threat to the isolation that containers inherently provide from their host operating systems. The new CVEs underscore a fundamental flaw in the architecture of container technology.

In this blog post, we discuss the Leaky Vessels vulnerabilities, providing an in-depth analysis of each vulnerability, their attack vectors, impacts, and suggested mitigation strategies.

  • CVE-2024-21626 involves a file descriptor leak in runc, potentially enabling attackers to access the host system.
  • CVE-2024-23651 involves a race condition in Docker and Buildkit that could lead to container breakouts and host access.
  • CVE-2024-23652 affects Buildkit and allows attackers to potentially delete arbitrary files on the host during image building.
  • CVE-2024-23653 exists in Buildkit and could enable attackers to break out of containers during the image-building process.

While timely patching remains crucial, a proactive approach is paramount to thwarting these threats and safeguarding the containerized landscape.

CVE-2024-21626: Runc File Descriptor Leak and Container Breakout

CVE-2024-21626 resides in runc, a critical tool responsible for spawning containers. Due to an internal file descriptor leak in versions up to and including 1.1.11, attackers can manipulate the working directory (process.cwd) of a newly spawned container process. Like an unlocked door, the leak leaves the file descriptor open, providing access. This manipulation allows the process to access the host filesystem, granting unauthorized access and potential container breakout. The risk is significantly reduced, however, when using prebuilt images from reputable registries that maintain patched images.

CVE-2024-21626 Technical Breakdown

The Leak

The vulnerability arises from how runc handles file descriptors during container spawning. When the build process is done on the machine while setting the container's working directory using setcwd(2), the file descriptor linked to the container's working directory remains open — even if the user has it set to O_CLOEXEC.

The open file descriptor becomes accessible to the container process if its path resolves to a directory on the host filesystem.

Exploitation and Possible Attack Vectors

Attackers leverage CVE-2024-21626 by manipulating the process.cwd value. In a malicious image attack scenario, the image script sets process.cwd to the leaked descriptor path. In the "runc run" attack scenario, the attacker sets environment variables or command-line arguments that influence process.cwd during container creation.

Malicious Image

An attacker can embed a specially crafted container image containing a script that sets the process.cwd to a path on the host filesystem accessible through a leaked file descriptor. When the image is executed, the container process will gain access to the host, potentially leading to privilege escalation and compromise.

An attacker needs to embed malicious code within container images to exploit the file descriptor leak and potentially gain host access. This can occur through various means, including:

  • Compromised Supply Chain: Injection of malicious code during the image build or
    distribution, even within trusted repositories.
  • Social Engineering: Deception of users into running malicious images, such as through phishing or misleading image names.
  • Various Attack Vectors: Exploitation of vulnerabilities in container orchestration platforms or management tools to manipulate container processes or execute malicious images.

Runc "Run" Command

Attackers with some level of access to the host system (e.g., indirect access through a vulnerable service or application running on the host) can exploit the vulnerability while running a container using the runc run command. By manipulating arguments and environment variables, they can trick the container process into setting its working directory to a leaked file descriptor on the host, achieving container breakout.

Runc "Runc Exec"

Another scenario presents a broader impact (but more difficult to exploit) than the one mentioned above. An attacker has knowledge about an administrative process calling runc exec with the current working directory flag (i.e., cwd) and knows the leaked file descriptor ID. In this scenario, it’s possible that the attacker will change the path with a symbolic link to a leaked file descriptor. This exploitation will result in the attacker gaining access to the host file system, bypassing PR_SET_DUMPABLE protection.

Impact of CVE-2024-21626

Once the container process has its working directory set to the leaked descriptor path, it can access the host filesystem at that location. This can grant read, write and even execute privileges depending on the permissions of the leaked file descriptor.

Mitigation

Upgrade runc to version 1.1.12 or later, which addresses the vulnerability by properly closing leaked file descriptors. Additionally, since the build process already happened, using a prebuilt image from a reputable registry will mitigate the risk. It's still important to check the build date of the image compared to the runc patch date, though.

CVE-2024-23651: Docker Symlink Race Condition in Build

CVE-2024-23651 stems from a symlink race condition in Docker versions below 23.0.1. The issue occurs during the image build process within the cache mount mechanism, as implemented in Buildkit versions up to and including 0.12.4. Exploiting this symlink race condition allows attackers to access files from the host system, but the likelihood of a successful exploitation is low, as it’s difficult to beat the race condition.

Technical Breakdown of CVE-2024-23651

The Race Condition

When you use the RUN --mount=type=cache directive in a Dockerfile, you can specify a source for the cache mount. This source is a directory on the host system that the Docker daemon uses as the cache directory. Docker then mounts this directory at a specified location in the Docker image being built.

The vulnerability arises when the validation of this cache mount source path is exploited through a race condition. The attack may replace the source path with a symbolic link to an arbitrary directory, which could be mounted into the Docker image if the exploit is successful.

Exploitation of CVE-2024-23651

Attackers could try to exploit this issue by causing the user to build two malicious images at the same time, which can be done by poisoning the registry, typosquatting or other methods. The build will mount a random cache called X and create a directory called Y.

Meanwhile, the second build attempts to mount the cache in path X, located within Y. After confirming Y is a directory, the first build overrides it with a symlink to a sensitive location. Subsequently, the second build follows the symlink and mounts the sensitive directory to the container file system. It’s important to mention that beating the race condition is almost impossible, as the window of opportunity is fleeting (a few milliseconds at most), and the attacker has no control over the timing of the attack.

Vulnerability Impact

If successful, the attacker can access and potentially manipulate files on the host system, which could lead to privilege escalation, data exfiltration or other malicious activities.

Mitigation of CVE-2024-23651

Upgrade Docker to version 23.0.1 or later, which addresses the vulnerability by fixing the race condition in cache invalidation. Make sure to upgrade any instances of Buildkit to version 0.12.5 or later.

Building and using images from trusted sources as a best practice can limit the possibility of exploitation.

CVE-2024-23652: Buildkit Vulnerability and Arbitrary File Deletion on Host

CVE-2024-23652 resides in Buildkit versions <=v0.12.4 and allows attackers to manipulate the container's temporary directories used during image building to delete arbitrary files on the host system. While primarily exploited during malicious Dockerfile builds, it can also potentially impact other Buildkit-based build systems.

Technical Breakdown

The Vulnerability

Buildkit mounts the directory from the host into the container filesystem for various phases of image building. When these directories become empty, Buildkit attempts to remove them automatically during cleanup. The vulnerability lies in how Buildkit determines if a directory is empty. It only checks for files within the directory itself, not considering potential mount points within and deleting the mount points on the host.

Exploitation and Possible Attack Vectors

Attackers leverage this logic by crafting a malicious Dockerfile. The file mounts a specific directory on the host system within the container's filesystem. Next, it creates a strategically placed empty directory within the container, positioned above the mount point in the container's hierarchy. The attacker then manipulates files within the mounted directory (which are actually on the host), tricking BuildKit into thinking the empty container directory is safe to delete. During cleanup, BuildKit mistakenly deletes the corresponding directory on the host as well, even if it contains files accessible through the mount point.

This Leaky Vessel vulnerability allows attackers to potentially delete critical system files, corrupt image builds or gain unauthorized access to sensitive data on the host. Remember, the attack hinges on manipulating the mounted host directory, not an empty container directory. Mitigate this risk by updating BuildKit to version v0.12.5 or later.

Malicious Dockerfile

Attackers exploit this vulnerability by creating malicious Dockerfiles with specific steps:

Step 1: Mount a Host Directory

Attackers use the RUN --mount directive to mount a targeted directory on the host system within a specific location inside the container's filesystem.

Step 2: Create an Empty Target Directory

Within the container, attackers create an empty directory strategically placed above the mounted directory's location.

Step 3: Manipulate Files Through the Mount

Commands within the Dockerfile manipulate files on the host system accessible through the mounted directory. Doing so tricks BuildKit into believing the directory is empty.

Step 4: Trigger Unintended Deletion

During cleanup, BuildKit mistakenly deletes the corresponding host directory as well, even if it contains files accessible through the mount point.

Vulnerability Impact

Manipulating mounted directories grants attackers a deceptive weapon that transcends file deletion. Forget directly reading data — the ability to delete becomes a sinister tool. Imagine attackers orchestrating denial-of-service chaos by removing critical system files, compromising security via security-related setting manipulations, or deleting encryption keys to expose sensitive data.

The destructive reach of this vulnerability extends to corrupting entire image builds, potentially disrupting software delivery pipelines.

Mitigation of CVE-2024-23652

Upgrade Buildkit to version 0.12.5 or later, which patches the vulnerability by properly addressing the empty directory checking logic.

Additionally, if possible, avoid using RUN --mount with untrusted Dockerfiles or frontend configurations.

CVE-2024-23653: BuildKit Vulnerability and Potential Container Escape

This vulnerability resides in BuildKit versions before v0.12.5 and arises from improper entitlement checks within its Interactive Containers API. This API allows running containers based on built images for interaction and customization. The vulnerability enables attackers to leverage a specially crafted Dockerfile to exploit these entitlement checks and potentially achieve container escape.

Technical Breakdown of CVE-2024-23653

The Vulnerability

Attackers create a Dockerfile that utilizes container configuration commands like RUN and USER. This configuration triggers a specific code path within BuildKit's Interactive Containers API where missing or inadequate entitlement checks could allow the container to have elevated privileges.

Exploitation and Possible Attack Vectors

If attackers exploit these elevated privileges within the container, they could use existing vulnerabilities or misconfigurations to break out of the container and access the host system.

Malicious Dockerfile

Attackers can craft a malicious Dockerfile that utilizes container configuration commands like RUN, USER and others to trigger a vulnerable code path within BuildKit's Interactive Containers API. Exploiting this path allows the container to bypass intended entitlement checks, potentially allowing attackers to gain elevated privileges within the container.

Impact of Vulnerability

Container escape: The primary concern is potential container escape, granting attackers unauthorized access to the host system's resources and data.

Escalated privileges: The exploited vulnerability could grant elevated privileges within the container, potentially facilitating further malicious activities.

Disrupted builds: Malicious use of this vulnerability could corrupt image builds or disrupt build processes.

Mitigation of CVE-2024-23653

Update your BuildKit installation to version v0.12.5 or later. The newer versions include the necessary patch to address the vulnerability.

As a best practice, review existing Dockerfiles, Be cautious with Dockerfiles, especially those obtained from untrusted sources. Scrutinize them for suspicious commands like RUN, USER or configuration settings that might grant unintended privileges.

Leaky Vessel Vulnerabilities At-a-Glance

 

Vulnerability Affected Version Impact Mitigation
CVE-2024-21626 (runc) <=1.1.11 Container breakout, potential privilege escalation Update to runc 1.1.12+
CVE-2024-23651 (Docker Engine, Buildkit) Docker Engine <23.0.2,
Buildkit <= 0.12.4
Container breakout, access to host files Update to Docker Engine 23.0.2+ or Buildkit 0.12.5+
CVE-2024-23652 (Buildkit) <= 0.12.4 Data corruption, privilege escalation Update to Buildkit 0.12.5+
CVE-2024-23653 (Buildkit) <= 0.12.4 Container breakout, privilege escalation Update to Buildkit 0.12.5+

Leaky Vessels: Patching Isn't Enough

The Leaky Vessels vulnerabilities have exposed critical flaws in the foundation of container security. While patching is essential, it's only the first step. Building secure container environments requires a layered approach that combines timely updates with strong security practices.

Containers aren’t islands. They’re interconnected parts of a larger ecosystem, and the security of one depends on the security of all.

Security Best Practices

  • Least privileges: Run container processes with the minimum necessary permissions.
  • Build from trusted sources: Only build and use images from trusted sources. Use Prisma Cloud’s Trusted Images feature to manage your trusted sources.
  • Restrict capabilities: Limit container capabilities to prevent unauthorized actions.
  • Minimize user permissions: Give users minimal permissions required for building images.
  • Image scanning: Scan container images for vulnerabilities before deployment.
  • Monitoring: Monitor container activity for suspicious behavior.

Key Takeaways

  • Each vulnerability resides in a critical component of the container ecosystem — runc (container spawning), Docker (image building), Buildkit (image building), and Moby (container platform).
  • The attack vectors involve manipulating aspects of container operations, such as file descriptors, cache mounts, temporary directories and security modes.
  • The potential impact of successful exploitation ranges from unauthorized file deletion to complete host compromise.

By taking proactive steps and adopting a layered security strategy, you can ensure your containerized applications remain resilient and protected from future threats. Remember, container security is an ongoing journey, not a one-time fix. Stay vigilant and adapt your defenses as the threat landscape evolves.

Detecting the Vulnerabilities with Prisma Cloud

Prisma Cloud users can identify the affected workloads by searching for relevant CVE in the Vulnerability Explorer.

Figure 1: CVE-2024-21626 in Prisma Cloud’s Vulnerability Explorer

You can also check all the affected artifacts by searching for relevant CVE in the CVE Viewer.

Figure 2: Results for CVE-2024-23651 in Prisma Cloud’s CVE Viewer

Figure 3: Results for CVE-2024-23653 in Prisma Cloud’s CVE Viewer

Figure 4: Results for CVE-2024-23652 in Prisma Cloud’s CVE Viewer

Figure 5: Results for CVE-2024-21626 in Prisma Cloud’s CVE Viewer

The implications of these vulnerabilities, as discussed, are far-reaching, challenging the security assumptions behind containerized environments. They highlight the need for rigorous security practices, including the scrutiny of container images and the adoption of tools and strategies to detect and mitigate such vulnerabilities. As the container technology landscape continues to evolve, so too must the approaches to securing it, ensuring that the benefits of containerization don’t come at the expense of security.

Learn More

Prisma Cloud secures applications from Code to Cloud enabling security and DevOps teams to effectively collaborate to accelerate secure cloud-native application development and deployment. If you haven’t experienced the advantage, take Prisma Cloud for a test drive with a free 30-day Prisma Cloud trial.


Subscribe to Cloud Native Security Blogs!

Sign up to receive must-read articles, Playbooks of the Week, new feature announcements, and more.