What Is Improper Artifact Integrity Validation?

5 min. read

Improper artifact integrity validation is a CI/CD security oversight that allows attackers to inject malicious code into the software delivery pipeline via artifacts within the pipeline. Tampering opportunities arise from the blend of internal and third-party resources within CI/CD systems. Failing to implement checks to verify the integrity of artifacts permits undetected tampering, which can lead to harmful code execution in the pipeline or production. This oversight results from various factors — weak validation processes, inadequate security controls, lack of awareness about the importance of artifact integrity.

CICD-SEC-9: Improper Artifact Integrity Validation Explained

CICD-SEC-9, as identified on the OWASP Top 10 CI/CD Security Risks, stems from the potential of an attacker with access to a system within the CI/CD pipeline to push malicious code or artifacts down the pipeline. This risk is exacerbated by insufficient mechanisms to validate the authenticity and integrity of code and artifacts.

As CI/CD processes combine internal resources with third-party packages fetched from assorted locations, the resulting mix creates multiple entry points susceptible to tampering. If a compromised resource infiltrates the delivery process undetected, it can flow through the pipeline, masquerading as a legitimate resource, and potentially reach production environments. Such a breach can lead to the execution of malicious code within CI/CD systems or, more concerning, in live production environments.

Artifact Integrity Validation Defined

An integral part of CI/CD security, artifact integrity validation provides assurance that digital artifacts, such as software packages, containers, and configuration files, remain unaltered and authentic from their original state. The security process involves using cryptographic methods, digital signatures, and checksums to confirm each artifact's origin while ensuring that the artifact hasn’t been tampered with during transit or storage.

By properly validating the integrity of artifacts, users can trust the reliability of the information, assured that deployed artifacts are free from unauthorized modifications.

Components of Artifact Integrity Validation in the Delivery Pipeline

Key components of effective artifact integrity validation — in addition to cryptographic checksums and digital signatures — include secure artifact storage, secure transport protocols, and secure key management practices. Each component plays a role in safeguarding the integrity and authenticity of artifacts at different stages of the CI/CD pipeline, from artifact creation and artifact transfer between stages to artifact deployment.

How CICD-SEC-9 Happens

To understand how improper artifact integrity validation exposes organizations to risk, let's look at a hypothetical attack scenario.

Initial Entry

A seasoned attacker discerns vulnerabilities in a prominent software company's CI/CD pipeline. Recognizing the potential to exploit lapses in artifact integrity validation, the attacker devises a plan to introduce a tampered artifact.

Reconnaissance

The attacker meticulously studies the company's CI/CD process. Noting the expected blend of internal resources with third-party packages, the attacker identifies potential weak points where the integrity of artifacts might lack rigorous validation.

Exploitation

Crafting a malicious library that mimics a widely used third-party package, the attacker infiltrates a mirror repository. By replacing the legitimate library with the tampered version, the attacker sets the stage for the company's CI/CD pipeline to inadvertently pull in the malicious code.

Bypassing Security Gates

The company's CI/CD system fetches the latest version of all dependencies. Due to lapses in secure artifact storage practices, the system unknowingly retrieves the tampered library from the compromised mirror repository.

Although the company employs checksum validation, the attacker, having manipulated the mirror repository, updates the checksum file to match the tampered library's hash. The absence of a multisource validation mechanism allows the malicious library to pass unchecked.

Deployment and Execution

Once the tampered library is fetched and linked during the build process, the resulting application, now tainted with the malicious code from the library, progresses through the pipeline. Upon deployment in the production environment, the concealed malicious code activates, leading to system compromise.

Importance of Artifact Integrity Validation in CI/CD

The trustworthiness of artifacts is critical to cloud-native application development. By ensuring that only trustworthy artifacts are deployed, proper artifact integrity validation reduces the possibility of malicious code making it into production environments.

Risks Associated with Improper Artifact Integrity Validation

Organizations open themselves to potential security breaches, data leaks, and operational disruptions resulting from tampered artifacts that could have been detected prior to deployment if proper validation measures had been in place.

Case Study 1: Webmin Falls Victim to Stealthy Server Exploit

Attackers exploited Webmin's development build server in April 2018, introducing a vulnerability to the password_change.cgi script. To conceal the malicious modification, they altered the file's timestamp, and the compromised file became part of Webmin version 1.890. Although developers reverted the file using GitHub's version, attackers altered it again by July 2018, impacting versions 1.900 to 1.920. The exploit remained active only when a specific feature was enabled. After receiving a zero-day exploit report in August 2019, Webmin promptly removed the exploit and released version 1.930.

Case Study 2: PHP's Internal Security Breach

In early 2021, PHP's git.php.net server faced a malicious attack. Initially considered an individual account compromise, two malicious commits were made under the names of prominent PHP contributors. A deeper investigation revealed that these commits bypassed the standard gitolite infrastructure, hinting at a server compromise. The commits were pushed using HTTPS and password-based authentication, raising suspicions of a potential leak in the master.php.net user database. The attacker's ability to authenticate after only a few username guesses further intensified these concerns.

Preventing Improper Artifact Integrity Validation

Understanding the risks associated with artifacts highlights the importance of implementing staunch checks to ensure their integrity. To mitigate risks, consider the following strategies:

Integrity Validation from Development to Production

Implement processes and technologies that validate resource integrity throughout the software delivery chain. As developers generate a resource, they should sign it using an external resource signing infrastructure. Before consuming a resource in subsequent pipeline stages, cross-check its integrity against the signing authority. Key measures include:

Code Signing

Source code management (SCM) solutions offer the capability to sign commits with a unique key for each contributor, preventing unsigned commits from progressing through the pipeline.

Artifact Verification Software

Tools designed for signing and verifying code and artifacts, such as the Linux Foundation's Sigstore, can thwart unverified software from advancing down the pipeline.

Configuration Drift Detection

Implement measures to detect configuration drifts, such as resources in cloud environments not managed using a signed infrastructure as code (IAC) template. Such drifts could indicate deployments from untrusted sources or processes.

Third-Party Resource Validation

Third-party resources incorporated into build and deploy pipelines, like scripts executed during the build process, should undergo rigorous validation. Before utilizing these resources, compute their hash and compare it against the official hash provided by the resource provider.

Industry Practices to Promote Artifact Integrity in CI/CD

The industry has established standards and guidelines for artifact integrity validation. Examples include the use of cryptographic algorithms like SHA-256 for checksums, X.509 certificates for digital signatures, and secure transport protocols such as HTTPS for artifact transfer. Organizations should align their practices with these standards to maintain a reliable and secure software delivery pipeline.

Establish Artifact Integrity Validation Policies and Audit Schedule

To ensure proper artifact integrity validation, organizations should establish clear policies that define validation processes. Once established, regularly audit compliance with internal policies to identify and address weaknesses, as well as areas of noncompliances. Continuous monitoring and analysis will help detect anomalies or unauthorized activities.

Employ Cryptographic Signing

Use public key infrastructure (PKI) to cryptographically sign artifacts at each stage of the CI/CD pipeline. This practice validates signatures against a trusted certificate authority before consumption. Configure your CI/CD pipeline to reject artifacts with invalid or missing signatures to reduce risks of deploying tampered resources or unauthorized changes.

Implement Secure Storage

Establish a secure tamper-proof repository to store artifacts and enforce strict access controls, preventing unauthorized modifications. Enable versioning to maintain a historical record of artifact changes and implement real-time monitoring to track and alert on suspicious activity. In case of compromised artifacts, configure the system to facilitate rollbacks to previous, known-good versions.

Enforce Multi-Source Validation

Adopt a multisource validation strategy that verifies the integrity of artifacts using various sources, such as checksums, digital signatures, and secure hash algorithms, as well as trusted repositories. Keep the cryptographic algorithms and keys up to date to maintain their effectiveness.

Integrate Security Scanning

Incorporate vulnerability scanning tools and static application security testing (SAST) into the CI/CD pipeline to identify potential security issues in artifacts — including third-party dependencies — before deployment. Taking a proactive approach allows DevOps teams to address vulnerabilities early in the development process, reducing the risk of security incidents and maintaining a high level of code quality.

Foster a Security-Aware Culture

Educate and train development teams about the importance of artifact integrity validation and the potential risks associated with improper validation. Encourage adherence to secure coding practices and emphasize the role each individual plays in maintaining a secure CI/CD environment.

Improper Artifact Integrity Validation FAQs

A digital signature is a cryptographic technique used to validate the authenticity and integrity of a message, software, or digital document.

Hash functions are cryptographic algorithms that take inputs of any length and generate fixed-size outputs called hashes or digests. Designed to be deterministic, hash functions ensure the same input consistently produces the same hash value. Additionally, their one-way nature makes it computationally infeasible to deduce the input from the hash value. Common applications for hash functions include data integrity verification, digital signature creation, and secure password storage.

Well-known hash functions include SHA-256, MD5, and SHA-1.

In the context of CI/CD, a library is a collection of precompiled routines that a program can use. These routines, sometimes called modules or functions, are stored in object format. Libraries are particularly useful for storing frequently used routines because developers don’t need to explicitly link them to every program that uses them. The linker automatically looks for them when linking modules together. Libraries can be static (linked at compile time) or dynamic (linked at runtime).
A repository, in the context of software development, refers to a centralized file storage location. It’s used by version control systems to store multiple versions of files. While a repository can contain libraries, it can also contain individual code files, images, scripts, documentation, etc.
In-toto is an open-source framework for securing software supply chains. It cryptographically ensures that the entire development process has been conducted as planned and the final artifact hasn't been tampered with.
Supply-chain levels for software artifacts, or SLSA, refers to an end-to-end framework for ensuring the integrity of software artifacts throughout the software supply chain.
Sigstore is a Linux Foundation project that provides a nonrepudiable software supply chain. It offers services to software developers for signing software artifacts, storing the signatures, and verifying them.
Configuration drift detection is a process that ensures the current state of the system configuration hasn't deviated or 'drifted' from its intended state. Configuration drift can be a sign of tampering or misconfiguration.

A software configuration management (SCM) solution is a tool or system that manages and tracks changes made to software projects throughout their development lifecycle. By controlling modifications to source code, files, and documentation, it aids in maintaining consistency, traceability, and accountability across the development process.

SCM solutions enable developers to collaborate efficiently, prevent conflicting changes, and easily revert changes to previous versions. They also facilitate branching and merging, allowing simultaneous development of multiple features or bug fixes in isolated environments. SCM solutions streamline the build and deployment processes, ensuring the right versions of software components are combined and released.

Popular SCM tools include Git, Subversion, and Mercurial.

Infrastructure as code (IAC) templates describe the desired state of system infrastructure. Signing these templates helps ensure their integrity.
Artifact verification software refers to tools for signing and verifying the integrity of code and artifacts, such as those provided by in-toto, SLSA, and Sigstore.
Checksum validation is a method used to ensure the integrity of data, especially during transmission or storage. It involves generating a checksum from the data and then regenerating and comparing the checksum at the point of use.