License scanning of CycloneDX files

Version history
  • Introduced in GitLab 15.9 for GitLab SaaS with two flags named license_scanning_sbom_scanner and package_metadata_synchronization. Both flags disabled by default.
  • Generally available in GitLab 16.4. Feature flags license_scanning_sbom_scanner and package_metadata_synchronization removed.
note
The legacy License Compliance analyzer was deprecated in GitLab 15.9 and removed in GitLab 16.3. To continue using GitLab for License Compliance, remove the License Compliance template from your CI/CD pipeline and add the Dependency Scanning template. The Dependency Scanning template is now capable of gathering the required license information so it is no longer necessary to run a separate License Compliance job. The License Compliance CI/CD template should not be removed prior to verifying that the instance has been upgraded to a version that supports the new method of license scanning. To begin using the Dependency Scanner quickly at scale, you may set up a scan execution policy at the group level to enforce the SBOM-based license scan for all projects in the group. Then, you may remove the inclusion of the Jobs/License-Scanning.gitlab-ci.yml template from your CI/CD configuration. If you wish to continue using the legacy License Compliance feature, you can do so by setting the LICENSE_MANAGEMENT_VERSION CI variable to 4. This variable can be set at the project, group or instance level. This configuration change will allow you to continue using the existing version of License Compliance to generate license scanning report artifacts in your pipelines. However, since legacy license scanning support is being removed from our codebase, switching back to this legacy analyzer prevents other License Compliance features from working as expected, so this approach is not recommended. In addition to this, bugs and vulnerabilities in this legacy analyzer will no longer be fixed.

To detect the licenses in use, License Compliance relies on running the Dependency Scanning CI Jobs, and analyzing the CycloneDX Software Bill of Materials (SBOM) generated by those jobs. Other 3rd party scanners may also be used as long as they produce a CycloneDX file with a list of dependencies for one of our supported languages. This method of scanning is also capable of parsing and identifying over 500 different types of licenses, as defined in the SPDX list. Licenses not in the SPDX list are reported as “Unknown”. License information can also be extracted from packages that are dual-licensed, or have multiple different licenses that apply.

Configuration

Enable Dependency Scanning and ensure that its prerequisites are met.

From the .gitlab-ci.yml file, remove the deprecated line Jobs/License-Scanning.gitlab-ci.yml, if it’s present.

On GitLab self-managed only, you can choose package registry metadata to sync in the Admin Area for the GitLab instance.

Supported languages and package managers

License scanning is supported for the following languages and package managers:

LanguagePackage Manager
.NETNuGet
C#
CConan
C++
GoGo
JavaGradle
Maven
JavaScript and TypeScriptnpm
pnpm
yarn
PHPComposer
Pythonsetuptools
pip
Pipenv
Poetry
RubyBundler
Scalasbt

The supported files and versions are the ones supported by Dependency Scanning.

License expressions

GitLab has limited support for composite licenses. License compliance can read multiple licenses, but always considers them combined using the AND operator. For example, if a dependency has two licenses, and one of them is allowed and the other is denied by the project policy, GitLab evaluates the composite license as denied, as this is the safer option. The ability to support other license expression operators (like OR, WITH) is tracked in this epic.

Blocking merge requests based on detected licenses

Users can require approval for merge requests based on the licenses that are detected by configuring a license approval policy.

Running in an offline environment

For self-managed GitLab instances in an environment with limited, restricted, or intermittent access to external resources through the internet, some adjustments are required to successfully scan CycloneDX reports for licenses. For more information, see the offline quick start guide.

Troubleshooting

A CycloneDX file is not being scanned and appears to provide no results

Ensure that the CycloneDX file adheres to the CycloneDX JSON specification. This specification does not permit duplicate entries. Projects that contain multiple SBOM files should either report each SBOM file up as individual CI report artifacts or they should ensure that duplicates are removed if the SBOMs are merged as part of the CI pipeline.

You can validate CycloneDX SBOM files against the CycloneDX JSON specification as follows:

$ docker run -it --rm -v "$PWD:/my-cyclonedx-sboms" -w /my-cyclonedx-sboms cyclonedx/cyclonedx-cli:latest cyclonedx validate --input-version v1_4 --input-file gl-sbom-all.cdx.json

Validating JSON BOM...
BOM validated successfully.

If the JSON BOM fails validation, for example, because there are duplicate components:

Validation failed: Found duplicates at the following index pairs: "(A, B), (C, D)"
#/properties/components/uniqueItems

This issue can be fixed by updating the CI template to use jq to remove the duplicate components from the gl-sbom-*.cdx.json report by overriding the job definition that produces the duplicate components. For example, the following removes duplicate components from the gl-sbom-gem-bundler.cdx.json report file produced by the gemnasium-dependency_scanning job:

include:
  - template: Jobs/Dependency-Scanning.gitlab-ci.yml

gemnasium-dependency_scanning:
  after_script:
    - apk update && apk add jq
    - jq '.components |= unique' gl-sbom-gem-bundler.cdx.json > tmp.json && mv tmp.json gl-sbom-gem-bundler.cdx.json

Remove unused license data

License scanning changes (released in GitLab 15.9) required a significant amount of additional disk space to be available on the instances. This issue was resolved in GitLab 16.3 by the Reduce package metadata table on-disk footprint epic. But if your instance was running license scanning between GitLab 15.9 and 16.3, you may want to remove the unneeded data.

To remove the unneeded data:

  1. Check if the package_metadata_synchronization feature flag is currently, or was previously enabled, and if so, disable it. Use Rails console to execute the following commands.

    Feature.enabled?(:package_metadata_synchronization) && Feature.disable(:package_metadata_synchronization)
    
  2. Check if there is deprecated data in the database:

    PackageMetadata::PackageVersionLicense.count
    PackageMetadata::PackageVersion.count
    
  3. If there is deprecated data in the database, remove it by running the following commands in order:

    PackageMetadata::PackageVersionLicense.delete_all
    PackageMetadata::PackageVersion.delete_all