Virtual repositories overview

This document provides an overview of virtual repositories. For instructions on how to create a virtual repository, see Create virtual repositories.

Artifact Registry Quotas and limits apply to virtual repositories.

How virtual repositories work

Virtual repositories act as a single access point to download, install, or deploy artifacts in the same format from one or more upstream repositories. An upstream repository can be a Artifact Registry standard or remote repository.

The other repository modes are:

  • Standard: The default repository mode. You upload or publish artifacts such as private packages directly to standard repositories. Although you can download directly from individual standard repositories, accessing groups of repositories with a virtual repository simplifies tool configuration.
  • Remote (language package repositories only): A pull through cache for artifacts in public repositories such as Maven Central or PyPI. It acts as a proxy for the public repositories so that you have more control over your external dependencies.

Use cases and benefits

Simpler client configuration

For task that only requires read access to repositories, you only need to configure a single Artifact Registry repository to access artifacts stored in multiple upstream repositories.

For example:

  • A virtual repository for Maven packages can serve private Java packages from a Artifact Registry standard repository and public Java packages from a remote repository that caches public packages from Maven Central.
  • A virtual repository can serve private Python packages from multiple upstream standard repositories owned by different teams. Each team has write access to their upstream repository, but downloads packages from other teams using the virtual repository.
Safer dependency resolution

You can assign a priority to upstream repositories to have more control over which repository Artifact Registry chooses when a requested artifact is in more than one upstream repository.

Some tools, such as the Python pip tool, do not provide a way to control search order when a mix of private and public repositories are configured in the client. This type of configuration is vulnerable to a dependency confusion attack, where someone uploads a new version of a package with bad code to a public repository to trick clients into choosing the bad version.

You can use remote and virtual repositories together to mitigate this risk:

  1. Create a remote repository as a proxy for the public repository.
  2. Create a standard repository for your private packages.
  3. Create a virtual repository that is configured to prioritize your standard repository if a version of the same package exists in both repositories.
  4. Configure package managers and other tools to read from the virtual repository only, so that the client logic is not involved repository selection.

To learn about other dependency management best practices, see Dependency management.

How virtual repositories select an upstream repository

Each upstream repository must have a configured priority. The priority is an integer that acts as a weight, not a ranking. This means that repositories with a higher priority value are prioritized over repositories with lower priority values.

When you request an artifact that is in multiple upstream repositories, Artifact Registry uses the following prioritization logic:

  • The repository with the highest value is prioritized. For example, a value of 10 is treated as higher priority than a value of 1.
  • If multiple upstream repositories have the same priority, the artifact can be served from any of those repositories.

When you directly configure a client to search a virtual repository and additional repositories, the client might still download artifacts from repositories outside of Artifact Registry.

For example, if you configure the Python pip tool to search PyPI and a virtual repository, your package might be downloaded directly from PyPI because pip will always choose the latest version of a package, regardless of which repository it comes from. If pip is configured to only search the virtual repository, you can then control the priority of all upstream repositories, including an upstream remote repository that acts as a proxy for PyPI.

Supported repository formats

You can create virtual repositories for following Artifact Registry Artifact Registry repository formats:

Language packages:

OS packages:

If you are new to Artifact Registry, you can use the quickstarts to learn about setting up standard repositories for these formats.

Limitations

In addition to Artifact Registry quotas and limitations, virtual repositories have the following limitations:

  • Standard Artifact Registry upstream repositories must be in the same region or multi-region as the virtual repository, but can be in different Google Cloud projects.
  • Maven virtual repositories don't permit setting the version policy to snapshot or release.

  • Apt and Yum upstreams must be Artifact Registry standard repositories.

  • Apt and Yum standard repositories update the package index asynchronously after a package is imported, uploaded, or deleted. For small repositories, regenerating the index can take several seconds. For larger repositories, reindexing might take several minutes or longer. After reindexing is complete, the change to the repository is visible to Apt and Yum clients.

What's next