Defining data repositories

As part of performing a migration, Migrate to Containers writes information to different data repositories:

Docker image files representing a migrated Linux VM are written to a Docker registry.

These Docker image files represent the files and directories of the migrated Linux VM. This repository is not required when migrating Windows workloads.
Migration artifacts that represent the migrated workload are written to a second repository.

Artifacts include the configuration YAML files that you can use to deploy the migrated workloads, and other files. The exact artifacts depend on whether you are migrating Linux or Windows workloads.

Platform	Docker image files registry*	Migration artifacts repository
GKE Enterprise clusters on Google Cloud	Default is Container Registry (GCR). Optionally specify any Docker registry that supports basic authentication.	Default is Cloud Storage. Optionally specify S3 as the artifacts repo for Linux migrations. S3 is not supported for migrating Windows workloads.
Google Distributed Cloud Virtual for Bare Metal clusters	Default is Container Registry (GCR). Optionally specify any Docker registry that supports basic authentication.	Default is Cloud Storage. Optionally specify S3 as the artifacts repo.
* The Docker image files registry is not required for Windows migrations. It is only required for migrating Linux VMs.

Viewing the repository status

After you install Migrate to Containers, you validate the Migrate to Containers installation by running the migctl doctor command. As part of this validation, the migctl doctor command checks the status of the repos:

migctl doctor

In the following example output of the migctl doctor command, the check mark indicates that Migrate to Containers has been successfully deployed but you have not yet configured the necessary data repositories:

  [✓] Deployment
  [!] Docker Registry
  [!] Artifacts Repo
  [!] Source Status

If there are problems with your repositories migctl doctor flags them to you when you run the command. When you run migctl doctor, migctl queries all artifact repositories, and warns for each one that is unhealthy.

In the following example output of the migctl doctor command, the exclamation mark indicates that Migrate to Containers has found errors in the artifact repositories. Per the error, the repository has failed to initialize and the configuration should be fixed before running the command again. For additional repositories beyond your default, this is not necessarily a blocker to your migration.

If your default repository is unhealthy, an X indicates that Migrate to Containers has found an error and the migration may fail.

[✓] Deployment
[✓] Docker Registry
[!] Artifacts Repository
    [✓] example-healthy-repository [default]
    [!] example-failed-repository
        Error: Failed to initialize repository client: Retryable M4A_RepositoryFactoryMissingSecret: artifacts repository secret is configured, but not found at the designated location '/example-failed-repository'
[!] Source Status
[✓] Default storage class

After configuring the repositories, you can run the migctl doctor command again to ensure that the repositories are configured correctly:

  [✓] Deployment
  [✓] Docker registry
  [✓] Artifacts repo
  [!] Source Status

Google Cloud console support

The Google Cloud console displays the URLs to items in the repos based on the repository implementation. For example, if the repo is implemented using S3 , the Google Cloud console shows URLs for a bucket in S3.

Options for repository location

The location of the data repositories can have an effect on migration performance and cost.

For example, the Docker image files representing a migrated VM can be large. If you have an on-premises processing cluster, but write the Docker image files to GCR on Google Cloud, then you incur the performance latency of the data upload, and the cost of storing that data.

For an on-premises processing cluster, you might find it more efficient to define a Docker registry local to the cluster. By having the registry local, you minimize the upload latency and minimize storage costs.

For a GKE cluster deployed on Google Cloud, using the default GCR repos provides the highest level of performance, but you are charged for that storage. However, you are not required to use GCR with a Cloud-based cluster and can choose to use your own Docker registry instead.

Repository naming requirements

You assign a name to a repository when you add it to Migrate to Containers. The name must meet the following requirements:

Contain at most 63 characters.
Contain only lowercase alphanumeric characters or "-" (hyphen).
Start with an alphanumeric character.
End with an alphanumeric character.

Repository authentication

All repositories used by Migrate to Containers require authentication. The authentication mechanism depends on the repository type, as shown in the following table:

Repository	Implementation	Authentication
Docker image files registry	GCR	JSON key for a Google Cloud service account. See Creating a service account for accessing Container Registry and Cloud Storage for more.
Docker image files registry	Docker registry	Username and password for basic authentication.
Migration artifacts repository	Cloud Storage	JSON key for a Google Cloud service account. See Creating a service account for accessing Container Registry and Cloud Storage for more.
Migration artifacts repository	S3	Access key and secret or credentials file. See Overview of managing access for more.

Supporting TLS

Some repositories are accessible using TLS/SSL over HTTPS. If the HTTPS connection to the repository uses a self-signed cert, then you must pass a PEM file containing either of the following when configuring the repository:

The public key of the self-signed cert
A concatenation from the root certificate and all intermediate certificates up to the actual server certificate

Configuring a Docker registry

Use the migctl command to configure a Docker registry. The migctl command lets you perform the following actions on a registry configuration:

Create
Update
Delete
List
Set default

You can define multiple configurations. Migrate to Containers uses the configuration currently defined as the default. Use the migctl docker-registry list command to view the current configurations, including the default. Use the migctl docker-registry set-default command to set the default configuration.

The following example shows how to configure a Docker registry:

GCR
```
migctl docker-registry create gcr registry-name --project project-id --json-key=m4a-install.json
```
where:
- registry-name is the user-defined name of the Docker registry configuration.
- project-id is your Google project ID.
- m4a-install.json is the name of the JSON key file for the service account for accessing Container Registry and Cloud Storage as described in Configuring a service account.
Docker registry
```
migctl docker-registry create basic-auth registry-name --registry-path url --username username --ca-pem-file ca-pem-filename
```
where:
- registry-name is the user-defined name of the Docker registry configuration.
- url specifies the URL of the registry without the http:// or https:// prefix. For example, localhost:8080/myregistry.
- username for the basic authentication credentials of the registry. You are prompted to enter the password.
- If the registry uses a self-signed cert, ca-pem-filename specifies a PEM file containing either the public key or the complete CA chain, meaning a concatenation from the intermediate CA certificates up to the root certificate. For example:
```
cat int1.pem int2.pem ... root.pem
```

To later update the registry configuration, run the migctl docker-registry update command with the same arguments as you used to create it:

migctl docker-registry update gcr registry-name same-flags-as-create

When you configure a Docker registry, it becomes the default registry. However, you might have multiple registries defined. To see the current list of registries:

migctl docker-registry list

To set the default registry configuration, meaning the one currently used for migrations, use the following command:

migctl docker-registry set-default registry-name

To delete a registry configuration:

migctl docker-registry delete registry-name

Configuring an artifacts repository

Use the migctl command to configure an artifacts repository. The migctl command lets you perform the following actions on a repository configuration:

Create
Update
Delete
List
Set default

The migctl commands create, update, and list all have continuous health checks for artifact repositories. When they are run, they provide messages indicting whether or not the repository is ready, and an associated error message. To skip the health check on create or update, run these commands with the --async flag.

You can define multiple configurations. Migrate to Containers uses the configuration currently defined as the default. Use the migctl artifacts-repo list command to view the current configurations, including the default. Use the migctl artifacts-repo set-default command to set the default configuration.

The following example shows how to configure an artifacts repository:

Cloud Storage
```
migctl artifacts-repo create gcs repository-name --bucket-name bucket-name --json-key=m4a-install.json
```
where:
- repository-name is the user-defined name of the artifacts repository configuration.
- bucket-name specifies an existing bucket in the Cloud Storage repository. If you do not have an existing bucket, create one using the instructions at Create buckets.
  
  Note: When installing Migrate to Containers on clusters on Google Cloud, the Migrate to Containers installer automatically creates a default bucket named:
  
  GCP_PROJECT-migration-artifacts
  
  Where GCP_PROJECT is your Google project ID.
- project-id is your Google project ID.
- m4a-install.json is the name of the JSON key file for the service account for accessing Container Registry and Cloud Storage as described in Configuring a service account.
```
migctl artifacts-repo create s3 repository-name --bucket-name bucket-name --region aws-region --access-key-id=key-id
```
You are prompted to enter the secret key for key-id.

Alternatively, specify the path to a credentials file:
```
migctl artifacts-repo create s3 repository-name --bucket-name bucket-name --region aws-region --credentials-file-path file-path
```
where:
- repository-name is the user-defined name of the artifacts repository configuration.
- bucket-name specifies an existing bucket in the S3 repository. If you do not have an existing bucket, create one using the instructions at Working with Amazon S3 Buckets.
- aws-region specifies the AWS region for the repository. The processing cluster and the repository can be in separate regions as long as the cluster has permissions to access the repository.
- key-id specifies the access key. See Overview of managing access for more.
- file-path specifies the path to a CSV file, downloaded from the AWS console, containing the credentials.

To later update the repository configuration, run the migctl docker-registry update command with the same arguments as you used to create it:

migctl artifacts-repo update gcr repository-name same-flags-as-create

When you configure a repository registry, it becomes the default repository. However, you might have multiple repositories defined. To see the current list of repositories:

migctl artifacts-repo list

To set the default repository configuration, meaning the one currently used for migrations, use the following command:

migctl artifacts-repo set-default repository-name

To delete a repository configuration:

migctl artifacts-repo delete repository-name

What's next

Learn how to configure an HTTPS proxy for a remote installation.