Cette page a été traduite par l'API Cloud Translation.

Entraîner Llama2 avec Megatron-LM sur des machines virtuelles A3 Mega

Standard

Présentation

Dans ce guide de démarrage rapide, vous allez apprendre à exécuter une charge de travail PyTorch Megatron-LM basée sur des conteneurs sur A3 Mega. Le code est disponible dans ce dépôt GitHub : megatron-gke.

Avant de commencer

Procédez comme suit pour activer l'API Google Kubernetes Engine (GKE) :

Sign in to your Google Cloud account. If you're new to Google Cloud, create an account to evaluate how our products perform in real-world scenarios. New customers also get $300 in free credits to run, test, and deploy workloads.

In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

Roles required to select or create a project

Select a project: Selecting a project doesn't require a specific IAM role—you can select any project that you've been granted a role on.
Create a project: To create a project, you need the Project Creator (roles/resourcemanager.projectCreator), which contains the resourcemanager.projects.create permission. Learn how to grant roles.

Go to project selector

Verify that billing is enabled for your Google Cloud project.

Enable the GKE API.

Roles required to enable APIs

To enable APIs, you need the Service Usage Admin IAM role (roles/serviceusage.serviceUsageAdmin), which contains the serviceusage.services.enable permission. Learn how to grant roles.

Enable the API

In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

Roles required to select or create a project

Select a project: Selecting a project doesn't require a specific IAM role—you can select any project that you've been granted a role on.
Create a project: To create a project, you need the Project Creator (roles/resourcemanager.projectCreator), which contains the resourcemanager.projects.create permission. Learn how to grant roles.

Go to project selector

Verify that billing is enabled for your Google Cloud project.

Enable the GKE API.

Roles required to enable APIs

To enable APIs, you need the Service Usage Admin IAM role (roles/serviceusage.serviceUsageAdmin), which contains the serviceusage.services.enable permission. Learn how to grant roles.

Enable the API

Make sure that you have the following role or roles on the project: roles/container.admin, roles/compute.networkAdmin, roles/iam.serviceAccountUser
Check for the roles
1. In the Google Cloud console, go to the IAM page.
  Go to IAM
2. Select the project.
3. In the Principal column, find all rows that identify you or a group that you're included in. To learn which groups you're included in, contact your administrator.
4. For all rows that specify or include you, check the Role column to see whether the list of roles includes the required roles.
Grant the roles
1. In the Google Cloud console, go to the IAM page.
  Accéder à IAM
2. Sélectionnez le projet.
3. Cliquez sur Accorder l'accès.
4. Dans le champ Nouveaux comptes principaux, saisissez votre identifiant utilisateur. Il s'agit généralement de l'adresse e-mail d'un compte Google.
5. Dans la liste Sélectionner un rôle, sélectionnez un rôle.
6. Pour attribuer des rôles supplémentaires, cliquez sur Ajouter un autre rôle et ajoutez tous les rôles supplémentaires.
7. Cliquez sur Enregistrer.

Entraîner Llama2 avec Megatron-LM sur des machines virtuelles A3 Mega

Présentation

Avant de commencer

Check for the roles

Grant the roles

Créer un cluster A3 Mega

Configurer votre environnement

Utiliser le programmeur basé sur la topologie pour déployer vos pods

Exécuter la charge de travail

Créer le Dockerfile et le transférer vers Google Cloud Artifact Registry

Lancer le benchmark Megatron-LM Llama2

Effectuer un nettoyage

Supprimez le cluster GKE :

Supprimer le bucket Cloud Storage

Étape suivante