Stay organized with collections
Save and categorize content based on your preferences.
This document helps you optimize Goodput, the rate of useful data
transferred, for your workloads. To achieve this optimization, we have curated
reproducible Goodput recipes that use common machine learning (ML) frameworks
and models. To review these recipes, see the
AI Hypercomputer GitHub organization.
The Goodput recipes were tested on clusters created using
Cluster Toolkit.
Before you begin
Before you use the Goodput recipes in this document, complete the following
steps if you haven't already:
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Hard to understand","hardToUnderstand","thumb-down"],["Incorrect information or sample code","incorrectInformationOrSampleCode","thumb-down"],["Missing the information/samples I need","missingTheInformationSamplesINeed","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2025-09-03 UTC."],[[["\u003cp\u003eThis document provides Goodput optimization strategies for workloads, focusing on the rate of useful data transfer.\u003c/p\u003e\n"],["\u003cp\u003eReproducible Goodput recipes, tested on clusters using Cluster Toolkit, are available within the AI Hypercomputer GitHub organization.\u003c/p\u003e\n"],["\u003cp\u003eThe document guides users through selecting an accelerator, choosing a consumption method, and creating a cluster, all before using the provided recipes.\u003c/p\u003e\n"],["\u003cp\u003eA pre-training recipe for Llama3.1 70B using the NeMo framework on A3 Mega accelerators in GKE clusters is available.\u003c/p\u003e\n"]]],[],null,["This document helps you optimize *Goodput* , the rate of useful data\ntransferred, for your workloads. To achieve this optimization, we have curated\nreproducible Goodput recipes that use common machine learning (ML) frameworks\nand models. To review these recipes, see the\n[AI Hypercomputer GitHub organization](https://github.com/orgs/AI-Hypercomputer/repositories).\nThe Goodput recipes were tested on clusters created using\nCluster Toolkit.\n| **Important:** Goodput recipes are only supported for VMs that use [future reservations](/ai-hypercomputer/docs/consumption-models#future-reservations).\n\nBefore you begin\n\nBefore you use the Goodput recipes in this document, complete the following\nsteps if you haven't already:\n\n1. Choose an accelerator that best suits your workload. See [Choose a deployment strategy](/ai-hypercomputer/docs/choose-strategy).\n2. Select a consumption method based on your accelerator of choice. See [Consumption options](/ai-hypercomputer/docs/consumption-model).\n3. Create your cluster based on the type of accelerator selected. See [Cluster deployment guides](/ai-hypercomputer/docs/choose-strategy#accelerators-pre-train).\n\nRecipes\n\nThe following reproducible Goodput recipes are available for pre-training\non GKE clusters: \n\n| **Recipe name** | **Accelerator** | **Model** | **Framework** | **Workload type** |\n|------------------------------------------------------------------------------------------------------------------------------------------------------------|-----------------|--------------|---------------|---------------------|\n| [Llama3.1 70B - A3 Mega](https://github.com/AI-Hypercomputer/gpu-recipes/blob/main/training/a3mega/llama3-1-70b/nemo-pretraining-gke-resiliency/README.md) | A3 Mega | Llama3.1 70B | NeMo | Pre-training on GKE |"]]