Stay organized with collections
Save and categorize content based on your preferences.
To support you with running your workloads, we have curated a set of
reproducible benchmark recipes that use some of the most common machine learning
(ML) frameworks and models. These are stored in GitHub repositories. To access
these repositories, see
AI Hypercomputer GitHub organization.
These benchmark recipes were tested on clusters created using
Cluster Toolkit.
Overview
Before you get started with these recipes, ensure that you have completed the
following steps:
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Hard to understand","hardToUnderstand","thumb-down"],["Incorrect information or sample code","incorrectInformationOrSampleCode","thumb-down"],["Missing the information/samples I need","missingTheInformationSamplesINeed","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2025-09-04 UTC."],[[["\u003cp\u003eReproducible benchmark recipes utilizing common machine learning frameworks and models are available in GitHub repositories for running workloads.\u003c/p\u003e\n"],["\u003cp\u003eBefore using the recipes, users should choose an accelerator, select a consumption method, and create a cluster.\u003c/p\u003e\n"],["\u003cp\u003eThe provided recipes are designed for pre-training on GKE clusters and can be filtered by framework, model, and accelerator.\u003c/p\u003e\n"],["\u003cp\u003eRecipes cover pre-training of models such as Llama3.1 70B, Mixtral-8-7B, and GPT3-175B using frameworks like MaxText and NeMo.\u003c/p\u003e\n"],["\u003cp\u003eThese benchmarks have been tested using clusters that were created through the Cluster Toolkit, and use accelerators such as A3 Ultra and A3 Mega.\u003c/p\u003e\n"]]],[],null,["To support you with running your workloads, we have curated a set of\nreproducible benchmark recipes that use some of the most common machine learning\n(ML) frameworks and models. These are stored in GitHub repositories. To access\nthese repositories, see\n[AI Hypercomputer GitHub organization](https://github.com/orgs/AI-Hypercomputer/repositories).\nThese benchmark recipes were tested on clusters created using\nCluster Toolkit.\n| **Important:** Benchmark recipes are only supported for VMs that use [future reservations](/ai-hypercomputer/docs/consumption-models#future-reservations).\n\nOverview\n\nBefore you get started with these recipes, ensure that you have completed the\nfollowing steps:\n\n1. Choose an accelerator that best suits your workload. See [Choose a deployment strategy](/ai-hypercomputer/docs/choose-strategy).\n2. Select a consumption method based on your accelerator of choice, see [Consumption options](/ai-hypercomputer/docs/consumption-models).\n3. Create your cluster based on the type of accelerator selected. See [Cluster deployment guides](/ai-hypercomputer/docs/choose-strategy#accelerators-pre-train).\n\nRecipes\n\nThe following reproducible benchmark recipes are available for pre-training\nand inference on GKE clusters.\n\nTo search the catalog, you can filter by a combination of your framework,\nmodel, and accelerator. \nNeMo MaxText SGLang TensorRT-LLM vLLM DeepSeek R1 671B GPT3-175B Llama3 70B Llama3.1 70B Llama-3.1-405B Mixtral-8-7B A3 Ultra A3 Mega Inference Pre-training \nClear all \n\n| **Recipe name** | **Accelerator** | **Model** | **Framework** | **Workload type** |\n|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----------------|--------------------------------------|---------------|---------------------|\n| [Llama3.1 70B - A3 Ultra](https://github.com/AI-Hypercomputer/gpu-recipes/blob/main/training/a3ultra/llama3-1-70b/maxtext-pretraining-gke/README.md) | A3 Ultra | Llama3.1 70B | MaxText | Pre-training on GKE |\n| [Llama3.1 70B - A3 Ultra](https://github.com/AI-Hypercomputer/gpu-recipes/blob/main/training/a3ultra/llama3-1-70b/nemo-pretraining-gke/README.md) | A3 Ultra | Llama3.1 70B | NeMo | Pre-training on GKE |\n| [Mixtral-8-7B - A3 Ultra](https://github.com/AI-Hypercomputer/gpu-recipes/tree/main/training/a3ultra/mixtral-8x7b/nemo-pretraining-gke/README.md) | A3 Ultra | Mixtral-8-7B | NeMo | Pre-training on GKE |\n| [GPT3-175B - A3 Mega](https://github.com/AI-Hypercomputer/gpu-recipes/blob/main/training/a3mega/gpt3-175b/nemo-pretraining-gke/README.md) | A3 Mega | GPT3-175B | NeMo | Pre-training on GKE |\n| [Mixtral 8x7B - A3 Mega](https://github.com/AI-Hypercomputer/gpu-recipes/blob/main/training/a3mega/mixtral-8x7b/nemo-pretraining-gke/README.md) | A3 Mega | Mixtral 8x7B | NeMo | Pre-training on GKE |\n| - [Llama3 70B - A3 Mega](https://github.com/AI-Hypercomputer/gpu-recipes/blob/main/training/a3mega/llama3-70b/nemo-pretraining-gke/README.md) \u003c!-- --\u003e - [Llama3.1 70B A3 Mega](https://github.com/AI-Hypercomputer/gpu-recipes/blob/main/training/a3mega/llama3-1-70b/nemo-pretraining-gke/README.md) | A3 Mega | - Llama3 70B \u003c!-- --\u003e - Llama3.1 70B | NeMo | Pre-training on GKE |\n| [DeepSeek R1 671B](https://github.com/AI-Hypercomputer/gpu-recipes/blob/main/inference/a3mega/deepseek-r1-671b/sglang-serving-gke/README.md) | A3 Mega | DeepSeek R1 671B | SGLang | Inference on GKE |\n| [DeepSeek R1 671B](https://github.com/AI-Hypercomputer/gpu-recipes/blob/main/inference/a3mega/deepseek-r1-671b/vllm-serving-gke/README.md) | A3 Mega | DeepSeek R1 671B | vLLM | Inference on GKE |\n| [Llama-3.1-405B](https://github.com/AI-Hypercomputer/gpu-recipes/blob/main/inference/a3ultra/llama-3.1-405b/trtllm-inference-gke/single-node/README.md) | A3 Ultra | Llama-3.1-405B | TensorRT-LLM | Inference on GKE |\n| [DeepSeek R1 671B](https://github.com/AI-Hypercomputer/gpu-recipes/blob/main/inference/a3ultra/deepseek-r1-671b/sglang-serving-gke/README.md) | A3 Ultra | DeepSeek R1 671B | SGLang | Inference on GKE |\n| [DeepSeek R1 671B](https://github.com/AI-Hypercomputer/gpu-recipes/blob/main/inference/a3ultra/deepseek-r1-671b/vllm-serving-gke/README.md) | A3 Ultra | DeepSeek R1 671B | vLLM | Inference on GKE |"]]