Genera un informe de errores de NVIDIA para las GPU Blackwell
Organiza tus páginas con colecciones
Guarda y categoriza el contenido según tus preferencias.
En este documento, se explica cómo crear un informe de errores de NVIDIA para los tipos de máquinas que usan GPUs NVIDIA Blackwell.
Si deseas identificar cuáles de tus tipos de máquinas usan la GPU NVIDIA Blackwell, consulta Modelos de GPU.
Si tu tipo de máquina usa la arquitectura de GPU NVIDIA Blackwell, el informe de errores de NVIDIA generado no incluye datos de hardware críticos de bajo nivel cuando ejecutas el script nvidia-bug-report.sh. Estos datos contienen información como el estado de la capa física de las conexiones NVLink, los valores internos de los registros de la GPU y los segmentos de diagnóstico sin procesar del firmware. Esta información es esencial para diagnosticar problemas, en especial los relacionados con NVLink, que podrían provocar errores de Xid de la GPU o una degradación del rendimiento sin resolver.
Cómo generar un informe de errores de NVIDIA
Para generar un informe de errores, completa los siguientes pasos:
Conéctate a tu instancia de GPU. Elige una de las siguientes opciones:
[[["Fácil de comprender","easyToUnderstand","thumb-up"],["Resolvió mi problema","solvedMyProblem","thumb-up"],["Otro","otherUp","thumb-up"]],[["Difícil de entender","hardToUnderstand","thumb-down"],["Información o código de muestra incorrectos","incorrectInformationOrSampleCode","thumb-down"],["Faltan la información o los ejemplos que necesito","missingTheInformationSamplesINeed","thumb-down"],["Problema de traducción","translationIssue","thumb-down"],["Otro","otherDown","thumb-down"]],["Última actualización: 2025-09-03 (UTC)"],[],[],null,["# Generate NVIDIA bug report for Blackwell GPUs\n\n*** ** * ** ***\n\nThis document explains how to create an NVIDIA bug report for your machine types\nthat use [NVIDIA Blackwell GPUs](https://www.nvidia.com/en-us/data-center/technologies/blackwell-architecture/).\nIf you want to identify which of your machine types are using the NVIDIA Blackwell GPU, see\n[GPU models](/compute/docs/gpus#gpu-models).\n\nIf your machine type uses the NVIDIA Blackwell GPU architecture, the generated\nNVIDIA bug report doesn't include critical low-level hardware data when you run\nthe [nvidia-bug-report.sh](https://docs.nvidia.com/deploy/rma-process/index.html#topic_3_1)\nscript. This data contains information such as the physical layer status of NVLink\nconnections, internal GPU register values, and raw diagnostic segments from the\nfirmware. This information is essential for diagnosing issues, especially those\nrelated to NVLink, which could lead to\n[GPU Xid errors](https://docs.nvidia.com/deploy/xid-errors/index.html#xid-error-listing)\nor unresolved performance degradation.\n\nGenerate an NVIDIA bug report\n-----------------------------\n\nTo generate a bug report, complete the following steps:\n\n1. Connect to your GPU instance. Choose one of the following options:\n\n - [Connect to Linux instances](/compute/docs/connect/standard-ssh)\n - [Connect to Windows instances](/compute/docs/instances/connecting-to-windows)\n2. Download and install the MFT package by selecting one of the following options:\n\n### Container-Optimized OS\n\n\nIf your instance uses a Container-Optimized OS (COS) as the\nguest operating system, use the open source [GCE COS NVIDIA Bug Report Collector](https://github.com/GoogleCloudPlatform/cluster-toolkit/tree/main/community/gce-cos-nvidia-bug-report)\ntool to generate the bug report with MFT. This tool automatically\ninjects supported MST kernel modules that match the COS kernel,\ninstalls the userspace tool, generates the bug report, and\noptionally uploads the result to a Cloud Storage bucket.\n\n### Other OS\n\nFor other Linux OSes, complete the following steps:\n\n1. Download and install NVIDIA Firmware Tools (MFT) software version 4.32.0 or higher from the [NVIDIA website](https://network.nvidia.com/products/adapter-software/firmware-tools/).\n2. Install the tool. For more information, see [Compilation and installation](https://docs.nvidia.com/networking/display/mftv4320/compilation+and+installation) in the NVIDIA Firmware Tools (MFT) Documentation. After you install MFT, the [nvidia-bug-report.sh](https://docs.nvidia.com/deploy/rma-process/index.html#topic_3_1) script automatically uses the MFT tools to generate the report. You don't need to interact with MFT tools directly.\n3. Run the `nvidia-bug-report.sh` script to generate a bug report. This process takes about two minutes.\n4. Extract the report.\n5. Verify that the report includes MFT data by running the following command on your extracted bug report file: \n\n ```\n grep -m 1 -A 30 \"Starting GPU MST dump..\" PATH_TO_UNZIPPED_BUG_REPORT\n ```\n\n The output is similar to the following example: \n\n ```text\n Starting GPU MST dump..\n ... (additional MFT data) ...\n \n ```"]]