Documentation sur l'orchestration IA/ML sur GKE
Exécutez des charges de travail d'IA/de ML optimisées grâce aux fonctionnalités d'orchestration de plates-formes de Google Kubernetes Engine (GKE). Avec Google Kubernetes Engine (GKE), vous pouvez implémenter une plate-forme d'IA et de ML robuste et prête pour la production avec tous les avantages de Kubernetes géré et les fonctionnalités suivantes :
- Orchestration de l'infrastructure compatible avec les GPU et les TPU pour les charges de travail d'entraînement et de diffusion à grande échelle
- Intégration flexible aux frameworks de traitement des données et de calcul distribué
- Prise en charge de plusieurs équipes sur la même infrastructure afin d'optimiser l'utilisation des ressources
Démarrez votre preuve de concept avec 300 $de crédit offerts
- Accédez à Gemini 2.0 Flash Thinking
- Utilisation mensuelle gratuite de produits populaires, y compris les API d'IA et BigQuery
- Aucuns frais automatiques, aucun engagement
Ressources de documentation
Diffuser des modèles ouverts sur GKE
-
NOUVEAU !
Diffuser des LLM tels que Deepseek-R1 671B ou Llama 3.1 405B sur GKE
-
NOUVEAU !
Diffuser un LLM à l'aide de TPU sur GKE avec KubeRay
-
Tutoriel
Diffuser un LLM à l'aide de TPU Trillium sur GKE avec vLLM
-
Tutoriel
Guide de démarrage rapide: Diffuser un LLM à l'aide d'un seul GPU sur GKE
-
Tutoriel
Diffuser Gemma à l'aide de GPU sur GKE avec Hugging Face TGI
-
Tutoriel
Diffuser Gemma à l'aide de GPU sur GKE avec vLLM
Orchestrer les TPU et les GPU à grande échelle
-
NOUVEAU !
Optimiser l'utilisation des ressources GKE pour les charges de travail d'entraînement et d'inférence mixtes d'IA/ML
-
Vidéo
Présentation des Cloud TPU pour le machine learning
-
Vidéo
Créer des modèles de machine learning à grande échelle dans Cloud TPU avec GKE
-
Vidéo
Diffuser des grands modèles de langage avec KubeRay sur TPU
-
Blog
Machine learning à l'aide de JAX sur Kubernetes avec des GPU NVIDIA
Optimisation des coûts et orchestration des jobs
-
NOUVEAU !
Architecture de référence pour une plate-forme de traitement par lot sur GKE
-
Blog
Stockage de modèles d'IA et de ML hautes performances grâce à la compatibilité avec les disques SSD locaux sur GKE
-
Blog
Simplifier le MLOps à l'aide des pondérations et des biais avec Google Kubernetes Engine
-
Bonne pratique
Bonnes pratiques pour l'exécution de charges de travail par lot sur GKE
-
Bonne pratique
Exécuter des applications Kubernetes à coût maîtrisé sur GKE
-
Bonne pratique
Améliorer le temps de lancement de Stable Diffusion sur GKE par quatre
Ressources associées
Vidéos similaires
New Way Now: Vertiv cuts threat investigation time in half with Google SecOps
*Summary:* Mike Orosz, CISO at Vertiv, shares how the critical infrastructure solutions provider enhanced its security operations workflows with Google Cloud. Using Google Security Operations, Vertiv has accelerated and automated threat detection,
Protecting sensitive data in AI apps
Discover essential strategies for safeguarding sensitive data in AI applications on this episode of Serverless Expeditions. Join Martin Omander and Aron Eidelman as they explore key concepts, best practices, and tools for protecting data throughout
Introducing pipe syntax in BigQuery and Cloud Logging
BigQuery pipe syntax documentation → https://goo.gle/4dFaZrB Pipe syntax examples and reference guide → https://goo.gle/3U1wcF3 Blog: Introducing pipe syntax in BigQuery and Cloud Logging → https://goo.gle/3ZUr75l Writing complex SQL queries can be
Dataflow for Real-time Log Replication and Analytics
Deploy sample code → https://goo.gle/3TA3omR Streamline your log replication and analysis with Dataflow! Learn to build real-time pipelines that capture, process, and analyze logs from any source. See examples like detecting IoT sensor anomalies,
Why Google Kubernetes Engine uniquely supports the reliability-first approach
The State of Kubernetes Cost Optimization report (https://goo.gle/state-of-kubernetes-cost-optimization) found that more than expected Kubernetes Pods aren’t setting accurate resource requests and limits. This can lead to workloads being abruptly
Next-generation logging: Deep dive with Wells Fargo
Learn about Wells Fargo’s journey to next-generation logging, with focus on the architecture that handles logging at scale for the third-largest U.S. Bank. We'll explore topics such as: architecture supporting infrastructure serving over 70 million
Google Cloud Logging 101 - How to manage log routing at scale
Cloud logging’s log router is a power tool that gives you the flexibility to choose which logs are stored in Cloud Logging, sent to other Google Cloud products like Cloud Storage, or even sent to your favorite third-party product. In this video,
Troubleshoot faster, stress less with Google Cloud logging, monitoring, and observability
Logging and Monitoring in Google Cloud → https://goo.gle/3PFvcEp Observability in Google Cloud → https://goo.gle/3TAPXSR Tired of troubleshooting in the dark? Logging, monitoring, and observability in Google Cloud are key steps on your roadmap to
New Way Now: Backcountry is creating new retail landscapes with Google Cloud
→𝗦𝘂𝗺𝗺𝗮𝗿𝘆: Igor Cherny, CTO of Backcountry, shares how Google Cloud is helping the outdoor retailer turbocharge its AI innovation. With Google Cloud, Backcountry can now spin up new infrastructure in seconds and tap into the full power of its data to
Extend your Cloud Run containers’ capabilities using sidecars
Sidecars in Cloud Run allow users to run additional workloads alongside the main container in Cloud Run. Sidecar containers can provide capabilities like custom tracing, monitoring, and logging when using OpenTelemetry or Google Cloud Managed Service
Secure your data from ransomware and outages with Google Cloud backup
Attend this session to learn how Google Cloud’s backup services secure and protect your data from a variety of threats, such as ransomware, outages, and user errors. Our backup services protect VMs, databases, and Google Kubernetes Engine
CloudSQL: Configure database flags
Would you like to customize the behavior of your Cloud SQL instances? Are you looking to configure and tune your Cloud SQL instances? In this video, we introduce database flags that can customize the behavior of your Cloud SQL instances. We will look
Security Command Center and, more!
What’s new with Google Cloud? Welcome to our weekly series where we serve you the lowest latency news. This week, we’re talking about Security Command Center Premium pricing, and the date/time selector in Log Analytics in Cloud Logging Find more
Google Cloud Backup and DR Alert Notifications setup
In this video we detail how you can configure notifications to be sent to specified channels using events sent to Cloud Logging by the Backup and DR Service. More details can be found here → https://goo.gle/3FqIgZc
Sending custom log messages to Cloud Logging from Apigee
In this video, we'll be demonstrating how to create and send custom log messages from an Apigee API Proxy to Google Cloud logging. To try this sample → https://goo.gle/412tUHt For other samples → https://goo.gle/3I6sCTl For other Apigee Accelerator
Troubleshoot permission errors accessing datasets and tables
Are you having issues with querying BigQuery tables? Would you like to learn how to troubleshoot and resolve permission errors related to querying BigQuery tables? Check out this video where we will be discussing the various permissions needed to
Troubleshoot throttled jobs in Google Cloud Dataproc
Is your Cloud Dataproc job stuck in RUNNING state and not doing anything? Would you like to troubleshoot and resolve such issues with your job? Check out this video to learn the concepts like what is a Dataproc job, how to submit the job and life of
How to use monitoring and dashboards with Google Cloud Armor
Manage custom dashboards → https://goo.gle/3iVh0JV Getting started with Cloud Armor Adaptive Protection → https://goo.gle/3OoYq8y Google Cloud Armor Adaptive Protection overview → https://goo.gle/3Ybznup Cloud Armor allows you to easily monitor your
Troubleshoot Dataproc Cluster Creation Errors
Have you experienced any failures while creating Dataproc clusters? Are you interested to learn how to troubleshoot Dataproc creation cluster errors? Check out this video where we provide a quick overview of the common issues that can lead to
Securing APIs and Implementing multi-region failover with PSC and Apigee
When you access managed services that run on Google Cloud, you might want to channel requests to those services through a policy enforcement point. Using a policy enforcement point lets you configure logging and cryptography, and lets you specify
Troubleshoot Slow or Stuck Jobs in Google Cloud Dataflow
Are you experiencing slowness with your jobs or your jobs getting stuck in Cloud Dataflow? Slow/Stuck Dataflow jobs can be caused by a number of factors, but they can be investigated and solved with a handful of steps! Check out this video to learn
Get started with Looker SSO Embedding
SSO embedding getting started docs → https://goo.gle/3SpYbLK SSO embedding with Embed SDK getting started docs → https://goo.gle/3StQe8u Embed SDK Repository → https://goo.gle/3E7Bdoz SSO embedding with the Embed SDK allows users secure access to
Troubleshoot Cloud Firewall Rules
Is your application not working as expected despite having configured proper firewall rules? Would you like to know how to use Firewall rules log to troubleshoot issues with the Firewall rules? In this video, we introduce you to Firewall rules logs
Audit Logs: Querying Logs, Pricing and Retention
Would you like to have a better control on the information that you find through Audit logs? Do you want to know the pricing and retention information for Audit logs? Then check out this video to learn how to view the Audit logs using logging queries
Analyze Pacemaker logs in Cloud Logging
As an SAP system administrator, you've probably asked yourself: why did my Compute Instance restart? Why did Pacemaker restart my instance? Why did/didn’t my SAP system failover? By streaming Pacemaker logs into Cloud Logging, you can now find the
Google Data Studio, Startup Stories podcast, & more!
Here to bring you the latest news in the Cloud is Kelci Mensah. • Startup Stories podcast → https://goo.gle/3Qntc28 • Data Studio → https://goo.gle/3dmvH65 • Top 5 Logging tips → https://goo.gle/3Qx4SKI Chapters: 0:00 - Intro 0:11 - Startup Stories
Peering Google Cloud VMware Engine to GCP VPC
Google Cloud VMware Engine → https://goo.gle/3zV53di In this video, we walkthrough how to peer your Google Cloud VMware Engine environment and to your virtual private network. Watch for an overview of peering! Chapters: 0:00 - Intro 0:12 - Peering
Stream application logs into Cloud Logging
Do you have workloads that generate logs inside your Google Compute Engine (GCE) instances? Would you like to troubleshoot your application directly from Google Cloud Platform? Then check out this video to learn how to install and configure the Ops
Google Cloud Media CDN walkthrough
Cloud Media CDN → https://goo.gle/3NbPcdy The cloud is growing to meet the needs of modern video streaming! Introducing Cloud Media CDN to efficiently scale and deliver media content to anyone, anywhere globally. It uses the same infrastructure that
Cloud Storage Data Security and Sovereignty
Learn more → https://goo.gle/3H4mKZI Join us today and learn about a number of new Google Cloud storage capabilities which empower our customers with Data security and Data Sovereignty. In this session we'll discuss how we think about Data Security