AI & Machine Learning

Meta’s Llama 3.1 is now available on Google Cloud

July 23, 2024

Warren Barkley

Senior Director, Product Management, Google Cloud

Today, we’re excited to announce the addition of the Llama 3.1 family of models, including a new 405B model – Meta's most powerful and versatile model to date — to Vertex AI Model Garden. These additions continue Google Cloud’s commitment to open and flexible AI ecosystems that help you build solutions best-suited to your needs.

Vertex AI provides a curated collection of first-party, open-source, and third-party models, many of which — including the new Llama models — can be delivered as fully-managed Model-as-a-service (MaaS) offerings. With MaaS, you can choose the foundation model that fits your requirements, access it simply via an API, tailor it with robust development tools, and deploy on our fully-managed infrastructure — all with the simplicity of a single bill and hassle-free infrastructure.

Meta's Llama 3.1 represents a paradigm shift in open-weight models, boasting unparalleled performance and versatility in its class. This release features a family of models tailored for diverse applications:

Llama 3.1 405B: The largest openly available foundation model to date, Llama 3.1 405B sets a new standard among open models for flexibility, control, and innovation. This model opens an array of new possibilities, from generating synthetic data and powering complex reasoning tasks to effortlessly handling direct inference scenarios with minimal fine-tuning.
Llama 3.1 8B and 70B: These new versions of Llama 3 models excel at understanding language nuances, grasping context, and performing complex tasks such as translation and dialogue generation.

You can access the new 405B model in just a few clicks using Model-as-a-Service in preview here, without any setup or infrastructure hassles. General availability begins in the coming weeks. The 8B and 70B models will also be available as MaaS in the coming weeks. All three models are available for self-service in Vertex AI Model Garden starting today, giving you the flexibility to choose your preferred infrastructure.

These models are available as pre-trained and instruction-tuned versions to support your specific needs, and they include an expanded context of 128,000 tokens, offering deeper comprehension of longer, more complex text than earlier generations. Llama 3.1 models also include multilingual support across eight languages, further broadening their reach and applicability.

Using Llama 3.1 in Google Cloud

Google Cloud’s Vertex AI is a comprehensive AI platform for experimenting with, customizing, and deploying, and monitoring foundation models like Llama 3.1. Llama 3.1 joins over 150 curated, enterprise-ready models already available on Vertex AI Model Garden, expanding your choice and flexibility to choose the best models for your needs and budget, and to keep pace with leap-frogging innovations.

https://storage.googleapis.com/gweb-cloudblog-publish/images/Metas_Llama_3.1.max-900x900.png

Model card of Llama 3.1 on Vertex AI

Using Llama 3.1 on Vertex AI, you can:

Experiment with confidence: Explore Llama 3.1's capabilities through simple API calls and comprehensive side-by-side evaluations within our intuitive environment, without worrying about complex deployment processes.
Tailor Llama 3.1 to your exact needs: Fine-tune the model using your own data to build bespoke solutions tailored to your unique needs. If you're accessing the 8B and 70B models via self-service in Vertex AI Model Garden, you can start fine-tuning today. The ability to fine-tune the 405B model will be available in the coming weeks.
Ground your AI in truth: Make sure your AI outputs are reliable, relevant, and trustworthy with Vertex AI’s multiple options for grounding and RAG. For example, you can connect your models to enterprise systems, use Vertex AI Search for enterprise information retrieval, leverage Llama3 for generation, and more.
Craft intelligent agents: Create and orchestrate agents powered by Llama 3.1, using Vertex AI's comprehensive set of tools, including LangChain on Vertex AI. Integrate Llama 3.1 into your AI experiences with Firebase Genkit’s Vertex AI plugin.
Deploy without overheads: Eliminate the complexities of deploying and scaling even the 405B model, thanks to flexible auto-scaling and pay-as-you-go pricing. And of course, leverage world-class infrastructure, purpose-built for AI workloads.
Make Llama 3.1 work within your guardrails: Deploy with confidence with not only support for Meta’s Llama Guard, but also Google Cloud's built-in security, privacy, and compliance measures.

Get started with Llama 3.1 on Google Cloud

With every new innovation in AI models, enterprise AI ecosystems become more diverse. Our partnership with Meta testifies to both organizations’ commitment to providing world-class innovation supported by an open and accessible AI ecosystem. We'll continue to work closely with Meta and other partners to keep our customers at the forefront of AI capabilities.

To access Llama 3.1, visit Model Garden, and to learn more about Llama 3.1, check out Meta’s announcement.

Posted in

AI & Machine Learning

Data Analytics

BigQuery AI supports Gemini 3.0, simplified embedding generation and new similarity function

By Tianxiang Gao • 5-minute read

Management Tools

Monitoring Google ADK agentic applications with Datadog LLM Observability

By Abhi Das • 4-minute read

Databases

How Fastweb + Vodafone reimagined data workflows with Spanner & BigQuery

By Vincenzo Forciniti • 5-minute read

Compute

Scaling WideEP Mixture-of-Experts inference with Google Cloud A4X (GB200) and NVIDIA Dynamo

By Sean Horgan • 9-minute read