Vertex AI adds Mistral AI model for powerful and flexible AI solutions
AI Account Executive
Thomas Le Moullec
AI Infrastructure Customer Engineer
One of Europe’s leading providers of artificial intelligence (AI) solutions, Mistral AI, is on a mission to design highly performant and efficient open-source (OSS) foundation models.
Mistral AI is teaming up with Google Cloud to natively integrate their cutting-edge AI model within Vertex AI. This integration can accelerate AI adoption by making it easy for businesses of all sizes to launch AI products or services.
Mistral-7B is Mistral AI’s foundational model that is based on customized training, tuning, and data processing methods. This optimized model allows for compression of knowledge and deep reasoning capacities despite having a small number of parameters. These optimized foundational models can lead to benefits in sustainability and efficiency by reducing training time, cost, energy consumption, and the environmental impact of AI.
Mistral’s model utilizes Grouped-Query Attention (GQA), which balances high speed and accuracy for model inference, and leverages the Sliding Window Attention (SWA) method to handle longer sequences at lower cost, as well as improving the accuracy of the resulting large language model (LLM).
A consistent approach in AI
At Google, we believe anyone should be able to quickly and easily turn their AI dreams into reality. OSS has become increasingly important to this goal, heavily influencing the pace of innovation in AI and machine learning (ML) ecosystems. These OSS efforts are aimed at enabling a broader spectrum of developers and researchers to contribute to the improvement of these AI models and make AI explainable, ethical, and equitable.
Google Cloud seeks to become the best platform for the OSS AI community and ecosystem. Bringing Mistral AI model to Google Cloud furthers this mission.
Freedom to innovate anywhere
Mistral AI users will benefit from Google Cloud’s commitment to multi-cloud and hybrid cloud, and to high standards of data security and privacy. Concretely, they can keep their data in accordance with their privacy rules and fine-tune and run their models in the environment of their choice — whether on-premises, in Google Cloud, on another cloud provider, or across geographic regions. Through Google Cloud and open source technologies, users enjoy freedom of choice.
Organizations need AI ecosystems with data sharing and open infrastructure. Google Cloud customers can run and manage their AI infrastructure on open source technologies such as Google Kubernetes Engine, Ray on GKE, or Ray on Vertex AI. They can leverage BigQuery Omni to access data in external data sources and cloud providers, and use BigLake to unify data lakes and data warehouses across clouds.
AI/ML privacy commitments for Google Cloud
At Google Cloud, we are committed to providing customers with increased visibility and controls over their data.
Customers own and control their data, and it stays within their Google Cloud environment. We recognize that customers want their data to be private, and not be shared with the broader Google or LLM training corpus. Customers maintain control over where their data is stored and how or where it is used, helping them to safely pursue data-rich use cases without fear of data privacy breaches. Google does not store, read, or use customer data outside of the customer’s cloud environment. Customers' fine-tuned data is their data. We are able to provide Cloud AI offerings such as Vertex AI and Mistral AI models with enterprise-grade safety, security, and privacy baked in from the beginning.
Mistral-7B now available in Vertex AI
Today we are pleased to announce that Mistral AI's first open source model “Mistral-7B” is integrated with Vertex AI Notebooks.
This public notebook allows Google Cloud customers to deploy an end-to-end workflow to experiment (i.e., test, fine-tune) with Mistral-7B and Mistral-7B-Instruct on Vertex AI Notebooks. Vertex AI Notebooks enable data scientists to collaboratively develop models by sharing, connecting to Google Cloud data services, analyzing datasets, experimenting with different modeling techniques, deploying trained models into production, and managing MLOps through the model lifecycle.
Mistral AI’s model integration in Vertex AI leverages vLLM, a highly optimized LLM serving framework that can increase serving throughput. By running the notebook, users will be able to automatically deploy a vLLM image (maintained by Model Garden) on a Vertex AI endpoint for inference. When defining the endpoint, users can have many accelerators to choose from to optimize model inference performance.
Leveraging Vertex AI model deployment, users can benefit from Vertex AI Model Registry, a central repository where they can manage the lifecycle of Mistral AI models and their own fine-tuned models. From the Model Registry, users will have an overview of their models so they can better organize, track, and train new versions. When there’s a model version they would like to deploy, they can assign it to an endpoint directly from the registry, or using aliases — deploy models to an endpoint.
Learn more about Mistral AI performance and features in their blog post. You can also see how other partners are leveraging generative AI on Google Cloud.