Processes 14 trillion+ tokens efficiently, enabling the training of the latest Falcon models
Supercharges LLM training with Cloud GPUs, enabling AI models capable of processing 1,000+ tokens per second
Enables granular control for maximum efficiency with Kueue, Cluster Director, and Dynamic Workload Scheduler
Delivers reliability at scale, ensuring uninterrupted training runs for even the largest LLMs
Accelerates go-to-market times by streamlining the environment to deliver robust, mature products ready for users
TII trains its open source large language models on Google Cloud GPUs and infrastructure for maximum efficiency at scale.
Google Cloud had the compute power and availability to cope with the demands of training large language models, but I think I've been most impressed with the technical expertise we found there. That was really important for building a healthy working relationship.
Dr. Hakim Hacid
Chief Researcher, Artificial Intelligence and Digital Science Research Center, Technology Innovation Institute
In just a few years, artificial intelligence has gone from being a conceit of science fiction to a fact of everyday life, spreading through every level of industry and society.
The Artificial Intelligence and Digital Science Research Center (AIRC) is one of nine research centers at Abu Dhabi's Technology Innovation Institute (TII) employing teams of scientists, researchers, and engineers to push the boundaries of AI and machine learning.
Although the AIRC works at the bleeding edge of technology, its focus is on solutions that are accessible and useful to a broad community. The center has made its mark in recent years with the highly praised Falcon family of large language models (LLM), which it has open-sourced for use across the world.

"We are a government entity, but we're not just targeting users in the UAE government," says Dr. Hakim Hacid, Research Director at the AIRC. "The Falcon LLMs have given us a global presence with a global impact."
In 2023, TII released the first iteration of its Falcon LLM. Since then, the Falcon family has expanded to cover a wide range of use cases, from text and image to audio and video, with models adaptable to hardware ranging from powerful supercomputers to personal laptops.
As work began on the next generation of Falcon models, TII were looking for a new cloud provider that could meet the massive scale of LLM training, while also delivering efficiency and reliability. In 2024, the institute began training its next generation of Falcon models on Google Cloud.
From the start of the Falcon program, TII has pursued a multi-cloud strategy to ensure resilience and independence. But training one of the world's most popular, feature-rich LLMs requires the very best in modern infrastructure.
"Google Cloud had the compute power and availability to cope with the demands of training large language models," says Dr. Hacid, "but I think I've been most impressed with the technical expertise we found there. That was really important for building a healthy working relationship."

Working closely with experts from Google Cloud, Dr. Hacid and his team designed an infrastructure purpose-built for training TII's AI models. Using AI Hypercomputer, the institute customized infrastructure and hardware to match the specific architecture of its models.
Traditionally, TII has used Graphical Processing Units to train its models. For these, the institute deploys Cloud GPUs on A3 Mega Google Kubernetes Engine clusters designed to handle large scale training. Managed Lustre delivers high-bandwidth, low-latency storage for seamless performance.
Accessing high-powered tools is one thing, managing them is quite another. With Kueue, the open source job-queuing system for Kubernetes, and Cluster Director, TII has the granular control over its clusters to ensure that it is using as much of its available GPU processing power as possible. GPU utilization rates are double what they would have been with a managed service.
I very much appreciate the way Google Cloud meets our support needs. Whenever we run into an issue with the training environment or have to change the codebase, they are there to help at any time. Google Cloud has helped us to unlock some serious efficiency gains.
Dr. Hakim Hacid
Chief Researcher, Artificial Intelligence and Digital Science Research Center, Technology Innovation Institute

Meanwhile, Dynamic Workload Scheduler enables TII to be as cost-efficient as possible by planning its heaviest workloads for short, specific periods of time, without over-commiting nodes that sit idle for months on end.
TII has started leveraging A3 Ultra (H200 Nvidia) to accelerate and enhance their AI infrastructure workloads.
Adopting Google Cloud meant adapting to a new architecture and codebase, but with close support from Google Cloud, the team transitioned quickly. "I very much appreciate the way Google Cloud meets our support needs," says Dr. Hacid. "Whenever we run into an issue with the training environment or have to change the codebase, they are there to help at any time. Google Cloud has helped us to unlock some serious efficiency gains."

We work in a challenging field that is moving extremely fast, but Google Cloud has been a capable and flexible partner to us. We've had a very positive and productive relationship and we're excited to continue that collaboration.
Dr. Hakim Hacid
Chief Researcher, Artificial Intelligence and Digital Science Research Center, Technology Innovation Institute
Working with Google Cloud has given TII the tools to take the Falcon program to the next level. Dr. Hacid and his team have been able to speed up training runs dramatically, and the AI models are showing excellent results. One way of gauging a model's performance is by measuring the number of tokens it can input and output per second. With Cloud GPUs, TII's models have been measured processing more than a thousand tokens per second.
The setup also brings reliability at scale, so even the biggest training jobs can run without interruption. By fine-tuning their environment on Google Cloud, the institute has been able to cut the time it takes to bring new Falcon models to market, releasing products that are stable, mature, and ready for real-world use.
Equally important is the reach that this infrastructure makes possible.
Thanks to the ubiquity of Google Cloud and its expertise in AI, TII can share its models with the wider world, releasing open source tools that are robust, reliable, and accessible to anyone who wants to use them.
Looking ahead, TII's ambitions go beyond AI alone. Building on the experience of running Falcon on Google Cloud, the institute is preparing to expand its use of the infrastructure into other research areas, including security, quantum computing, and robotics.
"We work in a challenging field that is moving extremely fast, but Google Cloud has been a capable and flexible partner to us," says Dr. Hacid. "We've had a very positive and productive relationship and we're excited to continue that collaboration."

The Technology Innovation Institute is a leading research organization that focuses on developing cutting-edge technologies to solve real-world problems.
Industry: Government and Public Sector
Location: United Arab Emirates
Products: AI Hypercomputer, Cloud GPUs, Cluster Director, Google Kubernetes Engine, Managed Lustre