Vionlabs

Vionlabs strengthens its multimodal AI with Llama text models on Vertex AI

Google Cloud results
  • Accelerates deployments with Vertex AI

  • Maintains high feature velocity using Llama on Vertex AI

  • Achieves global scale with a lean team

  • Speeds integration of new tools and services thanks to the Google Cloud ecosystem

  • Optimizes costs with hosted APIs on Vertex AI

With Llama 3.1 on Vertex AI, Vionlabs uses multimodal AI to transform content discovery on streaming platforms.

Vionlabs workflow

Advancing audience engagement using multimodal AI

Vionlabs, a content intelligence company founded in Stockholm Sweden, wants to transform how audiences engage with content. They offer solutions to the global media and entertainment industry, assisting streaming services, studios, and broadcasters in understanding and enriching their video libraries. Vionlabs uses AI to extract deep metadata—including moods, emotions, environments, plot details, and textual nuances—from audio, video, and text content. This intelligence then powers improved content discovery, recommendations, and editorial workflows.

Initially, Vionlabs focused on proprietary audio and video analysis. However, they soon identified a blind spot: understanding plot and textual nuances. Audio-visual models alone couldn't detect crucial plot details that were only revealed through dialogue, like when a character reveals in a voiceover that they were always the villain. To bridge this gap, Vionlabs recognized they needed to integrate text as a third modality into their content analysis toolset.

However, rather than building their own text-based large language model, Vionlabs opted to use Llama 3.1 405B and Llama 3.1 70B models, hosted on Vertex AI.

Audio-visual models alone couldn’t detect crucial plot details—for instance, that a character is a ghost, if that fact is only revealed in dialogue within the transcript.

Marcus Bergström

Chief Executive Officer, Vionlabs

This decision was based on several factors, including maintaining high quality over a long context window; consistency in output that the model and its temperature parameter could achieve; and the ease of integration and cost-effectiveness of Vertex AI’s hosted API. Further, Vionlabs used Vertex AI to simplify launching and tracking new training jobs while also integrating seamlessly with its database of choice, BigQuery.

Vionlabs’ strategic choice allowed for rapid implementation, taking only a few weeks instead of the six to nine months typically required for training their own embedding models (the numerical representations or vectors that enable AI systems to perform semantic search, natural language processing, and cross-modal applications; and power recommendation systems).

Editorial Lab dashboard

Llama on Vertex AI enables scalable, efficient services for Vionlabs’ clients

Today, Vionlabs provides its clients with several new services powered by Llama 3.1 on Vertex AI, including:

Using Llama on Vertex AI, we’re able to innovate rapidly and provide our clients with scalable new content-discovery services to power their businesses.

Marcus Bergström

Chief Executive Officer, Vionlabs

  • AI-generated, multi-lingual synopses: By fusing text data processed by Llama models with audio and video analysis, Vionlabs creates a multimodal embedding. This deep, standardized content intelligence is then fed back into Llama to automatically write both short and long form versions of narrative-style synopses in four languages (English, German, Spanish, and French). With this service, every title in a client’s library has a standardized taxonomy, critical for optimal search and discovery across their platforms.
  • Fully automated, editorial smart lists: Vionlabs uses multimodal embedding for initial content clustering. Llama then refines these groups and generates list names, automating the previously manual and time-consuming process of curating content lists. This service helps clients organize massive libraries (e.g., 100,000 titles into 700 different lists) with compelling, auto-generated list names and descriptions. Clients benefit from more efficient content curation and the ability to instantly push these lists to their user interfaces for consumption.
  • Frame-level, automated trailer creation: Leveraging its deep frame-level understanding of content, Vionlabs automates the creation of preview clips and short trailers. This service performs scene analysis, tracks key characters and story arcs, and identifies different moods. Llama contributes short synopses and tags for these automated clips, enabling clients to create highly customized assets, such as a stitched montage of all the high-octane action scenes from a cluster of action movies. In this way, clients can quickly create highly engaging promotional content.
Creative Lab dashboard

Plotting an ambitious next chapter

We envision indexing the entire world of content down to a frame level, which we believe will be crucial for the next generation of high-quality, gen AI-produced content.

Marcus Bergström

Chief Executive Officer, Vionlabs

The Vionlabs team is now setting out to index the entire world of content down to a frame level. They see this as critical to empowering the next generation of video content creation.

Bergström and his team believe that by focusing on Vionlabs’ core strengths and leveraging open source AI models like Llama on Vertex AI, they can maintain extreme feature velocity while delivering high-quality output. This approach has enabled Vionlabs to scale their revenue without significantly impacting costs.

Says Bergström, "Our philosophy is to focus on the strategic elements of our business, do those things really well, and use available state-of-the-art models like Llama to complement our offering."

Vionlabs library

Founded in the heart of Stockholm in 2016, Vionlabs is now at the forefront of leveraging AI and data analytics to revolutionize content discovery and media optimization worldwide.

Industry: Technology

Location: Sweden

Products: Google Cloud, Vertex AI, BigQuery, Cloud Run, Kubernetes Engine, Dataflow, TensorFlow, Llama 3.1 405B and Llama 3.1 70B on Vertex AI


About Google Cloud partner — Llama

Llama, from technology company Meta, is a collection of base models, research tools, and programs that enable the next wave of innovation.


Google Cloud Partners
  • Meta
Google Cloud