Pinecone: Powering the AI revolution with fast, accurate vector search

About Pinecone

With its market-leading vector database, Pinecone is a critical component of the generative AI market. The fast-growing startup has offices in New York and Tel Aviv, and recently raised $100m in series B funding, bringing its valuation to $750m.

Industries: Technology

Location: US

Products: Google Cloud, BigQuery, Cloud SQL, Cloud Spanner, Cloud Storage, Compute Engine, Firebase, Kubernetes Engine

Tell us your challenge. We're here to help.

About DoiT

DoiT provides businesses with technology and cloud expertise to reduce cloud costs and boost engineer productivity.

Pinecone turned to Google Cloud to help it expand, and gained a responsive, scalable architecture capable of providing its rapidly growing customer base with fast and scalable vector search.

Google Cloud results

Stores hundreds of billions of vectors, allowing customers to scale up databases to billions of items
Serves millions of queries per second at ultra-low latency with Google Kubernetes Engine and Google Cloud data solutions
Scales from hundreds to hundreds of thousands of users in months, with Google Cloud technical support
No compromise between high availability and high consistency with Cloud Spanner

GenAI market leader scales rapidly with Google Cloud

The world is in the grip of an AI revolution. Generative AI, semantic search, and large language models are transforming entire industries, with companies hurrying to put their own AI solutions into production, for fear of being left behind.

At the heart of this technological arms race is probably the most important innovation you've never heard of: vector embeddings. Put simply, vector embeddings are a way to transform natural language, from single words to whole paragraphs of descriptive text, into strings of numbers that a computer can understand. The computer can then map these strings of numbers (or vector embeddings) onto three-dimensional vector spaces, meaning the semantic similarity of two or more real-world concepts can be quantified by how close their vector embeddings are to each other in that vector space. And that critical ability for computers to understand the similarity between real-world objects and concepts underpins the Retrieval Augmented Generation (RAG) architecture of Generative AI applications, from chatbots to AI assistants to autonomous agents. Without the ability to search through vector space to find clusters of similar real-world objects and concepts, none of these AI solutions would be possible.

Pinecone is a market leader in vector search. Built on Google Cloud, its fully managed, developer-friendly, and easily scalable vector database enables any company to build accurate, secure, and reliable Generative AI applications. By storing their vector embeddings in the Pinecone database, Pinecone customers receive superfast, highly relevant vector search to retrieve the most relevant context from private data and to securely include within prompts to their LLMs, while Pinecone takes care of all the scaling, monitoring, troubleshooting, security.

Searching for a cloud provider to scale at speed

Pinecone's growth curve has been steep. Having launched its first commercial product in 2021, its customer usage grew beyond expectations in 2022 and then exponentially in 2023 with the rise of Generative AI and outstanding demand for vector databases. During its early development, Pinecone was running on a third-party cloud provider, but with each customer potentially bringing billions of data entries to the database, the company needed to be sure that its cloud platform would be able to support such rapid growth.

"As we started launching the product and going into production, the ability to work at scale was super important for us. That, combined with the responsiveness of Google Cloud compared to other options, was enough to persuade us to make the move."

—Lior Ehrenfeld, VP of Finance and Ops, Pinecone

"As we started launching the product and going into production, the ability to work at scale was super important for us," explains Lior Ehrenfeld, VP of Finance and Ops at Pinecone. "That, combined with the responsiveness of Google Cloud compared to other options, was enough to persuade us to make the move."

Gaining specialist knowledge from local product teams

Pinecone migrated its infrastructure to Google Cloud in 2022 and worked closely with product and engineering teams within Google Cloud to ensure the implementation was as effective as possible.

"The work with the Google Cloud team has been fantastic," says Ehrenfeld. "We got really close support from the local team here in Israel. The availability and knowledge they brought to the development process gave us a highly technical understanding of the specific solutions we were implementing, and how to embed them into Pinecone for the best results."

A responsive infrastructure for fast and accurate vector search

With its vector database running on Google Kubernetes Engine, Pinecone is now able to store hundreds of billions of vectors and serve millions of queries per second at ultra-low latency, allowing its customers to scale their databases up to billions of items, while only paying for what they use. As a result, Pinecone is able to provide an accessible, fully managed service to engineering teams of all sizes, while at the same time offering its customers fast and accurate vector search to power state-of-the-art AI solutions.

When it comes to storing and managing its data, Pinecone uses a combination of Google Cloud data solutions, giving its vector database the consistency that its customers depend on to deliver accurate, reliable vector search results for its AI products.

"Google Cloud solutions around data in general are top notch," explains Roy Miara, Engineering Manager, Generative AI at Pinecone. "Google makes better databases and better data warehouses as an end-to-end solution than any other provider. So we're using databases like BigQuery, Cloud SQL, and Firebase both to operate our product and for analytics and management of data."

No need to compromise on performance or consistency with Cloud Spanner

As the company's rapid growth continues, Pinecone is continually improving its architecture to ensure it is able to efficiently store and process an ever-growing volume of data. To that end, Pinecone has begun using Cloud Spanner to separate its cloud storage and compute, giving it increased availability, and allowing it to scale its product more quickly. This enables its customers to work on a larger scale, with reduced costs and improved latency.

By guaranteeing the external consistency of data within databases, Cloud Spanner ensures that all data is correct, regardless of how it is accessed or updated. And because its automated data management techniques guarantee this level of consistency at any scale, database operators no longer need to choose between performance and consistency as they grow. This allows Pinecone customers to use Pinecone on a limitless scale, safe in the knowledge that their data will remain consistent and synchronized, regardless of how many times entries are edited.

"Spanner has a unique way to ensure high consistency and high availability, meaning we can operate in more regions, scale out our product, and still have many customers reading and writing at the same time. All while retaining consistent data within Pinecone."

—Roy Miara, Engineering Manager, Generative AI, Pinecone

What's more, with its automated maintenance, Cloud Spanner liberates Pinecone's engineering teams from menial maintenance tasks and infrastructure concerns, allowing them to focus on building their products instead.

"Spanner has a unique way to ensure high consistency and high availability," says Miara, "meaning we can operate in more regions, scale out our product, and still have many customers reading and writing at the same time — all while retaining consistent data within Pinecone. Having those pieces together allows us to build and scale products more quickly, reach more users, and focus on what we do best, which is vector search."

Powering the future of AI with the Google Cloud team

Along with the specific Google Cloud solutions, Pinecone credits the Google Cloud team with playing a critical role in helping the company scale up at such a rapid rate. In 2023, Pinecone saw a spike in the number of users of its free tier from a few hundred to a few hundred thousand, meaning it needed to quickly spin up a lot of new environments in different regions to ensure there was no interruption to its service.

"The availability of the Google Cloud team, and the advice they gave us around where to spin up environments to give us the machines to handle this huge spike in demand, was paramount," explains Ehrenfeld, "not only for Pinecone but for generative AI developers across the world. With their advice, the Google Cloud team supported hundreds of thousands of Pinecone customers simultaneously building their generative AI applications."

"Pinecone has already contributed a great deal to the AI space. We've seen that from how quickly we have grown so far. Having Google Cloud at our side, helping us to scale up our product, makes us confident that we will continue to play a central role in AI's future."

—Roy Miara, Engineering Manager, Generative AI, Pinecone

For Pinecone, that relationship will prove critical as the company continues to scale up, providing fast and accurate vector search to its rapidly expanding user base as it continues to drive the AI revolution.

"Pinecone has already contributed a great deal to the AI space. We've seen that from how quickly we have grown so far," says Ehrenfeld. "Having Google Cloud at our side, helping us to scale up our product, makes us confident that we will continue to play a central role in AI's future."

Tell us your challenge. We're here to help.

About Pinecone

Industries: Technology

Location: US

About DoiT

DoiT provides businesses with technology and cloud expertise to reduce cloud costs and boost engineer productivity.

Google Cloud BigQuery Cloud SQL Cloud Spanner Cloud Storage Compute Engine Firebase Kubernetes Engine