Jump to Content
Data Analytics

Unlock Web3 data with BigQuery and Subsquid

February 14, 2024
Marcel Fohrmann

Co-Founder Subsquid

Benjamin Richter

Customer Engineer, Google Cloud

Try Gemini 1.5 Pro

Google's most advanced multimodal model in Vertex AI

Try it

Editor’s note: The post is part of a series showcasing partner solutions that are Built with BigQuery.

Blockchains generate a lot of data with every transaction. The beauty of Web3 is that all of that data is publicly available. But the multichain and modular expansion of the space has increased the complexity of accessing data, where any project looking to build cross-chain decentralized apps (DApps) has to figure out how to tap into on-chain data that is stored in varying locations and formats.

Meanwhile, running indexers to extract the data and make it readable is a time-consuming, resource-intensive endeavor often beyond small Web3 teams’ capabilities, since proficiency in coding smart contracts and building indexers are entirely different skills.

Having recognized the challenges for builders to leverage one of the most valuable pieces of Web3 (its data!), the Subsquid team set out to build a fully decentralized solution that drastically increases access to data in a permissionless manner.

Subsquid explained

One way to think about the Subsquid Network is as Web3’s largest decentralized data lake — existing to ingest, normalize, and structure data from over 100 Ethereum Virtual Machines (EVM) and non-EVM chains. It allows devs to quickly access (‘query’) data more granularly — and vastly more efficiently — than via legacy RPC node infrastructure.

Subsquid Network is horizontally scalable, meaning it can grow alongside archival blockchain data storage. Its query engine is optimized to extract large amounts of data and is a perfect fit for both dApp development (indexing) and for analytics. In fact, a total of over 11 billion dollars in decentralized application and L1/L2 value depends on Subsquid indexing.

Since September, Subsquid has been shifting from its initial architecture to a permissionless and decentralized format. So far during the testnet, 30,000 participants — including tens of thousands of developers — have built and deployed over 40,000 indexers. Now, the Subsquid team is determined to bring this user base and its data to Google BigQuery.

BigQuery and blockchain

BigQuery is a powerful enterprise data warehouse solution that allows companies and individuals to store and analyze petabytes of data. Designed for large-scale data analytics, BigQuery supports multi-cloud deployments and offers built-in machine learning capabilities, enabling data scientists to create ML models with simple SQL.

BigQuery is also fully integrated with Google's own suite of business intelligence and external tools, empowering users to run their own code inside BigQuery using Jupyter Notebooks or Apache Zeppelin.

Since 2018, Google has added support for blockchains like Ethereum and Bitcoin to BigQuery. Then, earlier this year, the on-chain data of 11 additional layer-1 blockchain architectures was integrated into BigQuery, including Avalanche, Fantom, NEAR, Polkadot, and Tron.

But while it's great to be able to run analytics on public blockchain data, this might not always offer exactly the data a particular developer needs for their app. This is where Subsquid comes in.

Data superpowers for Web3 devs and analysts

Saving custom-curated data to BigQuery lets developers leverage Google's analytics tools to gain insights into how their product is used, beyond the context of one chain or platform.

Multi-chain projects can leverage Subsquid in combination with BigQuery to quickly analyze their usage on different chains and gain insights into fees, operating costs, and trends. With BigQuery, they aren't limited to on-chain data either. After all, Google is the company behind Google Analytics, an advanced analytics suite for website traffic.


Subsquid Developer relations engineer Daria A. demonstrates how to store data indexing using Subsquid to BigQuery and other tools

Analyzing across domains by combining sets of on-chain activity with social media data and website traffic can help projects understand retention and conversion in their projects while identifying points where users drop off, to further improve their workflows.

“BigQuery is quickly becoming an essential product in Web3, as it enables builders to query and analyze one’s own data, as well as to leverage a rich collection of datasets already compiled by others. Since it's SQL based, it's extremely easy to explore any data and then run more and more complex queries. With a rich API and complete developer toolkit, it can be connected to virtually anything,” writes Dmitry Zhelezov, Subsquid CEO and co-founder.

“Now, with the addition of Subsquid indexing, Web3 developers literally have data superpowers. They can build a squid indexer from scratch or use an existing one to get exactly the data they need extremely efficiently. We can’t wait to see what this unlocks for builders.”

Get started with Subsquid on BigQuery today

Subsquid’s support for BigQuery is already feature-complete. Are you interested in incorporating this tool into your Web3 projects? Find out more in the documentation. You can also view an example project demoed on YouTube and open-sourced on GitHub.

The Built with BigQuery advantage for Data Providers and ISVs

Built with BigQuery helps companies like Subsquid build innovative applications with Google Data and AI Cloud. Participating companies can:

  • Accelerate product design and architecture through access to designated experts who can provide insight into key use cases, architectural patterns, and best practices.
  • Amplify success with joint marketing programs to drive awareness, generate demand, and increase adoption.

BigQuery gives Data Providers and ISVs the advantage of a powerful, highly scalable unified AI lakehouse that’s integrated with Google Cloud’s open, secure, sustainable platform. Click here to learn more about Built with BigQuery.

Posted in