Enhancing Google Cloud’s blockchain data offering with 11 new chains in BigQuery
James Tromans
Head of Web3
Alberto Martin
Director, Web3 Product Management
Early in 2018, Google Cloud worked with the community to democratize blockchain data via our BigQuery public datasets; in 2019, we expanded with six more datasets. Today, we’ve added eleven more of the most in-demand blockchains to the BigQuery public datasets, in preview. And we’re making improvements to existing datasets in the program, too.
We’re doing this because blockchain foundations, Web3 analytics firms, partners, developers, and customers tell us they want a more comprehensive view across the crypto landscape, and to be able to query more chains. They want to answer complex questions and verify subjective claims such as “How many NFTs were minted today across three specific chains?” “How do transaction fees compare across chains?” and “How many active wallets are on the top EVM chains?”
Having a more robust list of chains accessible via BigQuery and new ways to access data will help the Web3 community better answer these questions and others, without the overhead of operating nodes or maintaining an indexer. Customers can now query full on-chain transaction history off-chain to understand the flow of assets from one wallet to another, which tokens are most popular, and how users are interacting with smart contracts.
Chain expansion
Here are the 11 in-demand chains we’re adding into the BigQuery public datasets:
Avalanche
Arbitrum
Cronos
Ethereum (Görli)
Fantom (Opera)
Near
Optimism
Polkadot
Polygon Mainnet
Polygon Mumbai
Tron
We’re also improving the current Bitcoin BigQuery dataset by adding Satoshis (sats) / Ordinals to the open-source blockchain-ETL datasets for developers to query. Ordinals, in their simplest state, are a numbering scheme for sats.
Google Cloud managed datasets
We want to provide users with a range of data options. In addition to community managed datasets on BigQuery, we are creating first party Google Cloud managed datasets that offer additional feature capabilities. For example, in addition to the existing Ethereum community dataset (crypto_ethereum), we created a Google Cloud managed Ethereum dataset (goog_blockchain_ethereum_mainnet.us) which offers a full representation of the data model native to Ethereum with curated tables for events. Customers that are looking for richer analysis on Ethereum will be able to access derived data to easily query wallet balances, transactions related to specific tokens (ERC20, ERC721, ERC1155), or interactions with smart contracts.
We want to provide fast and reliable enterprise-grade results for our customers and the Web3 community. Here’s an example of a query against the goog_blockchain_ethereum_mainnet.us dataset:
Let’s say we want to know “How many ETH transactions are executed daily (last 7 days)?”
On the results above you can see how using the goog_ dataset is faster and consumes less slot time, while also remaining competitive in terms of bytes processed.
More precise data
We gathered feedback from customers and developers to understand pain points from the community and heard loud and clear that features such as numerical precision are important for more accurately calculating the pricing of certain coins. We are improving the precision of the blockchain datasets by launching UDF for better UNIT256 integration and BIGNUMERIC1 support. This will give customers access to longer decimal digits for their blockchain data and reduce rounding errors in computation.
Making on-chain data more accessible off-chain
Today, customers interested in blockchain data must first get access to the right nodes, then develop and maintain an indexer that transforms the data into a queryable data model. They then repeat this process for every protocol they’re interested in.
By leveraging our deep expertise in scalable data processing, we’re making on-chain data accessible off-chain for easier consumption and composability, enabling developers to access blockchain data without nodes. This means that customers can access blockchain data as easily as they would their own data. By joining chain data with application data, customers can get a complete picture of their users and their business.
Lastly, we have seen this data used in other end user applications such as Looker and Google Sheets.
Building together
For the past five years, we have supported the community through our public blockchain dataset offering, and we will continue to build on these efforts with a range of data options and user choice — from community-owned to Google managed high-quality alternatives and real-time data. We’re excited to work with partners who want to distribute public data for developers or monetize datasets for curated feeds and insights. We’re also here to partner with startups and data providers who want to build cloud-native distribution and syndication channels unique to Web3.
You can get started with our free tier and usage-based pricing. To gain early access to the new chains available on BigQuery, contact web3-trustedtester-data@google.com.
1. 32 logical bytes