ShareChat: Building a scalable data-driven social network for non-English speakers globally

About ShareChat

ShareChat is the leading Indian social media platform that allows users to share their opinions, document their lives, and make new friends in their native language. On a mission to spearhead India's internet revolution, ShareChat is changing the way in which the next billion users will interact on the internet.

Industries: Media & Entertainment
Location: India

Tell us your challenge. We're here to help.

Contact us

About CloudCover

Founded in 2015, CloudCover is an award-winning cloud service provider specializing in infrastructure and data migration. CloudCover is one of the first Google Cloud partners in Southeast Asia and 2017 Google Cloud APAC Services Partner of the Year. Headquartered in India, it has satellite sales and project management offices in Singapore and LA.

Since migrating to Google Cloud, ShareChat improves performance, app development, and analytics for serving regional content to millions of users.

Google Cloud results

  • Automatically scales to nearly 7 billion web requests per day for a consistent user experience
  • Analyzes 70 terabytes of data to enrich customer experience and business performance with deeper insights
  • Improves CDN cache hit ratio to 98.5% to deliver a faster load time for users anywhere, anytime
  • Reduces cost by 30% with the robustness of Cloud Spanner to seamlessly replicate data in real time

Migrated 60 million users in five hours with no downtime

India is a multilingual country, with 22 major official languages and many more regional dialects spoken in rural areas. According to the internet and Mobile Association of India, there are more than 227 million internet users in rural areas, and this number is expected to grow exponentially as internet access becomes more affordable.

Conceptualized, designed, and built in India, ShareChat brings Indians in rural areas, and all over the world, together on one platform. More than 160 million monthly active users share and view videos, images, GIFs, songs, and much more in 15 different Indian languages. In July 2020, ShareChat also launched a short video platform, Moj that already emerged as the leading short video platform in India, with over 80 million monthly active users.

Knowing that many new internet users don't know how to use search terms to find content, ShareChat simplifies content and people discovery by using a personalized content feed as its mobile app homepage. The company’s data science team uses machine learning models that detect the language of the content, and user engagement, to surface the right content to the right user.

"Google Cloud makes things very simple. It offers an efficient DevOps pipeline for releasing changes on Google Cloud. We can deploy bug fixes and new features such as our Moj app for short videos, without spending hours writing scripts for testing and deployment."

Venkatesh Ramaswamy, Vice-President of Engineering, ShareChat

Like many startups, ShareChat over-provisioned compute and storage to accommodate unpredictable traffic and avoid running out of storage. After raising $100 million in series D funding, and most recently another $40 million in pre-series E funding, ShareChat set itself to grow its user base exponentially between 2019 and 2020. To achieve this, it knew that it needed a more efficient way to dynamically scale and allocate resources and turned to Google Cloud for a solution.

By moving to Google Cloud, today ShareChat uses just half the total core consumption of its legacy environment to run its existing workloads smoothly.

To support millions of users, ShareChat deploys and scales its app on Google Kubernetes Engine. To analyze 70 terabytes of data, ShareChat uses a combination of managed data analytics and database services from Pub/Sub for data pipelines, BigQuery for analytics of read-only data, Cloud Spanner for real-time app serving workloads, and Cloud Bigtable for less-indexed databases. ShareChat also uses Cloud CDN to distribute high-quality, low-latency content to users anytime, anywhere.

According to Bhanu Singh, co-founder and Chief Technology Office at ShareChat, the top two differentiators are the Google Cloud ecosystem and its people. "Google is at the forefront of technology innovation. We trust Google Cloud to figure out challenges and solve them so that we don't have to switch between technologies."

"Google Cloud makes things very simple. It offers an efficient DevOps pipeline for releasing changes on Google Cloud,” says Venkatesh Ramaswamy, Vice-President of Engineering at ShareChat. “We can deploy bug fixes and new features such as our Moj app for short videos, without spending hours writing scripts for testing and deployment."

ShareChat credits its project's success to a smooth migration, working with Google Cloud partner CloudCover and Google Cloud Professional Services (PSO).

“Hats off to Google PSO and CloudCover for being available for us at any time,” says Venkatesh. “Our collaboration and synergy helped us drive towards a shared goal.”

Scaling apps to meet unexpected demand with a few clicks

At the heart of ShareChat's application is data. ShareChat relies on real-time data to drive its social network platform, tracking all user activity in the app, from chat messages and new groups created, to what people like and who they follow.

ShareChat users create more than a million posts every day so it needs its systems to process terabytes of data with very high levels of efficiency. ShareChat migrated from a NoSQL database to Cloud Spanner, a relational database for its global consistency and secondary indexing. Out of 220 database tables, ShareChat moved 120 tables with 17 indexes to Cloud Spanner that supports transactional workloads, reducing cost by 30%. The ability of Cloud Spanner to seamlessly replicate data in multiple locations in real time also enables ShareChat to quickly retrieve a copy of any documents they require, even if one region fails.

When the company had its traffic grow to 500% over the span of just a few days, it managed to scale horizontally with zero lines of code change. This was alongside the launch of its Moj app, and within a week, it moved it across to another region in a seamless manner. With Cloud Spanner, the app handled all that extra load that was thrown at it without needing any help.

Venkatesh adds that the team faced some limitations when planning to scale with a NoSQL database. It had to rethink its existing table and schema definitions, which was going to be huge in terms of developer time, something that Cloud Spanner did not require.

Cost was another factor which would have been untenable given the scale of ShareChat. "We wanted a solution that would support us while we continue to grow and tackle use cases that may come up in the future. It was also important that we partnered with a technology-first company instead of a sales-driven one."

“ShareChat is a Made in India app that serves a global audience,” says Vishal Parpia, CEO and co-founder at CloudCover. “Depending on data requirements in different countries, the company may be required to store and process data locally. Cloud Spanner keeps the data systems in sync across multiple locations. That's something ShareChat could not do with its legacy system.”

"We worked closely with ShareChat to understand data usage and needs and then propose Google Cloud services to maximize performance for different workloads,” says Vishal. “Since the migration is not apples to apples, we had to refactor databases and build custom tools to audit and monitor database activity."

The company runs custom machine learning algorithms on top of the database to predict what each user wants to see in their feed. That analysis now happens on Cloud Bigtable and Cloud Spanner.

“Big-bang” migration with zero downtime for users

ShareChat assembled a dream team with the right skills and resources to deliver a seamless migration. Partnering with CloudCover, which has a proven track record for large-scale cloud migrations, ShareChat's engineers planned, tested, and executed the migration process together.

At the time of migration, ShareChat had already accumulated more than 70 terabytes of data, consisting of 220 tables. Some of these tables are 14 terabytes in size, with almost 50 billion rows.

ShareChat decided on a big-bang migration instead of a phased rollout. Due to the interdependencies of data, if service was moved one at a time, users might experience latency spikes from out-of-sync data, which would affect its timely delivery. For example, if the data isn’t in sync, users might receive late notifications such as two-week-old event invitations. Errors like that could lead users to stop using ShareChat.

"Even though we’re migrating terabytes of data, our move shouldn’t impact any customers,” says Venkatesh. “Google Cloud gave us a lot of confidence that this could be achieved and that it would act as a partner working alongside us. This is something we didn’t feel with our previous cloud partners.”

ShareChat ran a proof-of-concept cluster over four months to confirm database performance in a real-world scenario as the platform needs to handle more than one million queries per second. It used an open source API gateway to replicate all data from the legacy environment to Google Cloud for intensive performance tests and capacity analysis.

“Once we were confident that Google Cloud could handle the same, if not more traffic than our previous cloud environment, we moved ahead from planning to execution mode,” says Venkatesh.

"We didn’t want a quick-fix solution for migration, where the team would have to deal with code changes at a later stage. Google Cloud created a seamless route to reduce the technical debt of legacy code. Using wrappers, we could migrate to Google Cloud without changing anything in the existing application code, he adds.

“We successfully migrated more than 60 million users to Google Cloud in five hours without any data loss or downtime,” says Bhanu. “It's quite an achievement, considering all 30 of us on the migration team from ShareChat, Google, and CloudCover were working remotely due to the COVID-19 lockdown.” The number of users has since grown to 160 million, and Google Cloud continues to provide the support it needs.

Improving ShareChat’s ad offering for businesses

ShareChat is currently free to download, but the next phase of its journey is refining how it monetizes its service to deliver value to users as well as its investors. The app currently generates ad-driven revenue to support further app development. The company connects brands with content creators to build regional language campaigns.

Partners want to increase their brand experience with rural consumers on ShareChat through hyper-local content. The Business and Strategy team visualizes campaign performance metrics with BigQuery and Data Studio to help advertisers optimize marketing spend.

ShareChat also uses BigQuery and Data Studio to visualize resource usage captured in GKE usage metering, a feature of Google Kubernetes Engine. By merely tagging the services in BigQuery, ShareChat can share a cost breakdown for different departments to track over-provisioned resources and reduce waste.

“Instead of over-provisioning servers, we can pre-scale Google Kubernetes Engine for traffic spikes around scheduled events such as Diwali when millions of Indians send greetings. Migrating to a native Kubernetes environment dramatically enhances our ability to adopt agile ways of work such as automated deployment and to save time writing scripts.”

Venkatesh Ramaswamy, Vice-President of Engineering, ShareChat

Simplifying DevOps with automation

On average, ShareChat experiences a throughput of 80,000 requests per second (RPS), that's nearly 7 billion RPS a day. Given the size of its subscription base, push notifications such as the daily trending topic sent to users at 5 PM can result in a spike of 130,000 RPS in a matter of seconds.

“Instead of over-provisioning servers, we can pre-scale Google Kubernetes Engine for traffic spikes around scheduled events such as Diwali when millions of Indians send greetings,” says Venkatesh. “Migrating to a native Kubernetes environment dramatically enhances our ability to adopt agile ways of work such as automated deployment and to save time writing scripts.”

Unique Kubernetes features such as sidecar proxy enables ShareChat to attach peripheral tasks such as logging into the application without making code changes. Compared to the previous cloud provider, Google Kubernetes takes care of Kubernetes upgrades by default. Clusters and nodes automatically upgrade to run the latest Kubernetes version to minimize security risks and use new features.

“As we expand, we'll build new algorithms that process real-time datasets in regional languages and accurately predict what content users want to see. Google Cloud gives us an infrastructure optimized to handle such compute-intensive workloads for current and future growth.”

Bhanu Singh, co-founder and Chief Technology Office, ShareChat

Millisecond latency for content delivery and real-time ML predictions

Latency is critical to ShareChat, as app performance outside of metro areas can be affected by low bandwidth and high-latency rate. Users may get frustrated if the app loads slowly or messages are late.

Using Cloud CDN allows ShareChat to address questions like, How can we be closer to our users to reduce latency in Tier 2 and Tier 3 cities? How can we make the app load faster and deliver messages on time?

ShareChat uses Cloud CDN to cache data in five Google Cloud Point of Presence (PoP) locations at the edge in India, bringing the content as close to the user as possible to speed up load time. Since moving to Cloud CDN, ShareChat's cache hit ratio improved from 90% to 98.5% which means the cache can effectively handle 98.5% of content requests from users.

To make its global expansion a reality, ShareChat plans to use machine learning to reach new internet users with content in different languages.

“As we expand, we'll build new algorithms that process real-time datasets in regional languages and accurately predict what content users want to see. Google Cloud gives us an infrastructure optimized to handle such compute-intensive workloads for current and future growth,” says Bhanu.

Tell us your challenge. We're here to help.

Contact us

About ShareChat

ShareChat is the leading Indian social media platform that allows users to share their opinions, document their lives, and make new friends in their native language. On a mission to spearhead India's internet revolution, ShareChat is changing the way in which the next billion users will interact on the internet.

Industries: Media & Entertainment
Location: India

About CloudCover

Founded in 2015, CloudCover is an award-winning cloud service provider specializing in infrastructure and data migration. CloudCover is one of the first Google Cloud partners in Southeast Asia and 2017 Google Cloud APAC Services Partner of the Year. Headquartered in India, it has satellite sales and project management offices in Singapore and LA.