How to optimize your network for live video on Google Cloud
Like so many industries impacted by the global pandemic, the media and entertainment industry was forced to quickly create ad-hoc solutions to help broadcasters “keep the show on the air.” This caused seismic shifts in media production, distribution, and consumption, which accelerated trends like virtual work that were already underway and are now likely permanent. Google Cloud can be a key enabler in the long-term evolution of live TV supply chains.
In 2020, the internet also faced unprecedented demand. Internet Exchanges (IXPs) recorded net increases up to 60%1 in total bandwidth handled per country during Q1 2020. Google’s unique global fiber optic network and approach to cloud provides highly differentiated capabilities for media supply chains that can isolate broadcasters from potential bandwidth bottlenecks.
Next, in line with Google’s philosophy of creating an open platform and making it easy for our partners to work with us, Google has created a comprehensive partnership ecosystem with some of the best known media technology companies.
This blog post is one of the first in a series from the Google Cloud teams that work closely with media customers and partners every day. In this installment, we share best practices for network setup and configuration, which is crucial for high-quality video broadcasts.
1. Understanding and Calibrating your Network and VMs
Broadcast distribution of live video requires highly consistent network performance. The following considerations are important factors in the production and distribution of a video stream:
Latency (the time taken to transmit a packet from Point A → Point B)
Jitter (the latency’s variance over time)
Packet drops (the number of packets lost from Point A → Point B)
Here is how to understand and calibrate your network and VMs:
Network Baseline: Understand your current network’s performance level of latency, jitter and packet drops.
Calibrate: Adjust your cloud transmission endpoints to compensate for these artifacts by:
Adjusting your lossless overlay protocol with the correct amount of error correction and redundancy to manage the packet drops especially over large distances. Opt for a suitable media overlay transport protocol like Secure Reliable Transport (SRT), Zixi or Reliable Internet Stream Transport(RIST)/SMPTE 2022-7.
Latency and jitter change by distance and number of intermediate processing and transit steps; therefore, measure both parameters and adjust receiving app/VM network buffers as needed.
Benchmark VMs: Optimizing VM sizes and tuning OS changes (in a Linux environment) have a direct impact on video transport performance. These include:
Changing the size of Guest OS ‘receive’ buffer.
Changing to a higher performance (CPU/RAM) machine type if your VMs used for media transport (responsible for ingress/egress traffic) run at greater than 50% sustained CPU utilization. It’s best to leave the extra headroom to account for the inevitable temporary spikes in CPU utilization due to workload/network jitter inherent in any network.
A blanket high level of error correction, buffering, and redundancy in your transport protocol is wasteful and can significantly increase network traffic and CPU overhead. Google Cloud’s network allows you to create systems with lower latency and jitter in two ways:
Google’s global fiber optic network directly connects different continents and regions over a dedicated backbone. Therefore, all regions are within a single network hop of each other, not encumbered by extraneous network hops or third-party transit agreements.
We published PerfKit Benchmarker to provide you with visibility and further understanding of jitter and latency in your architecture.
Details of setting up and executing a thorough test will be provided in an upcoming blog post. In the meantime, you can refer to this prior blog post about general network measurement and instrumentation on Google Cloud.
2. Ingest into Cloud
You can get your raw media stream into Google Cloud over the public internet or over interconnect, with your business requirements determining the most appropriate ingest method. In either case, the use of a lossless protocol like SRT is recommended.
When using the public internet, you’ll likely use either TCP or UDP with a lossless protocol overlay. Generally, UDP with a lossless overlay (such as SRT) is recommended; alternatively, you can also transmit your signal over a VPN from on-premise to Google Cloud. If using a secure transport like SRT, the need for a VPN is reduced, but other protocols without security might still require a VPN.
The Google Cloud VPN is not a VM-based, single point of failure. Instead, it’s a regional scale-out service that provides up to 3gbps bandwidth per tunnel. Additional tunnels can be set up for greater bandwidth, and the VPN is available in HA configurations that offer 99.99% service availability as well. The Google Cloud VPN uses the premium network tier. You can also proactively get notified of over-utilized VPN tunnels before it becomes a bottleneck, preventing packet loss and increasing resiliency of your system.
When not using a VPN, we recommend using Google Premium Network tier for public internet ingestion so that traffic from your source enters Google’s network from the closest point of presence to that source.
For higher throughput streaming, especially for UDP/RTP based ingest methods, a dedicated connection (Dedicated Interconnect or Partner Interconnect via a service provider) is more common. When choosing an interconnect type, consider your connection requirements, such as the connection location and required capacity. Both types of interconnect can be configured with redundancy to achieve a 99.99% SLA. Visit Google’s peering site to get started, and read more about Google Cloud’s interconnect best practices.
Today’s modern broadcasters and media companies have two main distribution needs: one, sending linear channels/streams to other traditional MVPDs, partners, and operators; and two, sending VOD/live traffic directly to end consumers for viewing via applications and smart TVs.
Distribution to traditional MVPDs, Partners, and Operators
Google’s global network is a differentiated offering that provides distributors a quantum leap in cloud-based media transmission capability across three key areas: reach, reliability, and performance.
Reach: the single, global planet-wide network with 91 global direct interconnect locations allows feeds originating from any region in the world to be transmitted to any other region after the appropriate in-cloud processing and transformation. This allows you to confidently meet your business requirements to supply media to your distribution partners.
Reliability: the global network has been designed to self-heal in the event of various failures or congestion by intelligently finding alternate optimal paths for your data with minimal effort on your part. These operations are handled automatically. We’ve devised mechanisms to defend against advanced attacks including DDoS threats. Our infrastructure was able to absorb a 2.5 Tbps DDoS attack in September 2017—the highest-bandwidth attack reported to date. By deploying Google Cloud Armor integrated into our Cloud Load Balancing service—which can scale to absorb massive DDoS attacks—you can protect services deployed in Google Cloud, other clouds, or on-premise.
Performance: Google’s innovation in its network stack gives you the benefit of extremely high network performance within and between regions. That means you can transmit media to partners with high throughput and low latency, packet loss, and jitter.
OTT and Direct-to-Consumer (DTC) Distribution
The en-masse adoption of streaming media has necessitated petabyte-scale global delivery of content to end customers. The end customers vary widely in their location, connectivity, equipment, and last-mile ISPs.
Google Cloud CDN has been purpose-built to deliver content with speed, efficiency, and reliability to all corners of the world. Cloud CDN caches your content in more than 100 locations around the world and hands it off to 144 network edge locations, placing your content close to your users, usually within one network hop through their ISP - giving your viewers the best possible content experience. Additionally, by using Cloud CDN, you get the benefit of over a decade of edge innovation, such as fast SSL handshakes through QUIC, advanced congestion control through BBR, simplified DNS management through global anycast IPs, and DDoS absorption at scale.
While Cloud CDN can serve content from any origin, Google Cloud Storage (GCS) with advanced capabilities like multi-regional buckets allow you to further leverage Google’s innovations to delight your customers.
4. Measuring, Monitoring, and Improving
PerfKit Benchmarker is an open-source tool created at Google that allows you to measure and understand performance across multiple clouds and hybrid deployments. Use PerfKit Benchmarker to get visibility into and benchmark performance metrics like latency, throughput, and jitter. You can access the tutorial here.
Google Cloud offers Network Intelligence Center for comprehensive and proactive monitoring, troubleshooting, and optimization capabilities across hybrid deployments. Four products are available within Network Intelligence Center today: Connectivity Tests, Network Topology, Performance Dashboard, and Firewall Insights. Learn more about how to fix your top network issues using these products.
Proper network setup and configuration is crucial to achieve high quality video broadcasts in the cloud. Google’s global network provides customers with a highly capable system, and with proper tuning for media use cases, customers can achieve high reliability and performance in their broadcast system.
No network system is static and unchanging. The Google Cloud network provides out-of-the-box tools for monitoring and insights. This allows you to continuously measure and improve your aggregate performance in an ever-changing environment where the needs of your broadcast partners and customers are continuously evolving.