Ultra-Low-Latency Streaming: HESP vs webRTC
In the vibrant world of gaming, betting, auctions, live commerce, interactive TV and in stadium streaming, achieving ultra-low latency is a must. It not only heightens viewer engagement but also unlocks the potential for interactivity, ultimately bolstering revenues. Two leading technologies, HESP and webRTC, stand out for delivering ultra-low-latency streaming. In this article we will compare both technologies, evaluating aspects such as latency, scalability, device support, network resilience, content protection, viewer quality of experience, timed metadata support, and backwards compatibility.
- Latency
HESP, functioning as an HTTP over TCP protocol, necessitates a player buffer to manage variable network conditions, enhancing overall quality at the expense of slightly higher latencies. Complete end-to-end latencies for HESP, covering (low latency) encoding, packaging, CDN, and player buffer, typically stand at 700-800ms. WebRTC, designed for real-time web communication, boasts swift communication facilitated by the UDP protocol, achieving impressively low latencies, even as low as 500ms. Real-world latencies for both protocols may fluctuate based on factors such as ingest and viewer location.
- Scalability
Scaling mechanisms for accommodating various audience sizes differ significantly between HESP and webRTC. HESP, as an HTTP-based streaming protocol, taps into commodity CDNs designed for website delivery, similar to HLS and DASH. This allows for seamless scaling to thousands, even millions of viewers. Furthermore, the distributed global CDN points of presence facilitate effective management of flash crowds, ensuring stability during surges in viewer numbers. In contrast, webRTC necessitates each client to have a direct connection with the backend. Achieving scalability in this context involves deploying additional media server instances, a complex and resource-intensive process for reaching larger audiences. On average, most webRTC solutions initiate a new server for every 500 to 1,000 new viewers.
Figure 1: HESP vs WebRTC-based scaling
- Device support
Tailoring your streaming approach to specific use cases involves considerations for browsers, mobiles apps, smart TV apps, and set-top boxes. While both streaming protocols share similar device support, a critical distinction arises concerning web browsers. WebRTC does not permit the use of a 'bring your own player' approach; instead, it mandates the use of the browser's built-in player implementation. This restriction curtails the use of certain codecs. While audio codecs are constrained to Opus, video codecs are practically limited to h.264 and VP8, solely in the constrained baseline profile, thereby restricting the quality potential. Development for AV1 is underway but far from standardized.
- Network resilience
Ensuring the last-mile delivery of your ultra-low latency live stream can sometimes be a challenge, given the reliance on existing networks globally. Regions like Southern Europe, Latin America, Africa, or Asia demand heightened network resilience. While proprietary webRTC implementations at times feature server-side Adaptive Bitrate (ABR), UDP delivery has shown to suffer from significant packet loss, leading to frame drops. This is especially the case for mobile operators who often restrict UDP traffic during peak times and where traffic is regularly lost when migrating between network cells. In contrast, HESP, with TCP delivery and enhanced ABR, enables rapid quality adjustments based on evolving network conditions, avoiding head-of-line blocking to achieve smooth video without frame drops, and making it more resilient in scenarios like crowded city centers and major events like the World Cup.
- Content protection
Securing content, for use cases such as live sports (betting) and board meetings, demands robust content protection measures. Both webRTC and HESP support essential techniques like token-based security, however, this does not necessarily protect the content itself. The divergence between webRTC and HESP arises in DRM implementation. While some webRTC solutions provide proprietary DRM implementations, they lack cross-platform support, for example not supporting iOS Safari. In contrast, HESP leverages CMAF compatibility. This critical advantage allows HESP to apply studio-approved DRM seamlessly across browsers, mobile devices, smart TVs, and set-top boxes, with no impact on latency.
- Quality of Experience
Several factors influence the viewer Quality of Experience. In addition to addressing potential frame drops, as discussed earlier, considerations such as codecs, profile, resolution, bitrate, frame rate, stalling and start-up time are pivotal. Notably, a key distinction between webRTC and HESP lies in start-up time and high bitrates & resolutions. WebRTC grapples with a trade-off between live latency and channel change time due to a GOP (“Group of Pictures”) based streaming approach, resulting in higher channel change times for ultra-low-latency streaming. On the contrary, HESP adopts a frame-based streaming approach, eliminating the need for such trade-offs. This enables HESP to perform fast channel changes, even at ultra-low latencies. Both technologies support high bitrates and resolutions, although these are economically more viable for HESP, as it scales cost-effectively over standard CDNs, it does not require highly frequent (and expensive) key frames, and is able to leverage more advanced codecs and encoding profiles, providing higher perceptual quality for the same bitrate.
- Captions, subtitles and timed metadata
WebRTC lacks specifications for the delivery of captions, subtitles, ad insertion markers, and other timed data, relying on proprietary messages over webRTC data channels instead. In contrast, HESP aligns with other HTTP-based streaming protocols, ensuring full compatibility with CMAF. This compatibility extends to features supported by CMAF, enabling HESP to handle subtitles through standards such as TTML or WebVTT akin to current practices in HLS and DASH. Furthermore, the CMAF compatibility makes it possible to use standardized approaches to insert timed metadata, for example the carriage of advertisement information over ID3 or in emsg (“event message”) boxes.
- Backwards compatibility
HESP is fully backwards compatible with HLS and DASH. This enables the reuse of HESP output for live to VOD, and makes it possible to generate an additional HLS or DASH stream. Furthermore it facilitates the reuse of existing encoding and distribution (CDN) infrastructure. In contrast, WebRTC lacks backward compatibility with HLS and DASH, hindering the utilization of existing encoding and distribution infrastructure.
- Summary
The choice between HESP and webRTC for ultra-low-latency streaming holds profound implications. While webRTC excels in achieving impressively low latencies, HESP ensures enhanced quality. The scalability diverges significantly where HESP takes the torch thanks to the economics linked to its ability to scale over existing, commodity CDNs. While in terms of device support both protocols are nearly on-par, HESP's network resilience, robust content protection, superior Quality of Experience, out-of-the box support for captions, subtitles and timed metadata, and backwards compatibility with HLS and DASH make it a compelling choice for ultra-low-latency streaming.
Figure 2: Comparison between HESP and WebRTC
This article is Sponsored Content