System Designhard12 min read

Design a Live Video Portal (Streaming)

Live streaming on the client: HLS/DASH adaptive bitrate vs WebRTC, the buffering state machine, a high-volume live chat sidebar, syncing chat to stream latency, quality switching, and reconnect after a drop.

Published 20 May 2026 · by Frontend Masters India

A live video portal is two hard problems sharing a screen: getting video onto the page without it stalling, and running a chat next to it that can take thousands of messages a minute without melting the browser. Think Twitch or a sports stream. Interviewers like this prompt because the video half pulls you into delivery protocols and a buffering state machine, while the chat half is a brutal rendering-performance question. You get tested on both.

1. Scope it first

How live is live? A sports broadcast can tolerate 10-30 seconds of delay. An auction or a co-host video call cannot. This single answer decides HLS/DASH vs WebRTC.
One broadcaster to many viewers, or interactive? One-to-many is streaming; two-way is conferencing.
Is there chat, and how busy? A big stream is tens of thousands of concurrent viewers and a chat that scrolls faster than anyone can read.
Quality controls? Auto bitrate plus a manual quality picker, captions, fullscreen.

Assume: one broadcaster to many viewers, a few seconds to tens of seconds of acceptable latency, a very busy live chat, adaptive bitrate with a manual override.

2. Delivery: HLS/DASH vs WebRTC

The core trade-off. Lead with it.

HLS and DASH chop the stream into short segments (2-6 seconds each) listed in a manifest the player polls. The browser downloads segments over plain HTTP, which means it rides CDN caching and scales to millions of viewers cheaply. The cost is latency: you're always several segments behind live, so 6-30 seconds delay is normal. This is the right default for one-to-many at scale.

manifest.m3u8 → [seg0.ts][seg1.ts][seg2.ts]...   // player buffers ahead

Browsers don't play HLS natively except Safari, so you use Media Source Extensions through a library like hls.js or dash.js, which fetches segments and feeds them into a <video> element's SourceBuffer.

WebRTC is a peer connection built for sub-second latency: video calls, auctions, anything interactive. It's UDP-based and real-time, but it doesn't ride CDN caching the same way, so scaling one-to-many needs a media server (an SFU) fanning the stream out, which is more expensive and more complex.

The decision rule: use HLS/DASH when scale and reach matter and a few seconds of delay is fine; use WebRTC when latency below a second is a product requirement. For a portal showing a broadcast to a huge audience, HLS/DASH with low-latency extensions is usually the answer.

3. The buffering state machine

A live player is a state machine, and naming the states is what makes your answer sound like you've shipped one.

idle → loading → playing ⇄ buffering → ended
                    ↑                ↓
                    └──── error ─────┘

loading: fetching the manifest and the first segments. Show a spinner.
playing: the buffer ahead of the playhead is healthy, video runs.
buffering: the buffer ran dry (playhead caught up to downloaded data). Pause playback, show a spinner, keep fetching. Don't flip back to playing until you have enough buffered to survive, or you'll yo-yo between states.
error: a segment failed or the manifest 404'd. Retry with backoff, then surface a real error.

You drive this off <video> events: waiting and stalled push you toward buffering, canplay and playing pull you back, error is its own branch. The judgment call interviewers look for is the hysteresis: don't resume on one buffered frame, resume on a threshold (say a few seconds buffered) so the experience doesn't stutter.

4. Quality switching (adaptive bitrate)

The manifest lists the same content at several bitrates (240p up to 1080p). Adaptive bitrate measures recent download throughput and the current buffer level, then picks the rendition that won't starve the buffer. The library does this automatically; your job is to expose a manual override and to switch cleanly.

Switching down should be aggressive (drop quality the instant the buffer shrinks, because a stall is worse than a soft frame). Switching up should be cautious (only climb when you've got headroom). When the user picks a quality manually, pin it and stop the auto logic until they choose Auto again.

hls.currentLevel = -1;   // -1 = auto/adaptive
hls.currentLevel = 3;    // pin to a specific rendition

5. The live chat sidebar at high volume

This is where the frontend interview gets real. A busy stream might push thousands of chat messages a minute. The naive approach (append a DOM node per message, let the list grow) destroys the page in under a minute.

Three things keep it alive:

Cap the list. Live chat is ephemeral. Keep only the last ~200-500 messages in the DOM and drop the oldest as new ones arrive. Nobody scrolls a fast chat's full history, so there's no reason to retain it. This bounds memory and DOM size permanently.

Batch DOM updates. Don't render per message. Buffer incoming messages and flush them on a requestAnimationFrame tick, so a hundred messages arriving in one frame become one render, not a hundred.

let pending = [];
function onChatMessage(msg) {
  pending.push(msg);
  if (pending.length === 1) requestAnimationFrame(flush);
}
function flush() {
  appendMessages(pending);   // one batched render
  pending = [];
  trimToCap();               // drop oldest beyond the cap
  if (atBottom) scrollToBottom();
}

Pause autoscroll on interaction. If the user scrolls up to read something, stop auto-following and show a "more messages" jump button, exactly like a chat list. Snapping them to the bottom mid-read is infuriating at this message rate.

Under extreme load you can also sample or coalesce: if messages outpace what a human can read, you're allowed to drop some, because no one is reading every line of a 10,000-viewer chat anyway.

6. Syncing chat to stream latency

Here's a subtle one worth raising. Because HLS leaves viewers seconds behind live, the chat (which is real-time) is ahead of the video. People react to a goal in chat before the laggy viewer sees it, spoiling the moment. Two options: delay rendering chat messages to roughly match the viewer's playback delay using each message's server timestamp against the current playhead's wall-clock position, or just accept the skew for casual streams. Mentioning the problem at all is a strong signal; most candidates never notice it.

7. Reconnect after a drop

Networks fail mid-stream. The player should detect a stall that isn't recovering, then reload the manifest and resume from the live edge (not from where it stalled, since this is live, you want to jump back to now). The chat socket reconnects with backoff and resubscribes; you don't try to replay the missed messages because live chat is disposable. Show a clear "reconnecting" state rather than a frozen frame.

ws.onclose = () => {
  setChatState("reconnecting");
  retryWithBackoff(connectChat);
};
video.addEventListener("stalled", () => {
  if (!recoveringSoon()) reloadManifestAtLiveEdge();
});

8. Performance and accessibility

Keep video and chat re-renders independent so a chat flood never touches the player. Captions and a real keyboard-operable control bar (play, mute, quality, fullscreen) are part of the answer, not extras. Lazy-load the player so the page paints before the heavy media code loads.

What the interviewer will push on

"When WebRTC over HLS?" WebRTC for sub-second interactive latency at smaller scale; HLS/DASH for huge one-to-many audiences where a few seconds of delay is acceptable and CDN caching matters.
"The chat is 5,000 messages a minute. How does the tab survive?" Cap the rendered list, batch appends on requestAnimationFrame, trim the oldest, and drop messages under extreme load.
"Buffer ran dry. What does the user see and what do you do?" Enter buffering, show a spinner, keep fetching, and resume only past a buffered-seconds threshold to avoid stutter.
"Chat spoils the live moment. Fix it?" Delay chat rendering to match playback latency using message timestamps versus the playhead's wall-clock position.
"Connection dropped for 10 seconds." Reload the manifest and jump to the live edge rather than resuming where it stalled; reconnect the chat socket with backoff and don't replay missed messages.

The one-paragraph recap

A live video portal picks HLS/DASH for scalable one-to-many delivery and WebRTC only when sub-second latency is the product requirement, drives the player as an explicit buffering state machine with hysteresis so it doesn't stutter, and switches bitrate down fast and up cautiously with a manual override. The chat sidebar survives high volume by capping the rendered list, batching appends per animation frame, trimming the oldest, and pausing autoscroll on interaction. The senior touches are syncing chat rendering to the viewer's playback delay so it doesn't spoil the moment, and reconnecting to the live edge after a drop. Lead with the HLS-vs-WebRTC trade-off and the chat batching; those are the two pillars of the answer.

Before you leave — how confident are you with this?

Your honest rating shapes when you'll see this again. No grades, no shame.

Comments

to join the discussion.

Loading comments…