What is Media Server and why it's needed?

WebRTC media servers handle the processing and distribution of audio, video, and data streams between clients. They are essential for scenarios that require advanced media handlings, such as group video calls, recording, transcoding, or broadcasting. Media servers are responsible for mixing, recording, and scaling, ensuring all participants' high-quality and consistent experience.

There are different types of WebRTC servers. One of them is the WebRTC media server. When will you be needing one and what exactly it does?

The role of a WebRTC media server

At its conception, WebRTC was meant to be “between” browsers. Only recently, did the good people at the W3C see it fit to change it to something that can work also in browsers. We’ve know that to be the case all along 😎

What does a WebRTC media server do exactly? It processes and routes media packets through the backend infrastructure – either in the cloud or on premise.

Let’s say you are building a group calling service and you want 10 people to be able to join in and talk to each other. For simplicity’s sake, assume we want to get 1Mbps of encoded video from each participant and show the other 9 participants on the screen of each of the users:

This is where a WebRTC media server comes in. We will add it here to be able to do the following tasks for us:

Reduce the stress on the upstream connection of clients
- Now clients will send out fewer media streams to the server
- The server will be distributing the media it receives to other clients
Handle bandwidth estimation
- Each client takes care of bandwidth estimation in front of the server
- The server takes care of the whole “operation”, understanding the available bandwidth and constraints of all clients

How is a WebRTC media server different from TURN servers

WebRTC media server != TURN server

I’ve seen people try to use the TURN server to do what media servers do. Usually that would be things like recording the data stream.

This doesn’t work.

TURN servers route media through firewalls and NAT devices. They aren’t privy to the data being sent through them. WebRTC privacy is maintained by having data encrypted end to end when passing via TURN servers – the TURN servers don’t know the encryption key so can’t do anything with the media.

A WebRTC media server is privy to all data passing through it, and acts as a WebRTC client in front of each of the WebRTC devices it works with. It is also why it isn’t so well defined in WebRTC but at the same time so versatile.