Signaling

Signaling is the process of sending Session Description Protocol (SDP) between two clients in order for them to establish a WebRTC peer-to-peer connection

For metadata signaling, WebRTC apps use an intermediary server, but for actual media and data streaming once a session is established, RTCPeerConnection attempts to connect clients directly: peer-to-peer.

Session Description Protocol (SDP)

the SDP contains the following information:

  • ice-candidates - network data, such as a host’s IP address and port as seen by the outside world
  • security metadata
  • media metadata such as codecs and codec settings, bandwidth and media types
  • etc

Using ICE Framework to Cope with NATs and Firewalls

In a simpler world, every WebRTC endpoint would have a unique address that it could exchange with other clients in order to communicate directly.

A World without NATs and Firewalls

In reality most clients live behind NATs and firewalls.

The Real World

WebRTC apps can use the ICE Framework to overcome the complexities of real-world networking.

ICE Framework utilizes the following servers:

  • STUN Server - provides ice-candidates back to client (inexpensive to maintain)
  • TURN Server - provides a relay between 2 clients, but only used when direct peer-to-peer connection is impossible (expensive to maintain)
    • are usually also a STUN Server

ICE tries to find the best path to connect clients. It tries all possibilities in parallel and chooses the most efficient option that works:

  • ICE first tries to make a connection using the host address obtained from a device’s operating system and network card
  • if that fails, (which it will for devices behind NATs) ICE obtains an external address using a STUN Server
  • if that fails, traffic is routed via a TURN Server

URLs for STUN and/or TURN servers are (optionally) specified by a WebRTC app in the iceServers configuration object that is the first argument to the RTCPeerConnection constructor. For appr.tc that value looks like this:

{																																																																												
  'iceServers': [
    {
      'urls': 'stun:stun.l.google.com:19302'
    },
    {
      'urls': 'turn:192.158.29.39:3478?transport=udp',
      'credential': 'JZEOEt2V3Qb0y27GRntt2u2PAYA=',
      'username': '28224511:1379330808'
    },
    {
      'urls': 'turn:192.158.29.39:3478?transport=tcp',
      'credential': 'JZEOEt2V3Qb0y27GRntt2u2PAYA=',
      'username': '28224511:1379330808'
    }
  ]
}

Once RTCPeerConnection has that information, the ICE magic happens automatically: RTCPeerConnection uses the ICE framework to work out the best path between clients, working with STUN and TURN servers as necessary.

STUN (Session Traversal Utilities for NAT)

TURN (Traversal Using Relays around NAT)

WebRTC Flow

  1. WebRTC can be thought of as just a “protocol” for establishing “P2P” connections
  2. The Session Description Protocol (SDP) is a structured format that defines the media capabilities of a peer: codecs, video availability, audio availability, resolution, etc.
  3. An Interactive Connectivity Establishment (ICE) candidate defines the available “routes” a data packet can take to get from Peer A to Peer B. For example, an ICE candidate may describe a simple, direct P2P connection (just like in the “old days”, as you mentioned in the video). However, due to NAT, there may be some other ICE candidates that may help work around this issue, such as an intermediary TURN server
  4. In the initial handshake of every WebRTC connection, Peer A creates an “offer” to Peer B. The WebRTC standard itself does not define how this “offer” is brought to Peer B, but people usually opt to using WebSockets, at least for this “signaling” portion of the handshake
  5. Once Peer B receives the “offer” (by any means necessary, even by carrier pigeon), Peer B responds with its own SDP (the “answer”) to inform Peer A about Peer B’s media capabilities
  6. Peer A receives Peer B’s “answer”. By then, both peers know each other’s media capabilities. This is the end of the initial “three-way” handshake, analogous to that of TCP (`SYN`, `SYN-ACK`, and `ACK`)
  7. However, the connection has not been established yet. Only the optimal media capabilities have been agreed upon. During this time, each peer receives an ICE candidate from their browser. As soon as they receive a candidate, they should immediately forward it to the other side. This is the part when both peers just to throw ICE candidates back and forth (via any “signaling” method). Rinse and repeat until they both agree on a viable ICE candidate
  8. Once both have agreed on an ICE candidate, a connection can finally be established. Data can be streamed back and forth from Peer A to Peer B via the route defined by the agreed-upon ICE candidate. By this time, there is no longer a need for the intermediary “signaling” server since the P2P connection has already been established

Signaling SDP

  • to connect 2 clients A and B
  • A will create an offer (SDP) and set it as local description
  • B will get the offer and set it as remote description
  • B creates an answer and sets it as its local description and signal the answer (SDP) to A
  • A sets the answer as its remote description
  • connection established, exchange data channel