AI and VoIP Blog

VOIP | AI | Cloud | Kamailio | Open Source


How WebSocket Works in WebRTC (With Trace Analysis and FAQ’s)

WebSocket is crucial for signaling in WebRTC. In this blog we dive deep to understand how WebSocket operates in WebRTC, starting from the TCP handshake to closeing connection. To follow along, you can download the trace from my GitHub repository here.

  • TCP Handshake
    WebSocket communication begins with a TCP handshake, establishing a reliable connection between the client and the server.
TCP handshake
The client (port 51724) initiates a TCP connection with the server (port 7880) using a SYN packet.SYN-ACK and ACK follow to complete the three-way handshake. Packet # 15-17.
Note:- Here server is using port 7880 but port 80 or 443 is generally used.
  • HTTP Upgrade Request
    After the TCP handshake, the client sends an HTTP Upgrade request to transition to the WebSocket protocol. The server responds with a status code 101 Switching Protocols, confirming the upgrade.
HTTP Upgrade Request
HTTP upgrade request , packet # 19 and 21.
  • Persistent WebSocket Connection
    Once the protocol switches, the WebSocket connection becomes a persistent, full-duplex channel for signaling. This is used to exchange Session Description Protocol (SDP) messages and Interactive Connectivity Establishment (ICE) candidates.
Websockets Binary Packets
Websocket Binary packets from server to client.
STUN requests and responses
STUN requests and responses
  • Masked Messages
    WebSocket clients (browsers) must send messages with a mask applied for security reasons. This is seen in the trace where messages from the client are marked as [MASKED].
Websocket client masked messages
WebSocket messages sent by the client are labeled [MASKED].
  • Media Transport
    After signaling, media (audio, video, and data) is sent directly between peers using WebRTC protocols like DTLS and SRTP. WebSocket is no longer involved in the media path.
  • Termination Process
    A WebSocket connection terminates through a process called a close handshake, where both the client and the server can initiate the termination. Here’s how it works:
Websocket close handshake
Websocket connection close, packet # 1275 – 1281
  • Step 1: Close Frame Sent
    • Either the client or the server sends a Close frame (Opcode: 0x8) to indicate that it wants to close the connection.
    • The Close frame may include a status code (2 bytes) and an optional reason (text string).
  • Step 2: Acknowledgment
    • The receiving party acknowledges the Close frame by sending its own Close frame back.
    • This confirms that both sides agree to terminate the connection.
  • Step 3: TCP Connection Closure
    • Once both Close frames are exchanged, the underlying TCP connection is closed.

FAQ’s

  1. Why is WebSocket used for signaling in WebRTC?
    WebSocket provides a persistent, real-time, full-duplex connection, perfect for exchanging SDP and ICE candidates required for WebRTC setup.
  2. Does WebRTC specify a signaling protocol?
    No, WebRTC doesn’t mandate a signaling protocol. WebSocket, SIP, or custom protocols can be used.
  3. Can SIP work over WebSocket in WebRTC?
    Yes, SIP over WebSocket (as per RFC 7118) is a common setup for WebRTC.
  4. Can SIP be used without WebSocket in WebRTC?
    Yes, SIP can work over other transports like TCP or UDP, but browsers do not natively support SIP over these protocols.
  5. Is WebSocket supported by browsers because it uses HTTP/HTTPS?
    Yes, WebSocket starts with an HTTP/HTTPS handshake, leveraging standard web infrastructure and ports (80/443), which are browser-friendly.
  6. What is the advantage of using websockets without SIP?
    Simplicity: You can create lightweight, custom protocols tailored to your application’s needs.
    Flexibility: Not tied to SIP’s structure or semantics, which might be overkill for some applications.
    Reduced Overhead: SIP includes many headers and mechanisms that may be unnecessary for certain applications.
    Broader Use Cases: WebSocket is not limited to VoIP or RTC but can be used for any real-time data exchange.
  7. What are different types of websocket connections?
    Unencrypted WebSocket (ws://): Plain TCP, Transmits data in clear text, not secure. Best for non-sensitive use on trusted networks (e.g., ws://example.com/chat).
    Encrypted WebSocket or WebSocket Secure (wss://): TLS Secured, Encrypts data for confidentiality and integrity, ideal for internet use (e.g., wss://example.com/chat).
  8. TCP is layer 4 protocol, which layer protocol is websockets?
    WebSocket is an application-layer protocol, which corresponds to Layer 7 of the OSI model like SIP.

Key Takeaways
• WebSocket is a vital part of WebRTC for signaling and connection setup.
• The process starts with a TCP handshake, followed by an HTTP upgrade.
• Persistent WebSocket connections enable the exchange of signaling data, as seen in the provided trace.
• After signaling, WebRTC handles media directly over peer-to-peer connections.
• WebSocket connections close gracefully. This occurs through a combination of WebSocket Close frames and underlying TCP FIN-ACK exchange.

Additional Resources
SIP:- https://de.wikipedia.org/wiki/Session_Initiation_Protocol
SDP:- https://en.wikipedia.org/wiki/Session_Description_Protocol
ICE:-https://en.wikipedia.org/wiki/Interactive_Connectivity_Establishment

Leave a Reply to Understanding LiveKit: A Deep Dive into WebRTC Communication – AI and VoIP BlogCancel reply

Join 49 other subscribers

Akash Gupta
Senior VoIP Engineer and AI Enthusiast



Discover more from AI and VoIP Blog

Subscribe to get the latest posts sent to your email.



One response to “How WebSocket Works in WebRTC (With Trace Analysis and FAQ’s)”

  1. […] features into their applications. In my previous article we discussed how Websockets works link. In this article we explores how LiveKit handles calls, from signaling to media […]

Leave a Reply to Understanding LiveKit: A Deep Dive into WebRTC Communication – AI and VoIP BlogCancel reply

Discover more from AI and VoIP Blog

Subscribe now to keep reading and get access to the full archive.

Continue reading