Real-time voice and video communication on the Internet is main stream today with several popular instant messengers (IMs) supporting VoIP calls. A big hurdle in the initial adoption of VoIP was the fact that most PCs or other devices sit behind firewalls and use private IP addresses. Multiple private addresses (IP address and port) in the network are mapped to a single public address by a firewall using a technique called Network Address Translation (NAT). But the end device is not aware of its public address, and hence cannot receive voice traffic from the remote party on the private address it advertises in its VoIP communication. One solution to this NAT traversal problem is a tool called Session Traversal Utilities for NAT (STUN), devised by the IETF to allow applications to discover their public address and port mappings for use in communication with a peer.
Below, I’ve tried to deconstruct a Yahoo Messenger voice call with the hope of understanding how STUN is used in NAT traversal.
Using Wireshark, I captured the traffic for a call between me (private IP address 192.168.1.3) and a remote user in my own network (private IP address 192.168.1.5).
We first see my IM client do a DNS resolution for Yahoo’s STUN service at ‘beta.stun.voice.yahoo.com’, yielding two IP addresses 220.127.116.11 and 74.
We then see my IM client send STUN requests to both of these Yahoo STUN servers on the standard STUN port 3478. The STUN response in the picture below shows my public IP address/port (called server reflexive candidate) in the MAPPED-ADDRESS attribute as 18.104.22.168/23885.
The public address thus discovered via STUN is then communicated in the SIP (Session Initiation Protocol) session between my IM client and Yahoo’s SIP server (sip120-p3.voice.sp2.yahoo.com at 22.214.171.124) over TCP. Following this TCP stream on Wireshark, in the picture below, we see a SIP invite from me to my remote party and the payload carries a list of all possible IP addresses/ports (candidates) where I can receive the media flows. The list includes both my private IP address 192.168.1.3/23880 as well as my public addresses discovered using STUN.
The remote party (192.168.1.5) sends a SIP OK message with its own candidate list ordered by priority.
The two endpoints then exchange a series of STUN checks for connectivity to each candidate on the list and arrive at a candidate pair to send and receive media. In this case, the candidate pair selected is (192.168.1.3/23880, 192.168.1.5/19256) - the private addresses of the two end points.
This is how IETF navigates address hiding to provide accessibility. Clients for the proprietary VoIP application Skype and peep-to-peer application Bittorrent are believed to leverage variations of this technique to navigate NAT as well.