For as long as the Internet has been on people's desktops, the promise of Internet telephony has been along for the ride. But while some of that promise has been realised -- voice over IP (VOIP) is now a standard service included in many mainstream routers and office telephony switches -- both telephony and its more glamorous cousin videoconferencing remain minority pastimes. Of the exciting new applications promised, such as unified messaging, web-based call centres and so on, almost nothing exists.
The problem lies with standards -- there are simultaneously too many to make interoperability easy, and too few to be sure that what you want to work will work at all. In part, this is because of the genuine technical difficulties in putting high bandwidth, low latency services over a network as heterogeneous as the Internet; in part because Internet telephony lives at the volcano-ridden fault line between the two tectonic plates of computer networking and traditional telephone services.
The first standard for videoconferencing, telephony and other multimedia use of computer networks was H.323, created under the auspices of the ITU in 1996 and augmented many times since. The last major revision was version 4 in 2000. The standard has also acquired H.248 to control gateway functions and H.235 for its security framework; future additions may include H.460, for adding extensions such as number portability across locations, and the H.5xx series for mobile users. Although much in the protocol family leans towards a world of 64k pipes through ATM networks, it was originally designed to run over LANs and has grown out to encompass WANs subsequently. Calls go between endpoints -- phones, computers, video-conferencing systems -- while the media and signalling are handled by gateways. Optionally, a gatekeeper will do call management, routing and address resolution.
SIP -- the Session Initiation Protocol -- is relatively new, only coming to prominence over the past couple of years. Strictly speaking, it isn't a telephony protocol; instead, it does the call set-up, error handling and inter-process signalling that goes along with any point-to-point connection. When used for telephony it most often uses the same underlying streaming protocol, RTP, as H.323. SIP is -- or was, when it was born -- as simple as H.323 is complex, a text-based protocol that finds the recipient of a call, checks that it has capabilities congruent with the caller's, and then lets other protocols take care of the details of data transfer, security and so on. It was created by the IETF, and like H.323 has grown a number of related protocols -- SIP-CPL, the Call Processing Language, is a scripting language based on XML tags, SIP-CGI defines how server-resident scripts can communicate with applications, among others. Unlike H.323, SIP moves a lot of the work of call management and routing out among different parts of the network..
Both H.323 and SIP have problems with aspects of today's networks. Network address translation -- NAT -- is particularly difficult for peer-to-peer systems, as the network address of a machine behind a NAT gateway makes no sense outside the LAN on which that machine lives. The gateway itself must understand how to translate all packets intended for the machine, and with a complex protocol such as H.323 that can consume considerable resources and require additional management. Firewalls too need to understand either protocol: there is a fundamental tension between exposing every individual device to the wider network for data transfer, and shielding information from the network for security reasons.
Many of the arguments between the SIP and the H.323 camps will be familiar to anyone who remembers when OSI standards first met TCP/IP: H.323 is a widely deployed and mature standard says one group; it's a legacy of the first generation and will be quietly worked around, says the other. H.323 is flexible, capable and scalable -- or it's cumbersome, over-engineered and over-complex. SIP is simple, lightweight and open; or it needs complex interactions with other standards, isn't supported widely and has potential for intellectual property disputes.
Fortunately, nobody needs either H.323 or SIP today or tomorrow. Both can potentially save money and give a better ROI on current systems by coupling voice with data more effectively. Both require awareness from existing network components. No killer application has yet been discovered that makes Internet telephony absolutely essential. Treating the claims and counterclaims of both camps with a degree of scepticism while demanding that advantages are demonstrated, not merely promised, will serve network planners and implementers well while things settle down. Building in volcano zones may be exciting, but it's not for the risk-adverse.