It is important to understand what WebRTC can do for you, but it is equally important to understand what WebRTC may do to you.
Introduction
WebRTC is an emerging standard that enables real-time voice, video and data sharing in a Web browser without the need for browser plugins. Potentially billions of devices supporting a browser--PCs, laptops, smartphones, tablets and a host of new devices--from a variety of manufacturers will be real-time communications-enabled. Whereas browsers have typically interacted only with one or more Web servers, WebRTC allows browsers to exchange media and data with one another directly and in a secure manner.
Although third-party programs like Skype have been around for a long time, and some browser-based plugins have been available for limited communications interactions, the implications WebRTC brings to organizations of all types and sizes are enormous. Ubiquitous voice, video, and data for gaming, customer service, communications and personal and group engagement opens a new world of possibilities for innovation and disruption.
The transformative power behind WebRTC is that ordinary Web developers using just JavaScript Application Programming Interfaces (APIs) can craft fully functioning voice, video and data collaboration applications or embed these capabilities within other applications with just a few lines of code.
A WebRTC Primer
WebRTC (Web real-time communications) is an effort to create an open framework for embedding real-time communications capabilities into Web browsers. WebRTC allows HTML5 Web programmers, with no telecommunications skills and using simple Javascript APIs, to surface real-time audio and video functionality in Web servers and in browser-based applications running on computers, laptops, tablets and smartphones without the need for browser plugins or third-party applications.
Two standards bodies involved in creating the WebRTC standards include: the World Wide Web Consortium (W3C) and the Internet Engineering Task Force (IETF). The W3C is tasked with creating the Web APIs used in WebRTC while the IETF focuses on the underlying communications and data transfer protocols. Together, both groups collaborate on WebRTC specifications.
Powered by a Triangular P2P Architecture
The WebRTC architecture involves Web servers and browser clients. The Web server "serves up" Web applications with embedded Javascript, and the browser clients (PCs, tablets, smartphones) run the Javascript application. Traditionally, Web browsers have communicated only with Web servers. What is unique about WebRTC is that the Web application can now enable peer-to-peer (P2P) communications between two browser clients (See Figure 1).
Figure 1. WebRTC's Triangle Architecture (Adapted from "WebRTC: APIs and RTCWeb Protocols of the HTML5 Real-Time Web", Johnson, Alan B. and Daniel C. Burnett, First Edition, September 2012, Digital Codex LLC)
While the control data flows between the browser client and the Web server, the audio and video streams flow directly between the browsers. Directly transmitting media between browsers is very useful because voice and video are very sensitive to network latency and jitter, and the direct transmission eliminates additional paths for traffic to travel, on which it could encounter additional impairments.
WebRTC enables point-to-point browser communications as well as multipoint communications sessions. In a multipoint session, each browser sends and receives audio, video and data streams to and from every other browser in the session in a fully meshed configuration (see Figure 2).
Figure 2. Fully Meshed Peer Connections in WebRTC Multi-Point Communications Sessions
Keep in mind, WebRTC will not scale particularly well in many-to-many situations due to the processing power and network bandwidth required for all of the individual peer-to-peer connections that must be established. Consequently, audio and video bridging infrastructure may be required for large meetings with numerous endpoints.
The good news is that the majority of multipoint audio or video meetings typically involve only three or four endpoints. But these have typically been room or group endpoints. WebRTC will enable individuals to meet in multipoint video conferences, and recent data indicates that the number of endpoints participating in such conferences is increasing because people no longer congregate in three to four conference rooms for video meetings.
WebRTC Requires Directory Services
One of the elements WebRTC does not supply is a directory service. A directory is necessary so that WebRTC users can find one another. This capability could be termed a "rendezvous service".
WebRTC directory services must be supplied by the application developer. In many cases, directory services will be provided by interfacing with a website's authentication mechanism or with an existing enterprise directory.
When a browser connects to a website, the application can ask the user for login credentials. As the user is authenticated, the Web server creates a directory that maps authenticated users to active Web browsing sessions. Directory information can then be pushed down to the browser interface, allowing people to communicate with one another.
An alternative scenario would be a customer service web site that interfaces to a contact center. In this scenario, the user browsing the web site does not authenticate; only the contact center agent requires authentication. The Web server can automatically create the linkage between the customer and a contact center agent through the contact center's routing software.
Directories can be simple or complex, but they will be required in order for users to establish communications sessions using WebRTC.
WebRTC Federates Using a Trapezoid Approach
Although WebRTC capabilities may soon be ubiquitous in the browsers most people use, the ability to reach out and connect to others who may not be connected to the same Web server is an essential capability. Consequently, Web servers running WebRTC may ultimately need to be able to federate with one another. Federating between WebRTC domains results in the trapezoid architecture (see Figure 3).
One of the issues Web developers must pay attention to is how the control data will be exchanged. WebRTC specifies the use of a protocol called Session Description Protocol (SDP) to exchange communications parameters, but it does not specify what those parameters are nor the format that should be used to establish and control the communications session. These details are left up to each individual WebRTC application developer. Thus, developers wishing to federate with other WebRTC domains will need to ensure that they use common session initiation and control mechanisms.
Figure 3. The WebRTC Trapezoid for Federation between Server Domains
WebRTC Voice and Video Protocols
The IETF has standardized on the wideband Opus and the narrowband G.711 codecs for audio in WebRTC. If Opus is used in a WebRTC application, then any interoperability with SIP or the PSTN would require a transcoding border element. If G.711 is used, then audio transcoding between WebRTC and SIP would not be required because almost all SIP systems have G.711 as an available codec .
Video in WebRTC is far from finalized. Google has been pushing the VP8 video codec, and it has spent at least $125 million to make it available royalty-free to any WebRTC implementation. However, most of the existing video infrastructure in the world does not use VP8. Existing infrastructure often supports H.264. Mobile devices also have H.264 capability embedded into their hardware chipsets. Consequently, H.264 is the codec preferred by many IETF members; however, it is not royalty-free.
There has been no vote on which video codecs would be mandatory to implement . Google has made VP8 available to developers, and WebRTC developers using Google Chrome and Mozilla Firefox can have video interoperability today using VP8.
A straw poll taken at a recent IETF meeting showed 70 members could live with H.264 as a mandatory-to-implement codec. In the same meeting, 50 members could live with VP8 as a mandatory-to-implement codec (people could raise their hands more than once).
Some are suggesting that the WebRTC standard should push forward without specifying a mandatory video codec, leaving it up to the market to decide which, if any video codecs would be included. There are huge implications for both browser and hardware manufacturers, regardless of where this issue ultimately falls. Use of H.264 may not be such a huge licensing issue because most of the current browser and mobile device vendors have already paid the maximum licensing fee; the issue is that future codecs based on H.264 may have higher licensing costs, and choosing H.264 today as a mandatory codec may require much higher licensing fees in the future as H.265/HEVC become available.
Conclusion
WebRTC is already making headway into our everyday lives. Anyone running the latest version of Google Chrome has WebRTC capability already enabled. The automatic update for Mozilla Firefox (Firefox 22) will soon have WebRTC capabilities as well.
I am personally aware of over 70 companies that are either developing WebRTC browsers, toolkits, service platforms, or that have already created solutions in use by end users. New entrants are appearing nearly every week. WebRTC-based video bridging capability is already available in the services offered by VidTel and Blue Jeans Network. Expect to see WebRTC-enabled customer engagement solutions from some of the big contact center companies later this year and in early 2014.
WebRTC has the potential to turn the communications and collaboration industry on its head over the next few years. Executives and product managers would do well to learn what WebRTC is all about. It is important to understand not only what WebRTC can do for you, but it is equally important to understand what WebRTC may do to you.
This article is an excerpt from Dr. Kelly's recently published report titled, "Ten Things CIOs Should Know About WebRTC."