A guide to WebRTC

Real-time communications standards aim to help more users speak with ease over the internet. Computer Weekly looks at how the systems work

This article can also be found in the Premium Editorial Download: Computer Weekly: Software defined networking explodes

Talking to someone over the internet is not as easy as it could be. Fundamentally, you need to find out what software your contact is using and make sure you are using the same. That’s all well and good if you’re both using a common platform such as Skype, but what if you’re using a PC and they are using FaceTime on a Mac?

Real-time communications is already in use across the web, with tools such as Google Hangouts, Microsoft’s Lync and Outlook.com, and Facebook all offering web-based voice and video communications. But like choosing a desktop tool, they all need plugins, and you also need to be sure that you have the right version and that your browser’s security settings are not blocking it.

With most IT departments blocking browser plugins, it is clear that there needs to be another way.

Proprietary software and plugins are the bane of internet communications, be it voice or video. They get in the way of actually communicating, as you spend more time negotiating over what tool to use than on the eventual conversation.

That is where WebRTC comes in, as it is intended to be an open standard for video and voice communication, embedded in the software that is on every desktop – the browser. As it is both a browser technology and a protocol, standardisation is being handled by two different organisations – the W3C and the IETF – with much of the technology coming from Google.

In a WebRTC world you will be able to use a web-based communications service to start a conversation with a friend or colleague (or even a one-to-many or many-to-many web conference), with voice and video, as well as sharing files over a peer-to-peer connection. 

Read more about real-time communications

Can businesses stand to gain from free real-time messaging apps?

Real-time comms on the move

Toyota uses Microsoft for global communications

Can businesses make use of Facebook Messages?

While a web service will mediate the call, the browser you are using will not matter, as the technology to work with your computer’s camera and microphone will be built-in, with no plugins needed. All that will be necessary is a few lines of JavaScript to set up the call.

WebRTC APIs

At the heart of WebRTC are three JavaScript application programming interfaces (APIs): MediaStream, RTCPeerConnection, and RTCDataChannel. 

MediaStream uses the getUserMedia JavaScript API to access device cameras and microphones, RTCPeerConnection handles audio and video calls, and RTCDataChannel is for peer-to-peer data transfers. Put together, the three APIs are the building blocks of a video chat application.

One interesting feature of the MediaStream API is how it works with audio and video, taking device output and delivering it as a stream – with separate streams for each connected device. It is easy to imagine applications that work with both front and rear cameras on a tablet, use multiple microphones, or even treat the screen display as a video stream – an approach that seems tailor-made for screen sharing or presentations). MediaStreams do not need to be sent over the internet; they can be used inside local applications, giving you the ability to quickly capture images and video that can be used elsewhere.

While a web service will mediate the call, the browser you are using won't matter, as the technology to work with your computer's camera and microphone will be built-in, with no plug-ins needed

Getting access to devices is only part of the story. The real complexity in building any real-time communications service is connecting users, and sensibly WebRTC leaves the choice of the underlying communications channel to web developers. That means you can use proven technologies such as SIP and XMPP in your applications, or choose any other communications protocol.

While there are libraries you can take advantage of to simplify development, the process of setting up a connection can be complex, using the Session Description Protocol (SDP) and the JavaScript Session Establishment Protocol. With SDP you can send information about camera resolution, the codecs in use, and the type of MediaStreams that can be used in a conversation. It will also deliver the information needed to traverse firewalls and handle peer-to-peer file exchanges.

The PeerConnection API handles the actual work of passing MediaStreams between browsers. That means there is a lot under the hood – and a lot that needs to be built into your web browser. 

Browsers that support WebRTC will need to include much of the tooling that goes into an app like Skype – just without the user interface. Apps that use WebRTC will also need server infrastructure to handle the initial session negotiations and as a backup if there is no peer-to-peer path between browsers.

Real-time communication is not just about voice and video though. It is a tool that is at the heart of many of the things we do online, such as business conferencing, instant messaging, virtual desktop infrastructures and even file transfers. 

WebRTC’s RTCDataChannel API uses browser-to-browser connections to work with any content type, with sample apps using it for gaming and peer-to-peer file transfer.

WebRTC complexities

Any web-based communications tool built using WebRTC will need to use all these technologies to build the full-featured application users expect, along with an assortment of HTML5 technologies to display controls and video.

There is a lot that needs to be done, and while WebRTC is attempting to make the complex easy, it is not a complete success. For one thing, the structure of SDP and the process of setting up a communications channel makes it harder to build a communications web app that most developers would like.

Luckily there are already libraries and services that simplify building an app, but they are as experimental as WebRTC. You can try them out, but be prepared for things to change – and to change your code. You can use the WebRTC internals tooling built into Chrome to see how your application works, and how it’s using your network – letting you tune video and audio for optimal network performance.

Part of agreeing on a standard is settling on the codecs that will be used. That is a more complex task than it might first appear as there are two distinct blocks in the W3C – one focused on standardising open source video and audio codecs such as Google’s WebM, and one wedded to industry standards like H-264. 

That aside, the WebRTC community seems to be coalescing around WebM and VP8 for video, with a range of different audio standards including the royalty-free Opus codec (based on Skype’s SILK).

Experimenting with WebRTC and CU-RTC-Web

WebRTC is not the only proposal for a browser-based communications platform. Microsoft’s Open Technologies subsidiary is working on the alternative CU-RTC-Web, which is being designed to support more than just video chat as it is a much lower-level protocol. 

More on WebRTC

Is WebRTC ratified?

Does WebRTC support IPv6?

What WebRTC applications will and won't do for enterprises

WebRTC and its impact on the enterprise

WebRTC primer: Using Web browsers for calls and video conferencing

WebRTC: Bringing real-time communications to the next level

One advantage over the current WebRTC is better support for mobile use cases, including on smartphones and between clients on different types of networks. A prototype CU-RTC-Web plugin is available on Microsoft’s HTML5Labs site.

But with WebRTC still under development, it is not surprising that implementations are currently experimental. Even so, there are services that you can use to see just how well your browsers support its core technologies – as well as support for WebRTC in telecoms-as-service provider Twilio’s client software.

Outside of the obvious business communications use cases, there have also been demonstrations of WebRTC being used as part of an emergency wireless network for quick deployment of telecoms infrastructure after a disaster. With much work on WebRTC coming from Google and the company’s move to open standards in its Hangouts service, it is likely to adopt WebRTC alongside its VP8 video codec – with the aim of delivering plugin-free web conferencing that it can build into its Google+ social network.

There is still a lot of work to be done before we get to that point, as WebRTC has to be agreed on in two different standards bodies. There is currently only experimental support in Opera, Chrome and Firefox – both on the desktop and on Android – with some interoperability issues, including use of different browser prefixes in both Firefox and Chrome for the same APIs. While that is a problem for anyone wanting to experiment with WebRTC, it is one that goes away once WebRTC is standardised.

When that milestone is reached, we can expect quick support from most browsers, and the rapid roll-out of commercial services, including gateways to commercial communications platforms and services.

Read more on Voice networking and VoIP