Concept

Latency

To understand the difficulties associated with VoIP over Internet, one must first understand latency and how it works. Latency is the time delay between two ends of a VoIP line. It can be measured either one-way or round trip. A one-way latency of 120 milliseconds is acceptable for radio remote communication. A round-trip latency of over 300 milliseconds is considered poor. Many factors and components add latency to the audio path. Some can be reduced or optimised, others not. Surprisingly, most of the latency can be generated on your own computer and not on the Internet connection itself. One of the difficulties are buffers.

Buffers

Let's start by briefly recapping on why software buffers are needed. Playing back digitised radio audio requires a continuous stream of data to be fed from your remote station to your soundcard's D-A (digital to analogue) converter on your home computer before you can listen to it on speakers or headphones. And transmitting voice also requires a continuous stream of data, this time being converted by the soundcard's A-D (analogue to digital) converter from analogue waveform to digital data and then sent to your remote radio station.

No computer operating system can do everything at once, so a multitasking operating system such as Windows works by running lots of separate programs or tasks in turns, each one consuming a share of the available CPU (processor) and I/O (Input/Output) cycles. To maintain a continuous audio stream, small amounts of system RAM (buffers) are used to temporarily store a chunk of audio at a time.

For playback, the soundcard continues accessing the data within these buffers while Windows goes off to perform its other tasks, and hopefully Windows will get back soon enough to drop the next chunk of audio data into the buffers before the existing data has been used up. Similarly, during audio recording the incoming data slowly fills up a second set of buffers, and Windows comes back every so often to grab a chunk of this and sends it to the server.

If the buffers are too small and the data runs out before Windows can get back to top them up (playback) or empty them (recording) you'll get a gap in the audio stream that sounds like a click or pop in the waveform and is often referred to as a 'glitch'. If the buffers are far too small, these glitches occur more often, firstly giving rise to occasional crackles and eventually to almost continuous interruptions that sound like distortion as the audio starts to break up regularly.

Making the buffers a lot bigger immediately solves the vast majority of problems with clicks and pops, but has an unfortunate side effect: any change that you make to the audio from your audio software doesn't take effect until the next buffer is accessed.

Bandwidth and Codecs

RemAud supports currently two codecs ("coder-decoder") - the uncompressed PCM Codec and the compressed GSM 6.10 Codec.

Pulse-code modulation (PCM) is a method used to digitally represent sampled analog signals. A PCM stream is a digital representation of an analog signal, in which the magnitude of the analog signal is sampled regularly at uniform intervals, with each sample being quantized to the nearest value within a range of digital steps.

PCM streams have two basic properties that determine their fidelity to the original analog signal: the sampling rate, which is the number of times per second that samples are taken; and the bit depth, which determines the number of possible digital values that each sample can take.

Since we are using a rather limited audio spectrum of 300 to 4000 Hz in amateur radio communications (SSB), a sampling rate of 8000 Hz and a bit depth of 8 or 16 bit is sufficient to reach an audio quality of 95% and a low bandwidth. PCM 8 kHz needs a bandwidth of 16 kB/s (stereo, two channels) with 8 bps and 32 kB/s with 16 bps.

GSM 6.10 is a compressed mono channel codec with a very low transfer rate of about 1.6 kB/s but the audio quality is much lower.

IP Transport Protocol - TCP or UDP

IPSound screenshot

Based on my own experience IP-Sound turned out to be one of the best audio solutions. I've been using it for a couple of years. SM5VXC has developed IP-Sound but development has been discontinued. You will still find a setup program somewhere on the web. IP-Sound uses different Sound Codecs (PCM, GSM, G711 uLaw and Speex) but they are limited to a bit depth of 16 bps and so the bandwidth is higher than with 8 bps. And it is a single-user solution.

IP-Sound uses the UDP protocol instead of TCP. Contrary to TCP, UDP uses a simple transmission model without implicit handshaking dialogues for providing reliability, ordering, or data integrity. Thus, UDP provides an unreliable service and datagrams may arrive out of order, appear duplicated, or go missing without notice.

In addition to that UDP is blocked by most or all private and company firewalls. Thus, you will need a VPN, a Virtual Private Network, for example Hamachi VPN, to "tunnel" your audio through the firewall. If you have access to the router and firewall settings on both sides, client and server, you may want to open port 4444 for UDP and you will need no VPN. Port 4444 is used by default by IP-Sound.

Skype is another alternative but not a good one for amateur radio communication. Skype uses TCP and a proprietary Codec. You will not have any firewall problems. The audio quality is great for VoIP but it's really bad for low bandwidth radio audio. I do not recommend it. The estimated audio quality is only 50% for SSB and CW signals.