Packet Loss: How to Measure And How to Fix
Networks are the backbone of most modern organizations. They allow data to be accessed by whoever needs it wherever they may be. But as important and critical as they may be, networks can also suffer from a few issues. In fact, there are mainly three main problems that are causing the vast majority of all network issues: latency, jitter, and packet loss. Today, we’re talking about the latter. We’ll try to look at the why, and the how of packet loss but also what tools can be used to measure it and locate its source as well as briefly explore what can be done to reduce, if not eliminate, it.
We’ll start off by trying to define packet loss as it is important that we all start on the same page. Then, we’ll explore the various causes of packet loss. The actual causes are countless but we’ve chosen the five most common and we’ll have a deeper look at each one. Following that, we’ll have a look at some tools one can use to measure and locate packet loss. After all, before we can fix anything, we need to know that it’s there and we need to know where it is. And talking about fixing packet loss, that will be our last order of business.
What Is Packet Loss
In simple terms, packet loss is the failure of data packets sent from a source to reach their destination. Data packets, in case you don’t know, are small chunks of data that are transmitted on computer networks. Every piece of data, regardless of its size, is divided into packets which are sent sequentially over networks before being reassembled into meaningful data by the receiving party. For various reasons that we’re about to explore, it does happen that some packets are lost in transit. As an analogy, imagine a multi-page letter that would be mailed with each page in a separate envelope. This is exactly how data is transferred over a network. In this example, packet loss happens if one of the envelopes get lost in transit.
The Causes Of Packet Loss
While there are numerous causes for packet loss—far too many to cover them all, we’ve identified five of the most common ones.
1. Network Congestion
Network congestion is possibly the primary cause of packet loss. It is similar to traffic congestion and it typically happens when there are more data than the network can handle. When that happens, networking devices such as routers could eventually drop packets that have been queued for too long, resulting in packet loss.
Some WAN links and Internet circuits are sometimes bandwidth limited by the provider. For example, they could supply 2 Mbps of bandwidth on a 10 Mbps physical circuit. If you try to send out more than 2Mbps of data on such a circuit, the WAN or Internet router will often drop the extra traffic, resulting in packet loss.
Network congestion can also happen when service providers intentionally oversubscribe a link. They do that based on the rationale that not all subscribers of the service will be using the bandwidth simultaneously. However, in peak periods, when more people are using the service and the demand exceeds its capacity, there will likely be packet loss resulting from congestion.
2. Device Overutilization
Another common cause of packet loss device overutilization. It happens when a device is operating at a capacity it was not designed for. In a network, packets may arrive at a router faster than they can be processed/sent out. To handle these situations, devices have buffers where they hold packets temporarily until they processed and sent out. These buffers are not infinite, though, and can eventually fill up, resulting in packets being dropped.
In many instances, a device will perform in an acceptable manner during normal (off-peak) operating times and properly route all packets. During peak periods, however, there could be a noticeable increase in packet drops.
3. Hardware And Software Issues
Faulty hardware is yet another cause of packet loss. We’ve seen instances, for example, of WAN routers with 100 Mbps interfaces that were not able to transmit more than 30 Mbps of data. When traffic was low, the issue was not noticeable but as soon as it exceeded 30 Mbps, packets started being dropped. Moving the circuit to a different interface of the same device with the exact same interface configuration fixed the problem, confirming that it was a router hardware issue.
A closely related cause of packet loss is buggy software running on a network device. Network devices’ firmware are computer programs and, as such, they are exposed to programming bugs. They are increasingly complicated pieces of software and it is often impossible for their development team to catch all bugs in network devices firmware.
4. Malicious Action
Malicious action—mainly in the form of Denial of Service (DoS) attacks—is another common cause of packet loss. It is, however, one that we often have little control over. It happens when malicious users flood a networking device with enough traffic that it can no longer perform its duties and starts dropping packets.
Since we have no control over these situations as they are the work of a third party, the best way is to avoid them altogether. Several services exist that claim to protect your network from DoS attacks. Some of them do a rather good job although they can tend to be a bit expensive. But if they can protect you from an attack about which you’ll otherwise be defenseless, it might be worth the investment.
5. Configuration Errors
Human error is often to blame in many situations and packet loss is no different. Device configuration error is another one of the most common causes of packet loss. For instance, interface speed and duplex mismatch can lead to packet loss. It happens, for instance when one end of a link is set to full-duplex while the other one is set for half-duplex. When that happens, collisions occur with resulting packet loss. Networking equipment is increasingly complex and it is easy to make mistakes. Configuration management tools can help you ensure your configurations are error-free by implementing standardized configuration elements.
The Effects of Packet Loss
Packet loss is common and it happens on most networks. It rarely has a noticeable effect until it reaches critical levels. When it does, it can cause various errors. File transfers—which use the connection-oriented TCP protocol—are relatively unaffected as there is some error correction built into most protocols and a missing packet can be retransmitted. It is more problematic with real-time or near-real-time transfers—which instead use the connectionless UDP protocol—such as streaming video or audio or Voice over IP (VoIP) where it can cause skips and gaps, image freeze, or unintelligible speech.
Some Tools To Measure And Locate Packet Loss
If you want to reduce or eliminate packet loss, the first thing you need to do is measure if your network is experiencing any and, if it is, where it is happening. As we said earlier, packet loss is normal and will be present on most networks. It must, however, remain under a certain threshold to ensure that no ill-effects are observed. Cisco Systems recommends, for example, that packet loss on VoIP traffic—possibly the type of traffic most affected by it—should be kept below 1%. For video streaming, it should remain between 0.05% and 5% depending on the type of video.
Since VoIP traffic is the type of traffic most affected by packet loss, it won’t come as a surprise that most of the tools we’ve found for measuring and locating it are primarily VoIP network monitoring tools.
1. SolarWinds VoIP And Network Quality Manager (Free Trial)
SolarWinds has been making some of the best network administration tools for the past 20 years or so. Its flagship product, the Network Performance Monitor, consistently scores among the best SNMP network monitoring tools. The company is also famous for its free tools, made to address specific needs of network administrators. These free tools include products such as the TFTP Server or the Advanced Subnet Calculator.
The SolarWinds VoIP and Network Quality Manager is a dedicated VoIP monitoring tool that is packed with great features. This tool can be used to monitor VoIP call quality metrics including packet loss, but also latency, jitter, and MOS. It can help troubleshoot VoIP call performance by correlating call issues and network performance. The tool also includes real-time WAN monitoring is using Cisco IP SLA technology. The tool’s VoIP call path trace feature lets you see and pinpoint call problems along the entire network path.
This tool can perform real-time monitoring of site-to-site WAN performance and it also has alerting features to notify you of any abnormal situation. It can help ensure that WAN circuits are performing as expected by utilizing Cisco IP SLA metrics, synthetic traffic testing, and custom performance thresholds and alerts.
But the SolarWinds VoIP and Network Quality Manager won’t only monitor your WAN circuits, it can also display the utilization and performance metrics of your VoIP gateways and PRI trunks. It can help with capacity planning by allowing you to evaluate voice quality when planning new VoIP deployments.
Prices for the SolarWinds VoIP and Network Quality Manager start $1 615 for up to 5 IP SLA source devices and 300 IP phones. Other licensing levels–including a device-unlimited license–are also available. A free 30-day trial is available should you want to take the product for a test run.
2. PRTG Network Monitor
The PRTG Network Monitor from Paessler is a multi-purpose network monitoring system. Through the use of sensors, which can be compared to add-ons although they are included with the product, PRTG can be used to monitor many different parameters of networks and systems. The tool can monitor virtually any system, device, traffic, and application in your IT infrastructure.
For the purpose of measuring and locating packet loss, PRTG proposes no less than three different sensors. You can use the Ping Sensor to measure the availability of your devices and to calculate packet loss as a percentage. The Quality of Service Sensor lets you monitor entire network paths, and thereby measure and locate it. Finally, the Cisco IP SLA Sensor can be used to measure the packet loss rate on your Cisco devices. You can choose to be notified via email, SMS or push notifications on a mobile device whenever the threshold is exceeded so you can take appropriate measures.
The PRTG Network Monitor is super easy and quick to install. The tool’s auto-discovery system will scan network segments and automatically recognize a wide range of devices and systems. It will then create sensors from predefined device templates. Specific VoIP sensors sometimes need to be manually set up afterwards, making the installation a bit longer but this is still one of the fastest tools to set up.
The PRTG Network Monitor is available in a free, full-featured version limited to 100 sensors. Note that any single monitored parameter counts as one sensor. To monitor more than 100 sensors, you’ll need a license. Prices vary with the number of sensors and start at $600 for 500 sensors up to $14 500 for unlimited sensors. A free device-unlimited 30-day trial version is available.
3. ManageEngine OpManager With VoIP Monitor
The ManageEngine OpManager is another excellent network monitoring tool. It will monitor the vital signs of your equipment and alert you as soon as something is out of specs. The tool features an intuitive user interface that will let you easily find the information you need. It also features an excellent reporting engine along with some pre-built and custom reports. To complete the package, the product’s alerting features are also very comprehensive.
When it comes to monitoring for jitter, the ManageEngine OpManager‘s VoIP monitor option can proactively monitor and report on your infrastructure’s capacity to handle VoIP calls. The tool uses Cisco IP SLA to continuously monitor critical Quality of Service parameters of VoIP networks. The monitored VoIP parameters include packet loss, delay, jitter, the Mean Opinion Score (MOS) and Round Trip Time (RTT).
The ManageEngine OpManager is priced based on the number of monitored devices. Prices range from $715 for 25 devices to $14 995 for 1 000 devices. The VoIP monitor option adds $125 per device that requires it. A free 30-day trial is available so you can try the product and see how it fits your specific needs.
4. VoIPmonitor
VoIPmonitor is an open source network packet sniffer with a commercial front end for monitoring most VoIP protocols. It runs on Linux and is designed to analyze the quality of ongoing VoIP calls based on network parameters such as packet loss and jitter according to the ITU-T G.107 E-model. Call information, along with their metrics, is saved to a database. Each call can be saved to a pcap file for further analysis with external tools such as Wireshark.
VoIPmonitor can also decode speech and play it over its web-based GUI as well as save it to disk as a .WAV file. Out of the box, the product supports the G.711 alaw and ulaw codecs and commercial plugins add support for G.722, G.729a, G.723, iLBC, Speex, GSM, Silk, iSAC, and OPUS. VoIPmonitor is also able to convert T.38 FAX to PDF.
The VoIPmonitor GUI front end is available either as a locally hosted server at prices ranging from $42/month for 10 channels to $917/month for 6 000 channels or as a cloud-based service with prices varying from $20/month for 3 channels to $200/month for 200 channels. Both versions are available in a free and unlimited 30-day trial.
Fixing Packet Loss
Measuring and locating packet loss if the first step in fixing it. Any of the tools reviewed above will help you with that. Usually, the cause of packet loss will be obvious once you locate where it’s happening and fixing it is a simple matter of addressing the cause.
If the network is congested, increasing its bandwidth so that you can push more traffic through seems to be the obvious answer. You could also consider applying Quality of Service (QoS) features. It could enable certain types of traffic—VoIP, for example—to be given priority over other traffic that is not so sensitive to packet loss or critical to operations.
If packet loss is caused be device overutilization, then the only solution may be to upgrade to a higher performance device. In some cases, it may be only a component of the device that needs to be upgraded. For instance, you can sometimes replace a 100 Mbps router interface with a 1 Gbps one.
Faulty hardware can be addressed by replacing it or, if it is more convenient, by using another non-faulty component of the same device. For example, if a router interface is defective, perhaps you can simply use a different interface on the same device. While this is by no means a good solution, it is adequate for testing or for providing a temporary fix until the unit can be replaced.
Wireless networks are often more prone to packet loss due to radio interference. Switching to a wired connection could be a way to address this kind of issues although it is not always possible. For instance, if the affected device is a handheld portable IP phone, it might only support a wireless connection. In these situations, switching to a different channel or using a different frequency might improve the situation or even solve the issue altogether.
If packet loss is caused by malicious activities, you need to mitigate the attack as quickly as possible. This can be as simple as using an Access Control List to block the IP address of the attacker (if static and known). In more complex cases, you can use features like Remotely Triggered Black Hole Routing.
You should also check that your configuration is not causing packet loss. Make sure that duplex settings match at either end of a connection. I personally tend to stay away from the auto speed and duplex settings as it has gotten me into trouble more than one. I much prefer to force the speed of each interface and set it to full-duplex. There is, nowadays, no compelling reason to use half-duplex anyways. And If you have configured QoS on your networking devices, make that your buffer’s size is enough. Otherwise, you run the risk of buffer overflow which leads to packet loss.
Very helpful
Thank you