Acceptable levels of packet loss
Been running SmokePing for a little while now, building up a statistical sample, and wondering what's an acceptable level of loss to be seeing through the provider's core, both to targets foreign and domestic and also from the premises to the first hop in the provider's network. Obviously I'm seeing a non-zero level (averages are in the zero-point-something range, but some of the maximums are above 0), so want to know what people consider to be the point where I go from "unreasonable customer" to "customer with a justified concern"? -- Matthew Poole "The difference between theory and practice is that practice is easier in theory than theory is in practice"
What does your SLA say? If it doesn¹t specify a level of packet loss/you
don¹t have an SLA but you¹re still spending lots of money then is it
impacting on your performance?
-Scott
On 21/01/14 15:35, "Matthew Poole"
Obviously I'm seeing a non-zero level (averages are in the zero-point-something range, but some of the maximums are above 0), so want to know what people consider to be the point where I go from "unreasonable customer" to "customer with a justified concern"?
________________________________ The content of this message and any attachments may be privileged, confidential or sensitive. Any unauthorised used is prohibited. Views expressed in this message are those of the individual sender, except where stated otherwise with appropriate authority. All pricing provided is valid at the time of writing only and due to factors such as the exchange rate, may change without notice. Sales are made subject to our Terms & Conditions, available on our website or on request. ________________________________
No SLA, hence the question. Subjectively I expect that a fibre circuit to a major provider will be delivering very low levels of loss, and very low levels of jitter. I just don't have enough knowledge of what's actually normal for a standard business fibre connection in central Auckland to quantify my "low levels". As an indication, for this month I've got peak loss of 3.5-4.5% and average ~0.08% to two different DNS servers on the provider's network. Off to an Australian host it's a similar level, but at 26ms median rtt instead of a mere 1.5ms. On 21/01/2014 15:40, Scott Pettit wrote:
What does your SLA say? If it doesn¹t specify a level of packet loss/you don¹t have an SLA but you¹re still spending lots of money then is it impacting on your performance?
-Scott
On 21/01/14 15:35, "Matthew Poole"
wrote: Obviously I'm seeing a non-zero level (averages are in the zero-point-something range, but some of the maximums are above 0), so want to know what people consider to be the point where I go from "unreasonable customer" to "customer with a justified concern"?
The content of this message and any attachments may be privileged, confidential or sensitive. Any unauthorised used is prohibited. Views expressed in this message are those of the individual sender, except where stated otherwise with appropriate authority. All pricing provided is valid at the time of writing only and due to factors such as the exchange rate, may change without notice. Sales are made subject to our Terms & Conditions, available on our website or on request. ________________________________
-- Matthew Poole "The difference between theory and practice is that practice is easier in theory than theory is in practice"
I think the issue, touched upon by others, is that you don't know
whether the loss is path specific (although loss to Aussi mirroring
local is a small data point to add) and mostly whether it's protocol
specific.
Find a remote host (close to start with) and run the low bandwidth udp
iperf tests, looking for loss there. If you see loss there, then it's
likely that there is congestion somewhere along the path. If you start
seeing loss as UDP test speeds increase but before line rate then it
might be congestion and it might be some schedulers not being set
appropriately for your access type.
But pinging DNS servers doesn't give a good enough set of metrics to
chase this down.
Cheers - N
On 21 January 2014 16:42, Matthew Poole
No SLA, hence the question. Subjectively I expect that a fibre circuit to a major provider will be delivering very low levels of loss, and very low levels of jitter. I just don't have enough knowledge of what's actually normal for a standard business fibre connection in central Auckland to quantify my "low levels".
As an indication, for this month I've got peak loss of 3.5-4.5% and average ~0.08% to two different DNS servers on the provider's network. Off to an Australian host it's a similar level, but at 26ms median rtt instead of a mere 1.5ms.
I think 0.08% is ³ok² but 3.5-4.5% is terrible, if that¹s regular for more
than a few seconds, then make sure your end is okay first, particularly
CPU load on your router, are your interfaces maxing out momentarily, do
you have errors on the interfaces at all. Then if that¹s all ok
definitely log a support case. If we¹re talking momentary impact once a
month then I think that falls in the realm of ³normal² if you don¹t have a
SLA (which implies you¹re on some kind of shared service).
We had a customer who kept opening faults about packet loss for months, in
the end we established that the router at their head office had heavy CPU
load as it was undersized for the circuit they had.
-Scott
On 21/01/14 16:42, "Matthew Poole"
As an indication, for this month I've got peak loss of 3.5-4.5% and average ~0.08% to two different DNS servers on the provider's network. Off to an Australian host it's a similar level, but at 26ms median rtt instead of a mere 1.5ms.
________________________________ The content of this message and any attachments may be privileged, confidential or sensitive. Any unauthorised used is prohibited. Views expressed in this message are those of the individual sender, except where stated otherwise with appropriate authority. All pricing provided is valid at the time of writing only and due to factors such as the exchange rate, may change without notice. Sales are made subject to our Terms & Conditions, available on our website or on request. ________________________________
The only time we had continuous low-level packet loss was when our fiber provider configured 100M-half-duplex on their side and our end "auto-negotiated" 100M-Full-duplex. Apart from that everything looked to be working perfectly :-) On Tue, 21 Jan 2014 16:42:49 Matthew Poole wrote:
No SLA, hence the question. Subjectively I expect that a fibre circuit to a major provider will be delivering very low levels of loss, and very low levels of jitter. I just don't have enough knowledge of what's actually normal for a standard business fibre connection in central Auckland to quantify my "low levels".
As an indication, for this month I've got peak loss of 3.5-4.5% and average ~0.08% to two different DNS servers on the provider's network. Off to an Australian host it's a similar level, but at 26ms median rtt instead of a mere 1.5ms.
On 21/01/2014 15:40, Scott Pettit wrote:
What does your SLA say? If it doesn¹t specify a level of packet loss/you don¹t have an SLA but you¹re still spending lots of money then is it impacting on your performance?
-Scott
On 21/01/14 15:35, "Matthew Poole"
wrote: Obviously I'm seeing a non-zero level (averages are in the zero-point-something range, but some of the maximums are above 0), so want to know what people consider to be the point where I go from "unreasonable customer" to "customer with a justified concern"?
The content of this message and any attachments may be privileged, confidential or sensitive. Any unauthorised used is prohibited. Views expressed in this message are those of the individual sender, except where stated otherwise with appropriate authority. All pricing provided is valid at the time of writing only and due to factors such as the exchange rate, may change without notice. Sales are made subject to our Terms & Conditions, available on our website or on request. ________________________________
-- Jean-Francois Pirus | Technical Manager francois(a)clearfield.com | Mob +64 21 640 779 | DDI +64 9 282 3401 Clearfield Software Ltd | Ph +64 9 358 2081 | www.clearfield.com
Yes. Often the clue for that is CRC errors on the interface. If of course the duplex mismatch is occurring on something you get at. Jamie
On 21/01/2014, at 5:04 pm, Jean-Francois Pirus
wrote: The only time we had continuous low-level packet loss was when our fiber provider configured 100M-half-duplex on their side and our end "auto-negotiated" 100M-Full-duplex.
Apart from that everything looked to be working perfectly :-)
On Tue, 21 Jan 2014 16:42:49 Matthew Poole wrote:
No SLA, hence the question. Subjectively I expect that a fibre circuit to a major provider will be delivering very low levels of loss, and very low levels of jitter. I just don't have enough knowledge of what's actually normal for a standard business fibre connection in central Auckland to quantify my "low levels".
As an indication, for this month I've got peak loss of 3.5-4.5% and average ~0.08% to two different DNS servers on the provider's network. Off to an Australian host it's a similar level, but at 26ms median rtt instead of a mere 1.5ms.
On 21/01/2014 15:40, Scott Pettit wrote: What does your SLA say? If it doesn¹t specify a level of packet loss/you don¹t have an SLA but you¹re still spending lots of money then is it impacting on your performance?
-Scott
On 21/01/14 15:35, "Matthew Poole"
wrote: Obviously I'm seeing a non-zero level (averages are in the zero-point-something range, but some of the maximums are above 0), so want to know what people consider to be the point where I go from "unreasonable customer" to "customer with a justified concern"?
The content of this message and any attachments may be privileged, confidential or sensitive. Any unauthorised used is prohibited. Views expressed in this message are those of the individual sender, except where stated otherwise with appropriate authority. All pricing provided is valid at the time of writing only and due to factors such as the exchange rate, may change without notice. Sales are made subject to our Terms & Conditions, available on our website or on request. ________________________________
-- Jean-Francois Pirus | Technical Manager francois(a)clearfield.com | Mob +64 21 640 779 | DDI +64 9 282 3401
Clearfield Software Ltd | Ph +64 9 358 2081 | www.clearfield.com _______________________________________________ NZNOG mailing list NZNOG(a)list.waikato.ac.nz http://list.waikato.ac.nz/mailman/listinfo/nznog
On Tue, Jan 21, 2014 at 03:35:59PM +1300, Matthew Poole wrote:
Been running SmokePing for a little while now, building up a statistical sample, and wondering what's an acceptable level of loss to be seeing through the provider's core, both to targets foreign and domestic and also from the premises to the first hop in the provider's network.
Obviously I'm seeing a non-zero level (averages are in the zero-point-something range, but some of the maximums are above 0), so want to know what people consider to be the point where I go from "unreasonable customer" to "customer with a justified concern"?
There can be considerable impact with 0.05% average packet loss. If it's happening to all destinations then maybe there's a problem with the link? I would trust a known well connected host rather than your next hop, as lots of routers forward better than they answer pings. I'd also look at using iperf in udp mode, with a long test, at low bit rates, to see if it's bursty, what direction it's in, whether doing more or less traffic impacts it etc. Ben.
On 2014-01-21 16:01 , Ben wrote:
I would trust a known well connected host rather than your next hop, as lots of routers forward better than they answer pings.
For clarity: on bigger routers that's "forward much much much better than they answer pings". On bigger routers the forwarding is in the hardware ASIC, and the "answering pings" is a low priority task on the supervisor CPU reached via a slow path. (Which is unfortunate for monitoring, but one can understand why, eg, BGP keep alive or route convergence or... might be prioritised over ping.) I too would trust monitoring to a host (where the main CPU(s) are answering both ping and real services) over monitoring to a router in terms of latency/jitter/loss figures. (Even on smaller routers, typically ping handling is still much lower priority than the packet forwarding thread.) There are some recent studies suggesting TCP performance (especially on short-lived connections like HTTP) plummets with even 1% (one percent) packet loss on the path. Personally my rule of thumb for reporting loss on "best effort" connections is over 10% (ten percent) sustained over periods of time -- that typically indicates a congested path, and maybe there's some traffic engineering upstream can do to help reduce it. (Besides over 10% interactive TCP -- eg, ssh -- starts getting laggy, which is the point where I usually start wondering what's going on.) Ewen
when it's worse than most SLA's for international transit providers? For
example:
http://www.sprint.com/business/resources/dedicated_internet_access.pdf
http://www.verizonenterprise.com/terms/latam/co/sla/
http://www.cogentco.com/files/docs/network/performance/global_sla.pdf
etc etc.
As others have said, intermediate routers are designed to forward traffic
not respond to pings. So really the only thing that matters is the result
for your end systems under test. Beyond that there's confusion between
identifying the problem and solving the problem. But you know that.
Cheers
jamie
On 21 January 2014 15:35, Matthew Poole
Been running SmokePing for a little while now, building up a statistical sample, and wondering what's an acceptable level of loss to be seeing through the provider's core, both to targets foreign and domestic and also from the premises to the first hop in the provider's network.
Obviously I'm seeing a non-zero level (averages are in the zero-point-something range, but some of the maximums are above 0), so want to know what people consider to be the point where I go from "unreasonable customer" to "customer with a justified concern"?
-- Matthew Poole "The difference between theory and practice is that practice is easier in theory than theory is in practice" _______________________________________________ NZNOG mailing list NZNOG(a)list.waikato.ac.nz http://list.waikato.ac.nz/mailman/listinfo/nznog
That's really useful, thanks. On 21/01/2014 16:55, Jamie Baddeley wrote:
when it's worse than most SLA's for international transit providers? For example: http://www.sprint.com/business/resources/dedicated_internet_access.pdf http://www.verizonenterprise.com/terms/latam/co/sla/ http://www.cogentco.com/files/docs/network/performance/global_sla.pdf etc etc.
As others have said, intermediate routers are designed to forward traffic not respond to pings. So really the only thing that matters is the result for your end systems under test. Beyond that there's confusion between identifying the problem and solving the problem. But you know that.
Cheers
jamie
On 21 January 2014 15:35, Matthew Poole
mailto:matt(a)p00le.net> wrote: Been running SmokePing for a little while now, building up a statistical sample, and wondering what's an acceptable level of loss to be seeing through the provider's core, both to targets foreign and domestic and also from the premises to the first hop in the provider's network.
Obviously I'm seeing a non-zero level (averages are in the zero-point-something range, but some of the maximums are above 0), so want to know what people consider to be the point where I go from "unreasonable customer" to "customer with a justified concern"?
-- Matthew Poole "The difference between theory and practice is that practice is easier in theory than theory is in practice" _______________________________________________ NZNOG mailing list NZNOG(a)list.waikato.ac.nz mailto:NZNOG(a)list.waikato.ac.nz http://list.waikato.ac.nz/mailman/listinfo/nznog
-- Matthew Poole "The difference between theory and practice is that practice is easier in theory than theory is in practice"
participants (8)
-
Ben
-
Ewen McNeill
-
jamie baddeley
-
Jamie Baddeley
-
Jean-Francois Pirus
-
Matthew Poole
-
Neil Gardner
-
Scott Pettit