On 2014-01-21 16:01 , Ben wrote:
I would trust a known well connected host rather than your next hop, as lots of routers forward better than they answer pings.
For clarity: on bigger routers that's "forward much much much better than they answer pings". On bigger routers the forwarding is in the hardware ASIC, and the "answering pings" is a low priority task on the supervisor CPU reached via a slow path. (Which is unfortunate for monitoring, but one can understand why, eg, BGP keep alive or route convergence or... might be prioritised over ping.) I too would trust monitoring to a host (where the main CPU(s) are answering both ping and real services) over monitoring to a router in terms of latency/jitter/loss figures. (Even on smaller routers, typically ping handling is still much lower priority than the packet forwarding thread.) There are some recent studies suggesting TCP performance (especially on short-lived connections like HTTP) plummets with even 1% (one percent) packet loss on the path. Personally my rule of thumb for reporting loss on "best effort" connections is over 10% (ten percent) sustained over periods of time -- that typically indicates a congested path, and maybe there's some traffic engineering upstream can do to help reduce it. (Besides over 10% interactive TCP -- eg, ssh -- starts getting laggy, which is the point where I usually start wondering what's going on.) Ewen