Thanks to all for the suggestions all.
While winmtr traces have a lot of packet loss (Assuming ICMP low priority/blocking), another tool suggested was mturoute. Whilst this tool showed that mtu size was not the problem, what it did show is that the
end server stopped responding when I broke the connection, but all other hops were fine.
Which doesn’t explain why this issue is replicable to other completely unrelated sites. I have checked blacklist sites and our IP appears on none of them, so the mystery is … still a mystery.
Anyway, I added an outbound NAT rule to our FW to translate http traffic to 210.48.17.18 and that has gotten around the problem.
Cheers
Julian
From: nznog-bounces@list.waikato.ac.nz
[mailto:nznog-bounces@list.waikato.ac.nz] On Behalf Of Julian Maxwell
Sent: Friday, 16 December 2011 4:54 p.m.
To: nznog@list.waikato.ac.nz
Subject: [nznog] Something out there hates our IP address
Hi Guys,
Bit of a lurker – not a poster. Hai!
All of this week, our office network here has been experiencing an odd issue of sorts. It goes sort of like this:
The NAT’d IP address for our office network is 210.48.17.17. At random times, international destinations stop working, usually for a period of a few minutes.
For example, when I first came across the problem I went to download winmtr to try and help diagnose the issue. I then discovered that winmtr was a site that is effected by this problem, so it became my guinea pig site.
So using winmtr.net as an example site, the following happens:
Ping’s work to winmtr.net fine.
The website works fine.
I go to download the winmtr file… however the download gets interrupted (TCP RST) at between 600kb and 1100kb. This is
guaranteed to happen and I have replicated it time and time again.
When the download gets interrupted, the pings stop working and I can’t access the website anymore.
After a period of some minutes, I can again access the website – however the pings remain unreplied. It seems they remain unreplied until I stop the ping, and then after a few minutes I can resume them…as if they are blocked indefinitely
until I stop the requests and then after a period they are allowed again….some sort of flood control?
Here’s a SS of the TCP dump explaining the above. note that the ICMP Echoes have no corresponding reply packets:
I have ruled out our office firewall as the cause as it happens when I plug in a laptop on the same address, ruling out the firewall.
If I change the firewall WAN Ip to anything but 17.17 (Ie: 17.18) everything works fine!
Any other device on this subnet has no issues – it is ONLY 17.17.
We do have a netenforcer shaper in the path – however I have ruled that out as we bypassed it for a while and the issue still existed.
We don’t want to have to change the WAN IP to something else as there are a whole bunch of inbound connections going to that IP. But at this stage I can’t think of anything else we can do?
I have contacted our upstream peer and they say there is no Dos/flood control or anything of sorts on the circuit.
It’s not just winmtr, but about 70% of international sites we try to access.
I have run winmtr (I downloaded it via a VPN, in case you’re wondering) and interestingly the path doesn’t change much when I break the connection to winmtr.net – however the PL is about 100% anyway after hop 7 so it’s not very revealing.
So, what say you NZNOGgers that are waiting for 5pm to tick over and drink those Christmas beers?
Julian Maxwell