Hi Guys, Bit of a lurker - not a poster. Hai! All of this week, our office network here has been experiencing an odd issue of sorts. It goes sort of like this: The NAT'd IP address for our office network is 210.48.17.17. At random times, international destinations stop working, usually for a period of a few minutes. For example, when I first came across the problem I went to download winmtr to try and help diagnose the issue. I then discovered that winmtr was a site that is effected by this problem, so it became my guinea pig site. So using winmtr.net as an example site, the following happens: Ping's work to winmtr.net fine. The website works fine. I go to download the winmtr file... however the download gets interrupted (TCP RST) at between 600kb and 1100kb. This is guaranteed to happen and I have replicated it time and time again. When the download gets interrupted, the pings stop working and I can't access the website anymore. After a period of some minutes, I can again access the website - however the pings remain unreplied. It seems they remain unreplied until I stop the ping, and then after a few minutes I can resume them...as if they are blocked indefinitely until I stop the requests and then after a period they are allowed again....some sort of flood control? Here's a SS of the TCP dump explaining the above. note that the ICMP Echoes have no corresponding reply packets: [Description: Description: cid:image001.png(a)01CCBA7A.B1A181F0] I have ruled out our office firewall as the cause as it happens when I plug in a laptop on the same address, ruling out the firewall. If I change the firewall WAN Ip to anything but 17.17 (Ie: 17.18) everything works fine! Any other device on this subnet has no issues - it is ONLY 17.17. We do have a netenforcer shaper in the path - however I have ruled that out as we bypassed it for a while and the issue still existed. We don't want to have to change the WAN IP to something else as there are a whole bunch of inbound connections going to that IP. But at this stage I can't think of anything else we can do? I have contacted our upstream peer and they say there is no Dos/flood control or anything of sorts on the circuit. It's not just winmtr, but about 70% of international sites we try to access. I have run winmtr (I downloaded it via a VPN, in case you're wondering) and interestingly the path doesn't change much when I break the connection to winmtr.net - however the PL is about 100% anyway after hop 7 so it's not very revealing. So, what say you NZNOGgers that are waiting for 5pm to tick over and drink those Christmas beers? Julian Maxwell
Hi Julian, As a work around most firewalls should be able to NAT your office traffic to .18 but still have .17 available for you existing inbound connections. You should also be able configure for a workstation to still be NATed to .17 for further testing of the issue at the same time. It would be interesting to compare the TTL of the resets v other packets from 64.18.203.96 as you may be able to determine if a device in the path is infact sending the resets and spoofing the source. The screenshot is a bit limited as information like port numbers, seq numbers, traffic/replies to 64.18.203.96 are missing or not available for all packets. A traceroute when things are working and not working would be useful too - esp if you can get a TCP traceroute on port 80. Cheers Ivan On 16/Dec/2011 4:53 p.m., Julian Maxwell wrote:
Hi Guys,
Bit of a lurker – not a poster. Hai!
All of this week, our office network here has been experiencing an odd issue of sorts. It goes sort of like this:
The NAT’d IP address for our office network is 210.48.17.17. At random times, international destinations stop working, usually for a period of a few minutes.
For example, when I first came across the problem I went to download winmtr to try and help diagnose the issue. I then discovered that winmtr was a site that is effected by this problem, so it became my guinea pig site.
So using winmtr.net as an example site, the following happens:
Ping’s work to winmtr.net fine.
The website works fine.
I go to download the winmtr file… however the download gets interrupted (TCP RST) at between 600kb and 1100kb. This is *guaranteed* to happen and I have replicated it time and time again.
When the download gets interrupted, the pings stop working and I can’t access the website anymore.
After a period of some minutes, I can again access the website – however the pings remain unreplied. It seems they remain unreplied until I stop the ping, and then after a few minutes I can resume them…as if they are blocked indefinitely until I stop the requests and then after a period they are allowed again….some sort of flood control?
Here’s a SS of the TCP dump explaining the above. note that the ICMP Echoes have no corresponding reply packets:
Description: Description: cid:image001.png(a)01CCBA7A.B1A181F0
I have ruled out our office firewall as the cause as it happens when I plug in a laptop on the same address, ruling out the firewall.
If I change the firewall WAN Ip to anything but 17.17 (Ie: 17.18) everything works fine!
Any other device on this subnet has no issues – it is ONLY 17.17.
We do have a netenforcer shaper in the path – however I have ruled that out as we bypassed it for a while and the issue still existed.
We don’t want to have to change the WAN IP to something else as there are a whole bunch of inbound connections going to that IP. But at this stage I can’t think of anything else we can do?
I have contacted our upstream peer and they say there is no Dos/flood control or anything of sorts on the circuit.
It’s not just winmtr, but about 70% of international sites we try to access.
I have run winmtr (I downloaded it via a VPN, in case you’re wondering) and interestingly the path doesn’t change much when I break the connection to winmtr.net – however the PL is about 100% anyway after hop 7 so it’s not very revealing.
So, what say you NZNOGgers that are waiting for 5pm to tick over and drink those Christmas beers?
*Julian Maxwell*
_______________________________________________ NZNOG mailing list NZNOG(a)list.waikato.ac.nz http://list.waikato.ac.nz/mailman/listinfo/nznog
Thanks to all for the suggestions all. While winmtr traces have a lot of packet loss (Assuming ICMP low priority/blocking), another tool suggested was mturoute. Whilst this tool showed that mtu size was not the problem, what it did show is that the end server stopped responding when I broke the connection, but all other hops were fine. Which doesn't explain why this issue is replicable to other completely unrelated sites. I have checked blacklist sites and our IP appears on none of them, so the mystery is ... still a mystery. Anyway, I added an outbound NAT rule to our FW to translate http traffic to 210.48.17.18 and that has gotten around the problem. Cheers Julian From: nznog-bounces(a)list.waikato.ac.nz [mailto:nznog-bounces(a)list.waikato.ac.nz] On Behalf Of Julian Maxwell Sent: Friday, 16 December 2011 4:54 p.m. To: nznog(a)list.waikato.ac.nz Subject: [nznog] Something out there hates our IP address Hi Guys, Bit of a lurker - not a poster. Hai! All of this week, our office network here has been experiencing an odd issue of sorts. It goes sort of like this: The NAT'd IP address for our office network is 210.48.17.17. At random times, international destinations stop working, usually for a period of a few minutes. For example, when I first came across the problem I went to download winmtr to try and help diagnose the issue. I then discovered that winmtr was a site that is effected by this problem, so it became my guinea pig site. So using winmtr.net as an example site, the following happens: Ping's work to winmtr.net fine. The website works fine. I go to download the winmtr file... however the download gets interrupted (TCP RST) at between 600kb and 1100kb. This is guaranteed to happen and I have replicated it time and time again. When the download gets interrupted, the pings stop working and I can't access the website anymore. After a period of some minutes, I can again access the website - however the pings remain unreplied. It seems they remain unreplied until I stop the ping, and then after a few minutes I can resume them...as if they are blocked indefinitely until I stop the requests and then after a period they are allowed again....some sort of flood control? Here's a SS of the TCP dump explaining the above. note that the ICMP Echoes have no corresponding reply packets: I have ruled out our office firewall as the cause as it happens when I plug in a laptop on the same address, ruling out the firewall. If I change the firewall WAN Ip to anything but 17.17 (Ie: 17.18) everything works fine! Any other device on this subnet has no issues - it is ONLY 17.17. We do have a netenforcer shaper in the path - however I have ruled that out as we bypassed it for a while and the issue still existed. We don't want to have to change the WAN IP to something else as there are a whole bunch of inbound connections going to that IP. But at this stage I can't think of anything else we can do? I have contacted our upstream peer and they say there is no Dos/flood control or anything of sorts on the circuit. It's not just winmtr, but about 70% of international sites we try to access. I have run winmtr (I downloaded it via a VPN, in case you're wondering) and interestingly the path doesn't change much when I break the connection to winmtr.net - however the PL is about 100% anyway after hop 7 so it's not very revealing. So, what say you NZNOGgers that are waiting for 5pm to tick over and drink those Christmas beers? Julian Maxwell
participants (2)
-
Ivan Walker
-
Julian Maxwell