Cloudflare IPv6 MTU

newer
Internetnz conference attendance...

Brian E Carpenter

2 Feb 2015 2 Feb '15

6:25 a.m.

Just a factoid: as far as I can see from Wiresharking, Cloudflare at APE is configured with an outbound IPv6 MTU of 1280. Not a complaint, just an observation. It's safer that way. Regards Brian Carpenter

Show replies by date

Tom Paseka

2 Feb 2 Feb

6:39 a.m.

On Mon, Feb 2, 2015 at 5:25 PM, Brian E Carpenter < brian.e.carpenter(a)gmail.com> wrote:

...

Not a complaint, just an observation. It's safer that way.

Unfortunately, yes. You'll see some other very large web properties doing the same. Too many tunnels, ICMP isn't reliable (nor does it work happily with some load balancing techniques or anycast). -Tom

Matthew Luckie

6:54 a.m.

On Tue, Feb 03, 2015 at 02:25:00PM +1300, Brian E Carpenter wrote:

...

Just a factoid: as far as I can see from Wiresharking, Cloudflare at APE is configured with an outbound IPv6 MTU of 1280.

How many networks at APE have IPv6 customers that are carried in a tunnel that reduces the MTU below 1500?

Simon Allard

7:40 a.m.

A fair number if you include customers who are tunnelled in PPPoE and have their MTU set to 1492. Simon Allard Senior Network Architect d: 09 550 2790 m: 020 100 0790 f: -----Original Message----- From: nznog-bounces(a)list.waikato.ac.nz [mailto:nznog-bounces(a)list.waikato.ac.nz] On Behalf Of Matthew Luckie Sent: Tuesday, 3 February 2015 2:54 p.m. To: Brian E Carpenter Cc: nznog Subject: Re: [nznog] Cloudflare IPv6 MTU On Tue, Feb 03, 2015 at 02:25:00PM +1300, Brian E Carpenter wrote:

...

Just a factoid: as far as I can see from Wiresharking, Cloudflare at APE is configured with an outbound IPv6 MTU of 1280.

How many networks at APE have IPv6 customers that are carried in a tunnel that reduces the MTU below 1500?

Dave Mill

8:40 a.m.

On Tue, Feb 3, 2015 at 3:40 PM, Simon Allard <Simon.Allard(a)team.orcon.net.nz

...

wrote:

...

A fair number if you include customers who are tunnelled in PPPoE and have their MTU set to 1492.

I'm a bit of a newbie on this side of things so apologies if this is a stupid question. So, with a situation like what Simon describes above what tends to fix the issue of the (generally) default TCP MSS of 1460 not fitting inside of this. Does path MTU discovery tend to fix this, or does change TCP MSS on a CPE (or BNG?) tend to fix it instead? Cheers Dave

Brian E Carpenter

8:56 a.m.

On 03/02/2015 16:40, Dave Mill wrote:

...

On Tue, Feb 3, 2015 at 3:40 PM, Simon Allard <Simon.Allard(a)team.orcon.net.nz

...
wrote:

...
A fair number if you include customers who are tunnelled in PPPoE and have their MTU set to 1492.

I'm a bit of a newbie on this side of things so apologies if this is a stupid question.

So, with a situation like what Simon describes above what tends to fix the issue of the (generally) default TCP MSS of 1460 not fitting inside of this. Does path MTU discovery tend to fix this, or does change TCP MSS on a CPE (or BNG?) tend to fix it instead?

They both tend to break sometimes. Some people filter ICMPv6 too agressively and MSS negotiation seems to fail sometimes (I've never figured out why). And as Tom said, some load balancers screw up on MTUD. Brian

Jeremy Visser

12:32 p.m.

On 2015-02-03 14:40, Dave Mill wrote:

...

So, with a situation like what Simon describes above what tends to fix the issue of the (generally) default TCP MSS of 1460 not fitting inside of this. Does path MTU discovery tend to fix this, or does change TCP MSS on a CPE (or BNG?) tend to fix it instead?

Two reasons for a content provider to force a small TCP MSS: (1) Working around broken networks that filter ICMPv6 Packet Too Big messages (2) One less RTT than waiting on an ICMPv6 PTB I don't condone any of the above reasons.

Mark Foster

12:46 p.m.

On 3/02/2015 8:32 p.m., Jeremy Visser wrote:

...

On 2015-02-03 14:40, Dave Mill wrote:

...
So, with a situation like what Simon describes above what tends to fix the issue of the (generally) default TCP MSS of 1460 not fitting inside of this. Does path MTU discovery tend to fix this, or does change TCP MSS on a CPE (or BNG?) tend to fix it instead?

Two reasons for a content provider to force a small TCP MSS:

(1) Working around broken networks that filter ICMPv6 Packet Too Big messages (2) One less RTT than waiting on an ICMPv6 PTB

I don't condone any of the above reasons.

I realise there is this ideal world where we don't have to transit networks beyond our control, but why is (1) not condoned in your view?

Jeremy Visser

3 Feb 3 Feb

2:33 a.m.

On 3/02/2015 18:46, Mark Foster wrote:

...

I realise there is this ideal world where we don't have to transit networks beyond our control, but why is (1) not condoned in your view?

Because history has shown that the "be liberal in what you accept, conservative in what you emit" mantra is actually really bad at motivating others to fix their broken stuff.

Jasper Bryant-Greene

2:55 a.m.

...

On 4/02/2015, at 10:33 am, Jeremy Visser <jeremy(a)sunriseroad.net> wrote:

...
On 3/02/2015 18:46, Mark Foster wrote: I realise there is this ideal world where we don't have to transit networks beyond our control, but why is (1) not condoned in your view?

Because history has shown that the "be liberal in what you accept, conservative in what you emit" mantra is actually really bad at motivating others to fix their broken stuff.

OTOH, it's pretty good at keeping networks working. Jasper

Brian E Carpenter

6:52 a.m.

On 04/02/2015 10:55, Jasper Bryant-Greene wrote:

...

...
On 4/02/2015, at 10:33 am, Jeremy Visser <jeremy(a)sunriseroad.net> wrote:

...
On 3/02/2015 18:46, Mark Foster wrote: I realise there is this ideal world where we don't have to transit networks beyond our control, but why is (1) not condoned in your view?

Because history has shown that the "be liberal in what you accept, conservative in what you emit" mantra is actually really bad at motivating others to fix their broken stuff.

OTOH, it's pretty good at keeping networks working.

And, if we're getting philosophical, it's morally neutral: it doesn't matter whether the broken stuff is intentional or human error, and it doesn't needlessly punish an innocent user. But there is one case where the mantra is dangerous: it tells you not to implement BCP 38. In terms of IPv6 MTU size, a server site that limits its outbound MTU to 1280, but accepts larger inbound packets, is just playing safe. Brian

Matthew Luckie

2 Feb 2 Feb

9:18 p.m.

...

Two reasons for a content provider to force a small TCP MSS:

(1) Working around broken networks that filter ICMPv6 Packet Too Big messages

How many networks on the path back to a content provider filter PTB messages, apart from the content provider itself? I'd have thought a transit network filtering a PTB would be extremely rare.

Brian E Carpenter

3 Feb 3 Feb

12:25 a.m.

On 04/02/2015 05:18, Matthew Luckie wrote:

...

...
Two reasons for a content provider to force a small TCP MSS:

(1) Working around broken networks that filter ICMPv6 Packet Too Big messages

How many networks on the path back to a content provider filter PTB messages, apart from the content provider itself? I'd have thought a transit network filtering a PTB would be extremely rare.

But throttling ICMP(v6) to avoid primitive DOS attacks might not be so rare, and that would lead to random loss of PTBs. Anyway, it's pretty easy to determine that PTBs are getting dropped, but pretty hard to find out exactly where it happens. Brian

Don Stokes

8:40 a.m.

On 04/02/15 08:25, Brian E Carpenter wrote:

...

But throttling ICMP(v6) to avoid primitive DOS attacks might not be so rare, and that would lead to random loss of PTBs. Anyway, it's pretty easy to determine that PTBs are getting dropped, but pretty hard to find out exactly where it happens. tom+lists(a)cloudflare.com wrote: Unfortunately, yes. You'll see some other very large web properties doing the same. Too many tunnels, ICMP isn't reliable (nor does it work happily with some load balancing techniques or anycast).

Surely these would also be a problem in IPv4-land where TCP with PMTU discovery is the norm? A quick TCP test to Cloudflare shows that TCP/IPv4 packets are coming from them with the DF bit set in the IP header (per normal PMTU discovery), which elicits pretty much exactly the same MTU behaviour in IPv4 as in IPv6. I don't think I've seen significant PMTU problems in IPv4, despite the prevalence of PPP-over-whatever with not-quite-1500 MTUs (ugh!), for a decade or so. These days, firewall folk mostly know not to filter PTBs, or more accurately, have NATs and stateful firewalls that know how to pass ICMP unreachables based on connection state, and do so as a matter of course. Filtering aside, PMTU discovery is pretty robust for most TCP applications, since every oversize retry can generate a PTB, and the PMTU is highly cache-able and the effect of a stale or over-aggregated PMTU cache entry is benign. Yet this isn't the first example I've seen of minimum MTUs in use in IPv6 to avoid PTBs. Granted, DNS applications where a PMTU re-send effectively doubles the length of the transaction (and requires a bunch more state to be kept by the server) is a slightly different case to TCP, but I can't see why this should be an issue in TCP over IPv6 when it isn't in IPv4. Do we have yet another example of IPv6 implementers making all the same mistakes they made with IPv4 (and a few more besides)? -- don

Brian E Carpenter

8:56 a.m.

On 04/02/2015 16:40, Don Stokes wrote:

...

On 04/02/15 08:25, Brian E Carpenter wrote:

...
But throttling ICMP(v6) to avoid primitive DOS attacks might not be so rare, and that would lead to random loss of PTBs. Anyway, it's pretty easy to determine that PTBs are getting dropped, but pretty hard to find out exactly where it happens. tom+lists(a)cloudflare.com wrote: Unfortunately, yes. You'll see some other very large web properties doing the same. Too many tunnels, ICMP isn't reliable (nor does it work happily with some load balancing techniques or anycast).

Surely these would also be a problem in IPv4-land where TCP with PMTU discovery is the norm? A quick TCP test to Cloudflare shows that TCP/IPv4 packets are coming from them with the DF bit set in the IP header (per normal PMTU discovery), which elicits pretty much exactly the same MTU behaviour in IPv4 as in IPv6.

I don't think I've seen significant PMTU problems in IPv4, despite the prevalence of PPP-over-whatever with not-quite-1500 MTUs (ugh!), for a decade or so. These days, firewall folk mostly know not to filter PTBs, or more accurately, have NATs and stateful firewalls that know how to pass ICMP unreachables based on connection state, and do so as a matter of course. Filtering aside, PMTU discovery is pretty robust for most TCP applications, since every oversize retry can generate a PTB, and the PMTU is highly cache-able and the effect of a stale or over-aggregated PMTU cache entry is benign.

Yet this isn't the first example I've seen of minimum MTUs in use in IPv6 to avoid PTBs. Granted, DNS applications where a PMTU re-send effectively doubles the length of the transaction (and requires a bunch more state to be kept by the server) is a slightly different case to TCP, but I can't see why this should be an issue in TCP over IPv6 when it isn't in IPv4.

Do we have yet another example of IPv6 implementers making all the same mistakes they made with IPv4 (and a few more besides)?

Basically, I think the answer is yes. There's RFC 4890 on ICMPv6 filtering but I'm not sure that all implementations and/or deployments are following it yet. There may still be bad configuration advice floating around for some products. The situation is also bad for IPv6 extension headers, and RFC 7045 is too new to have had much effect yet. Brian

Tom Paseka

9:16 p.m.

looping this all back, we've just posted: https://blog.cloudflare.com/path-mtu-discovery-in-pracrice/ Path MTU discovery still remains an issue in the wild however, with random ICMP messages being dropped, routers not sending the ICMP messages at all (or randomly, when they decide to), inconsistent routing and Anycast (an intermediary router replying to a different anycast node). Cheers -Tom

Eliezer Croitoru

8 Feb 8 Feb

8:26 a.m.

On 04/02/2015 18:16, Tom Paseka wrote:

...

looping this all back, we've just posted: https://blog.cloudflare.com/path-mtu-discovery-in-pracrice/

I have tried accessing the url but it clearly states "The page you are looking for cannot be found.". Eliezer

Bill Walker

8:29 a.m.

https://blog.cloudflare.com/path-mtu-discovery-in-practice/ Change r to t Sent from my iPhone

...

On 9/02/2015, at 4:26 pm, Eliezer Croitoru <eliezer(a)ngtech.co.il> wrote:

...
On 04/02/2015 18:16, Tom Paseka wrote: looping this all back, we've just posted: https://blog.cloudflare.com/path-mtu-discovery-in-pracrice/

I have tried accessing the url but it clearly states "The page you are looking for cannot be found.".

Eliezer

_______________________________________________ NZNOG mailing list NZNOG(a)list.waikato.ac.nz http://list.waikato.ac.nz/mailman/listinfo/nznog

3806

Age (days ago)

3812

Last active (days ago)

List overview

Download

17 comments

11 participants

participants (11)

Bill Walker
Brian E Carpenter
Dave Mill
Don Stokes
Eliezer Croitoru
Jasper Bryant-Greene
Jeremy Visser
Mark Foster
Matthew Luckie
Simon Allard
Tom Paseka