DNS format error: their fault or mine?
Sorry if this is off topic. I'm seeing a bunch of this (and have been for ages) in my bind logs: http://pastebin.com/ydL0RhG4 (sample line for archives, when paste goes away: Nov 28 23:14:58 jet named[28427]: DNS format error from 120.204.202.200#53 resolving ns-os1.qq.com/AAAA: invalid response ) My understanding is qq.com is a Chinese IM service (which my flatmate uses). Dig doesn't tell me about any error; it merely doesn't return any AAAA records. Any suggestions on: - whether I should care, or just filter it out of my logcheck email - whether there's something I can fix - whether there's something I can do to figure out in more detail what's happening Note that this is my caching NS, on my home network (behind NAT) Thanks, Richard
AFAIK this is because the "primary nameserver" field in the SOA record doesn't one of the NS records in the delegation from the parent zone. Mark Andrews at ISC: https://lists.isc.org/pipermail/bind-users/2010-April/079804.html In theory this indicates a potential MITM attack, but more likely just ignorance of the people running the nameservers, so I'd just filter it. -Martin On Thu, 29 Nov 2012, Richard Hector wrote:
Sorry if this is off topic.
I'm seeing a bunch of this (and have been for ages) in my bind logs:
Nov 28 23:14:58 jet named[28427]: DNS format error from 120.204.202.200#53 resolving ns-os1.qq.com/AAAA: invalid response
On 29/11/12 10:17, Martin D Kealey wrote:
AFAIK this is because the "primary nameserver" field in the SOA record doesn't one of the NS records in the delegation from the parent zone.
Mark Andrews at ISC: https://lists.isc.org/pipermail/bind-users/2010-April/079804.html
Actually that's a secondary symptom, not the reason.
In theory this indicates a potential MITM attack, but more likely just ignorance of the people running the nameservers, so I'd just filter it.
What I have seen is some kind of device in the path doing packet inspection and not being able to understand the type you are requesting. When that happens, in some cases the reply is dropped, in other cases is mangled. You can't discard the presence of the Great Firewall from China, there are a few documented cases http://arstechnica.com/tech-policy/2010/03/china-censorship-leaks-outside-gr... http://www.renesys.com/blog/2010/06/two-strikes-i-root.shtml Cheers,
-Martin
On Thu, 29 Nov 2012, Richard Hector wrote:
Sorry if this is off topic.
I'm seeing a bunch of this (and have been for ages) in my bind logs:
Nov 28 23:14:58 jet named[28427]: DNS format error from 120.204.202.200#53 resolving ns-os1.qq.com/AAAA: invalid response
_______________________________________________ NZNOG mailing list NZNOG(a)list.waikato.ac.nz http://list.waikato.ac.nz/mailman/listinfo/nznog
-- Sebastian Castro DNS Specialist .nz Registry Services (New Zealand Domain Name Registry Limited) desk: +64 4 495 2337 mobile: +64 21 400535
On 29/11/12 10:17, Martin D Kealey wrote:
AFAIK this is because the "primary nameserver" field in the SOA record doesn't one of the NS records in the delegation from the parent zone.
Absolutely not. The SOA primary name server not being one of the delegated name servers is not an error, in fact it's a common configuration. Case in point, nz and its 2LDs are configured: @ SOA loopback.dns.net.nz. soa.nzrs.net.nz. 2012112944 900 300 604800 3600 NS ns1.dns.net.nz. NS ns2.dns.net.nz. NS ns3.dns.net.nz. NS ns4.dns.net.nz. NS ns5.dns.net.nz. NS ns6.dns.net.nz. NS ns7.dns.net.nz. "loopback.dns.net.nz" points to 127.0.0.1, and you'll notice that it's not in the or parent delegation. This was originally done to deal with broken/misconfigured Microsoft Windows hosts sending DNS updates to the NZ nameservers as part of their network initialisation; the update process would look up the primary name server in the SOA, and send the update there. The updates were ignored, of course, but produced a steady stream of unnecessary traffic. Setting the primary nameserver to "loopback.dns.net.nz", address 127.0.0.1, meant that the update would get sent up the broken host's own backside. In reality, none of the listed name servers are primary. The actual primaries (two) are not reachable from the Internet at large, so the "primary nameserver" is not useful information to anyone. (Such "hidden primaries" are pretty much standard equipment on significant DNS infrastructures.) No part of normal DNS name resolution uses the SOA primary nameserver field. (You do get a copy of the SOA record in negative responses, but in that case the primary name server field is not used.)
Mark Andrews at ISC: https://lists.isc.org/pipermail/bind-users/2010-April/079804.html
... talks about bogus SOA records being returned for domains the device isn't authoritative for by broken DNS code, mostly on nasty load balancers. Which is actually what is happening here. "ns-os1.qq.com" is delegated to: $ dig ns-os1.qq.com. NS @ns1.qq.com. ns-os1.qq.com. 86400 IN NS ns-os1.qq.com. ns-os1.qq.com. 86400 IN NS ns-cmn1.qq.com. ns-os1.qq.com. 86400 IN NS ns-cnc1.qq.com. If you do an A query to these you get sense: $ dig ns-os1.qq.com. A @ns-os1.qq.com. ns-os1.qq.com. 600 IN A 202.55.2.230 ns-os1.qq.com. 600 IN A 114.134.85.106 but if you do a AAAA query ... not so much: $ dig ns-os1.qq.com. AAAA @ns-os1.qq.com. ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 49063 ;; flags: qr aa rd; QUERY: 1, ANSWER: 0, AUTHORITY: 1, ADDITIONAL: 0 ... ;; AUTHORITY SECTION: qq.com. 300 IN SOA ns1.qq.com. webmaster.qq.com. 1350444472 300 600 86400 300 Since ns-os1.qq.com is supposed to be authoritative for the domain ns-os1.qq.com. it should be returning the SOA for ns-os1.qq.com, not qq.com (that's the left hand field in the response). I suspect, as per Mark Andrews post, it's a dodgy DNS load balancer. In fact any query other an a A query (including SOA and NS queries) to it gets exactly that response. It's just plain broken. So to answer Richard's question: it's their fault, you don't need to care, just filter the message out of logcheck if it's bothering you. -- don
On 29/11/12 16:21, Don Stokes wrote: [loads of useful info]
So to answer Richard's question: it's their fault, you don't need to care, just filter the message out of logcheck if it's bothering you.
Thanks heaps. Actually I followed up an off-list reply I got, and added logging { category resolver { null; }; }; to my named.conf.options - based on the suggestion that most/all of what would be logged in that category would be stuff I can't do anything about. Is that reasonable? Incidentally, on the subject of SOA records - either I'm confused about the last entry, or People are Wrong on the Internet. Most references call it 'Minimum TTL', where 'DNS & BIND' 4th ed calls it 'Negative Caching TTL'. RFC 2038 (from 1998) says the minimum ttl usage is deprecated. Sample reference: http://support.microsoft.com/kb/163971 (updated in 2007). Am I right in thinking this is one of these things that is wrong, but people just can't get out of their heads, like classful addressing? Richard
On 29/11/12 19:58, Richard Hector wrote:
Actually I followed up an off-list reply I got, and added
logging { category resolver { null; }; };
to my named.conf.options - based on the suggestion that most/all of what would be logged in that category would be stuff I can't do anything about. Is that reasonable?
Yep. Take that out if you need to debug a really weird problem. But yeah, most of that class of errors just tell you that there is broken stuff on the Internet.
Incidentally, on the subject of SOA records - either I'm confused about the last entry, or People are Wrong on the Internet.
Most references call it 'Minimum TTL', where 'DNS & BIND' 4th ed calls it 'Negative Caching TTL'. RFC 2038 (from 998) says the minimum ttl usage is deprecated.
It was defined as the "minimum TTL" in RFC 1035. To quote the scriptures, it is "the unsigned 32 bit minimum TTL field that should be exported with any RR from this zone." In practice, BIND did not implement this. Nobody noticed. What they did notice was that hosts would ask for the same non-existent data over and over without caching the fact that it wasn't there, but because it wasn't there, there was no way to communicate how long it should retain that fact. TTLs follow RRs; no RR, no TTL. So RFC 2308 re-purposed the "minimum TTL" field to the "negative TTL" field to be the "negative TTL". What that means is that an authoritative server, on being asked for an RRset that doesn't exist, will return a response that contains no data (and an NXDOMAIN code if the name doesn't exist at all, or NOERROR if the name exists but not the class/type requested), with an SOA record in the authority section. Normally, the server will manufacture the TTL of the SOA from negative TTL field of the SOA. For example, given: domain. 3600 SOA primary.domain. contact.domain. 1354100330 3600 600 3600000 300 the response to does-not-exist.domain. will be: domain. 300 SOA primary.domain. contact.domain. 1354100330 3600 600 3600000 300 the TTL coming from the SOA. On the other hand, an actual query for the SOA will give the former record in response, because that's a real record and has a real, not manufactured, TTL. That is, the TTL for the SOA record is an hour, but the negative TTL is 5 minutes. RFC 2308 states that the smaller of the SOA record's own TTL and the negative TTL value should be sent as the TTL for a negative answer, and it is this TTL that applies to caching of that answer. So now you can ask for "does-not-exist.domain" as many times as you like, and your DNS forwarder will only bother the authoritative nameservers for "domain" about it once every five minutes. On 29/11/12 17:40, Martin D Kealey wrote:
So the problem is essentially that the response indicated a lame delegation to itself?
No, the problem is that the response was "out of bailiwick", i.e. we expected an answer that referred to the something in the zone we asked for ("ns-os1.qq.com", and we know that from the NS records we got from the "qq.com" nameservers), and got an answer about its parent zone ("qq.com"), which wasn't expected. Imagine of I could answer a query to my domain with a response that (if processed) affected further processing of a parent zone, or even the root zone? That would be a bad thing, right? Well, that type of behaviour is what BIND is detecting, reporting and discarding in this case. -- don
On 2012-11-28, at 22:21, Don Stokes
On 29/11/12 10:17, Martin D Kealey wrote:
AFAIK this is because the "primary nameserver" field in the SOA record doesn't one of the NS records in the delegation from the parent zone.
Absolutely not. The SOA primary name server not being one of the delegated name servers is not an error, in fact it's a common configuration.
The MNAME field has a single use these days, which is to identify the server responsible for accepting DNS UPDATE (RFC2136) requests. Zones that don't support dynamic updates can either not care about any junk update requests they receive at the nominated server, or set the MNAME to something that will cause such junk requests to be sent elsewhere. Don, you example of loopback.dns.net.nz is a good one. I tend to be more crude in my zones, e.g. see below. [I once tried to persuade the dnsop working group at the IETF that enshrining an empty field to mean "not applicable" in MNAME and also as an MX target was a good idea. Nobody agreed, really. I then attempted to reserve SINK.ARPA as a name that would definitively never exist, so that you could use that as an MNAME or an MX target for zones that don't want dynamic updates or mail, and that failed in the IESG. A procedural document requiring IAB review which defined and required the non-existence of something that already didn't exist made people curiously angry.] Fully agree that there's no requirement for the MNAME to be populated with anything that appears in the apex NS set RDATA. [krill:~]% dig hopcount.ca soa +noall +answer ; <<>> DiG 9.8.3-P1 <<>> hopcount.ca soa +noall +answer ;; global options: +cmd hopcount.ca. 86377 IN SOA . jabley.hopcount.ca. 1304604691 28800 3600 604800 3600 [krill:~]% Joe
On 30/11/12 10:08, Joe Abley wrote:
Don, you example of loopback.dns.net.nz is a good one. I tend to be more crude in my zones, e.g. see below.
[ '.' in the SOA MNAME field ] Yeah, I thought about that. But in the end I decided that giving a positive A record answer for the MNAME would mean that the 1 day TTLs on the answers would apply and sites with the broken hosts wouldn't bother us or anyone else for another 24 hours, whereas negative TTLs are generally shorter, and negative caching was iffy at the time. I'd be lying if I said I didn't think the behaviour of sending unsolicited UPDATEs by default was obnoxious, or that telling the broken hosts to update themselves didn't tickle my sense of humour... On 30/11/12 10:28, Phil Regnauld wrote:
I'd have hoped loopback.dns.net.nz to be v6-enabled :)
Of course I totally didn't expect that ... ;-) Note though that this was done a dozen years ago, when the flood of UPDATEs was actually a (small) problem, and never revisited. I'm actually curious as to how common unsolicited UPDATEs are now. -- don
On 30/11/12 12:22, Don Stokes wrote:
On 30/11/12 10:08, Joe Abley wrote:
Don, you example of loopback.dns.net.nz is a good one. I tend to be more crude in my zones, e.g. see below.
On 30/11/12 10:28, Phil Regnauld wrote:
I'd have hoped loopback.dns.net.nz to be v6-enabled :)
Of course I totally didn't expect that ... ;-)
Note though that this was done a dozen years ago, when the flood of UPDATEs was actually a (small) problem, and never revisited. I'm actually curious as to how common unsolicited UPDATEs are now.
The .nz nameservers see in the order of 0.1 to 0.5 updates per second. We haven't analyzed the sources of those updates. Cheers,
-- don
_______________________________________________ NZNOG mailing list NZNOG(a)list.waikato.ac.nz http://list.waikato.ac.nz/mailman/listinfo/nznog
-- Sebastian Castro DNS Specialist .nz Registry Services (New Zealand Domain Name Registry Limited) desk: +64 4 495 2337 mobile: +64 21 400535
participants (6)
-
Don Stokes
-
Joe Abley
-
Martin D Kealey
-
Phil Regnauld
-
Richard Hector
-
Sebastian Castro