Re: SERVFAIL - DNS

This is a discussion on Re: SERVFAIL - DNS ; I wrote: >> In response to a posting "Re: Two DNS Servers inside a firewall" >> Mark Andrews wrote on September 5: >> >> >>> Below is a example of such a bad delegation. The last SOA >>> record should ...

+ Reply to Thread
Results 1 to 7 of 7

Thread: Re: SERVFAIL

  1. Re: SERVFAIL

    I wrote:
    >> In response to a posting "Re: Two DNS Servers inside a firewall"
    >> Mark Andrews wrote on September 5:
    >>
    >>
    >>> Below is a example of such a bad delegation. The last SOA
    >>> record should be owned by www.lawlink.nsw.gov.au not
    >>> lawlink.nsw.gov.au. It results in SERVFAIL being returned.
    >>>
    >>> Mark
    >>>
    >>>
    >>> ; <<>> DiG 9.3.4-P1 <<>> aaaa www.lawlink.nsw.gov.au
    >>> ;; global options: printcmd
    >>> ;; Got answer:
    >>> ;; ->>HEADER<<- opcode: QUERY, status: SERVFAIL, id: 56606
    >>> ;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0
    >>>
    >>> ;; QUESTION SECTION:
    >>> ;www.lawlink.nsw.gov.au. IN AAAA
    >>>
    >>> ;; Query time: 63 msec
    >>> ;; SERVER: 127.0.0.1#53(127.0.0.1)
    >>> ;; WHEN: Fri Sep 5 12:01:30 2008
    >>> ;; MSG SIZE rcvd: 40
    >>>
    >>> ; <<>> DiG 9.3.4-P1 <<>> www.lawlink.nsw.gov.au aaaa +trace
    >>> ;; global options: printcmd
    >>> . 440024 IN NS h.root-servers.net.
    >>> . 440024 IN NS d.root-servers.net.
    >>> . 440024 IN NS g.root-servers.net.
    >>> . 440024 IN NS i.root-servers.net.
    >>> . 440024 IN NS b.root-servers.net.
    >>> . 440024 IN NS l.root-servers.net.
    >>> . 440024 IN NS m.root-servers.net.
    >>> . 440024 IN NS e.root-servers.net.
    >>> . 440024 IN NS f.root-servers.net.
    >>> . 440024 IN NS a.root-servers.net.
    >>> . 440024 IN NS j.root-servers.net.
    >>> . 440024 IN NS c.root-servers.net.
    >>> . 440024 IN NS k.root-servers.net.
    >>> ;; Received 504 bytes from 127.0.0.1#53(127.0.0.1) in 3 ms
    >>>
    >>> au. 172800 IN NS ns1.audns.net.au.
    >>> au. 172800 IN NS dns1.telstra.net.
    >>> au. 172800 IN NS sec1.apnic.net.
    >>> au. 172800 IN NS sec3.apnic.net.
    >>> au. 172800 IN NS adns1.berkeley.edu.
    >>> au. 172800 IN NS adns2.berkeley.edu.
    >>> au. 172800 IN NS audns.optus.net.
    >>> au. 172800 IN NS aunic.aunic.net.
    >>> ;; Received 430 bytes from 2001:500:1::803f:235#53(h.root-servers.net) in 244 ms
    >>>
    >>> lawlink.nsw.gov.au. 3600 IN NS ns3.uecomm.net.au.
    >>> lawlink.nsw.gov.au. 3600 IN NS ns1.uecomm.net.au.
    >>> lawlink.nsw.gov.au. 3600 IN NS ns2.uecomm.net.au.
    >>> ;; Received 105 bytes from 58.65.255.73#53(ns1.audns.net.au) in 42 ms
    >>>
    >>> www.lawlink.nsw.gov.au. 3600 IN NS ns1.lawlink.nsw.gov.au.
    >>> www.lawlink.nsw.gov.au. 3600 IN NS ns2.lawlink.nsw.gov.au.
    >>> ;; Received 108 bytes from 203.94.128.54#53(ns1.uecomm.net.au) in 39 ms
    >>>
    >>> lawlink.nsw.gov.au. 86400 IN SOA lawlink.nsw.gov.au. administrator.lawlink.nsw.gov.au. 998545544 28800 7200 604800 86400
    >>> ;; Received 144 bytes from 203.3.186.53#53(ns1.lawlink.nsw.gov.au) in 32 ms
    >>>

    >>
    >>
    >> I have a user who cannot resolve
    >>
    >> www.flickr.com
    >>
    >> The name server I am querying is 9.5.0-P1 (to be updated to a patched
    >> P2 tomorrow). When I query at one of the autoritative name servers,
    >> I get:
    >>
    >> oberon% dig www.flickr.com @ns1.yahoo.com.
    >>
    >> ; <<>> DiG 8.3 <<>> www.flickr.com @ns1.yahoo.com.
    >> ; (1 server found)
    >> ;; res options: init recurs defnam dnsrch
    >> ;; got answer:
    >> ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 4
    >> ;; flags: qr aa rd; QUERY: 1, ANSWER: 2, AUTHORITY: 5, ADDITIONAL: 5
    >> ;; QUERY SECTION:
    >> ;; www.flickr.com, type = A, class = IN
    >>
    >> ;; ANSWER SECTION:
    >> www.flickr.com. 5M IN CNAME www.flickr.vip.mud.yahoo.com.
    >> www.flickr.vip.mud.yahoo.com. 15M IN A 68.142.214.24
    >>
    >> ;; AUTHORITY SECTION:
    >> mud.yahoo.com. 2D IN NS ns1.yahoo.com.
    >> mud.yahoo.com. 2D IN NS ns2.yahoo.com.
    >> mud.yahoo.com. 2D IN NS ns3.yahoo.com.
    >> mud.yahoo.com. 2D IN NS ns4.yahoo.com.
    >> mud.yahoo.com. 2D IN NS ns5.yahoo.com.
    >>
    >> ;; ADDITIONAL SECTION:
    >> ns1.yahoo.com. 2D IN A 66.218.71.63
    >> ns2.yahoo.com. 2D IN A 68.142.255.16
    >> ns3.yahoo.com. 2D IN A 217.12.4.104
    >> ns4.yahoo.com. 2D IN A 68.142.196.63
    >> ns5.yahoo.com. 30M IN A 119.160.247.124
    >>
    >> ;; Total query time: 64 msec
    >> ;; FROM: oberon.it.anl.gov to SERVER: ns1.yahoo.com. 66.218.71.63
    >> ;; WHEN: Tue Sep 9 13:25:03 2008
    >> ;; MSG SIZE sent: 32 rcvd: 257
    >>
    >> oberon%
    >>
    >> but a general query results in SERVFAIL:
    >>
    >> oberon% dig www.flickr.com
    >>
    >> ; <<>> DiG 8.3 <<>> www.flickr.com
    >> ;; res options: init recurs defnam dnsrch
    >> ;; got answer:
    >> ;; ->>HEADER<<- opcode: QUERY, status: SERVFAIL, id: 2
    >> ;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0
    >> ;; QUERY SECTION:
    >> ;; www.flickr.com, type = A, class = IN
    >>
    >> ;; Total query time: 9 msec
    >> ;; FROM: oberon.it.anl.gov to SERVER: default -- 146.139.254.5
    >> ;; WHEN: Tue Sep 9 13:22:46 2008
    >> ;; MSG SIZE sent: 32 rcvd: 32
    >>
    >> oberon%
    >>
    >> I notice that when I query one of the authoritative name servers I
    >> get
    >>
    >> ;; ANSWER SECTION:
    >> www.flickr.com. 5M IN CNAME www.flickr.vip.mud.yahoo.com.
    >> www.flickr.vip.mud.yahoo.com. 15M IN A 68.142.214.24
    >>
    >> ;; AUTHORITY SECTION:
    >> mud.yahoo.com. 2D IN NS ns1.yahoo.com.
    >> mud.yahoo.com. 2D IN NS ns2.yahoo.com.
    >> mud.yahoo.com. 2D IN NS ns3.yahoo.com.
    >> mud.yahoo.com. 2D IN NS ns4.yahoo.com.
    >> mud.yahoo.com. 2D IN NS ns5.yahoo.com.
    >>
    >> Is the SERVFAIL because I queried
    >>
    >> flickr.com
    >>
    >> and the authority is
    >>
    >> mud.yahoo.com ?
    >>



    And Kevin Darcy replied:
    >No, that's perfectly normal. CNAMEs point to names in other domains all
    >the time. The only thing slightly unusual here is that the nameservers
    >for flickr.com also happen to be authoritative for the zone which
    >contains the target of the alias (www.flickr.vip.mud.yahoo.com) and are
    >therefore able to provide the A record without any further need for
    >referral-chasing. But that's _relatively_ normal too.
    >> If not, then why am I getting SERVFAIL? Thanks.
    >>

    >Does a dig +trace for www.flickr.com work?
    >
    >If you have port and/or source-address restrictions in named.conf, make
    >sure you're using the same port and/or source-address for your test
    >queries. Otherwise it's not really a valid test.
    >
    >If you're still getting SERVFAIL for your regular queries, but not for
    >your test queries, dump your cache and see if maybe you're trying to use
    >some bad/stale/obsolete cached glue/referral data in order to resolve
    >the name.


    I did an "rndc dumpdb", and I did not see any stale glue in the cache.
    But I am not sure exactly for what to search.

    I have no port and/or source-address restrictions in named.conf.
    When I do the "dig www.flickr.com" on my two external DNS servers
    (both 9.5.0-P2 with Jinmei's dumpdb patch) the queries succeed.
    When I issue the command on my two internal DNS servers (one the
    patched -P2 and one still 9.5.0-P1), both servers give SERVFAIL.
    I looked at the source code (query.c) yesterday, and there are 23
    cases for SERVFAIL. Before some of the SERVFAIL lines I see

    CTRACE("...");

    How do I enable this tracing? Or is there another way to determine
    which SERVFAIL code is matching in query.c?
    ----------------------------------------------------------------------
    Barry S. Finkel
    Computing and Information Systems Division
    Argonne National Laboratory Phone: +1 (630) 252-7277
    9700 South Cass Avenue Facsimile:+1 (630) 252-4601
    Building 222, Room D209 Internet: BSFinkel@anl.gov
    Argonne, IL 60439-4828 IBMMAIL: I1004994


  2. Re: SERVFAIL

    >>> I have a user who cannot resolve
    >>>
    >>> www.flickr.com

    ....
    >>> but a general query results in SERVFAIL:


    Hairball!
    --
    Paul Vixie


  3. Re: SERVFAIL

    i believe that the hard part of the traversal for www.flickr.com is:

    ; <<>> DiG 9.4.1-P1 <<>> @ns3.yahoo.com www.flickr.vip.mud.yahoo.com
    ; (1 server found)
    ;; global options: printcmd
    ;; Got answer:
    ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 41226
    ;; flags: qr aa rd; QUERY: 1, ANSWER: 1, AUTHORITY: 5, ADDITIONAL: 5
    ;; WARNING: recursion requested but not available

    ;; QUESTION SECTION:
    ;www.flickr.vip.mud.yahoo.com. IN A

    ;; ANSWER SECTION:
    www.flickr.vip.mud.yahoo.com. 900 IN A 68.142.214.24

    ;; AUTHORITY SECTION:
    mud.yahoo.com. 172800 IN NS ns1.yahoo.com.
    mud.yahoo.com. 172800 IN NS ns2.yahoo.com.
    mud.yahoo.com. 172800 IN NS ns3.yahoo.com.
    mud.yahoo.com. 172800 IN NS ns4.yahoo.com.
    mud.yahoo.com. 172800 IN NS ns5.yahoo.com.

    ;; ADDITIONAL SECTION:
    ns1.yahoo.com. 172800 IN A 66.218.71.63
    ns2.yahoo.com. 172800 IN A 68.142.255.16
    ns3.yahoo.com. 172800 IN A 217.12.4.104
    ns4.yahoo.com. 172800 IN A 68.142.196.63
    ns5.yahoo.com. 1800 IN A 119.160.247.124

    ;; Query time: 153 msec
    ;; SERVER: 217.12.4.104#53(217.12.4.104)
    ;; WHEN: Wed Sep 10 16:58:43 2008
    ;; MSG SIZE rcvd: 232

    because this is a yahoo.com nameserver which is simultaneously answering
    and delegating. this is a sensible thing for it to do since it's
    authoritative for both yahoo.com and mud.yahoo.com, but it's also an
    insensible thing for it to do since the downward referral trumps the
    non-empty answer section. (it would also trump a non-empty answer
    section which would otherwise be seen as a NODATA response.) i'm not
    throwing stones, since this is ambiguous in the spec, and for all i know
    it's what BIND9 would do. but my own toy traversal tool spake thusly:

    response from 217.12.4.104 (ns3.yahoo.com) is NOERROR (1 1 5 5) (AA)
    down-referral
    downward referral trumps nonempty ANSWER
    cache modified by AUTHORITY
    cache unmodified by ADDITIONAL
    upstream transaction complete (tryagain)
    requires iteration (#3)

    and the complexity thus revealed may behoove yahoo to put the mud.yahoo.com
    zone separate nameservers (or separate views) from the yahoo.com zone.
    --
    Paul Vixie


  4. Re: SERVFAIL

    -----BEGIN PGP SIGNED MESSAGE-----
    Hash: SHA1

    A name server may be authoritative for both a zone and its subzone.
    Your traversal tool is wrong - the server is giving an authoritative
    answer, not a downward referral. Your tool should consider an
    authoritative answer as trumping the authority section, if there is
    any conflict.

    It is common for an authoritative answer to contain the NS records of
    the zone containing the answer, along with any known addresses for
    those servers.

    Chris Buxton
    Professional Services
    Men & Mice

    On Sep 10, 2008, at 10:04 AM, Paul Vixie wrote:

    > i believe that the hard part of the traversal for www.flickr.com is:
    >
    > ; <<>> DiG 9.4.1-P1 <<>> @ns3.yahoo.com www.flickr.vip.mud.yahoo.com
    > ; (1 server found)
    > ;; global options: printcmd
    > ;; Got answer:
    > ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 41226
    > ;; flags: qr aa rd; QUERY: 1, ANSWER: 1, AUTHORITY: 5, ADDITIONAL: 5
    > ;; WARNING: recursion requested but not available
    >
    > ;; QUESTION SECTION:
    > ;www.flickr.vip.mud.yahoo.com. IN A
    >
    > ;; ANSWER SECTION:
    > www.flickr.vip.mud.yahoo.com. 900 IN A 68.142.214.24
    >
    > ;; AUTHORITY SECTION:
    > mud.yahoo.com. 172800 IN NS ns1.yahoo.com.
    > mud.yahoo.com. 172800 IN NS ns2.yahoo.com.
    > mud.yahoo.com. 172800 IN NS ns3.yahoo.com.
    > mud.yahoo.com. 172800 IN NS ns4.yahoo.com.
    > mud.yahoo.com. 172800 IN NS ns5.yahoo.com.
    >
    > ;; ADDITIONAL SECTION:
    > ns1.yahoo.com. 172800 IN A 66.218.71.63
    > ns2.yahoo.com. 172800 IN A 68.142.255.16
    > ns3.yahoo.com. 172800 IN A 217.12.4.104
    > ns4.yahoo.com. 172800 IN A 68.142.196.63
    > ns5.yahoo.com. 1800 IN A 119.160.247.124
    >
    > ;; Query time: 153 msec
    > ;; SERVER: 217.12.4.104#53(217.12.4.104)
    > ;; WHEN: Wed Sep 10 16:58:43 2008
    > ;; MSG SIZE rcvd: 232
    >
    > because this is a yahoo.com nameserver which is simultaneously
    > answering
    > and delegating. this is a sensible thing for it to do since it's
    > authoritative for both yahoo.com and mud.yahoo.com, but it's also an
    > insensible thing for it to do since the downward referral trumps the
    > non-empty answer section. (it would also trump a non-empty answer
    > section which would otherwise be seen as a NODATA response.) i'm not
    > throwing stones, since this is ambiguous in the spec, and for all i
    > know
    > it's what BIND9 would do. but my own toy traversal tool spake thusly:
    >
    > response from 217.12.4.104 (ns3.yahoo.com) is NOERROR (1 1 5 5) (AA)
    > down-referral
    > downward referral trumps nonempty ANSWER
    > cache modified by AUTHORITY
    > cache unmodified by ADDITIONAL
    > upstream transaction complete (tryagain)
    > requires iteration (#3)
    >
    > and the complexity thus revealed may behoove yahoo to put the
    > mud.yahoo.com
    > zone separate nameservers (or separate views) from the yahoo.com zone.
    > --
    > Paul Vixie
    >


    -----BEGIN PGP SIGNATURE-----
    Version: GnuPG v1.4.8 (Darwin)

    iEYEARECAAYFAkjITE8ACgkQ0p/8Jp6Boi3wgQCfQe8ybx0sENKX80aIn2M1k5tL
    z7UAoJBGxp/JuR/2xEkTl+hS2SqZT1F5
    =bpSG
    -----END PGP SIGNATURE-----


  5. Re: SERVFAIL

    > From: Chris Buxton
    >
    > A name server may be authoritative for both a zone and its subzone. Your
    > traversal tool is wrong - the server is giving an authoritative answer,
    > not a downward referral. Your tool should consider an authoritative
    > answer as trumping the authority section, if there is any conflict.


    chris, i'm not sure you read my earlier statement. i will try again,
    differently. there are many ambiguities in the dns protocol specification
    and this is one of them. the meaning of (AA && ANCOUNT==0) is sometimes
    that there are no records of type=QTYPE and sometimes that there is a zone
    cut between the zone whose server you queried and the zone that contains
    your data, and to disambiguate you have to look at the authority section
    to see if there are some NS RRs whose owner names are below the zone whose
    server you were querying (in which case you know (AA && ANCOUNT==0) means
    it's a delegation) or at the zone you were querying (in which case (AA &&
    ANCOUNT==0) means that there are no records of type==QTYPE).

    in preparing my traversal tool many things dawned on me since i did it from
    memory and without reference to any RFC (except RFC 2671 which i had to refer
    to and which i found to be badly written in the extreme). one of the dawnings
    was that (AA && ANCOUNT>0) actually presents the same ambiguity, since many
    servers will provide an answer if QTYPE=NS even though we all know from the
    years spent on DNSSEC that NS RRs are only authoritative in the child zone.
    therefore my traversal tool looks first to see if there is a delegation (which
    means a non-empty authority section containing NS RRs whose owners are below
    the zone whose servers i'm querying, also called "the bailiwick", and if
    these are present, then the delegation is followed, and the answer, whether
    empty or not, is ignored.

    this is not the only possible way to interpret the ambiguities of RFC 1034
    and RFC 1035, but i like it since it helps work around various
    misconfigurations which have in the past caused me to cache bad data. now,
    the server isn't doing the wrong thing, but the server operator had better
    be prepared to accept the same query a second time. the real problem, if
    there is a problem, is that a server for both a zone and its child, has no
    way to know what bailiwick the resolver has iterated down to. there is no
    fix for the server absent this important information.

    > It is common for an authoritative answer to contain the NS records of the
    > zone containing the answer, along with any known addresses for those
    > servers.


    thanks for explaining that.


  6. Re: SERVFAIL

    -----BEGIN PGP SIGNED MESSAGE-----
    Hash: SHA1

    On Sep 10, 2008, at 4:26 PM, Paul Vixie wrote:
    >> From: Chris Buxton
    >>
    >> A name server may be authoritative for both a zone and its
    >> subzone. Your
    >> traversal tool is wrong - the server is giving an authoritative
    >> answer,
    >> not a downward referral. Your tool should consider an authoritative
    >> answer as trumping the authority section, if there is any conflict.

    >
    > chris, i'm not sure you read my earlier statement.


    I thought I had. It appeared to me you had made a mistake. Perhaps I
    read it wrong; I'm willing to be corrected. However, I have had
    occasion to correct your thinking once before...

    > i will try again,
    > differently. there are many ambiguities in the dns protocol
    > specification
    > and this is one of them. the meaning of (AA && ANCOUNT==0) is
    > sometimes
    > that there are no records of type=QTYPE


    an NXRRSet (aka no data) response

    > and sometimes that there is a zone
    > cut between the zone whose server you queried and the zone that
    > contains
    > your data,


    Really? I've never seen a referral marked with the aa flag. (But you
    obviously have more years of experience than I.) I thought it was
    pretty clear in the RFC that a referral does not constitute an
    authoritative answer - it's neither authoritative nor an answer.

    Can you point to a name server version that gives a pure referral with
    the aa flag set?

    > and to disambiguate you have to look at the authority section
    > to see if there are some NS RRs whose owner names are below the zone
    > whose
    > server you were querying (in which case you know (AA && ANCOUNT==0)
    > means
    > it's a delegation)


    Again, I thought aa was supposed to be reserved for (final) answers.
    Now granted I've seen the aa flag on answers from resolvers when the
    answer did not come from cache, but that's a different issue. If you
    send an iterative query and the response has aa set, it should be both
    authoritative and an answer (not a referral).

    > or at the zone you were querying (in which case (AA &&
    > ANCOUNT==0) means that there are no records of type==QTYPE).


    Right, the NXRRSet response.

    > in preparing my traversal tool many things dawned on me since i did
    > it from
    > memory and without reference to any RFC (except RFC 2671 which i had
    > to refer
    > to and which i found to be badly written in the extreme).


    I'll defer to the opinion of the author, but I've seen worse. The
    RFC's describing IDNA, ACE, and punycode, for example, are completely
    opaque to me.

    > one of the dawnings
    > was that (AA && ANCOUNT>0) actually presents the same ambiguity,
    > since many
    > servers will provide an answer if QTYPE=NS even though we all know
    > from the
    > years spent on DNSSEC that NS RRs are only authoritative in the
    > child zone.


    Can you point to a name server software version that marks delegation
    NS records as authoritative even when specifically asked for them? I
    would call that a protocol violation. I've seen them go in the answer
    section (BIND 8), and I've seen them put in the authority section
    (BIND 9), but I've never seen them marked with aa.

    > therefore my traversal tool looks first to see if there is a
    > delegation (which
    > means a non-empty authority section containing NS RRs whose owners
    > are below
    > the zone whose servers i'm querying, also called "the bailiwick",
    > and if
    > these are present, then the delegation is followed, and the answer,
    > whether
    > empty or not, is ignored.


    Hmm...

    If we don't allow below-bailiwick assertions of authority in an
    authoritative answer, then the resolver has to consider it a
    delegation and (effectively) re-send the query (which then requires
    that the NS records be accurate...). If we do, then if a broken name
    server sets aa for a referral, we have a problem. However, the damage
    is limited to the zones served by that server and the descendants of
    those zones.

    However, let's suppose that we consider it a delegation and resend the
    query - in this case, the query goes to one of the same servers as the
    parent zone. You had described this as "the hard part of the
    traversal"; I don't see how throwing an extra query into the recursion
    process would result in a SERVFAIL response.

    In fact, a BIND 9.4.x resolver on my laptop is able to look up www.flickr.com/IN/A
    just fine. I don't have 9.5 installed to test with, but unless it's
    doing something different in the resolver algorithm, I would guess
    this is a configuration, resource, or network/routing/firewall issue
    for Barry.

    > this is not the only possible way to interpret the ambiguities of
    > RFC 1034
    > and RFC 1035, but i like it since it helps work around various
    > misconfigurations which have in the past caused me to cache bad
    > data. now,
    > the server isn't doing the wrong thing, but the server operator had
    > better
    > be prepared to accept the same query a second time. the real
    > problem, if
    > there is a problem, is that a server for both a zone and its child,
    > has no
    > way to know what bailiwick the resolver has iterated down to. there
    > is no
    > fix for the server absent this important information.


    Exactly. The responding server simply sends the best answer it can
    make from available data.

    This sounds like you're arguing against including NS records in the
    authority section of an answer (as opposed to a referral), because it
    can confuse the resolver. This is the default behavior of at least one
    non-BIND name server - an authoritative positive answer has just an
    answer section.

    Chris Buxton
    Professional Services
    Men & Mice

    -----BEGIN PGP SIGNATURE-----
    Version: GnuPG v1.4.8 (Darwin)

    iEYEARECAAYFAkjIaH0ACgkQ0p/8Jp6Boi2KMgCfVJQnjaOF2bim6VnceggslBIq
    J5gAn0cOTSvPwaEKKtGIncxMIo1q3pIT
    =RRty
    -----END PGP SIGNATURE-----


  7. Re: SERVFAIL

    > Really? I've never seen a referral marked with the aa flag. (But you
    > obviously have more years of experience than I.) I thought it was
    > pretty clear in the RFC that a referral does not constitute an
    > authoritative answer - it's neither authoritative nor an answer.


    load balancers do funny things.

    > Can you point to a name server version that gives a pure referral with
    > the aa flag set?


    not offhand. i ran into them when coding my tool, and coded around them.

    > Again, I thought aa was supposed to be reserved for (final) answers.


    it is. supposed to be, that is.

    > > or at the zone you were querying (in which case (AA &&
    > > ANCOUNT==0) means that there are no records of type==QTYPE).

    >
    > Right, the NXRRSet response.


    i wish you'd stop calling it that. i added the nxrrset rcode in RFC 2136
    but it is only applicable to updates. if you mean ANCOUNT==0, say so.

    > > ... and without reference to any RFC (except RFC 2671 which i had to
    > > refer to and which i found to be badly written in the extreme).

    >
    > I'll defer to the opinion of the author, but I've seen worse. The RFC's
    > describing IDNA, ACE, and punycode, for example, are completely opaque to
    > me.


    RFC 2671 seemed perfectly adequate to me as its author. it wasn't until
    years later than i tried to implement EDNS0 using that document as my
    reference that its many shortcomings flowed in neon.

    > Can you point to a name server software version that marks delegation NS
    > records as authoritative even when specifically asked for them? I would
    > call that a protocol violation. I've seen them go in the answer section
    > (BIND 8), and I've seen them put in the authority section (BIND 9), but
    > I've never seen them marked with aa.


    try bind4.8 i think. or the microsoft NT 3.51 resource kit. or many load
    balancers. it's only a protocol "violation" if the other guy can't and also
    won't work around it.

    > If we don't allow below-bailiwick assertions of authority in an
    > authoritative answer, then the resolver has to consider it a
    > delegation and (effectively) re-send the query (which then requires
    > that the NS records be accurate...).


    yes. which works, by the way.

    > However, let's suppose that we consider it a delegation and resend the
    > query - in this case, the query goes to one of the same servers as the
    > parent zone. You had described this as "the hard part of the traversal";
    > I don't see how throwing an extra query into the recursion process would
    > result in a SERVFAIL response.


    i don't know that it is the cause of this servfail, only that it's
    suspicious based on the output of my toy resolver in "trace" mode.

    > In fact, a BIND 9.4.x resolver on my laptop is able to look up
    > www.flickr.com/IN/A just fine. I don't have 9.5 installed to test with,
    > but unless it's doing something different in the resolver algorithm, I
    > would guess this is a configuration, resource, or network / routing /
    > firewall issue for Barry.


    i am sure that if bind had a problem looking up flickr, it would be on the
    front page of every newspaper, etc. so i agree that this is likely to be
    some kind of middlebox issue. but possibly a data dependent middlebox issue.

    > This sounds like you're arguing against including NS records in the
    > authority section of an answer (as opposed to a referral), because it
    > can confuse the resolver. This is the default behavior of at least one
    > non-BIND name server - an authoritative positive answer has just an
    > answer section.


    i'm not arguing that the server should do anything differently. i just think
    server operators should be prepared for some extra queries when they do this.


+ Reply to Thread