(One intermediate revision by the same user not shown) | |||
Line 115: | Line 115: | ||
!Source!!glibc!!netresolve!!notes | !Source!!glibc!!netresolve!!notes | ||
|- | |- | ||
|/etc/ | |/etc/hosts||broken||yes||may even return a different address family at times | ||
|- | |||
|/etc/services||yes||yes|| | |||
|- | |- | ||
|DNS||yes||yes|| | |DNS||yes||yes|| | ||
Line 199: | Line 201: | ||
You can see that the result is the same as before except that IPv4 is sorted first. | You can see that the result is the same as before except that IPv4 is sorted first. | ||
=== Dual-stack to dual-stack with lost AAAA answer === | |||
{| | |||
!rowspan=2| Source connectivity | |||
! IPv4 | |||
| Global or masqueraded | |||
|- | |||
! IPv6 | |||
| Global | |||
|- | |||
!rowspan=2| Target connectivity | |||
! IPv4 | |||
| Global | |||
|- | |||
! IPv6 | |||
| Global | |||
|- | |||
!colspan=2| Other | |||
| AAAA request or reply is lost | |||
|} | |||
Same as above, except that the AAAA answer is lost by a broken DNS server. | |||
==== What is tested ==== | |||
* Wheter the component reverts to IPv4 in reasonable time when AAAA answer gets lost. | |||
==== Steps to reproduce ==== | |||
# Block IPv6 DNS packets on firewall. | |||
# Let the client connect to a dual-stack server. | |||
# Check all tested properties including the delay. | |||
==== Expected result (sequential, IPv6 preferred) ==== | |||
* Host requests AAAA record and gives up after a delay (e.g. 15 seconds). | |||
* Host requests A record and receives reply. | |||
* Host connects via IPv4. | |||
==== Expected result (parallel, IPv6 preferred) ==== | |||
* Host requests A and AAAA records simultaneously and receives the A reply. | |||
* Host gives up waiting for AAAA record after a short delay (e.g. 300 milliseconds). | |||
* Host connects via IPv4. | |||
==== Bad result (parallel, first result wins) ==== | |||
* Host requests A and AAAA records simultaneously and receives the A reply. | |||
* Host connects via IPv4. | |||
* No delay. | |||
Rationale: IPv6 should be preferred over IPv4 by default. | |||
=== IPv4 to dual-stack with lost AAAA answer === | |||
{| | |||
!rowspan=2| Source connectivity | |||
! IPv4 | |||
| Global or masqueraded | |||
|- | |||
! IPv6 | |||
| Link-local (alternatively None or Disabled) | |||
|- | |||
!rowspan=2| Target connectivity | |||
! IPv4 | |||
| Global | |||
|- | |||
! IPv6 | |||
| Global | |||
|- | |||
!colspan=2| Other | |||
| AAAA request or reply is lost | |||
|} | |||
Same as above, except that the host doesn't have any IPv6 address except link-local and loopback. | |||
==== What is tested ==== | |||
* Whether the component suppresses AAAA queries when lacking global IPv6 connectivity. | |||
==== Steps to reproduce ==== | |||
# Block IPv6 DNS packets on firewall. | |||
# Let the client connect to a dual-stack server. | |||
# Check all tested properties including absence of a delay. | |||
==== Expected result ==== | |||
* Host requests A record and receives reply. | |||
* Host connects via IPv4. | |||
* No delay, no AAAA query. | |||
==== Bad result ==== | |||
* Host requests A and AAAA record and receives A reply. | |||
* Host gives up waiting for AAAA reply. | |||
* Host connects via IPv4 after an excessive delay. | |||
Rationale: IPv4-only host should connect to IPv4 address without delay. |
Latest revision as of 09:55, 15 January 2016
Using libc functions
Name resolution features are provided by the GNU C Library (glibc) which is not yet ready for proper IPv6 and dual-stack operation as you can see when performing your tests. The C library comes with its own testing tool getent
that has a special database called ahosts
that runs getaddrinfo()
, the library function that translates names to objects with addressing information. For your testing it is best used together with tools like strace
, ltrace
or even gdb
so that you know exactly what is happening behind the scenes.
As the getent
tool is very primitive, we created a tool called getaddrinfo
just as the library function that handles a larger subset of the function's API.
Name resolution input
When application requests addressing information for a hostname with an optional service name, the library returns a list of addressing information objects. The order of objects in the list is significant and depends on operating system configuration and connectivity.
From the application
- nodename
- servname
- protocol
- socktype
- flags
AI_CANONNAME
- ...
From local configuration and connectivity checks
- Files in
/etc/
including nsswitch, hosts, services and more - To what extend is IPv4 and IPv6 available
From the outside world
- DNS information
- Multicast DNS information
- LDAP information
Name resolution processing
What is requested
Not all information is requested at all times. Some information like canonical name must be explicitly requested by the application via AI_CANONNAME
flag. It may be desirable to suppress other requests by local configuration or connectivity checks, a notable example being suppression of DNS AAAA queries on hosts without global connectivity.
What is passed to the application
Not all information that is learnt via requests is presented to the application. It is typically filtered according to input from the application. It is sometimes also filtered according to connectivity checks but that has caused more problems than improvements.
How it is sorted
There are rules for sorting addressing information returned by getaddrinfo()
. One of the basic features is to return global IPv6 destinations before global IPv4 destinations. But when the library detects that IPv6 connectivity is not available, the reverse applies.
Using netresolve
There is an experimental package called netresolve (TODO: not yet in Fedora) that consists of a library somewhat similar to the glibc name resolution API implementation and a set of debugging tools. Any application using libc API and a couple of other APIs can be run using wrapresolve
to use the netresolve implementation instead and benefit from some advanced features and more extensive debugging. When using the libc
backend, netresolve can be also used to test the glibc implementation. With the nss
backend it can be used to test glibc nsswitch backends directly.
$ ./netresolve --node www.nix.cz response netresolve 0.0.1 name info.nix.cz ip 2a02:38::1001 any any 0 0 0 21599 ip 195.47.235.3 any any 0 0 0 12589
You can see that netresolve
behaves slightly different than getent ahosts
and it by default returns one item per IP address. But you can easily tweak it to behave the same way.
$ ./netresolve --node www.nix.cz --service '' response netresolve 0.0.1 name info.nix.cz ip 2a02:38::1001 stream tcp 0 0 0 21021 ip 2a02:38::1001 dgram udp 0 0 0 21021 ip 2a02:38::1001 raw any 0 0 0 21021 ip 195.47.235.3 stream tcp 0 0 0 12592 ip 195.47.235.3 dgram udp 0 0 0 12592 ip 195.47.235.3 raw any 0 0 0 12592
Note that netresolve
is both the name of the command and the name of the library. The above tests are done using the command but the same results would be given to an application using the library. Unless you explicitly request, netresolve uses its internal name resolution modules and not the libc functions.
Comparison of name resolution APIs
Now that there is the good old glibc implementation plus a new name resolution library, I would like to maintain a rather honest comparison table for their features.
For the beginning, I'd like to present API possibilities. Note that glibc is limited at two stages, one is the getaddrinfo API and the other is the nsswitch backend API. Unfortunately, nsswitch backend API has multiple entry points (even for the same task) and none of them provides a superset of the others' functionality.
Feature | getaddrinfo | getaddrinfo_a | asyncns | c-ares | netresolve | notes |
---|---|---|---|---|---|---|
POSIX | yes | no | no | no | no | |
glibc | yes | yes | no | no | no | |
nonblocking API | no | yes | yes | yes | yes | |
nonblocking backends | no | no | no | N/A | yes | |
pollable FD API | no | no | yes | yes | yes | |
extensible API | no | no | no | N/A | yes | |
connect/bind wrappers | no | no | no | no | yes | |
happy eyeballs | no | no | no | no | yes | netresolve optimizes DNS, TCP and (with application support) UDP |
SRV records | no | no | DNS API | yes | yes | |
DNS API | no | no | yes | yes | yes | |
input: address family | possible | possible | possible | yes | yes | glibc nsswitch broken (gethotbyname2/gethostbyname3 only) |
output: ifindex/scope_id | possible | possible | possible | N/A | yes | glibc nsswitch broken (gethostbyname4 only) |
TTL | no | no | ? | yes | yes | glibc nsswitch broken (gethostbyname3/gethostbyname4 only) |
output: validity (e.g. DNSSEC) | no | no | no | yes | planned |
And some notable backend implementations
Source | glibc | netresolve | notes |
---|---|---|---|
/etc/hosts | broken | yes | may even return a different address family at times |
/etc/services | yes | yes | |
DNS | yes | yes | |
custom binary/script | no | yes | |
TBD... |
Test cases
Isolated host name resolution
Test workflow
- Check localhost name resolution using the below table including result order
- Repeat the tests on other connectivity configurations (except disabled IPv6)
- Repeat the tests with
AI_ADDRCONFIG
You can use command getaddrinfo
from the netresolve package, see example below.
$ getaddrinfo --raw --node localhost
nodename | AF_UNSPEC | AF_INET6 | AF_INET |
---|---|---|---|
NULL or "" | ::0, 0.0.0.0 | ::0 | 0.0.0.0 |
"localhost" | ::1, 127.0.0.1 | ::1 | 127.0.0.1 |
"localhost4" | 127.0.0.1 | EAI_NODATA | 127.0.0.1 |
"localhost6" | ::1 | ::1 | EAI_NODATA |
Dual-stack host, destination with global IPv4 and IPv6
On a host with IPv4 and IPv6 connectivity we request addressing information of another host that is announced as dual-stack in DNS.
$ getent ahosts www.nix.cz 2a02:38::1001 STREAM info.nix.cz 2a02:38::1001 DGRAM 2a02:38::1001 RAW 195.47.235.3 STREAM 195.47.235.3 DGRAM 195.47.235.3 RAW
You can see that getaddrinfo()
returned six items for two unique IP addresses with the IPv6 address sorted first. If you only expected only two items, one for each IP address, see upstream bug 14990.
IPv4-only host, destination with global IPv4 and IPv6
We do the same on a host without IPv6 connectivity.
$ getent ahosts www.nix.cz 195.47.235.3 STREAM info.nix.cz 195.47.235.3 DGRAM 195.47.235.3 RAW 2a02:38::1001 STREAM 2a02:38::1001 DGRAM 2a02:38::1001 RAW
You can see that the result is the same as before except that IPv4 is sorted first.
Dual-stack to dual-stack with lost AAAA answer
Source connectivity | IPv4 | Global or masqueraded |
---|---|---|
IPv6 | Global | |
Target connectivity | IPv4 | Global |
IPv6 | Global | |
Other | AAAA request or reply is lost |
Same as above, except that the AAAA answer is lost by a broken DNS server.
What is tested
- Wheter the component reverts to IPv4 in reasonable time when AAAA answer gets lost.
Steps to reproduce
- Block IPv6 DNS packets on firewall.
- Let the client connect to a dual-stack server.
- Check all tested properties including the delay.
Expected result (sequential, IPv6 preferred)
- Host requests AAAA record and gives up after a delay (e.g. 15 seconds).
- Host requests A record and receives reply.
- Host connects via IPv4.
Expected result (parallel, IPv6 preferred)
- Host requests A and AAAA records simultaneously and receives the A reply.
- Host gives up waiting for AAAA record after a short delay (e.g. 300 milliseconds).
- Host connects via IPv4.
Bad result (parallel, first result wins)
- Host requests A and AAAA records simultaneously and receives the A reply.
- Host connects via IPv4.
- No delay.
Rationale: IPv6 should be preferred over IPv4 by default.
IPv4 to dual-stack with lost AAAA answer
Source connectivity | IPv4 | Global or masqueraded |
---|---|---|
IPv6 | Link-local (alternatively None or Disabled) | |
Target connectivity | IPv4 | Global |
IPv6 | Global | |
Other | AAAA request or reply is lost |
Same as above, except that the host doesn't have any IPv6 address except link-local and loopback.
What is tested
- Whether the component suppresses AAAA queries when lacking global IPv6 connectivity.
Steps to reproduce
- Block IPv6 DNS packets on firewall.
- Let the client connect to a dual-stack server.
- Check all tested properties including absence of a delay.
Expected result
- Host requests A record and receives reply.
- Host connects via IPv4.
- No delay, no AAAA query.
Bad result
- Host requests A and AAAA record and receives A reply.
- Host gives up waiting for AAAA reply.
- Host connects via IPv4 after an excessive delay.
Rationale: IPv4-only host should connect to IPv4 address without delay.