Line 134: | Line 134: | ||
* AI_V4MAPPED: I don't see any real use for this, only returns mapped IPv4 if there are no IPv6 addresses | * AI_V4MAPPED: I don't see any real use for this, only returns mapped IPv4 if there are no IPv6 addresses | ||
=== Flag AI_ADDRCONFIG considered harmful === | |||
As far as I know, AI_ADDRCONFIG was added for the following reasons: | |||
* Some buggy DNS servers would be confused by AAAA requests | |||
* Optimization of the number DNS queries | |||
Currently, I'm aware of several documents that define AI_ADDRCONFIG: | |||
* POSIX1-2008: useless but harmless | |||
* RFC 3493 (informational): useless but (partially) breaks IPv4/IPv6 localhost | |||
* RFC 2553 (obsolete informational): useless but hopefully harmless | |||
* GLIBC getaddrinfo(3): like RFC 3493 | |||
Actual GLIBC <code>getaddrinfo()</code> behavior differs from the manual | |||
page. | |||
==== Problem statement ==== | |||
Currently, any of the definitions above makes AI_ADDRCONFIG a noop when | |||
a link-local IPv6 address is present. These addresses are automatically | |||
added to interfaces that are otherwise used to connect to IPv4. Therefore, | |||
on a typical linux system, AI_ADDRCONFIG ''never'' meets its goals. | |||
But it builds on a false assumption, that no IPv4 communication is feasible | |||
without a non-loopback address. But why would we have a loopback address | |||
if we can't use it for node-local communication? AI_ADDRCONFIG breaks | |||
''localhost'', ''localhost4'', ''localhost6'', ''127.0.0.1'', ''::1'' and | |||
more if there's no non-loopback address of the respective protocol. | |||
This can happen if the computer is connected to an IPv4-only network or | |||
and IPv6-only network, when it loses IPv4 or IPv6 connectivity and when | |||
it's used offline. | |||
==== What next ==== | |||
A possible solution for the first problem (that AI_ADDRCONFIG is useless) | |||
is to treat link-local addresses the same as loopback (or node-local) | |||
addresses. But this is even more harmful. | |||
Fedora's GLIBC was patched to do exactly the above thing. The consequence | |||
was that even link-local IPv6 stopped working when a global IPv6 address | |||
was absent. | |||
leaves out IPv6 link-local addresses | |||
which are automatically added to active interfaces. Therefore the whole | |||
AI_ADDRCONFIG is ''not in effect'' in the cases it was made for. A patch was | |||
used in Fedora, that also disregarded | |||
The whole idea of filtering-out non-DNS addresses is flawed and breaks even | |||
IPv4 and IPv6 literals. | |||
Filtering of non-DNS addresses in getaddrinfo() has no real use | Filtering of non-DNS addresses in getaddrinfo() has no real use |
Revision as of 00:28, 22 November 2012
Name resolution
Resolving using getaddrinfo()
in applications
The getaddrinfo()
function is a dualstack-friendly API to name
resolution. It is used by applications to translate host and
service names to a linked list of struct addrinfo
objects.
Running getaddrinfo()
And example of getaddrinfo()
call:
const char *node = "www.fedoraproject.org"; const char *service = "http"; struct addrinfo hints = { .ai_family = AF_UNSPEC, .ai_socktype = SOCK_DGRAM, .ai_flags = 0, .ai_protocol = 0, .ai_canonname = NULL, .ai_addr = NULL, .ai_next = NULL }; struct addrinfo *result; int error; error = getaddrinfo(node, service, &hints, &result);
The input of getaddrinfo() consists of node specification, service specification and further hints.
- node: literal IPv4 or IPv6 address, or a hostname to be resolved
- service: numeric port number or a symbolic service name
- hints.ai_family: enable dualprotocol, IPv4-only or IPv6-only queries
- hints.ai_socktype: select socket type (and thus protocol family)
getaddrinfo()
can be futher tweaked with the hints.ai_flags. Other
attributes are either not needed (ai_protocol) or not supposed
to be set in hints (ai_canonname, ai_addr and ai_next).
On success, the error variable is assigned to 0 and result is pointed to
a linked list of one or more struct addrinfo
objects.
Never assume that getaddrinfo() returns only one result or that the first result actually works!
Using getaddrinfo()
results
It is necesary to try all results until one successfully connects. This works perfectly for TCP connections as they can fail gracefully at this stage.
struct addrinfo *item; int sock; for (item = result; item; item = item->ai_next) { sock = socket(item->ai_family, item->ai_socktype, item->ai_protocol); if (sock == -1) continue; if (connect(sock, item->ai_addr, item->ai_addrlen) != -1) { fprintf(stderr, "Connected successfully."); break; } close(sock); }
For UDP, connect()
succeeds without contacting the other side (if you
are using connect()
with udp at all). Therefore you might want to
perform additional actions (such as sending a message and recieving a reply)
before crying out „success!“.
Freeing getaddrinfo()
results
When we're done with the results, we'll free the linked list.
freeaddrinfo(result);
Using getaddrinfo()
in Python
Python's socket.getaddrinfo()
API tries to be
a little bit more sane than the C API.
#!/usr/bin/python3 import sys, socket host = "www.fedoraproject.org" service = "http" family = socket.AF_UNSPEC socktype = socket.SOCK_DGRAM protocol = 0 flags = 0 result = socket.getaddrinfo(host, service, family, socktype, protocol, flags) sock = None for family, socktype, protocol, canonname, sockaddr in result: try: sock = socket.socket(family, socktype, protocol) except socket.error: continue try: sock.connect(sockaddr) print("Successfully connected to: {}".format(sockaddr)) except socket.error: sock.close() sock = None continue break if sock is None: print("Failed to connect.", file=sys.stderr) sys.exit(1)
Tweaking getaddrinfo()
flags
- AI_NUMERICHOST: use literal address, don't perform host resolution
- AI_PASSIVE: return socket addresses suitable for bind() instead of connect(), sendto() and sendmsg()
- AI_NUMERICSERV: use numeric service, don't perform service resolution
- AI_CANONNAME: save canonical name to the first result
- AI_ADDRCONFIG: this never really worked, as far as I know
- AI_V4MAPPED+AI_ALL: only with AF_INET6, return IPv4 addresses mapped into IPv6 space
- AI_V4MAPPED: I don't see any real use for this, only returns mapped IPv4 if there are no IPv6 addresses
Flag AI_ADDRCONFIG considered harmful
As far as I know, AI_ADDRCONFIG was added for the following reasons:
- Some buggy DNS servers would be confused by AAAA requests
- Optimization of the number DNS queries
Currently, I'm aware of several documents that define AI_ADDRCONFIG:
- POSIX1-2008: useless but harmless
- RFC 3493 (informational): useless but (partially) breaks IPv4/IPv6 localhost
- RFC 2553 (obsolete informational): useless but hopefully harmless
- GLIBC getaddrinfo(3): like RFC 3493
Actual GLIBC getaddrinfo()
behavior differs from the manual
page.
Problem statement
Currently, any of the definitions above makes AI_ADDRCONFIG a noop when a link-local IPv6 address is present. These addresses are automatically added to interfaces that are otherwise used to connect to IPv4. Therefore, on a typical linux system, AI_ADDRCONFIG never meets its goals.
But it builds on a false assumption, that no IPv4 communication is feasible without a non-loopback address. But why would we have a loopback address if we can't use it for node-local communication? AI_ADDRCONFIG breaks localhost, localhost4, localhost6, 127.0.0.1, ::1 and more if there's no non-loopback address of the respective protocol.
This can happen if the computer is connected to an IPv4-only network or and IPv6-only network, when it loses IPv4 or IPv6 connectivity and when it's used offline.
What next
A possible solution for the first problem (that AI_ADDRCONFIG is useless) is to treat link-local addresses the same as loopback (or node-local) addresses. But this is even more harmful.
Fedora's GLIBC was patched to do exactly the above thing. The consequence was that even link-local IPv6 stopped working when a global IPv6 address was absent.
leaves out IPv6 link-local addresses
which are automatically added to active interfaces. Therefore the whole
AI_ADDRCONFIG is not in effect in the cases it was made for. A patch was
used in Fedora, that also disregarded
The whole idea of filtering-out non-DNS addresses is flawed and breaks even IPv4 and IPv6 literals.
Filtering of non-DNS addresses in getaddrinfo() has no real use and it only causes problems. There's no reason to filter over the mere existence of addresses. Filtering over global address existence may only be desirable for global address resolution, which is DNS. But that should be done by the DNS resolver that only asks for addresses that make sense and only accepts addresses that it asks for.
- IPv4:
getaddrinfo("127.0.0.1", ...)
fail with some AI_ADDRCONFIG configurations - IPv6: Fedora 808147 -
getaddrinfo("::1", ...)
fails with some configurations of AI_ADDRCONFIG - IPv6:
getaddrinfo("fe80::1234:56ff:fe78:90%eth0", ...)
also fails as above - IPv6: GLIBC's nsswitch doesn't support overriding
getaddrinfo
which is requred to resolve link-local IPv6 addresses