I don't think I could've found as much insight into the sockets library in a week of searching the intertubes as there is contained in that there email. Bravo!


From: Perry Lorier [mailto:perry@coders.net]
Sent: Mon 16/02/2009 10:58 a.m.
To: Joe Abley
Cc: NZNOG@list.waikato.ac.nz
Subject: Re: [nznog] FileMaker Pro 10 - now with IPv6


>> On Thu, 2009-02-12 at 14:45 +1300, Ian Batterbee wrote:
>>    
>>> But wasn't there a big discussion at NZNOG about how applications
>>> shouldn't know anything (or care) about the underlying communication
>>> protocols being used ?
>>>      
>> Yes, and wouldn't it be nice if *libc* didn't want applications to 
>> make
>> different calls when using IPv6 addresses vs. IPv4 addresses?
>>    
>
> Surely the bits of libc that are address-family dependent are those 
> that remain entrenched for historical reasons and rely upon data 
> structures that carry endpoint addresses in 32-bit words. That being 
> the case, requiring the use of different APIs seems fairly 
> understandable.
>
>  
the BSD sockets layer was designed, from the outset to support multiple
address family's (AF's).

There is a structure called "struct sockaddr" which represents a
"generic" address of some unknown type.  It contains a "sa_family"
member which tells you what type of address it is (AF_INET meaning IPv4
address, AF_UNIX meaning unix domain socket, and so on).  You then cast
it to the correct type (struct sockaddr_in for AF_INET, or struct
sockaddr_un for AF_UNIX) to get the correct structure for the correct
address family.

All system calls (bind,connect,accept and so on) take a (struct sockaddr
*), and thus can work with any address family.

However some of the libc functions didn't consider things in a way that
makes IPv6 useful.  gethostbyname(3) for instance can return a list of
addresses, but they all have to be the same address type (thus you can't
get AF_INET and AF_INET6 addresses at the same time), even more
annoyingly the programmer doesn't get to ask if they get AF_INET or
AF_INET6 sockets back, the call just returns one or the other (ie, only
returns AF_INET addresses).   This apears to have been "fixed" at some
point by 'getipnodebyname' calls which let you specify that you want
AF_INET or AF_INET6 (or any other address family) addresses.  Not that
this really solves the problem.

Thus these were all superseeded with getnameinfo(), that given a name
can return a list of addresses, in a specific order, of varying address
families.

So old apps use gethostbyname(), which is a nice trivial interface, that
doesn't do what you want.  New apps use getnameinfo() which is a useful
interface that does what you want, but is slightly more annoying to use
(you have to provide a lot more information to it saying what you want),
so it involves a lot more typing.  Nobody ever uses getipnodebyname().

Things get even more annoying, when you consider (struct sockaddr) was
supposed to be big enough to hold any address type.  But it's not quite
big enough to fit an IPv6 address and all the ancillary information
(port number, scopeid and so on), so now there's "struct
sockaddr_storage".  So if you're allocating space for a generic address
you use "sockaddr_storage", if you want to talk about a generic address
of unknown type, you use "sockaddr", and if you want ipv4, ipv6 or unix
domain sockets, you use sockaddr_in, sockaddr_in6, or sockaddr_un
respectively.

Then we get into the messy condition of converting addresses to a
printable form (not resolving them, just displaying them).

The original interface to convert a sockaddr_in to a printable address
was to use inet_ntoa().  This does have the problem that it only works
on an IPv4 address (sockaddr_in's generally contain an IP address and a
port number), and doesn't work with IPv6 addresses at all.   Then theres
inet_ntop() which can convert an IPv4 or IPv6 address, but you have to
figure out if you're talking about IPv4 or IPv6 addresses before you
call it, which again, makes it mostly useless and annoying to use.  The
"modern" way is to use getaddrinfo(), and pass in the AI_PNUMERICHOST
flag which then avoids trying to resolve it.

If your application specifically wants to talk about just IPv4 addresses
it can in theory use "struct in_addr", and struct in6_addr" for IPv6
addresses. Don't do this. :P

Converting an application to IPv6 is generally fairly straight forward. 
Convert the code that called gethostbyname() to getnameinfo(), convert
anything that used inet_*to*() to use getnameinfo()/getaddrinfo() with
AI_NUMERICHOST.  Convert anything that's listen()ing on sockets to also
listen on a v6 socket.    Things get hairier when you want to step
through a list of addresses returned by getnameinfo().  Make sure that
some of the "sockaddr"'s become "sockaddr_storages" (But only the ones
that allocate space, not ones that refer to a generic address that's
allocated elsewhere).

And then you have to deal with internal address policies.  Where they
are displayed, are they stored, or transfered over the network?  Are
they hashed? are there matches or other bits of code that know about the
structure of an IP address being used?  Does it use : at the end of an
address to represent a port number?  Do the internal protocol(s) use :
for anything special?  This is often where porting an application
becomes really difficult.





_______________________________________________
NZNOG mailing list
NZNOG@list.waikato.ac.nz
http://list.waikato.ac.nz/mailman/listinfo/nznog