Here are the before and after pictures. We used to have two independent single points of failure.
If boris was down, nobody could get an IP address, and internal name resolution didn’t work because the cache couldn’t see the tinydns server (on boris) that is authoritative for the internal network.
If draco the firewall was down, not only could nobody connect to the internet (after all, that is the point of having a firewall), and external name resolution didn’t work. But internal name resolution didn’t work either, because the single caching name server (dnscache) lived on draco.
That’s a result of the “djb way,” where name servers are separate from caches. Tinydns won’t answer queries for a domain which it’s not authoritative over, and dnscache won’t ever give an authoritative answer, but only process requests from clients, get answers from authoritative servers, and cache the results.
In the new setup, there are no single points of failure for dhcp or name service. (There’s still a single firewall host.) We moved all functions off of boris, which is slated for decommissioning.
Draco keeps a dnscache, since it has easy access to outside name servers. It picks up a dhcpd server and associated tinydns. I made it the dhcpd master, because it’s the least-frequently-rebooted server.
Since dnscache and tinydns use the same port (53/udp, the domain service), I had to configure an ethernet alias on draco to serve the tinydns off of. (I couldn’t use localhost, because it needs to be visible to the other dnscache server if the other tinydns server is down.)
Taurus, a new machine still figuring out its identity (external web/intranet/wiki/login/database/…?), gets the dhcpd slave server and an associated tinydns. At first I thought it made sense to separate the dhcpd from the tinydns server, but eventually I recognized that if one dhcpd server was down, it would not do any good to be trying to fetch its dhcpd.leases file to build a dns table. The functions are so closely associated — dhcp gives named hosts an IP address while tinydns reports what the IP address is.
Scorpio, the backup server (BackupPC/snapback), picks up an additional function as an internal name cache. That was a last-minute decision when I realized I needed a second name cache, and I’m not entirely happy about it. Since scorpio carries backup data, I’d like to be able to unplug it and take it offsite (e.g., if I’m reconstructing a crashed machine from its backups) without disrupting the network. So that’s going to have to change. I suppose there’s no deep reason why I can’t put the dnscache on taurus, but I’d have to configure another alias…
That’s the overall layout for redundant dhcpd and internal name service using tinydns and dnscache. Next post: “challenges” (aka annoying problems) I encountered making this work.
Post a Comment