DNS high availability using ExaBGP

Posted in high availability by aangelis on 17 March 2016

When the infrastructure must comply with high avaliability standards all its components must be resilient to temporarily or permament failure.

DNS service is a critical part of every infrastructure and can be easily replicated on several physical or virtual machines.

But how can we have several live replicas of the same server? One way is to announce a route from the DNS servers to the router. In this case dynamic routing will automatic forward IP traffic to one or several DNS servers that they annouce the proper IP.

Static prefix announcement is half the job. Problems arise when one of the DNS daemon dies and the router keeps forwarding traffic to all replicas of the DNS service. We must make our setup smart enough and forward DNS traffic only to live daemons.

Lets make a demo setup of a DNS instance using Debian Jessie distribution.

This instance has already a DNS service up and running and serves requests on a loopback IP (32.6.104.11).

First we will install ExaBGP package and then we will proceed with the proper configuration.

@dns-01# apt install exabgp

We can now overwrite ExaBGP configuration file

@dns-01# cat << EOF > /etc/exabgp/exabgp.conf
group anycast-test {
  router-id 32.6.103.71;
  local-as 444;

  process watch-application {
    run /bin/bash /etc/exabgp/check-dns.sh;
  }

  neighbor 32.6.103.74 {
    local-address 32.6.103.71;
    peer-as 444;

    static {
      route  32.6.104.11/32 next-hop self watchdog dns;
    }
  }

}
EOF

And finaly we must create the bash script that regularly checks the health of the local DNS daemon.

@dns-01# cat << EOF > /etc/exabgp/check-dns.sh
#!/usr/bin/env bash
STATE="down"
while true; do
  dig somehost.ourdomain.com @127.0.0.1 +short | grep 32 >/dev/null 2>&1 # 32 is part of the somehost.ourdomain.com IP
  if [[ $? == 0 ]]; then
    if [[ "$STATE" != "up" ]]; then
      echo "announce watchdog dns"
      STATE="up"
    fi
  else
    if [[ "$STATE" != "down" ]]; then
      echo "withdraw watchdog dns"
      STATE="down"
    fi
  fi
  sleep 2
done
EOF

@dns-01# chmod u+x /etc/exabgp/check-dns.sh

We are ready to restart exabgp service and test that our configuration works as expected.

@dns-01# systemctl restart exabgp

Output of routing table on router

#rtr-01> / routing bgp peer print
Flags: X - disabled, E - established 
 #   INSTANCE   REMOTE-ADDRESS   REMOTE-AS  
 1 E dc-21-wb   32.6.103.71      444 

#rtr-01> / ip route print where received-from=dns-01
Flags: X - disabled, A - active, D - dynamic, C - connect,
S - static, r - rip, b - bgp, o - ospf, m - mme, 
B - blackhole, U - unreachable, P - prohibit 
 #       DST-ADDRESS      GATEWAY
 0 ADb   32.6.104.11/32   32.6.103.71

As we can see router received our one IP prefix and forwards packets to this new DNS instance.

The final step is to stop DNS daemon and check that DNS service IP withdrawn of the routing table.

@dns-01# systemctl stop bind9

#rtr-01> / ip route print where received-from=dns-01
Flags: X - disabled, A - active, D - dynamic, C - connect,
S - static, r - rip, b - bgp, o - ospf, m - mme, 
B - blackhole, U - unreachable, P - prohibit 
 #       DST-ADDRESS      GATEWAY

Great, everything works as planned. We can now replicate our setup to more machines which can reside on different network subnets and even on different data centers.