While it is no longer shiny and new, I just recently got a chance to sit down andplay with Amazon Route 53’s DNS failover
feature. So far, I have found it to be simple and very useful for simple
cases where DNS fail-over is acceptable.
My usage case
I run EVE Market Data Relay (EMDR), which
is a distributed EVE Online market data distribution system. All pieces of
the infrastructure have at least one redundant copy, and the only single
point of failure is the DNS service itself. We can afford to lose a little
bit of data during fail-over, but a complete outage is something we can’t have.
Sitting at the top of the system are two HTTP gateways on different
machines at different ISPs. These are set up as a weighted record set, with
each gateway weighing in at 50/50 (requests are divided evenly).
We introduce the Route 53 magic by adding in a health check and associating
it with each of the two resource record sets. The health check involves
Route 53 servers around the world periodically calling a pre-determined
URL on the two HTTP gateways in search for a non-error HTTP status code.
If any of the entries fails more than three times (they check roughly
every 30 seconds), said entry is removed from the weighted set.
By the time that Route 53 picks up on the failure, yanks the entry from
the weighted set, and most fast ISP DNS servers notice the change, about
two minutes have elapsed.
Why this is a good fit for EMDR
With EVE Market Data Relay, it’s not the end of
the world if 50% of user-submitted data gets lost over the minute and a half
it takes for Route 53 to remove the unhealthy gateway. It’s highly likely that
another user will re-submit the very same data that was lost. Even if we never
see the data, the loss of a few data points here and there doesn’t hurt us much
in our case.
With that said, DNS failover in general can be sub-optimal in a few basic cases:
- You don’t want to leave failover up to the many crappy ISP DNS servers
around the net. Not all will pick up the change in a timely manner.
- You can’t afford to lose some requests here and there. DNS failover isn’t
seamless, so your application would need to be smart enough on both
ends if data loss is unacceptable.
For more simple cases like mine, it’s wonderful.
Price
In my case, Route 53 is health checking two servers that are external to
AWS, which means I spend a whopping $1.50/month on Route 53’s DNS failover.