North American Network Operators Group|
Date Prev | Date Next | Date Index | Thread Index | Author Index | Historical
IP failover/migration question.
I've got a bit of a network reconfiguration question that I'm wondering if anyone on NANOG might be able to provide a bit of advice on: I'm working on a project to provide failover of entire cluster-based (and so multi-host) applications to a geographically distinct backup site. The general idea is that as one datacentre burns down, a live service may be moved over to an alternate site without any interruption to clients. All of the host-state migration is done using virtual machines and associated magic; I'm trying to get a more clear understanding as to what is involved in terms of moving the IPs, and how fast it can potentially be done. I'm fairly sure that what I would like to do is to arrange what is effectively dual-homing, but with two geographically distinct homes: Assuming that I have an in-service primary site A, and an emergency backup site B, each with a distinct link into a common provider AS, I would configure B's link as redundant into the stub AS for A -- as if the link to B were the redundant link in a (traditional single-site) dual-homing setup. B would additionally host it's own IP range, used for control traffic between the two sites in normal operation. When I desire to migrate hosts to the failover site, B would send a BGP update advertizing that the redundant link should become preferred, and (hopefully) the IGP in the provider AS would seamlessly redirect traffic. Assuming that everything works okay with the virtual machine migration, connections would continue as they were and clients would be unaware of the reconfiguration. Does the routing reconfiguration story here sound plausible? Does anyone have any insight as to how long such a reconfiguration would reasonably take and/or if it is something that I might be able to negotiate a SLA for with a provider if I wanted to actually deploy this sort of redundancy as a service? Is anyone aware of similar high-speed failover schemes in use on the network today? Thoughts appreciated, I hope this is reasonably on-topic for the list. best, a.