Quote Originally Posted by guyxgreen View Post
You were the closest one to the real core of the problem.
After contacting CP, I can finally say this issue is over! :)
What we found out is that the passive member didn't learn from the active member about the BGP routes it had in it's routing table (apparently that's how CP works with BGP).
The routing updates are sent through the synchronization interface/s via port 2010 as varera mentioned.
When we saw that nothing is being sent, we manually restarted the routing daemon on the passive member and rebooted the machine.
After it powered on - voila!
I could see every BGP route in the routing table marked as 'Kernel'.
Now the BGP failover times were symmetric and after some tuning and tweaking on both CP and Cisco I could lower the times to 10-12 seconds.
It may be worth noting that BGP graceful restart can help significantly here. I run BGP on some of my GAiA firewalls, and I see maybe four dropped frames when my firewalls fail over. Routing is never lost. This is one of the reasons to peer with the VIP rather than the individual cluster members.