CPUG: The Check Point User Group

Resources for the Check Point Community, by the Check Point Community.


Tim Hall has done it again! He has just released the 2nd edition of "Max Power".
Rather than get into details here, I urge you to check out this announcement post.
It's a massive upgrade, and well worth checking out. -E

 

Results 1 to 15 of 15

Thread: RouteD daemon crash - OSPF on GAia

  1. #1
    Join Date
    2012-08-10
    Posts
    18
    Rep Power
    0

    Default RouteD daemon crash - OSPF on GAia

    Hello Guys,

    We had a strange issue couple of days back when all of a sudden OSPF between the Checkpoint firewall and an upstream device failed and when we did a cpstop and cpstart the issue got solved. When we verified the logs matching the time I will found below messages,

    Oct 22 17:23:59 KLIFW2 routed[13363]: OspfClusterTransition(3360): slave to slave event ignoring
    Oct 22 17:24:00 KLIFW2 routed[13363]: Assertion failed routed[13363]: file "ospf/ospf_cluster.c", line 1832: "exportRt->oreASELSA == lsaNodep"
    Oct 22 17:24:00 KLIFW2 routed[13363]: Abort routed[13363] version routed-07.17.2012-21:50:25: Invalid argument
    Oct 22 17:24:00 KLIFW2 pm[5790]: Reaped: routed[13363]
    Oct 22 17:24:00 KLIFW2 pm[5790]: Scheduled routed for +1 secs
    Oct 22 17:24:01 KLIFW2 pm[5790]: Restarted /bin/routed[1269], count=24
    Oct 22 17:24:01 KLIFW2 routed[1269]: task_cmd_init(138): command subsystem initialized.
    Oct 22 17:24:01 KLIFW2 routed[1269]: Start routed[1269] version routed-07.17.2012-21:50:25
    Oct 22 17:24:01 KLIFW2 routed[1269]: task_get_port: getservbyname("iclid", "tcp") failed, using port 667
    Oct 22 17:24:01 KLIFW2 routed[1269]: task_set_option: task IGMP socket 17 option MulticastForwarding(18): Permission denied
    Oct 22 17:24:01 KLIFW2 routed[1269]: igmp_set_mfwd: configuring multicast forwarding fwd=OFF
    Oct 22 17:24:01 KLIFW2 routed[1269]: vrrp_init: setting fw_is_running_vrrp to 0
    Oct 22 17:24:01 KLIFW2 routed[1269]: CLUSTER: Proto 7 enables sending in cluster
    Oct 22 17:24:01 KLIFW2 routed[1269]: Commence routing updates
    Oct 22 17:24:03 KLIFW2 kernel: VPN-1: disconnected from FW-1

    I tried running around SKs but couldn't find the root cause for this issue, any information is much appreciated thanks

    - Krishna

  2. #2
    Join Date
    2011-08-02
    Location
    http://spikefishsolutions.com
    Posts
    1,648
    Rep Power
    9

    Default Re: RouteD daemon crash - OSPF on GAia

    Quote Originally Posted by udupik View Post
    Hello Guys,

    We had a strange issue couple of days back when all of a sudden OSPF between the Checkpoint firewall and an upstream device failed and when we did a cpstop and cpstart the issue got solved. When we verified the logs matching the time I will found below messages,

    Oct 22 17:23:59 KLIFW2 routed[13363]: OspfClusterTransition(3360): slave to slave event ignoring
    Oct 22 17:24:00 KLIFW2 routed[13363]: Assertion failed routed[13363]: file "ospf/ospf_cluster.c", line 1832: "exportRt->oreASELSA == lsaNodep"
    Oct 22 17:24:00 KLIFW2 routed[13363]: Abort routed[13363] version routed-07.17.2012-21:50:25: Invalid argument
    Oct 22 17:24:00 KLIFW2 pm[5790]: Reaped: routed[13363]
    Oct 22 17:24:00 KLIFW2 pm[5790]: Scheduled routed for +1 secs
    Oct 22 17:24:01 KLIFW2 pm[5790]: Restarted /bin/routed[1269], count=24
    Oct 22 17:24:01 KLIFW2 routed[1269]: task_cmd_init(138): command subsystem initialized.
    Oct 22 17:24:01 KLIFW2 routed[1269]: Start routed[1269] version routed-07.17.2012-21:50:25
    Oct 22 17:24:01 KLIFW2 routed[1269]: task_get_port: getservbyname("iclid", "tcp") failed, using port 667
    Oct 22 17:24:01 KLIFW2 routed[1269]: task_set_option: task IGMP socket 17 option MulticastForwarding(18): Permission denied
    Oct 22 17:24:01 KLIFW2 routed[1269]: igmp_set_mfwd: configuring multicast forwarding fwd=OFF
    Oct 22 17:24:01 KLIFW2 routed[1269]: vrrp_init: setting fw_is_running_vrrp to 0
    Oct 22 17:24:01 KLIFW2 routed[1269]: CLUSTER: Proto 7 enables sending in cluster
    Oct 22 17:24:01 KLIFW2 routed[1269]: Commence routing updates
    Oct 22 17:24:03 KLIFW2 kernel: VPN-1: disconnected from FW-1

    I tried running around SKs but couldn't find the root cause for this issue, any information is much appreciated thanks

    - Krishna
    I've seen crashes with R77's routed. Fix is included in the jumbo R77 patch. Not sure if its in one of the HFAs. Since installing the R77 patch we no longer had any routed crashes related to OSPF.

  3. #3
    Join Date
    2008-07-31
    Location
    Netherlands, Europe
    Posts
    1,146
    Rep Power
    12

    Default Re: RouteD daemon crash - OSPF on GAia

    Which version?

    Had a similar issue with BGP same day on 2 different clusters where the VRRP showed state INIT when it happened, later we found BGP had caused the problem, it kept on crashing and therefore failing the VRRP to restore the master state.
    Worst of it all was that the other cluster member did not take over only after changing the priority.

    That was our solution, just flip the cluster, after that all was back to normal and it has not happened since, this was about 3 weeks ago.

    Did you open a ticket? If so can you PM me the ticket, that way they can combine the 2 tickets. We did not have coredumps of the routed daemon.
    Regards, Maarten.
    Triple MDS on R77.30, MDS on R80.10, VSX, GAIA.

  4. #4
    Join Date
    2011-08-02
    Location
    http://spikefishsolutions.com
    Posts
    1,648
    Rep Power
    9

    Default Re: RouteD daemon crash - OSPF on GAia

    you know thats one thing that makes me really sad about gaia. IPSO VRRP has ZERO bugs. Same can't be said about VRRP on gaia.
    Last edited by jflemingeds; 2014-10-24 at 11:08. Reason: autocomplete?

  5. #5
    Join Date
    2011-08-02
    Location
    http://spikefishsolutions.com
    Posts
    1,648
    Rep Power
    9

    Default Re: RouteD daemon crash - OSPF on GAia

    Jumbo patch is documented in sk98028 btw.

  6. #6
    Join Date
    2009-04-30
    Location
    Colorado, USA
    Posts
    2,248
    Rep Power
    14

    Default Re: RouteD daemon crash - OSPF on GAia

    Definitely a code bug. If you do a search for ""abort routed" in SecureKnowledge there are three articles describing various crashes of routed, and at the end of all three articles it says either "Contact support for a hotfix" or "The problem was fixed in R77.XX".

  7. #7
    Join Date
    2011-08-02
    Location
    http://spikefishsolutions.com
    Posts
    1,648
    Rep Power
    9

    Default Re: RouteD daemon crash - OSPF on GAia

    I kind of think routed is turning into gated. Going to turn up dynamic routing? Better call support and find out how to get the latest Routed first.

    Routed crashed? Please send us the output of cpvinfo routed so we can figure out which build you have. :( (cough cough ipsrd better cough cough)

  8. #8
    Join Date
    2008-07-31
    Location
    Netherlands, Europe
    Posts
    1,146
    Rep Power
    12

    Default Re: RouteD daemon crash - OSPF on GAia

    We are running on R77.20 btw and I need to take a look at that SK.
    Regards, Maarten.
    Triple MDS on R77.30, MDS on R80.10, VSX, GAIA.

  9. #9
    Join Date
    2006-09-26
    Posts
    3,190
    Rep Power
    16

    Default Re: RouteD daemon crash - OSPF on GAia

    Quote Originally Posted by msjouw View Post
    We are running on R77.20 btw and I need to take a look at that SK.
    Here is my 2c about running dynamic routing protocol such RIPv2/OSPF/BGP and especially multicast on Checkpoint firewalls (I am "NOT" basing checkpoint here, I am just stating the facts):

    Do NOT do this because the way checkpoint implement these things are very buggy. Even the Checkpoint SE that support my account (who is a great guy btw) admitted it. Furthermore, the support you get from checkpoint TAC engineer is extremely limitted in this area. You're not going to find more than one or two Checkpoint TAC engineers who actually have the in-depth experiences of how dynamic routing protocols and/or multicast work.

    I once had a case with Checkpoint related to a dynamic routing protocol and it took nine (9) months to get the issue resolved. Yes, 9 months, with a lot of pain and frustration.

    I also recently had a Sev 1 experience at work and the TAC engineer was not helpful at all. It seems to me like none of them have any training in this area. They just searching the database of issues that other customers had and relayed them back to me. It was funny that the TAC engineer read of a case that someone opened with Checkpoint last year with similar issues and I told him that it was me who opened the TAC case. He read the case further and confirmed that it was me. He ended up providing us with instructions to get it working again but it did not work. It was my colleague and my boss who figured it out. We were down for like 2 days on this one.

    Avoid running dynamic routing protocol and multicast on the Checkpoint firewalls. You will NOT get much support from Checkpoint in this area.

    If you want to run dynamic routing and multicast, get a device like Cisco or Juniper. These guys have been doing this for years and their TAC engineers know what they are doing when you call in for support. Not the same with Checkpoint on dynamic routing and multicast.

    If you are a F-1 driver, you want to be in a car built by either the German or Italian, not a car made in Korean. Cisco and Juniper are the German (Porsche) and Italian (Ferrari) while Checkpoint is the Korean (Kia)
    Last edited by cciesec2006; 2014-10-27 at 06:14.

  10. #10
    Join Date
    2009-04-30
    Location
    Colorado, USA
    Posts
    2,248
    Rep Power
    14

    Default Re: RouteD daemon crash - OSPF on GAia

    Quote Originally Posted by cciesec2006 View Post
    Here is my 2c about running dynamic routing protocol such RIPv2/OSPF/BGP and especially multicast on Checkpoint firewalls
    Just in general, I've never been a big fan of performing Dynamic Routing on the firewall itself unless absolutely necessary. In my opinion the routers in a network should focus on the routing and firewalls should focus on firewalling. When troubleshooting on a statically-routed firewall it is good to know that the IP routing table is not shifting around underneath you due to a OSPF flap somewhere and subsequent failure to stay converged with the rest of the network. I have set up Dynamic Routing on Check Point firewalls numerous times and my impression is that as long as you are not trying to do anything too complicated it is mostly OK.

  11. #11
    Join Date
    2008-07-31
    Location
    Netherlands, Europe
    Posts
    1,146
    Rep Power
    12

    Default Re: RouteD daemon crash - OSPF on GAia

    As an employee of an MSP I can tell you I have many customers where we use Dynamic routing, A VPN router attached in a internet facing DMZ and one towards the WAN of the customer. This router is often used as IPsec backup for a MPLS connection, this is h=just where you NEED dynamic routing.

    That said, I hate to see that not much has changed here.

    CP bashing (although we say we don not mean to) is still the same, as is the bashing of Support engineers.

    I have enough experience with them and sure I have my beefs with some and I need to escalate more then I would like to, but still they do what they can.

    The cas has been sent up to R&D and that is where it should be.

    A product like this is just supposed to do what it is advertising.

    And please stop about Cisco, as they have so many sub versions of their code that nobody really knows which version to use in any particular situation.
    Last edited by msjouw; 2014-10-25 at 13:51.
    Regards, Maarten.
    Triple MDS on R77.30, MDS on R80.10, VSX, GAIA.

  12. #12
    Join Date
    2011-08-02
    Location
    http://spikefishsolutions.com
    Posts
    1,648
    Rep Power
    9

    Default Re: RouteD daemon crash - OSPF on GAia

    i think maybe he is missing a '""not""' in there.

  13. #13
    Join Date
    2006-09-26
    Posts
    3,190
    Rep Power
    16

    Default Re: RouteD daemon crash - OSPF on GAia

    Quote Originally Posted by msjouw View Post
    As an employee of an MSP I can tell you I have many customers where we use Dynamic routing, A VPN router attached in a internet facing DMZ and one towards the WAN of the customer. This router is often used as IPsec backup for a MPLS connection, this is h=just where you NEED dynamic routing..
    work-around for this is to use GRE over IPSec and you can avoid running dynamic rouiting on the firewalls.

    Quote Originally Posted by msjouw View Post
    I have enough experience with them and sure I have my beefs with some and I need to escalate more then I would like to, but still they do what they can.
    Doing what they can? I am paying for support and they do what they can. Are you you kidding me?

    Quote Originally Posted by msjouw View Post
    A product like this is just supposed to do what it is advertising.
    you mean the product stopped working after reboot from a simple batch. apparently sub-sequent release for this patch didn't require cpstop/cpstart or reboot. Go figure.



    Quote Originally Posted by msjouw View Post
    And please stop about Cisco, as they have so many sub versions of their code that nobody really knows which version to use in any particular situation.
    at least with Cisco, their engineers are knowledgeable with dynamic routing and multicast because that's what they do, unlike Checkpoint TAC engineers. Everytime I call them about multicast and dynamic routing, it is like talking down to a 10 years old.

  14. #14
    Join Date
    2008-07-31
    Location
    Netherlands, Europe
    Posts
    1,146
    Rep Power
    12

    Default Re: RouteD daemon crash - OSPF on GAia

    Update
    After a lot of problems with this, I came to a situation where I could reproduce the problem.

    Finally, after the problem showing itself again on 1 cluster that had the problem earlier and now another one, we have a good view of what is happening and how you can see what is happening.
    When routeD is restarted on the VRRP Master,
    • the routeD process crashes every 10 seconds, using pidof routed you can see the routed process id's
    • the VRRP mode goes back to it's initial state and the coldstart timer will start the countdown, restarting every time routeD crashes
    • the VRRP driver holds the VIP's on the failing box
    • the VRRP driver keeps sending VRRP hello packets from the failing box
    • the backup system remains in Backup state, as it keeps receiving VRRP hello's
    • when you look at show route, a part of the learned routes will show as kernel routes and part of them will show no nexthop and a ? as the route type


    Today we finally were able to create a routeD crash dump, as by default the usermode crashdums are not enabled, enable them with um_core enable and reboot, all coredump settings in clish will not enable them, also ulimit will not do the trick.

    To get the VRRP cluster back to normal operation, just raise the priority on the Backup!! Changing the priority on the Master will not have any effect as this is not forwarded to the VRRP driver.
    As soon as the Master receives a higher priority hello packet however, the driver switches to backup mode, now routed stops crashing and the system comes back to normal live.

    R&D is now investigating the crash dump.

    PS the way to get routed to start the crash loop:
    on the active member issue:
    tellpm process:routed
    tellpm process:routed t

    Make sure that this is not a production cluster as it WILL stop routing....
    Last edited by msjouw; 2015-02-11 at 12:46.
    Regards, Maarten.
    Triple MDS on R77.30, MDS on R80.10, VSX, GAIA.

  15. #15
    Join Date
    2015-03-10
    Location
    Boston, MA
    Posts
    2
    Rep Power
    0

    Default Re: RouteD daemon crash - OSPF on GAia

    Check Point R&D was able to replicate this issue in their labs and has provided a fix into R77.30.
    If fix is desired for some other previous release, please contact Check Point Support.

    Mark Eliscu
    Directory of US R&D
    Check Point Software Technologies




    Quote Originally Posted by msjouw View Post
    Update
    After a lot of problems with this, I came to a situation where I could reproduce the problem.

    Finally, after the problem showing itself again on 1 cluster that had the problem earlier and now another one, we have a good view of what is happening and how you can see what is happening.
    When routeD is restarted on the VRRP Master,
    • the routeD process crashes every 10 seconds, using pidof routed you can see the routed process id's
    • the VRRP mode goes back to it's initial state and the coldstart timer will start the countdown, restarting every time routeD crashes
    • the VRRP driver holds the VIP's on the failing box
    • the VRRP driver keeps sending VRRP hello packets from the failing box
    • the backup system remains in Backup state, as it keeps receiving VRRP hello's
    • when you look at show route, a part of the learned routes will show as kernel routes and part of them will show no nexthop and a ? as the route type


    Today we finally were able to create a routeD crash dump, as by default the usermode crashdums are not enabled, enable them with um_core enable and reboot, all coredump settings in clish will not enable them, also ulimit will not do the trick.

    To get the VRRP cluster back to normal operation, just raise the priority on the Backup!! Changing the priority on the Master will not have any effect as this is not forwarded to the VRRP driver.
    As soon as the Master receives a higher priority hello packet however, the driver switches to backup mode, now routed stops crashing and the system comes back to normal live.

    R&D is now investigating the crash dump.

    PS the way to get routed to start the crash loop:
    on the active member issue:
    tellpm process:routed
    tellpm process:routed t

    Make sure that this is not a production cluster as it WILL stop routing....

Similar Threads

  1. Configuring OSPF on Gaia
    By tiger_jack in forum R75.40 (GAiA)
    Replies: 1
    Last Post: 2014-06-17, 10:56
  2. [Gaia] SNMPd crash upon topology change
    By TommyBoay in forum SNMP
    Replies: 2
    Last Post: 2013-07-11, 00:24
  3. Replies: 2
    Last Post: 2013-06-11, 17:09
  4. OSPF R75.40 Gaia and ClusterXL active/passive and anti-spoofing
    By nolan.rumble in forum Dynamic Routing
    Replies: 0
    Last Post: 2012-07-02, 07:04
  5. OSPF in R75.40 Gaia Cluster XL
    By Ivan.wwwcom.ru in forum Dynamic Routing
    Replies: 1
    Last Post: 2012-06-01, 06:06

Tags for this Thread

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •