CPUG: The Check Point User Group

Resources for the Check Point Community, by the Check Point Community.


First, I hope you're all well and staying safe.
Second, I want to give a "heads up" that you should see more activity here shortly, and maybe a few cosmetic changes.
I'll post more details to the "Announcements" forum soon, so be on the lookout. -E

 

Results 1 to 7 of 7

Thread: ClusterXL standby node strange behavior

  1. #1
    Join Date
    2015-10-21
    Posts
    32
    Rep Power
    0

    Default ClusterXL standby node strange behavior

    Hello.

    We have two ClusterXL Gaia R77.30 identical nodes with following hotfixes installed:
    • Jumbo Hotfix Accumulator General Availability for R77.30 Take 216
    • HOTFIX_R77_30
    Implicitly Installed Hotfixes (installed as part of another hotfix):
    • Check_Point_R77_30_JUMBO_HF_1_Bundle_T205_FULL.tgz

    Acceleration CoreXL and SecureXL are enabled on both nodes.

    Cluster works in HA mode:
    [Expert@gw2:0]# cphaprob stat
    Cluster Mode: High Availability (Primary Up) with IGMP Membership
    Number Unique Address Assigned Load State
    1 172.16.0.2 100% Active
    2 (local) 172.16.0.3 0% Standby

    Problem:
    Few days ago our stanby node (gw2) "automagically" - without any admin intervention - has started to forward some traffic (about 1000 pkt/s) and it's CPU load (as shown by SmartViewMonitor->System Counters->FireWall History) raised up to 30 - 40 %. The active node gw1 is proccessing much, much more traffic - about 20 000 pkt/s by 70% CPU. At the same time icmp probes (ping) time for traffic coming through the cluster raised from few ms to about 60-80 ms.

    Gw2 behaves strange as it would work in LS pivot mode (?!), although is in HA for sure. We have installed security policies few times and even reboot both nodes (of course not simultanously), but it is still the same issue. What's wrong with gw2? If we switch gw2 from standby to active, than gw1 behaves the same (1000 pkt/s, 30 - 40 % CPU).

    Regards
    Mariusz1
    Last edited by Mariusz1; 2017-05-17 at 05:05. Reason: Thread subscription, editing some errors

  2. #2
    Join Date
    2008-01-25
    Location
    Karlsruhe / Germany
    Posts
    15
    Rep Power
    0

    Default Re: ClusterXL standby node strange behavior

    Hi Mariusz,

    My Check Point partner told me there are several instabilities in Jumbo 216 that are already fixed in Jumbo 225.

    ...but I have no details.

    BR
    Sven


    Gesendet von iPhone mit Tapatalk

  3. #3
    Join Date
    2006-09-26
    Posts
    3,200
    Rep Power
    20

    Default Re: ClusterXL standby node strange behavior

    Quote Originally Posted by Chili View Post
    Hi Mariusz,

    My Check Point partner told me there are several instabilities in Jumbo 216 that are already fixed in Jumbo 225.

    ...but I have no details.

    BR
    Sven


    Gesendet von iPhone mit Tapatalk
    I looked at the released notes https://supportcenter.checkpoint.com...Ongoing%20Take and I couldn't find anything relate to this issue.

    it is a big concern for me as well, because I am also running JHFA 216 :-(

  4. #4
    Join Date
    2011-08-02
    Location
    http://spikefishsolutions.com
    Posts
    1,668
    Rep Power
    13

    Default Re: ClusterXL standby node strange behavior

    Can you look at the router on the inside of that firewall and check its arp table and route table?

    What you're looking for is the following.

    Do you have any routes pointing to fw02's interface directly (instead of the VIP ( i know just making sure))

    Next check the arp table and make sure all the next hop gateway address have the mac address of the active member.

    I think VMAC will show in cphaprob stat output, but can you confirm if your using VMAC or not?


    If that doesn't turn up anything i would look at tcpdump -ni $INTERFACE -w output.cap 'host $IP'

    replace $INTERFACE with the interface seeing the traffic and replace $IP with an IP your seeing go to the standby. Pull this into wireshark and check the src and dst mac address of the packets to see if they help uncover who is send the traffic to fw02.


    Maybe that will help uncover something. The other option would be diabling securexl and doing a firewall montior on both firewalls at the same time to see if you can verify if trafffic is really being pivoted, but you'll need to do that on off peak hours as disabling securexl can jack cpu.

  5. #5
    Join Date
    2015-10-21
    Posts
    32
    Rep Power
    0

    Default Re: ClusterXL standby node strange behavior

    Frist of all it appeared before T216 install. We have installed T216 in hope maybe it will resolve our issue. Yes, I confirm - we are using VMAC, but it works fine.

    By the way Gaia Portal (GUI) still shows:
    Check Point Upgrade Service Engine (CPUSE) | R77.30 take 204 Hotfixes | Last updated on: Fri May 19 8:08 2017
    and on Overview:
    Kernel: 2.6.18-92cpx86_64
    Edition: 64-bit
    Build Number: 204

    Why 204 and not 216?

    Besides on gw2 in Gaia Portal on Upgrades (CPUSE)->Status and Actions page we have "Check for Updates", "Import Package" and "Add hotfixes from the cloud" buttons on the upper right, but in gw1 we have only the last one (and "Import package" is in More submenu).

    Regards
    Mariusz1
    Last edited by Mariusz1; 2017-05-19 at 03:50. Reason: Adding few words

  6. #6
    Join Date
    2011-08-02
    Location
    http://spikefishsolutions.com
    Posts
    1,668
    Rep Power
    13

    Default Re: ClusterXL standby node strange behavior

    ah ok, so changing gears. Run the following to try to figure out the take version difference.

    installed_jumbo_take
    /opt/CPinfo-10/bin/cpinfo -y all

    compare the output of the two.

    might as well post all the cphaprob commands from both members as well

    cphaprob stat
    cphaprob -i list
    cphaprob -a if

  7. #7
    Join Date
    2015-10-21
    Posts
    32
    Rep Power
    0

    Default Re: ClusterXL standby node strange behavior

    On both gateways we have:
    [Expert@gw1:0]# installed_jumbo_take
    R77.30 Jumbo Hotfix Accumulator take_216 is installed, see sk106162.
    [Expert@gw1:0]# /opt/CPinfo-10/bin/cpinfo -y all

    This is Check Point CPinfo Build 914000173 for GAIA
    [FW1]
    HOTFIX_R77_30
    HOTFIX_R77_30_JUMBO_HF

    FW1 build number:
    This is Check Point's software version R77.30 - Build 048
    kernel: R77.30 - Build 048

    [SecurePlatform]
    HOTFIX_R77_30_JUMBO_HF

    [CPinfo]
    No hotfixes..

    [PPACK]
    HOTFIX_R77_30
    HOTFIX_R77_30_JUMBO_HF

    [CVPN]
    HOTFIX_R77_30
    HOTFIX_R77_30_JUMBO_HF

    [CPUpdates]
    BUNDLE_R77_30_JUMBO_HF

    [DIAG]
    HOTFIX_R77_30

    [rtm]
    No hotfixes..
    ##########################
    Important:

    It seems like gw2 is stepping in when CPU load on gw1 is about/above 80%. Now it's about 65% on gw1 and gw2 is normal - about 4%.
    Is it Cluster Under Load (CUL) mechanism?

    Regards
    Mariusz1
    Last edited by Mariusz1; 2017-05-22 at 02:39. Reason: Adding some info

Similar Threads

  1. UTM-1 NIC strange behavior
    By MeireleR in forum Check Point UTM-1 Appliances
    Replies: 3
    Last Post: 2012-10-10, 09:48
  2. Strange behavior with ARP table
    By Yannouch68 in forum Check Point IP Appliances and IPSO (Formerly Sold By Nokia)
    Replies: 0
    Last Post: 2009-03-26, 18:04
  3. Strange VPN Behavior
    By Bruce_A in forum Check Point UTM-1 Edge Appliances
    Replies: 1
    Last Post: 2007-12-18, 07:23
  4. Strange Behavior with TFTP process
    By mcarey in forum Services (TCP, UDP, ICMP, etc.)
    Replies: 0
    Last Post: 2007-08-30, 16:50
  5. Strange DNS behavior with SecureClient
    By scsummers in forum SecureClient/SecuRemote
    Replies: 6
    Last Post: 2006-02-17, 00:41

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •