CPUG: The Check Point User Group

Resources for the Check Point Community, by the Check Point Community.


First, I hope you're all well and staying safe.
Second, I want to give a "heads up" that you should see more activity here shortly, and maybe a few cosmetic changes.
I'll post more details to the "Announcements" forum soon, so be on the lookout. -E

 

Results 1 to 11 of 11

Thread: ClusterXL unexpected/hidden failover

  1. #1
    Join Date
    2013-09-25
    Location
    Bucharest
    Posts
    649
    Rep Power
    10

    Default ClusterXL unexpected/hidden failover

    Hi guys,

    I noticed today that last evening a failover occurred

    [Expert@krak-001b:0]# cphaprob -l list

    Built-in Devices:

    Device Name: Interface Active Check
    Current state: OK

    Device Name: Recovery Delay
    Current state: OK

    Registered Devices:

    Device Name: Synchronization
    Registration number: 0
    Timeout: none
    Current state: OK
    Time since last report: 81106.9 sec

    Device Name: Filter
    Registration number: 1
    Timeout: none
    Current state: OK
    Time since last report: 81098.4 sec

    Device Name: routed
    Registration number: 2
    Timeout: none
    Current state: OK
    Time since last report: 81097.6 sec

    Device Name: cphad
    Registration number: 3
    Timeout: 30 sec
    Current state: OK
    Time since last report: 828507 sec
    Process Status: UP

    Device Name: fwd
    Registration number: 4
    Timeout: 30 sec
    Current state: OK
    Time since last report: 828495 sec
    Process Status: UP


    Can you guide me in order to find out the root cause please?
    I checked the wrench logs on SmartviewTracker but I saw no real info.

    Thanks!

    P.S. Time since last report: 81106.9 sec shows the time since it did the failover, right?

  2. #2
    Join Date
    2011-08-02
    Location
    http://spikefishsolutions.com
    Posts
    1,668
    Rep Power
    13

    Default Re: ClusterXL unexpected/hidden failover

    I'm not %100, but i think that counter resets after a policy push. I think because clusterxl policy gets reloaded during policy push.

  3. #3
    Join Date
    2009-04-30
    Location
    Colorado, USA
    Posts
    2,252
    Rep Power
    18

    Default Re: ClusterXL unexpected/hidden failover

    Look for events of type "Control" (a grey wrench icon) in SmartView Tracker; "Type" is a very skinny column and hard to find for filtering. In SmartLog and R80+ the filter query is type:Control which is case-sensitive.
    --
    Third Edition of my "Max Power 2020" Firewall Book
    Now Available at http://www.maxpowerfirewalls.com

  4. #4
    Join Date
    2013-09-25
    Location
    Bucharest
    Posts
    649
    Rep Power
    10

    Default Re: ClusterXL unexpected/hidden failover

    Quote Originally Posted by ShadowPeak.com View Post
    Look for events of type "Control" (a grey wrench icon) in SmartView Tracker; "Type" is a very skinny column and hard to find for filtering. In SmartLog and R80+ the filter query is type:Control which is case-sensitive.
    I looked over, but I couldn't find any explanation/message that says anything about a failover occurring hence my forum post...
    Isn't there any other system file on firewall that would tell me more?

  5. #5
    Join Date
    2009-04-30
    Location
    Colorado, USA
    Posts
    2,252
    Rep Power
    18

    Default Re: ClusterXL unexpected/hidden failover

    Quote Originally Posted by laf_c View Post
    I looked over, but I couldn't find any explanation/message that says anything about a failover occurring hence my forum post...
    Isn't there any other system file on firewall that would tell me more?
    /var/log/messages or $FWDIR/log/fwd.elg, also try sniffing around any other .elg files in the $FWDIR/log directory that were touched around that time. Could also try the .elg files in $CPDIR/log but I doubt you'll find anything useful about clustering state there.

    Incidentally in R80.10 gateway there is a new ClusterXL screen in the cpview CLI tool, and I assume like any other screen in cpview that history is kept for it which could have been helpful in your situation.
    --
    Third Edition of my "Max Power 2020" Firewall Book
    Now Available at http://www.maxpowerfirewalls.com

  6. #6
    Join Date
    2014-07-21
    Posts
    57
    Rep Power
    9

    Default Re: ClusterXL unexpected/hidden failover

    Hi,

    don't know which version you are using but GAiA R77.30 with at least JHFA Take 162 up JHFA Take 225 has major memory leaks in FWD.
    When your are using "top" and you can see that e.g. fw_full is running with high "VIRT" and it will increase after every Policy install then you are probably hit by that memory leak. When VIRT reaches 4000m+ fwd will crash and you can see a failover.

    After 4 month of debugging, testing, escalating we got an additional hotfix based on JHFA 225 which fixed the memory leaks.

    So sorry if this is not an exact answer for your question.

    Regards

  7. #7
    Join Date
    2006-03-08
    Location
    Lausanne
    Posts
    1,030
    Rep Power
    19

    Default Re: ClusterXL unexpected/hidden failover

    how do you figure that a failover happened? info provided does not indicate a failover at all. anything else, I guess?
    -------------

    Valeri Loukine
    CCMA, CCSM, CCSI
    http://checkpoint-master-architect.blogspot.com/

  8. #8
    Join Date
    2009-04-30
    Location
    Colorado, USA
    Posts
    2,252
    Rep Power
    18

    Default Re: ClusterXL unexpected/hidden failover

    Try this undocumented command option for a nice concise history of ClusterXL failovers, great when trying to figure out if there is a pattern:

    show routed cluster-state detailed
    Your firewall can be statically routed with no dynamic routing configured and this command will still work.
    --
    Third Edition of my "Max Power 2020" Firewall Book
    Now Available at http://www.maxpowerfirewalls.com

  9. #9
    Join Date
    2005-11-25
    Location
    United States, Southeast
    Posts
    857
    Rep Power
    18

    Default Re: ClusterXL unexpected/hidden failover

    Quote Originally Posted by laf_c View Post
    I looked over, but I couldn't find any explanation/message that says anything about a failover occurring hence my forum post...
    Isn't there any other system file on firewall that would tell me more?
    If you don't see any Control messages, then the firewall didn't fail-over.

    It was likely a failure of some other device.

  10. #10
    Join Date
    2013-09-25
    Location
    Bucharest
    Posts
    649
    Rep Power
    10

    Default Re: ClusterXL unexpected/hidden failover

    Quote Originally Posted by varera View Post
    how do you figure that a failover happened? info provided does not indicate a failover at all. anything else, I guess?
    That was part of my original question :). My assumption was based on

    Time since last report: 81106.9 sec

    But since I couldn't find any SmartViewTracker wrench info, now I doubt "when failover really occured". About 1 year ago, 001A unit was active, but last week when doing some tshoot I found out 001B is active and I assumed 81106.9 sec was the value telling me the time when this occured.

    L.E. that show routed cluster-state detailed looks pretty cool! Thanks!
    Now I can't use it to trace what originally happened as I had to perform emergency maintenance last evening on that site and I rebooted both units.
    Last edited by laf_c; 2017-06-08 at 09:29.

  11. #11
    Join Date
    2006-03-08
    Location
    Lausanne
    Posts
    1,030
    Rep Power
    19

    Default Re: ClusterXL unexpected/hidden failover

    Quote Originally Posted by laf_c View Post
    Now I can't use it to trace what originally happened as I had to perform emergency maintenance last evening on that site and I rebooted both units.
    Mystery solved :-)
    -------------

    Valeri Loukine
    CCMA, CCSM, CCSI
    http://checkpoint-master-architect.blogspot.com/

Similar Threads

  1. ClusterXL Issue with Failover
    By The_Dude in forum Clustering (Security Gateway HA and ClusterXL)
    Replies: 31
    Last Post: 2017-02-02, 06:27
  2. VRRP failover - unexpected behaviour
    By oles.martin@gmail.com in forum Management High Availability
    Replies: 3
    Last Post: 2014-02-20, 08:42
  3. ClusterXL failover timings
    By tangerine0072000 in forum R75.40 (GAiA)
    Replies: 1
    Last Post: 2013-08-30, 10:29
  4. unable to failover r75.30 clusterXL using smartdashboard
    By lordbigsack in forum Clustering (Security Gateway HA and ClusterXL)
    Replies: 4
    Last Post: 2012-03-14, 04:43
  5. interface monitoring for failover in clusterXL
    By sebastan_bach in forum Clustering (Security Gateway HA and ClusterXL)
    Replies: 12
    Last Post: 2010-02-18, 03:05

Tags for this Thread

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •