CPUG: The Check Point User Group

Resources for the Check Point Community, by the Check Point Community.


CPUG Challenge 2018?? We will be holding another CPUG Challenge for 2018.
The plan is to time it around CPX again (earlier this year), but not necessarily limit it to those in attendance.
I'll provide more details as we get a bit closer, but be ready! -E

 

Page 1 of 3 123 LastLast
Results 1 to 20 of 43

Thread: Freezes/Lock-Out on our firewall that have CP puzzled.

  1. #1
    Join Date
    2017-11-01
    Posts
    20
    Rep Power
    0

    Default Freezes/Lock-Out on our firewall that have CP puzzled.

    Hi,

    We have some ‘freezes’/lock-outs with our DMZ firewall. This is a 12400 appliance for reference.

    These instances happen at any time. There is no correlation at all with these events. Some have happened at 1am, and some have happened at 3pm.

    The issue is that traffic does not pass through the firewall. TCPDUMP shows traffic hitting the inbound interface and not leaving the outbound interface.

    Fw ctl debugs with various flags show no drops at all. The traffic disappears!

    This is critically business impacting however, and the issue is only resolved with a CPSTOP and Start.

    CPU utilisation during this time does not hit anything over 35% and memory nothing over 30%.

    Checkpoint are puzzled. We have done various CPSizeMe’s and CPInfo’s and Checkpoint have verified the boxes are completely healthy and not overworking.

    The box during this time is alive and well from a monitoring PoV. SmartDashboard sees it fine too.

    The only other time this happens, and it’s the exact same symptoms is when we do a policy installation. However, the only difference is that this issue will self-rectify after 10/15 minutes (which is still not ideal of course)

    Traffic after a policy installation will disappear in the firewall, and not leave the other side, with no logging or drop events.

    Weirdly also, some services, in the same rule set, will come back sooner than others after a policy install. For example – Web Server A, B and C are all on rule 50. A and C will be accessible after 10 minutes’ish, but B will stay down for the next 15 minutes… which makes no sense at all.

    So, in summary we have 2 scenarios: 1) Random occurrences which cause a complete loss in service from the firewall, only rectified with a CPSTOP and Start. 2) After a policy installation, which has exact same symptoms but will self-rectify.

    Checkpoint are puzzled. Anyone else have any ideas?

    This is business critical, and any suggestions would be greatly appreciated

  2. #2
    Join Date
    2006-09-26
    Posts
    3,055
    Rep Power
    15

    Default Re: Freezes/Lock-Out on our firewall that have CP puzzled.

    Quote Originally Posted by JPYDX View Post
    Hi,

    We have some ‘freezes’/lock-outs with our DMZ firewall. This is a 12400 appliance for reference.

    These instances happen at any time. There is no correlation at all with these events. Some have happened at 1am, and some have happened at 3pm.

    The issue is that traffic does not pass through the firewall. TCPDUMP shows traffic hitting the inbound interface and not leaving the outbound interface.

    Fw ctl debugs with various flags show no drops at all. The traffic disappears!

    This is critically business impacting however, and the issue is only resolved with a CPSTOP and Start.

    CPU utilisation during this time does not hit anything over 35% and memory nothing over 30%.

    Checkpoint are puzzled. We have done various CPSizeMe’s and CPInfo’s and Checkpoint have verified the boxes are completely healthy and not overworking.

    The box during this time is alive and well from a monitoring PoV. SmartDashboard sees it fine too.

    The only other time this happens, and it’s the exact same symptoms is when we do a policy installation. However, the only difference is that this issue will self-rectify after 10/15 minutes (which is still not ideal of course)

    Traffic after a policy installation will disappear in the firewall, and not leave the other side, with no logging or drop events.

    Weirdly also, some services, in the same rule set, will come back sooner than others after a policy install. For example – Web Server A, B and C are all on rule 50. A and C will be accessible after 10 minutes’ish, but B will stay down for the next 15 minutes… which makes no sense at all.

    So, in summary we have 2 scenarios: 1) Random occurrences which cause a complete loss in service from the firewall, only rectified with a CPSTOP and Start. 2) After a policy installation, which has exact same symptoms but will self-rectify.

    Checkpoint are puzzled. Anyone else have any ideas?

    This is business critical, and any suggestions would be greatly appreciated

    What version of checkpoint and hotfix?

  3. #3
    Join Date
    2017-11-01
    Posts
    20
    Rep Power
    0

    Default Re: Freezes/Lock-Out on our firewall that have CP puzzled.

    R77.30 with take 216. Take 286 not installed.

  4. #4
    Join Date
    2011-08-02
    Location
    http://spikefishsolutions.com
    Posts
    1,480
    Rep Power
    8

    Default Re: Freezes/Lock-Out on our firewall that have CP puzzled.

    Quote Originally Posted by JPYDX View Post
    R77.30 with take 216. Take 286 not installed.
    Reply with the output of the following from both cluster members. Can you explain firewall topology as well? It seems like your saying only the dmz interface is going mia correct?

    dmesg
    fw ctl arp
    ifconfig
    fwaccel stat
    cphaprob stat
    cphaprob -a if
    cphaprob -i list
    fw ctl pstat

    Also what blades do you have enabled? Can’t remember the command to show that.

  5. #5
    Join Date
    2009-04-30
    Location
    Colorado, USA
    Posts
    2,088
    Rep Power
    12

    Default Re: Freezes/Lock-Out on our firewall that have CP puzzled.

    This sounds like it might be a ARP/network issue, read on...

    Quote Originally Posted by JPYDX View Post
    Hi,

    We have some ‘freezes’/lock-outs with our DMZ firewall. This is a 12400 appliance for reference.

    These instances happen at any time. There is no correlation at all with these events. Some have happened at 1am, and some have happened at 3pm.

    The issue is that traffic does not pass through the firewall. TCPDUMP shows traffic hitting the inbound interface and not leaving the outbound interface.
    Just because tcpdump shows traffic hitting an interface does not mean that Gaia picked up the packet off the wire and processed it, especially since running tcpdump puts the interface into promiscuous mode at which point it will forward everything. During the next problem period, run tcpdump with the -p option (disable promisc mode) and -e option. Do the packets still appear? Are they going to the correct destination MAC address corresponding to that interface of the firewall? Once those two things are verified, run fw monitor -m i, is the packet actually arriving at the INSPECT driver on the firewall? My guess is no.


    Fw ctl debugs with various flags show no drops at all. The traffic disappears!

    This is critically business impacting however, and the issue is only resolved with a CPSTOP and Start.
    Probably because the packet is never hitting the INSPECT driver in the first place. Also during the next problem period, disable SecureXL with "fwaccel off". Does the problem immediately correct itself? Could be an issue between SecureXL and the INSPECT driver (PXL/F2F).

    CPU utilisation during this time does not hit anything over 35% and memory nothing over 30%.

    Checkpoint are puzzled. We have done various CPSizeMe’s and CPInfo’s and Checkpoint have verified the boxes are completely healthy and not overworking.

    The box during this time is alive and well from a monitoring PoV. SmartDashboard sees it fine too.

    The only other time this happens, and it’s the exact same symptoms is when we do a policy installation. However, the only difference is that this issue will self-rectify after 10/15 minutes (which is still not ideal of course)
    That still sounds like a network/ARP issue to me.


    Traffic after a policy installation will disappear in the firewall, and not leave the other side, with no logging or drop events.

    Weirdly also, some services, in the same rule set, will come back sooner than others after a policy install. For example – Web Server A, B and C are all on rule 50. A and C will be accessible after 10 minutes’ish, but B will stay down for the next 15 minutes… which makes no sense at all.

    So, in summary we have 2 scenarios: 1) Random occurrences which cause a complete loss in service from the firewall, only rectified with a CPSTOP and Start. 2) After a policy installation, which has exact same symptoms but will self-rectify.

    Checkpoint are puzzled. Anyone else have any ideas?

    This is business critical, and any suggestions would be greatly appreciated
    Are you seeing any "neighbor table overflow" messages in syslog? How many ARP entries in the cache during the problem period? (arp -an | wc -l)

    Please provide output of following commands, hopefully right after a problem and assumes the firewall has not been rebooted since a problem period:

    ethtool -S (internal interface name)
    netstat -ni
    netstat -s
    cpstat -f sensors os
    cpstat -f power_supply os

    So to summarize during the next problem period try this:

    1) Verify packets are actually hitting INSPECT driver
    tcpdump -p -e -ni (internal interface name)
    fw monitor -m i

    2) Check ARP Cache Size
    arp an | wc -l

    3) Try disabling SecureXL
    fwaccel off
    (check condition)
    fwaccel on
    --
    My Book "Max Power: Check Point Firewall Performance Optimization"
    Second Edition Coming Soon

  6. #6
    Join Date
    2011-08-02
    Location
    http://spikefishsolutions.com
    Posts
    1,480
    Rep Power
    8

    Default Re: Freezes/Lock-Out on our firewall that have CP puzzled.

    I was thinking arp issue as well but couldn’t tie it to policy install unless maybe there are a lot of nats using local subnet and maybe a failover is happening post policy install. Maybe garps aren’t being processed or something.

    Neighbor table overflow will show up in dmesg as well.

  7. #7
    Join Date
    2017-11-01
    Posts
    20
    Rep Power
    0

    Default Re: Freezes/Lock-Out on our firewall that have CP puzzled.

    I will check syslog now, and also take all this down for the next occurance.

    Thanks all again, this is massively helpful.

    Question - when you suggest it could be ARP issues, what in particular about ARP? Is there anything else I could check outside of the CP to check ARP?

  8. #8
    Join Date
    2011-08-02
    Location
    http://spikefishsolutions.com
    Posts
    1,480
    Rep Power
    8

    Default Re: Freezes/Lock-Out on our firewall that have CP puzzled.

    Linux has a max size for the arp table. Neghbor table overflow mean you hit the max which by default is I think 1024? You can bump the size without issue. From clash i think it’s like set arp-cache size or something like that.

    Btw you don’t have to wait until next failure to run those commands.

  9. #9
    Join Date
    2009-04-30
    Location
    Colorado, USA
    Posts
    2,088
    Rep Power
    12

    Default Re: Freezes/Lock-Out on our firewall that have CP puzzled.

    Quote Originally Posted by jflemingeds View Post
    Linux has a max size for the arp table. Neghbor table overflow mean you hit the max which by default is I think 1024? You can bump the size without issue. From clash i think it’s like set arp-cache size or something like that.

    Btw you don’t have to wait until next failure to run those commands.
    Default ARP cache size is 4096 in modern Gaia versions and should not be increased unless necessary.

    Overall this just smells like a network-level issue which can stymie Check Point support since the issue (probably) has nothing to do directly with Check Point code.
    --
    My Book "Max Power: Check Point Firewall Performance Optimization"
    Second Edition Coming Soon

  10. #10
    Join Date
    2006-09-26
    Posts
    3,055
    Rep Power
    15

    Default Re: Freezes/Lock-Out on our firewall that have CP puzzled.

    Quote Originally Posted by jflemingeds View Post
    Linux has a max size for the arp table. Neghbor table overflow mean you hit the max which by default is I think 1024? You can bump the size without issue. From clash i think it’s like set arp-cache size or something like that.

    Btw you don’t have to wait until next failure to run those commands.
    Ran into this issue about 4 months ago and it was an ARP table overflow. network tools tried to detect all the hosts behind firewalls, less than 500 hosts but because it scans using /16 network, it took out the firewall, even when the MAC addresses are incomplete.

    since you have a 12400 appliances, just increase the arp size from the default of 1024 to 16384. it will take effect on the fly:

    Power-1-P> show configuration arp
    set arp table cache-size 16384
    set arp table validity-timeout 60
    set arp announce 2
    Power-1-P> set arp table cache-size 16384
    Power-1-P> save config
    Power-1-P>

  11. #11
    Join Date
    2017-11-01
    Posts
    20
    Rep Power
    0

    Default Re: Freezes/Lock-Out on our firewall that have CP puzzled.

    The ARP bit is confusing me.

    If traffic is getting to the outside interface on the firewall, then ARP surely isn't an issue since it is made it there?

    Also, if you are suggesting ARP is a an issue on the external interface of the firewall, then wouldnt this show in the logs

    dmesg currently showing

    [fw4_0];FW-1: SIM (SecureXL Implementation Module) SecureXL device detected.
    [fw4_1];FW-1: fwha_set_new_local_state: Setting state of fwha_local_id(0) to ACTIVE
    [fw4_1];fwioctl: Policy has ended. Continuing extending dead timouts (fwha_cul_policy_done_time=188361678)
    [fw4_1];FW-1: [CUL - Member] Policy Freeze mechanism disabled, Enabling state machine at 4 (time=188361678, caller=fwioctl: FWHA_CUL_POLICY_STATE_FREEZE)
    [fw4_1];FW-1: [freeze_on_remote] freeze state on remote member 1 has changed from 1 to 0
    [fw4_1]; eth7
    [fw4_1];Stopping ClusterXL
    [fw4_1];Starting ClusterXL
    [fw4_1];FW-1: fwha_set_new_local_state: Setting state of fwha_local_id(0) to ACTIVE
    [fw4_0];FW-1: SIM (SecureXL Implementation Module) SecureXL device detected.
    [fw4_1];fwioctl: Policy has ended. Continuing extending dead timouts (fwha_cul_policy_done_time=188465695)
    [fw4_0];FW-1: SIM (SecureXL Implementation Module) SecureXL device detected.
    [fw4_1];fwioctl: Policy has ended. Continuing extending dead timouts (fwha_cul_policy_done_time=188609835)
    [fw4_0];FW-1: SIM (SecureXL Implementation Module) SecureXL device detected.
    [fw4_1];fwioctl: Policy has ended. Continuing extending dead timouts (fwha_cul_policy_done_time=188753961)
    [fw4_0];FW-1: SIM (SecureXL Implementation Module) SecureXL device detected.
    [fw4_1];fwioctl: Policy has ended. Continuing extending dead timouts (fwha_cul_policy_done_time=188898082)
    [fw4_0];FW-1: SIM (SecureXL Implementation Module) SecureXL device detected.
    [fw4_1];fwioctl: Policy has ended. Continuing extending dead timouts (fwha_cul_policy_done_time=189042210)
    [fw4_0];fw_kmalloc_impl: alloc_ranges: allocates 0 bytes
    [fw4_1];fw_kmalloc_impl: alloc_ranges: allocates 0 bytes
    [fw4_2];fw_kmalloc_impl: alloc_ranges: allocates 0 bytes
    [fw4_0];FW-1: SIM (SecureXL Implementation Module) SecureXL device detected.
    [fw4_1];fwioctl: Policy has ended. Continuing extending dead timouts (fwha_cul_policy_done_time=189186351)
    [fw4_0];FW-1: SIM (SecureXL Implementation Module) SecureXL device detected.
    [fw4_1];fwioctl: Policy has ended. Continuing extending dead timouts (fwha_cul_policy_done_time=189330491)
    [fw4_0];FW-1: SIM (SecureXL Implementation Module) SecureXL device detected.
    [fw4_1];fwioctl: Policy has ended. Continuing extending dead timouts (fwha_cul_policy_done_time=189474627)
    [fw4_0];FW-1: SIM (SecureXL Implementation Module) SecureXL device detected.
    [fw4_1];fwioctl: Policy has ended. Continuing extending dead timouts (fwha_cul_policy_done_time=189618750)
    [fw4_0];FW-1: SIM (SecureXL Implementation Module) SecureXL device detected.
    [fw4_1];fwioctl: Policy has ended. Continuing extending dead timouts (fwha_cul_policy_done_time=189762887)
    [fw4_1];FW-1: [freeze_on_remote] freeze state on remote member 1 has changed from 0 to 1
    [fw4_1];fwioctl: Policy has started. Extending dead timeouts
    [fw4_1];FW-1: [cul_policy_freeze][CUL - Member] fwha_cul_policy_freeze_state_change: set Policy Freeze [ON], FREEZING state machine at ACTIVE (time=189888994, caller=fwioctl: FWHA_CUL_POLICY_STATE_FREEZE, freeze_timeout=300, freeze_event_timeout=150)
    [fw4_1];fwha_hp_periodic_run: Policy has ended 120 seconds ago. Returning to regular timeouts
    [fw4_1];FW-1: [freeze_on_remote] freeze state on remote member 1 has changed from 1 to 0
    [fw4_0];fw_kmalloc_impl: alloc_ranges: allocates 0 bytes
    [fw4_1];FW-1: [CUL - Member] Policy Freeze mechanism disabled, Enabling state machine at 4 (time=189889294, caller=fwha_hp_periodic_run: FWHA_CUL_POLICY_STATE_FREEZE_TIMEDOUT)
    [fw4_1];fw_kmalloc_impl: alloc_ranges: allocates 0 bytes
    [fw4_2];fw_kmalloc_impl: alloc_ranges: allocates 0 bytes
    [fw4_0];fw_kmalloc_impl: b_replace: allocates 0 bytes
    [fw4_1];fw_kmalloc_impl: b_replace: allocates 0 bytes
    [fw4_2];fw_kmalloc_impl: b_replace: allocates 0 bytes
    [fw4_1];FW-1: fwha_set_new_local_state: Setting state of fwha_local_id(0) to FAILURE
    [fw4_0];FW-1: SIM (SecureXL Implementation Module) SecureXL device detected.
    [fw4_1];FW-1: fwha_set_new_local_state: Setting state of fwha_local_id(0) to ACTIVE
    [fw4_1];fwioctl: Policy has ended. Continuing extending dead timouts (fwha_cul_policy_done_time=189889303)
    [fw4_1]; eth7
    [fw4_1];Stopping ClusterXL
    [fw4_1];Starting ClusterXL
    [fw4_1];FW-1: fwha_set_new_local_state: Setting state of fwha_local_id(0) to ACTIVE
    [fw4_0];FW-1: SIM (SecureXL Implementation Module) SecureXL device detected.
    [fw4_1];fwioctl: Policy has ended. Continuing extending dead timouts (fwha_cul_policy_done_time=189908028)
    [fw4_0];fw_kmalloc_impl: alloc_ranges: allocates 0 bytes
    [fw4_1];fw_kmalloc_impl: alloc_ranges: allocates 0 bytes
    [fw4_2];fw_kmalloc_impl: alloc_ranges: allocates 0 bytes
    [fw4_0];FW-1: SIM (SecureXL Implementation Module) SecureXL device detected.
    [fw4_1];fwioctl: Policy has ended. Continuing extending dead timouts (fwha_cul_policy_done_time=190052172)
    [fw4_0];FW-1: SIM (SecureXL Implementation Module) SecureXL device detected.
    [fw4_1];fwioctl: Policy has ended. Continuing extending dead timouts (fwha_cul_policy_done_time=190196322)
    [fw4_0];FW-1: SIM (SecureXL Implementation Module) SecureXL device detected.
    [fw4_1];fwioctl: Policy has ended. Continuing extending dead timouts (fwha_cul_policy_done_time=190340557)
    [fw4_0];FW-1: SIM (SecureXL Implementation Module) SecureXL device detected.
    [fw4_1];fwioctl: Policy has ended. Continuing extending dead timouts (fwha_cul_policy_done_time=190484689)
    [fw4_0];FW-1: SIM (SecureXL Implementation Module) SecureXL device detected.
    [fw4_1];fwioctl: Policy has ended. Continuing extending dead timouts (fwha_cul_policy_done_time=190628806)
    [fw4_0];FW-1: SIM (SecureXL Implementation Module) SecureXL device detected.
    [fw4_1];fwioctl: Policy has ended. Continuing extending dead timouts (fwha_cul_policy_done_time=190772942)
    [fw4_0];fw_kmalloc_impl: alloc_ranges: allocates 0 bytes
    [fw4_1];fw_kmalloc_impl: alloc_ranges: allocates 0 bytes
    [fw4_2];fw_kmalloc_impl: alloc_ranges: allocates 0 bytes
    [fw4_0];FW-1: SIM (SecureXL Implementation Module) SecureXL device detected.
    [fw4_1];fwioctl: Policy has ended. Continuing extending dead timouts (fwha_cul_policy_done_time=190917082)
    [fw4_0];FW-1: SIM (SecureXL Implementation Module) SecureXL device detected.
    [fw4_1];fwioctl: Policy has ended. Continuing extending dead timouts (fwha_cul_policy_done_time=191061321)
    [fw4_0];FW-1: SIM (SecureXL Implementation Module) SecureXL device detected.
    [fw4_1];fwioctl: Policy has ended. Continuing extending dead timouts (fwha_cul_policy_done_time=191205448)

    This conversation is really appreciated.

  12. #12
    Join Date
    2017-11-01
    Posts
    20
    Rep Power
    0

    Default Re: Freezes/Lock-Out on our firewall that have CP puzzled.

    Also - here we are.

    This issue has only started in the last 4 months or so. The checkpoints have been in for over 2 years.

    Why suddenly would ARP be an issue?

    show configuration arp
    set arp table cache-size 4096
    set arp table validity-timeout 60
    set arp announce 2

  13. #13
    Join Date
    2007-03-30
    Location
    DFW, TX
    Posts
    103
    Rep Power
    11

    Default Re: Freezes/Lock-Out on our firewall that have CP puzzled.

    Quote Originally Posted by JPYDX View Post
    The ARP bit is confusing me.

    If traffic is getting to the outside interface on the firewall, then ARP surely isn't an issue since it is made it there?

    Also, if you are suggesting ARP is a an issue on the external interface of the firewall, then wouldnt this show in the logs
    ARP issues manifest as the firewall thinking it’s passing a connection, but fw monitor only shows i-I with no o-O. This is because the ARP resolution happens at the OS level before the system tries to clock a frame out to the wire. If the system can’t get a destination MAC to put in the frame, it never tries to actually serialize the frame for transmission. It would almost certainly happen on all interfaces at once. It sounds like this matches your symptoms.

    If it is an ARP issue, a tcpdump filtered to exclude the firewall’s MAC address in the destination (with the intent of only seeing broadcast or traffic from the firewall) should show it. Specifically, it would show a sudden drop off in other traffic and a set of ARP requests.
    Zimmie

  14. #14
    Join Date
    2017-11-01
    Posts
    20
    Rep Power
    0

    Default Re: Freezes/Lock-Out on our firewall that have CP puzzled.

    Okay - so ARP could be an issue we are saying?

    1) - Why would this only be rectified with a CPSTOP and Start? What happens to the ARP cache then to cause this to start working again?
    2) - What happens to the arp during a policy installation? What causes the temporary loss there?

    Sorry for the questions. Its helping me get a better understanding of how this could be the issue.

  15. #15
    Join Date
    2011-08-02
    Location
    http://spikefishsolutions.com
    Posts
    1,480
    Rep Power
    8

    Default Re: Freezes/Lock-Out on our firewall that have CP puzzled.

    Yeah that is the thing. If it was arp I would expect it to be happening at random times and not around policy install.

    Those kmalloc lines don't look good. Can you past fw ctl pat at and the memory line from top?

  16. #16
    Join Date
    2017-11-01
    Posts
    20
    Rep Power
    0

    Default Re: Freezes/Lock-Out on our firewall that have CP puzzled.

    Quote Originally Posted by JPYDX View Post
    Hi,

    We have some ‘freezes’/lock-outs with our DMZ firewall. This is a 12400 appliance for reference.

    These instances happen at any time. There is no correlation at all with these events. Some have happened at 1am, and some have happened at 3pm.

    The issue is that traffic does not pass through the firewall. TCPDUMP shows traffic hitting the inbound interface and not leaving the outbound interface.

    Fw ctl debugs with various flags show no drops at all. The traffic disappears!

    This is critically business impacting however, and the issue is only resolved with a CPSTOP and Start.

    CPU utilisation during this time does not hit anything over 35% and memory nothing over 30%.

    Checkpoint are puzzled. We have done various CPSizeMe’s and CPInfo’s and Checkpoint have verified the boxes are completely healthy and not overworking.

    The box during this time is alive and well from a monitoring PoV. SmartDashboard sees it fine too.

    The only other time this happens, and it’s the exact same symptoms is when we do a policy installation. However, the only difference is that this issue will self-rectify after 10/15 minutes (which is still not ideal of course)

    Traffic after a policy installation will disappear in the firewall, and not leave the other side, with no logging or drop events.

    Weirdly also, some services, in the same rule set, will come back sooner than others after a policy install. For example – Web Server A, B and C are all on rule 50. A and C will be accessible after 10 minutes’ish, but B will stay down for the next 15 minutes… which makes no sense at all.

    So, in summary we have 2 scenarios: 1) Random occurrences which cause a complete loss in service from the firewall, only rectified with a CPSTOP and Start. 2) After a policy installation, which has exact same symptoms but will self-rectify.

    Checkpoint are puzzled. Anyone else have any ideas?

    This is business critical, and any suggestions would be greatly appreciated

    Adding more to this as I’ve forgot some bits of Maybe good information.

    When a policy push is occurring, the slave member in the VRRP set up becomes active for 1 second roughly. No longer. So both members are master/master. This is only for a second however. Is this usual?

    The last fw ctl debugs didn’t show much apart from fw_run_ filter and some dropped packets due to INDOM- which I’m now aware is domain objects. We only use 2 domain objects in our rule set, and they are at the bottom as per CP suggestions. Fail to see how 2 objects could a affect a rule base of 100’s.. but however, worth noting.

    Also, directly after the issue occurs (regardless whether it’s a policy push or not) cpinfo fails to verify CK. it’s says unable to resolve host.. DNS?

    Hopefully these extras bits might help a few theories.

    Jflemingeds - I’ll get back to you shortly. What does this mean?

  17. #17
    Join Date
    2006-09-26
    Posts
    3,055
    Rep Power
    15

    Default Re: Freezes/Lock-Out on our firewall that have CP puzzled.

    Quote Originally Posted by JPYDX View Post
    Okay - so ARP could be an issue we are saying?

    1) - Why would this only be rectified with a CPSTOP and Start? What happens to the ARP cache then to cause this to start working again?
    2) - What happens to the arp during a policy installation? What causes the temporary loss there?

    Sorry for the questions. Its helping me get a better understanding of how this could be the issue.
    I can't explain your situation either but in my situation, a cpstop/cpstart did NOT resolve the issue. I had to make the Standby firewall Active and reboot the previous trouble active firewall to resolve the issue.

    I can repeat this issue multiple times when I have the arp size set to default 4096. since I set it to 16384, I have not had issue yet :-) but I also control the ping sweep server to blast out /16 causing ARP table to fill up.

    Please increase the ARP size to 16384 on your production firewalls. It will do nothing but help. Not sure why Checkpoint does not do that in the first place.

    do you send syslog from the gateways to external syslog server so that when the issue occurs, you can go back and track down?

  18. #18
    Join Date
    2017-11-01
    Posts
    20
    Rep Power
    0

    Default Re: Freezes/Lock-Out on our firewall that have CP puzzled.

    Quote Originally Posted by jflemingeds View Post
    Yeah that is the thing. If it was arp I would expect it to be happening at random times and not around policy install.

    Those kmalloc lines don't look good. Can you past fw ctl pat at and the memory line from top?
    https://supportcenter.checkpoint.com...ionid=sk116956

    Looks like its normal.

  19. #19
    Join Date
    2007-03-30
    Location
    DFW, TX
    Posts
    103
    Rep Power
    11

    Default Re: Freezes/Lock-Out on our firewall that have CP puzzled.

    Quote Originally Posted by JPYDX View Post
    Okay - so ARP could be an issue we are saying?

    1) - Why would this only be rectified with a CPSTOP and Start? What happens to the ARP cache then to cause this to start working again?
    2) - What happens to the arp during a policy installation? What causes the temporary loss there?

    Sorry for the questions. Its helping me get a better understanding of how this could be the issue.
    I'll answer question 2 first, because it's easier. Absolutely nothing should happen to the ARP table on policy install. The firewall will flush out gratuitous ARP replies for all configured proxy ARP entries, but that isn't related to the same ARP table used for transmitting frames.

    As for question 1, I don't know. It could just be the cpstop/cpstart takes long enough for the issue to resolve "on its own". It's also possible something wacky in your state is causing the firewall to drop ARP replies for some reason before they get to the OS. I don't like blaming state, because it's only rarely the real problem. Clearing state happens to involve steps which fix a lot of other problems not directly related to bad state.
    Zimmie

  20. #20
    Join Date
    2017-11-01
    Posts
    20
    Rep Power
    0

    Default Re: Freezes/Lock-Out on our firewall that have CP puzzled.

    The issue is very much firewall related I believe. After many lines of further investigation on our network, DNS etc, I pretty confident now it is the box.

    Aside from ARP, Any other suggestions? I would increase the ARP cache, but I dont really see why the current ARP cache isnt big enough if you understand me?

Page 1 of 3 123 LastLast

Similar Threads

  1. Databse Lock issue using custom script
    By biskit in forum R77.30
    Replies: 9
    Last Post: 2016-12-09, 01:23
  2. 4807 Appliance Webui - Cannot Aquire Lock?
    By cjmiller2 in forum R75.40 (GAiA)
    Replies: 0
    Last Post: 2013-09-19, 15:31
  3. known hard lock on flash based cards.
    By jflemingeds in forum Check Point IP Appliances and IPSO (Formerly Sold By Nokia)
    Replies: 1
    Last Post: 2012-07-28, 17:16
  4. SD Freezes
    By ymhhou in forum IPS Blade (Formerly SmartDefense)
    Replies: 3
    Last Post: 2008-03-09, 15:56
  5. NGX Primary HA active node hard lock
    By l0wkey in forum Management High Availability
    Replies: 5
    Last Post: 2006-11-08, 15:56

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •