CPUG: The Check Point User Group

Resources for the Check Point Community, by the Check Point Community.


Tim Hall has done it yet again - That's right, the 3rd edition is here!
You can read his announcement post here.
It's a massive upgrade focusing on current versions, and well worth checking out. -E

 

Results 1 to 20 of 25

Thread: Checkpoint 5400 100% CPU usage

Hybrid View

  1. #1
    Join Date
    2018-02-26
    Posts
    12
    Rep Power
    0

    Default Checkpoint 5400 100% CPU usage

    Hi,

    in my place of work we have x2 checkpoint 5400 appliances running in a clustered configuration. We're struggling badly at the minute with them as CPU usage seems to be maxed out most of the time.

    We have all of the acceleration templates drop templates etc enabled. I have tried to enable hyperthreading, but it looks like either the 5400 doesn't support it, or it's disabled in the BIOS (one for the support contractors to resolve).

    when running cpview I have noticed that there is 1 connection which stands out when CPU usage is extremely high (90% - 100%) the TCP connection is iSCSI, I'm pretty sure we shouldn't have iscsi traffic running through the firewall, and that's something I'll look into resolving when back in the office, however the pps for this traffic is only 7500, bandwith throughput is a measly 65Mbps or so, everything else is barely hitting 3 figures pps and doesn't even register on the Mbps column.

    According to this document

    https://www.checkpoint.com/downloads...ison-chart.pdf

    the 5400 should be capable of 15,000 pps. What gives? Why is our appliance struggling so badly? Is it because it's iSCSI traffic, or is there something else that I've missed? (highly likely I'm very new to checkpoint).

    Any and all advise is very much appreciated.

  2. #2
    Join Date
    2007-03-30
    Location
    DFW, TX
    Posts
    320
    Rep Power
    13

    Default Re: Checkpoint 5400 100% CPU usage

    The comparison chart you linked actually says these boxes should be able to handle 150,000 new connections per second (under ideal testing conditions, of course). Setting up new connections is computationally expensive. They should be able to handle far more than that in terms of packets per second on existing connections.



    Where are you seeing CPU usage maxed out? Different tools report different levels of usage as "100%". Some report an average of all cores (so 100% on one core would be reported as 25%, and 100% of four cores would be reported as 100%), while others add the cores together (so 100% on one core would be reported as 100%, but 100% of four cores would be reported as 400%).

    What cluster mode are you running? You can check this with 'cphaprob state'.

    Do you have a separate SmartCenter, or are these firewalls also management servers? To check this, run 'fwm ver'.

    On the active member, what does your RAM usage look like? Check this with the 'free -m' command.



    Depending on what features you have enabled, the boxes may be running low on RAM, which causes them to swap data out to the disk. Swapping data out to disk, then swapping other data back into RAM is a synchronous operation. The time spent doing that gets booked as consumed processor time, even though it isn't really the processor doing any work.
    Zimmie

  3. #3
    Join Date
    2009-04-30
    Location
    Colorado, USA
    Posts
    2,252
    Rep Power
    14

    Default Re: Checkpoint 5400 100% CPU usage

    The 5400 does not support SMT/Hyperthreading, support for SMT starts with the 5800 model and higher.

    Please provide the output of the following commands for further diagnosis, ideally run when the system is exhibiting its worst performance:

    free -m

    netstat -ni

    enabled_blades

    fwaccel stat

    fwaccel stats -s

    fw ctl multik stat

    fw ctl affinity -l -r

    fw ctl multik get_mode (R77.30) or fw ctl multik dynamic_dispatching get_mode (R80.10+)

    cpstat os -f multi_cpu -o 1

    cpconfig (the menu displayed by this command)
    --
    Third Edition of my "Max Power 2020" Firewall Book
    Now Available at http://www.maxpowerfirewalls.com

  4. #4
    Join Date
    2018-02-26
    Posts
    12
    Rep Power
    0

    Default Re: Checkpoint 5400 100% CPU usage

    Quote Originally Posted by ShadowPeak.com View Post
    The 5400 does not support SMT/Hyperthreading, support for SMT starts with the 5800 model and higher.
    that's annoying, Checkpoint support suggested enabling it as a potential solution!

    Quote Originally Posted by ShadowPeak.com View Post
    free -m
    total used free shared buffers cached
    Mem: 15812 8474 7338 0 358 4066
    -/+ buffers/cache: 4049 11763
    Swap: 17390 0 17390

    Quote Originally Posted by ShadowPeak.com View Post
    netstat -ni
    Iface MTU Met RX-OK RX-ERR RX-DRP RX-OVR TX-OK TX-ERR TX-DRP TX-OVR Flg
    Mgmt 1500 0 55782830 0 14 14 665957109 0 0 0 BMRU
    Sync 1500 0 64227211 0 0 0 258529161 0 0 0 BMRU
    bond0 1500 0 15387506853 0 393325 393325 12723291410 0 0 0 BMmRU
    bond1 1500 0 54812944762 0 18592 18592 56658898954 0 0 0 BMmRU
    bond1.103 1500 0 22599292224 0 0 0 20259908774 0 0 0 BMmRU
    bond1.104 1500 0 4714192578 0 0 0 5145028503 0 0 0 BMmRU
    bond1.301 1500 0 4713 0 0 0 61354 0 0 0 BMmRU
    bond1.302 1500 0 61196 0 0 0 1525687 0 0 0 BMmRU
    bond1.401 1500 0 12121689 0 0 0 0 0 0 0 BMmRU
    bond1.410 1500 0 27466905116 0 0 0 31249907646 0 0 0 BMmRU
    bond1.411 1500 0 15834510 0 0 0 15271286 0 0 0 BMmRU
    bond1.990 1500 0 1977 0 0 0 2571 0 0 0 BMmRU
    eth1 1500 0 6393994266 0 0 0 6267717651 0 0 0 BMsRU
    eth2 1500 0 8993512627 0 393325 393325 6455573799 0 0 0 BMsRU
    eth3 1500 0 26659358408 0 10289 10289 30857876234 0 0 0 BMsRU
    eth4 1500 0 28153586408 0 8303 8303 25801022770 0 0 0 BMsRU
    lo 16436 0 13644985 0 0 0 13644985 0 0 0 LRU

    Quote Originally Posted by ShadowPeak.com View Post
    enabled_blades
    fw appi ips identityServer

    Quote Originally Posted by ShadowPeak.com View Post
    fwaccel stat
    Accelerator Status : on
    Accept Templates : enabled
    Drop Templates : disabled
    NAT Templates : disabled by user

    Accelerator Features : Accounting, NAT, Cryptography, Routing,
    HasClock, Templates, Synchronous, IdleDetection,
    Sequencing, TcpStateDetect, AutoExpire,
    DelayedNotif, TcpStateDetectV2, CPLS, McastRouting,
    WireMode, DropTemplates, NatTemplates,
    Streaming, MultiFW, AntiSpoofing, Nac,
    ViolationStats, AsychronicNotif, ERDOS,
    NAT64, GTPAcceleration, SCTPAcceleration,
    McastRoutingV2
    Cryptography Features : Tunnel, UDPEncapsulation, MD5, SHA1, NULL,
    3DES, DES, CAST, CAST-40, AES-128, AES-256,
    ESP, LinkSelection, DynamicVPN, NatTraversal,
    EncRouting, AES-XCBC, SHA256

    Quote Originally Posted by ShadowPeak.com View Post
    fwaccel stats -s
    Accelerated conns/Total conns : 1073/6949 (15%)
    Delayed conns/(Accelerated conns + PXL conns) : 151/6537 (2%)
    Accelerated pkts/Total pkts : 1272408892/1600746449 (79%)
    F2Fed pkts/Total pkts : 15375115/1600746449 (0%)
    PXL pkts/Total pkts : 312962442/1600746449 (19%)
    QXL pkts/Total pkts : 0/1600746449 (0%)

    Quote Originally Posted by ShadowPeak.com View Post
    fw ctl multik stat
    ID | Active | CPU | Connections | Peak
    ----------------------------------------------
    0 | Yes | 1 | 3852 | 31734
    1 | Yes | 0 | 3513 | 25076

    Quote Originally Posted by ShadowPeak.com View Post
    fw ctl affinity -l -r
    CPU 0: eth1 eth2 Sync
    fw_1
    CPU 1: eth3 eth4 Mgmt
    fw_0
    All: rad vpnd fwd pdpd pepd lpd rtmd mpdaemon cpd cprid

    Quote Originally Posted by ShadowPeak.com View Post
    fw ctl multik get_mode (R77.30) or fw ctl multik dynamic_dispatching get_mode (R80.10+)
    Current mode is Off - I've actually turned this on, another Checkpoint suggested solution, I just haven't rebooted the Firewalls yet, that's this evenings job.

    Quote Originally Posted by ShadowPeak.com View Post
    cpstat os -f multi_cpu -o 1
    Processors load
    ---------------------------------------------------------------------------------
    |CPU#|User Time(%)|System Time(%)|Idle Time(%)|Usage(%)|Run queue|Interrupts/sec|
    ---------------------------------------------------------------------------------
    | 1| 0| 7| 93| 7| ?| 7286|
    | 2| 0| 6| 93| 7| ?| 7286|
    ---------------------------------------------------------------------------------

    Quote Originally Posted by ShadowPeak.com View Post
    cpconfig (the menu displayed by this command)
    Configuration Options:
    ----------------------
    (1) Licenses and contracts
    (2) SNMP Extension
    (3) PKCS#11 Token
    (4) Random Pool
    (5) Secure Internal Communication
    (6) Disable cluster membership for this gateway
    (7) Enable Check Point Per Virtual System State
    (8) Enable Check Point ClusterXL for Bridge Active/Standby
    (9) Disable Check Point SecureXL
    (10) Check Point CoreXL
    (11) Automatic start of Check Point Products

    (12) Exit


    Typically it's now currently only consuming between 9 - 40% CPU, I'll grab the relevant outputs again when it's maxed out.
    Last edited by RichardPriest; 2018-03-31 at 10:28.

  5. #5
    Join Date
    2009-04-30
    Location
    Colorado, USA
    Posts
    2,252
    Rep Power
    14

    Default Re: Checkpoint 5400 100% CPU usage

    Quote Originally Posted by RichardPriest View Post
    that's annoying, Checkpoint support suggested enabling it as a potential solution!
    The underlying 5400 processor does not support it at all, SMT is not deliberately disabled by Check Point:


    https://ark.intel.com/products/77775...Cache-3_20-GHz

    total used free shared buffers cached
    Mem: 15812 8474 7338 0 358 4066
    -/+ buffers/cache: 4049 11763
    Swap: 17390 0 17390


    Iface MTU Met RX-OK RX-ERR RX-DRP RX-OVR TX-OK TX-ERR TX-DRP TX-OVR Flg
    Mgmt 1500 0 55782830 0 14 14 665957109 0 0 0 BMRU
    Sync 1500 0 64227211 0 0 0 258529161 0 0 0 BMRU
    bond0 1500 0 15387506853 0 393325 393325 12723291410 0 0 0 BMmRU
    bond1 1500 0 54812944762 0 18592 18592 56658898954 0 0 0 BMmRU
    bond1.103 1500 0 22599292224 0 0 0 20259908774 0 0 0 BMmRU
    bond1.104 1500 0 4714192578 0 0 0 5145028503 0 0 0 BMmRU
    bond1.301 1500 0 4713 0 0 0 61354 0 0 0 BMmRU
    bond1.302 1500 0 61196 0 0 0 1525687 0 0 0 BMmRU
    bond1.401 1500 0 12121689 0 0 0 0 0 0 0 BMmRU
    bond1.410 1500 0 27466905116 0 0 0 31249907646 0 0 0 BMmRU
    bond1.411 1500 0 15834510 0 0 0 15271286 0 0 0 BMmRU
    bond1.990 1500 0 1977 0 0 0 2571 0 0 0 BMmRU
    eth1 1500 0 6393994266 0 0 0 6267717651 0 0 0 BMsRU
    eth2 1500 0 8993512627 0 393325 393325 6455573799 0 0 0 BMsRU
    eth3 1500 0 26659358408 0 10289 10289 30857876234 0 0 0 BMsRU
    eth4 1500 0 28153586408 0 8303 8303 25801022770 0 0 0 BMsRU
    lo 16436 0 13644985 0 0 0 13644985 0 0 0 LRU


    fw appi ips identityServer
    Memory and network interfaces look good, however you have application control enabled but not URL filtering? That's a bit odd but not related to your performance problem.



    Accelerator Status : on
    Accept Templates : enabled
    Drop Templates : disabled
    NAT Templates : disabled by user

    Accelerator Features : Accounting, NAT, Cryptography, Routing,
    HasClock, Templates, Synchronous, IdleDetection,
    Sequencing, TcpStateDetect, AutoExpire,
    DelayedNotif, TcpStateDetectV2, CPLS, McastRouting,
    WireMode, DropTemplates, NatTemplates,
    Streaming, MultiFW, AntiSpoofing, Nac,
    ViolationStats, AsychronicNotif, ERDOS,
    NAT64, GTPAcceleration, SCTPAcceleration,
    McastRoutingV2
    Cryptography Features : Tunnel, UDPEncapsulation, MD5, SHA1, NULL,
    3DES, DES, CAST, CAST-40, AES-128, AES-256,
    ESP, LinkSelection, DynamicVPN, NatTraversal,
    EncRouting, AES-XCBC, SHA256


    Accelerated conns/Total conns : 1073/6949 (15%)
    Delayed conns/(Accelerated conns + PXL conns) : 151/6537 (2%)
    Accelerated pkts/Total pkts : 1272408892/1600746449 (79%)
    F2Fed pkts/Total pkts : 15375115/1600746449 (0%)
    PXL pkts/Total pkts : 312962442/1600746449 (19%)
    QXL pkts/Total pkts : 0/1600746449 (0%)


    ID | Active | CPU | Connections | Peak
    ----------------------------------------------
    0 | Yes | 1 | 3852 | 31734
    1 | Yes | 0 | 3513 | 25076


    CPU 0: eth1 eth2 Sync
    fw_1
    CPU 1: eth3 eth4 Mgmt
    fw_0
    All: rad vpnd fwd pdpd pepd lpd rtmd mpdaemon cpd cprid
    All that looks very good, about 80% of your traffic is accelerated which is great!


    Current mode is Off - I've actually turned this on, another Checkpoint suggested solution, I just haven't rebooted the Firewalls yet, that's this evenings job.
    Turning on DD may help a little, but won't make a huge difference anyway since so much of your traffic is accelerated. The DD only helps balance traffic which is PXL/F2F.

    Processors load
    ---------------------------------------------------------------------------------
    |CPU#|User Time(%)|System Time(%)|Idle Time(%)|Usage(%)|Run queue|Interrupts/sec|
    ---------------------------------------------------------------------------------
    | 1| 0| 7| 93| 7| ?| 7286|
    | 2| 0| 6| 93| 7| ?| 7286|
    ---------------------------------------------------------------------------------



    Configuration Options:
    ----------------------
    (1) Licenses and contracts
    (2) SNMP Extension
    (3) PKCS#11 Token
    (4) Random Pool
    (5) Secure Internal Communication
    (6) Disable cluster membership for this gateway
    (7) Enable Check Point Per Virtual System State
    (8) Enable Check Point ClusterXL for Bridge Active/Standby
    (9) Disable Check Point SecureXL
    (10) Check Point CoreXL
    (11) Automatic start of Check Point Products

    (12) Exit


    Typically it's now currently only consuming between 9 - 40% CPU, I'll grab the relevant outputs again when it's maxed out.
    Distributed configuration (good) CPU obviously not too busy when these commands were run.

    High CPU *might* be caused by an overloaded sync network between cluster members and you will need to consider selective synchronization of services if that is the case, to determine that please provide output of the following as well:

    fw ctl pstat

    Edit: Using cpview -t go back in time to a known period of high CPU utilization and please report the type of numbers being displayed for Bits/sec, Packets/sec, Connections/sec, & Concurrent connections on the Overview screen.

    Since you suspect iSCSI traffic may be the culprit, make sure that traffic is not getting dragged into the PXL/F2F path by appi (ensure you are not using Any as a destination in APCL/URLF policy) or IPS (can be immediately disabled for new connections with the ips off command for testing). fwaccel conns can be used to verify which path the iSCSI traffic is getting processed in.
    Last edited by ShadowPeak.com; 2018-03-31 at 13:04.
    --
    Third Edition of my "Max Power 2020" Firewall Book
    Now Available at http://www.maxpowerfirewalls.com

  6. #6
    Join Date
    2018-02-26
    Posts
    12
    Rep Power
    0

    Default Re: Checkpoint 5400 100% CPU usage

    Quote Originally Posted by ShadowPeak.com View Post
    The underlying 5400 processor does not support it at all, SMT is not deliberately disabled by Check Point:


    https://ark.intel.com/products/77775...Cache-3_20-GHz
    Sorry what I meant by that was our support contractors have passed this issue onto Checkpoint and they suggested turning Hyperthreading on! this issue has been going on far too long, I've been trying to resolve the issue myself / research as much as I can which has led me to this forum.


    Quote Originally Posted by ShadowPeak.com View Post
    fw ctl pstat

    Edit: Using cpview -t go back in time to a known period of high CPU utilization and please report the type of numbers being displayed for Bits/sec, Packets/sec, Connections/sec, & Concurrent connections on the Overview screen.

    Since you suspect iSCSI traffic may be the culprit, make sure that traffic is not getting dragged into the PXL/F2F path by appi (ensure you are not using Any as a destination in APCL/URLF policy) or IPS (can be immediately disabled for new connections with the ips off command for testing). fwaccel conns can be used to verify which path the iSCSI traffic is getting processed in.
    OK, CPU usage is now hovering at around 98% on a Saturday night!

    result of fw ctl pstat is:

    System Capacity Summary:
    Memory used: 8% (1036 MB out of 11763 MB) - below watermark
    Concurrent Connections: 6314 (Unlimited)
    Aggressive Aging is in detect mode

    Hash kernel memory (hmem) statistics:
    Total memory allocated: 1233125376 bytes in 301056 (4096 bytes) blocks using 1 pool
    Total memory bytes used: 146833184 unused: 1086292192 (88.09%) peak: 482980276
    Total memory blocks used: 54460 unused: 246596 (81%) peak: 123144
    Allocations: 233482188 alloc, 0 failed alloc, 232013950 free

    System kernel memory (smem) statistics:
    Total memory bytes used: 1926550188 peak: 1975698744
    Total memory bytes wasted: 3592549
    Blocking memory bytes used: 4424456 peak: 10920712
    Non-Blocking memory bytes used: 1922125732 peak: 1964778032
    Allocations: 7834977 alloc, 0 failed alloc, 7832813 free, 0 failed free
    vmalloc bytes used: 1918308932 expensive: no

    Kernel memory (kmem) statistics:
    Total memory bytes used: 834664984 peak: 1130048568
    Allocations: 241314314 alloc, 0 failed alloc
    239844621 free, 0 failed free
    External Allocations: 1221120 for packets, 86497657 for SXL

    Cookies:
    1362187951 total, 0 alloc, 0 free,
    668 dup, 2940940326 get, 270591025 put,
    1907131572 len, 3108612 cached len, 0 chain alloc,
    0 chain free

    Connections:
    287972655 total, 117303352 TCP, 161463590 UDP, 9188578 ICMP,
    17135 other, 101776 anticipated, 95337 recovered, 6314 concurrent,
    55008 peak concurrent

    Fragments:
    3179698 fragments, 1588725 packets, 139 expired, 0 short,
    0 large, 0 duplicates, 6 failures

    NAT:
    94304039/0 forw, 125589496/0 bckw, 123910519 tcpudp,
    13239464 icmp, 25916962-13992922 alloc

    Sync:
    Version: new
    Status: Able to Send/Receive sync packets
    Sync packets sent:
    total : 241317139, retransmitted : 20019, retrans reqs : 472, acks : 8759
    Sync packets received:
    total : 11324288, were queued : 15220, dropped by net : 1389
    retrans reqs : 15448, received 19126 acks
    retrans reqs for illegal seq : 0
    dropped updates as a result of sync overload: 0
    Callback statistics: handled 1727 cb, average delay : 1, max delay : 16


    Result of free -m

    total used free shared buffers cached
    Mem: 15812 8479 7333 0 358 4073
    -/+ buffers/cache: 4047 11765
    Swap: 17390 0 17390


    result of cpstat os -f multi_cpu -o 1

    Processors load
    ---------------------------------------------------------------------------------
    |CPU#|User Time(%)|System Time(%)|Idle Time(%)|Usage(%)|Run queue|Interrupts/sec|
    ---------------------------------------------------------------------------------
    | 1| 1| 22| 78| 22| ?| 524|
    | 2| 0| 96| 4| 96| ?| 1048|
    ---------------------------------------------------------------------------------

    This image is a snip of the cpview overview screen, I couldn't copy and paste that screen - and I don't think it would've in a nice format anyeway
    Click image for larger version. 

Name:	cpview1.PNG 
Views:	637 
Size:	24.2 KB 
ID:	1382

    This is the network tab:
    Click image for larger version. 

Name:	cpview-network.PNG 
Views:	799 
Size:	32.8 KB 
ID:	1383

    I've run the fwaccel conns command as you suggested, but I'm not really sure how to decipher the output? I get an awful lot in the output, more than securecrt can handle in it's view buffer anyway! Can you explain to me what the following means? "PXL/F2F path by appi" apologies if this is a very simple question, I am very new to checkpoint firewalls!
    Last edited by RichardPriest; 2018-03-31 at 16:04.

  7. #7
    Join Date
    2018-02-26
    Posts
    12
    Rep Power
    0

    Default Re: Checkpoint 5400 100% CPU usage

    Quote Originally Posted by Bob_Zimmerman View Post
    Where are you seeing CPU usage maxed out? Different tools report different levels of usage as "100%". Some report an average of all cores (so 100% on one core would be reported as 25%, and 100% of four cores would be reported as 100%), while others add the cores together (so 100% on one core would be reported as 100%, but 100% of four cores would be reported as 400%).
    We use SOlarwinds to monitor all our kit, but also SSH'ing to each node in the cluster and running cpview, both report similar numbers.

    Quote Originally Posted by Bob_Zimmerman View Post
    What cluster mode are you running? You can check this with 'cphaprob state'.
    Cluster Mode: High Availability (Active Up) with IGMP Membership

    Number Unique Address Assigned Load State

    1 (local) XXX.XXX.XXX.XXX 100% Active
    2 0% Standby

    Quote Originally Posted by Bob_Zimmerman View Post
    Do you have a separate SmartCenter, or are these firewalls also management servers? To check this, run 'fwm ver'.
    We have a separate management appliance. running that command gives the following output:

    This is not a Security Management Server station


    Quote Originally Posted by Bob_Zimmerman View Post
    On the active member, what does your RAM usage look like? Check this with the 'free -m' command.
    total used free shared buffers cached
    Mem: 15812 8476 7336 0 358 4065
    -/+ buffers/cache: 4051 11761
    Swap: 17390 0 17390

    Looks like there's plenty free to me?


    Quote Originally Posted by Bob_Zimmerman View Post
    Depending on what features you have enabled, the boxes may be running low on RAM, which causes them to swap data out to the disk. Swapping data out to disk, then swapping other data back into RAM is a synchronous operation. The time spent doing that gets booked as consumed processor time, even though it isn't really the processor doing any work.
    to be honest in an effort to reduce the CPU usage we've taken to basically tuning 90% of the features off. only the IPS is running at the minute really

Similar Threads

  1. FW Monitor CPU Usage
    By igormaxfv in forum fw monitor, tcpdump and Wireshark
    Replies: 1
    Last Post: 2013-03-01, 19:34
  2. 100% CPU usage in SPLAT - NGX R65
    By akchakravarthi09 in forum Check Point SecurePlatform (SPLAT)
    Replies: 8
    Last Post: 2010-06-11, 06:09
  3. CLI usage
    By westy2222 in forum Miscellaneous
    Replies: 2
    Last Post: 2010-05-24, 15:12
  4. Memory Usage in Checkpoint
    By anakalem in forum Miscellaneous
    Replies: 0
    Last Post: 2008-04-08, 21:52
  5. FW1 and proxy usage
    By shoenix in forum Content Security/Security Servers/CVP/UFP
    Replies: 0
    Last Post: 2008-03-27, 07:13

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •