CPUG: The Check Point User Group

Resources for the Check Point Community, by the Check Point Community.


Tim Hall has done it again! He has just released the 2nd edition of "Max Power".
Rather than get into details here, I urge you to check out this announcement post.
It's a massive upgrade, and well worth checking out. -E

 

Results 1 to 8 of 8

Thread: R76 fw_worker_0 gets all CPU

  1. #1
    Join Date
    2011-09-22
    Location
    Athens, Greece
    Posts
    10
    Rep Power
    0

    Default R76 fw_worker_0 gets all CPU

    Good day everyone !

    I'm using the Checkpoint R76 on a Dell R610 server.
    My problem is the 99% or 100% or 102% that i get from the fw_worker_0 process (Command) when i give the CLI top command.
    When i reboot the server, the problem goes away.

    I thought that the backup process was causing this error. I take a backup everyday at 00:10 to an internal tftp server i have installed.
    The Backup file.tgz seems to have to correct size (1.1 GB).

    Is there a way to find out what is cauing this process to get the amound of CPU that is getting?

    Any log file i should look into and which one?

    Thank you for your time !

    I include a picture from the top command and fw ver, i gave a while ago.
    Attached Thumbnails Attached Thumbnails Click image for larger version. 

Name:	Checkpoint 99 Spike.JPG 
Views:	1322 
Size:	125.6 KB 
ID:	772  

  2. #2
    Join Date
    2006-01-25
    Location
    Americas
    Posts
    1,535
    Rep Power
    15

    Default Re: R76 fw_worker_0 gets all CPU

    Do you only have 1 fw_worker (aka CoreXL instance)? With four cores I would suggest at least 2.

    Are you using SecureXL?

    Since it's SI traffic, my primary concern would be throughput. I would suggest looking to see who the top talkers are by enabling Monitoring for the gateway object (and pushing policy prior to the event) and then using Smartview Monitor - Traffic - Top connections.

    Some commands to run next time it happens:

    fw tab -t connections -s
    fw ctl multik stat
    fw ctl pstat

    and this is helpful to know regarding configuration:

    fw ctl affinity -l -r -a

    You could even go as far as turning on debugging and looking at fwd.elg / vpnd.elg while its happening.
    Its all in the documentation.

  3. #3
    Join Date
    2011-09-22
    Location
    Athens, Greece
    Posts
    10
    Rep Power
    0

    Default Re: R76 fw_worker_0 gets all CPU

    Thank you Very Much for your reply melipla.

    I came this morning and found the same sutition as described above.
    So i just reboot the firewall and now th CPU is going from 6% or 8% up to 20% or 29%

    My thought that the backup could cause this was wrong. This "Spike" happens even if the backup is not scheduled. (yes i took out the backup sceduler)

    CoreXL is currently disabled. Althought i have four cores to use, my licence is allowing me to use only two. Four cores cost way too much !!!

    Next time it happens i will do as you wrote me and i will let you know what happens.

    Thank you once more for your time !

    fwall> fw ctl affinity -l -r -a
    CPU 0:
    CPU 1: fw_0
    CPU 2:
    CPU 3:
    All: eth4 eth0 eth1 eth2 eth3
    in.aufpd vpnd fwm cpsead fgd50 status_proxy usrchkd stormd in.asessiond rtmd in.geod mpdaemon cpca fwd rad cpstat_monitor cprid cpd
    The current license permits the use of CPUs 0, 1 only.

  4. #4
    Join Date
    2011-09-22
    Location
    Athens, Greece
    Posts
    10
    Rep Power
    0

    Default Re: R76 fw_worker_0 gets all CPU

    Hello Again !

    A week passed and today i see the same 101% CPU spike.

    I gave all the commands and I'm posting the results below.

    I'm gonna check the fwd.elg and vpnd.elg as well and post if i see anything strange !!


    Thank you for your time !!!

    [Expert@fwall:0]# fw tab -t connections -s
    HOST NAME ID #VALS #PEAK #SLINKS
    localhost connections 8158 2046 5748 7303
    [Expert@fwall:0]# fw ctl multik stat
    CoreXL is disabled

    [Expert@fwall:0]# fw ctl pstat

    System Capacity Summary:
    Memory used: 4% (72 MB out of 1587 MB) - below watermark
    Concurrent Connections: 4% (2345 out of 49900) - below watermark
    Aggressive Aging is not active

    Hash kernel memory (hmem) statistics:
    Total memory allocated: 163577856 bytes in 39936 (4096 bytes) blocks using 39 pools
    Total memory bytes used: 20052016 unused: 143525840 (87.74%) peak: 65352704
    Total memory blocks used: 6353 unused: 33583 (84%) peak: 16584
    Allocations: 3088916125 alloc, 0 failed alloc, 3088676892 free

    System kernel memory (smem) statistics:
    Total memory bytes used: 199331008 peak: 206253260
    Total memory bytes wasted: 2433727
    Blocking memory bytes used: 845616 peak: 6223528
    Non-Blocking memory bytes used: 198485392 peak: 200029732
    Allocations: 18642701 alloc, 0 failed alloc, 18641648 free, 0 failed free
    vmalloc bytes used: 3754712 expensive: yes

    Kernel memory (kmem) statistics:
    Total memory bytes used: 55646792 peak: 97913004
    Allocations: 3107558257 alloc, 0 failed alloc
    3107318532 free, 0 failed free
    External Allocations: 1384 for packets, 0 for SXL

    Cookies:
    3493192938 total, 532760337 alloc, 532760337 free,
    103847 dup, 510366182 get, 2426866678 put,
    1706244162 len, 170527 cached len, 532621723 chain alloc,
    532621723 chain free

    Connections:
    26554464 total, 17944965 TCP, 7852957 UDP, 756541 ICMP,
    1 other, 815 anticipated, 6 recovered, 2345 concurrent,
    5748 peak concurrent

    Fragments:
    233348 fragments, 116343 packets, 114 expired, 0 short,
    0 large, 0 duplicates, 0 failures

    NAT:
    342228865/0 forw, 344898322/0 bckw, 686232512 tcpudp,
    884534 icmp, 25486654-18838004 alloc

    Sync: off

    [Expert@fwall:0]#

  5. #5
    Join Date
    2006-01-25
    Location
    Americas
    Posts
    1,535
    Rep Power
    15

    Default Re: R76 fw_worker_0 gets all CPU

    The only thing that sticks out is your NAT. You're doing more NAT than you have connections, which could mean some double NAT rules--not a terrible thing but odd. What's the output of this:

    fw tab -t fwx_alloc -s

    I would also try this:

    fw ctl zdebug + drop
    (outputs to console - Ctrl-C to end it)
    & look for unusual activity.

    Check the /var/log/messages file for unusual activity.
    Check $CPDIR/log/cpwd.elg for unusual activity around the start time for the high cpu usage.

    If $FWDIR/log/fwd.elg has nothing in it, you can try this:

    fw ctl debug 0

    that will set logging all to defaults (whereas "fw ctl debug -x" turns them all off). There isn't much in the defaults, so its safe to run that command--the output would be in the *.elg files in the log directory. Beyond this you're into some more advanced FWD debugging where you need to set flags specific to your problem.

    As a aside note, even with a two core license, you should have CoreXL running. I don't think you can get a FW license without corexl, I would recommend you enable it.

    HTH
    Its all in the documentation.

  6. #6
    Join Date
    2011-09-22
    Location
    Athens, Greece
    Posts
    10
    Rep Power
    0

    Default Re: R76 fw_worker_0 gets all CPU

    Good day Melipla and thank you for your help !
    I'm also thinking of using the CoreXL, since i have 4 cores, and paid for 2. I will do this tonight around 02:00 after the Backup process finishes.

    fwall> fw tab -t fwx_alloc -s
    HOST NAME ID #VALS #PEAK #SLINKS
    localhost fwx_alloc 8187 593 1782 0

    In the Var/log messages I Found This:

    Nov 20 09:04:03 fwall modprobe: FATAL: Could not open '/lib/modules/2.6.18-92cp/kernel/net/ipv6/ipv6.ko': No such file or directory
    Nov 20 09:04:04 fwall last message repeated 7 times
    Nov 20 09:04:29 fwall syslogd: sendto: Bad file descriptor
    Nov 20 09:04:32 fwall monitord[4387]: SQL error: columns time_stamp, sensor_name are not unique rc=19
    Nov 20 09:04:32 fwall last message repeated 5 times
    Nov 20 09:05:03 fwall modprobe: FATAL: Could not open '/lib/modules/2.6.18-92cp/kernel/net/ipv6/ipv6.ko': No such file or directory
    Nov 20 09:05:04 fwall last message repeated 7 times
    Nov 20 09:05:32 fwall monitord[4387]: SQL error: columns time_stamp, sensor_name are not unique rc=19
    Nov 20 09:05:32 fwall last message repeated 5 times
    Nov 20 09:06:03 fwall modprobe: FATAL: Could not open '/lib/modules/2.6.18-92cp/kernel/net/ipv6/ipv6.ko': No such file or directory
    Nov 20 09:06:04 fwall last message repeated 7 times
    Nov 20 09:06:32 fwall monitord[4387]: SQL error: columns time_stamp, sensor_name are not unique rc=19

    From the $FWDIR/log i see in the fwd.elg file the message
    Unable to open '/dev/fw6v0': No such file or directory
    Unable to open '/dev/fw6v0': No such file or directory
    Unable to open '/dev/fw6v0': No such file or directory
    Unable to open '/dev/fw6v0': No such file or directory

    The fwd.elg has a tone of logs... and no time stamp... I will reboot the server tonight and see if i can get any info shorted and get back to you !

    Thank you again,once more, for your time !

  7. #7
    Join Date
    2011-09-22
    Location
    Athens, Greece
    Posts
    10
    Rep Power
    0

    Default Re: R76 fw_worker_0 gets all CPU

    Good day to everyone !

    The CPU went up to 100 again. (I rebooted on November 21 around 02:00)
    In the company i'm in, I use Qos.
    So if i try to enable the CoreXL it disables the QOS. And i need Qos so i had to disable CoreXL again and reboot.
    (I firstly disabled Qos, installed the Policy, enabled the CoreXL from cpconfig and rebooted. Then i tied to Enable Qos again but it would let me so...)

    There is no cpwd.elg file. The files i see are cpca.elg, cphttp.elg, cplmd.elg and cpstat_monitor.elg

    The fwd.elg gives me some of these errors...

    wsdns_reconf: wsdns_reconf called
    wsdns_reconf: proxy_enable_override_settings is false, searching for default_proxy_settings
    wsdns_reconf: proxy is off or next proxy is on; clear CDnsHandler
    fw_ciu_reconf:
    appi_top_counters_reconf: appi_top_counters_reconf called
    appi_top_counters_init: called, env=(0x89226f0)
    fcall: cpFileExist /opt/CPsuite-R76/fw1/bin/xrm failed
    Unable to open '/dev/fw6v0': No such file or directory
    Unable to open '/dev/fw6v0': No such file or directory
    Unable to open '/dev/fw6v0': No such file or directory
    Unable to open '/dev/fw6v0': No such file or directory
    fw_get_kernel6_instance_num_ctx: Invalid instance num 0 - using only one

    Unable to open '/dev/fw6v0': No such file or directory
    Unable to open '/dev/fw6v0': No such file or directory
    coreXL_aff_handler: This is a cb respond to: FW1_INSTALLED msg
    coreXL_aff_handler: User has not enabled auto core affinity
    Unable to open '/dev/fw6v0': No such file or directory
    fw_get_kernel6_instance_num_ctx: Invalid instance num 0 - using only one

    Unable to open '/dev/fw6v0': No such file or directory
    Unable to open '/dev/fw6v0': No such file or directory
    is_urlf_ssl_enabled: advanced_uf_blade installed
    fwd_reload_event: calling Connect Control fwbalance_init_ping
    Unable to open '/dev/fw6v0': No such file or directory
    fwarp_get_arp_interface: no interface found on same subnet as valid ip address: xxx.xxx.xx.x (The xxx is were some public IP's were !!!)
    fwarp_make_arp_entry: can't find arp interface for address: xxx.xxx.xx.x
    fwarp_get_arp_interface: no interface found on same subnet as valid ip address: xxx.xxx.xx.xx
    fwarp_make_arp_entry: can't find arp interface for address: xxx.xxx.xx.xx
    fwarp_get_arp_interface: no interface found on same subnet as valid ip address: xxx.xxx.xx.xx
    fwarp_make_arp_entry: can't find arp interface for address: xxx.xxx.x.xx
    Unable to open '/dev/fw6v0': No such file or directory
    wsdns_reconf: wsdns_reconf called
    wsdns_reconf: proxy_enable_override_settings is false, searching for default_proxy_settings
    wsdns_reconf: proxy is off or next proxy is on; clear CDnsHandler
    fw_ciu_reconf:
    appi_top_counters_reconf: appi_top_counters_reconf called
    appi_top_counters_init: called, env=(0x89226f0)
    fcall: cpFileExist /opt/CPsuite-R76/fw1/bin/xrm failed
    Unable to open '/dev/fw6v0': No such file or directory
    Unable to open '/dev/fw6v0': No such file or directory
    Unable to open '/dev/fw6v0': No such file or directory
    Unable to open '/dev/fw6v0': No such file or directory
    fw_get_kernel6_instance_num_ctx: Invalid instance num 0 - using only one
    Unable to open '/dev/fw6v0': No such file or directory
    Unable to open '/dev/fw6v0': No such file or directory
    coreXL_aff_handler: This is a cb respond to: FW1_INSTALLED msg
    coreXL_aff_handler: User has not enabled auto core affinity
    Unable to open '/dev/fw6v0': No such file or directory
    Unable to open '/dev/fw6v0': No such file or directory
    Unable to open '/dev/fw6v0': No such file or directory
    Unable to open '/dev/fw6v0': No such file or directory
    Unable to open '/dev/fw6v0': No such file or directory
    Unable to open '/dev/fw6v0': No such file or directory

  8. #8
    Join Date
    2016-10-19
    Posts
    43
    Rep Power
    0

    Default Re: R76 fw_worker_0 gets all CPU

    Hello

    We had this same situation. 4 core gateway but had only 2 core license. You could be running into sk110422. What we did to temporarily fix the problem is to change the automatic sim affinity to static. I know CP does not recommend this but this "did" fix our issue atleast temporarily. So we moved eth4 and eth5 to CPU1 and other 5 interfaces are still on CPU0. I guess you need to edit the $FWDIR/conf/fwaffinity.conf file to make these changes so typically both CPU's are running SND + fw_worker.

    This is how our's is setup currently.
    i eth4 1
    i eth5 1
    i default 0

    I would like to say that it is always better to get the 4 core license and run 3 fw_workers + 1 SND which we are going to do shortly.

    Thanks.

Similar Threads

  1. R70.30 to R75 to r76 fails on r75 to r76
    By Spiky in forum Installing And Upgrading
    Replies: 1
    Last Post: 2013-09-09, 07:33
  2. SG80 and R76
    By amani in forum Check Point Series 80/1100 Appliances
    Replies: 2
    Last Post: 2013-06-25, 05:11
  3. New Certification for VS - R76?
    By woozy_cloud in forum Managed Security Expert VSX NGX Exam 156-816.67
    Replies: 1
    Last Post: 2013-04-28, 06:00
  4. SVRServer process running at 100% CPU on one of the CPU cores
    By cciesec2006 in forum Check Point SecurePlatform (SPLAT)
    Replies: 0
    Last Post: 2011-08-06, 17:37
  5. PS -AUX shows low CPU, but CP Smartview Monitor shows extremely high CPU usage
    By cdooer in forum Check Point IP Appliances and IPSO (Formerly Sold By Nokia)
    Replies: 14
    Last Post: 2011-06-14, 04:17

Tags for this Thread

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •