CPUG: The Check Point User Group

Resources for the Check Point Community, by the Check Point Community.


Tim Hall has done it again! He has just released the 2nd edition of "Max Power".
Rather than get into details here, I urge you to check out this announcement post.
It's a massive upgrade, and well worth checking out. -E

 

Results 1 to 9 of 9

Thread: RCV Overruns on bond interface

  1. #1
    Join Date
    2012-07-10
    Location
    Zurich, Switzerland
    Posts
    257
    Rep Power
    8

    Default RCV Overruns on bond interface

    On a clustered 15400 Appliance (R77.30 hfa 312), on the external interface eth2-08 we observed a receive overrun rate of 0.008 percent. The interface run at 1 Gbps speed and is connected to a Cisco Nexus 7000 switch. On the switch port, there was no indication of any errors.
    The interface eth2-08 shows an average utilization of 40%, with peaks up to 900 Mbps.
    In order to get rid of the data overruns, we created a bond interface with eth2-07 and eth2-08. Unfortunately, the overruns are still there, but its rate has dropped to 0.002 percent.
    What do we miss?

  2. #2
    Join Date
    2009-04-30
    Location
    Colorado, USA
    Posts
    2,248
    Rep Power
    14

    Default Re: RCV Overruns on bond interface

    Quote Originally Posted by slowfood27 View Post
    On a clustered 15400 Appliance (R77.30 hfa 312), on the external interface eth2-08 we observed a receive overrun rate of 0.008 percent. The interface run at 1 Gbps speed and is connected to a Cisco Nexus 7000 switch. On the switch port, there was no indication of any errors.
    The interface eth2-08 shows an average utilization of 40%, with peaks up to 900 Mbps.
    In order to get rid of the data overruns, we created a bond interface with eth2-07 and eth2-08. Unfortunately, the overruns are still there, but its rate has dropped to 0.002 percent.
    What do we miss?
    Please provide output of netstat -ni, and ethtool -S (interfacename) for all physical interfaces in the bond for further analysis.

    How is your bond interface set for load balancing of traffic among the physical interfaces, by MAC or Layer3/4?
    --
    Second Edition of my "Max Power" Firewall Book
    Now Available at http://www.maxpowerfirewalls.com

  3. #3
    Join Date
    2012-07-10
    Location
    Zurich, Switzerland
    Posts
    257
    Rep Power
    8

    Default Re: RCV Overruns on bond interface

    The Load Balancing Method is default (Layer 2)
    The physical interfaces in bond1 are eth2-07 and eth2-08.
    If we compare the number of packets on both physical interfaces for the last 24 hours, we get

    9.7.2018 08:45
    ==========
    Delta since 18.7.2018 09:00
    eth2-07

    Delta Receive Packets: 585'293'769
    Delta transmit packets: 536'643'642

    eth2-08

    Delta Receive Packets: 956'808'875
    Delta transmit packets: 584'837'108

    Below you find the requested info.

    Expert@1100prfw101:0]# netstat -ni
    Kernel Interface table
    Iface MTU Met RX-OK RX-ERR RX-DRP RX-OVR TX-OK TX-ERR TX-DRP TX-OVR Flg
    Mgmt 1500 0 630323508 0 178 178 644749044 0 0 0 BMRU
    Sync 1500 0 445014573 0 21756 21756 10584623474 0 0 0 BMRU
    bond1 1500 0 80205346509 0 6238477 6238477 63173952466 0 0 0 BMmRU
    eth2-01 1500 0 73928112 0 0 0 40952856 0 0 0 BMRU
    eth2-01.608 1500 0 47865288 0 0 0 40952852 0 0 0 BMRU
    eth2-07 1500 0 2607519440 0 59138 59138 2373273643 0 0 0 BMsRU
    eth2-08 1500 0 77597827521 0 6179339 6179339 60800679143 0 0 0 BMsRU

    eth3-01 1500 0 79541744718 0 1270139 0 109969341527 0 0 0 BMRU
    eth3-02 1500 0 352753854576 0 207619 0 334735808367 0 0 0 BMRU
    eth3-02.90 1500 0 69694565 0 0 0 86732130 0 0 0 BMRU
    eth3-02.101 1500 0 120057367038 0 0 0 115192710389 0 0 0 BMRU
    eth3-02.107 1500 0 464197451 0 0 0 88830665 0 0 0 BMRU
    eth3-02.223 1500 0 2538501934 0 0 0 4000727853 0 0 0 BMRU
    eth3-02.225 1500 0 23360788334 0 0 0 28378636496 0 0 0 BMRU
    eth3-02.325 1500 0 81226557786 0 0 0 58875231843 0 0 0 BMRU
    eth3-02.327 1500 0 7266208318 0 0 0 6097412792 0 0 0 BMRU
    eth3-02.328 1500 0 288420337 0 0 0 777387994 0 0 0 BMRU
    eth3-02.604 1500 0 1179938865 0 0 0 981190183 0 0 0 BMRU
    eth3-02.605 1500 0 53366518192 0 0 0 83239808947 0 0 0 BMRU
    eth3-02.701 1500 0 62673458841 0 0 0 36544705250 0 0 0 BMRU
    eth3-02.702 1500 0 134966176 0 0 0 140591697 0 0 0 BMRU
    eth3-02.703 1500 0 84716118 0 0 0 38651443 0 0 0 BMRU
    eth3-02.704 1500 0 34863636 0 0 0 29473688 0 0 0 BMRU
    eth3-02.705 1500 0 3304442 0 0 0 6716178 0 0 0 BMRU
    eth3-02.706 1500 0 3304417 0 0 0 6716228 0 0 0 BMRU
    lo 16436 0 26157347 0 0 0 26157347 0 0 0 LRU
    [Expert@1100prfw101:0]#

    [Expert@1100prfw101:0]# ethtool -S eth2-07
    NIC statistics:
    rx_packets: 2608438208
    tx_packets: 2374446541
    rx_bytes: 2447144126782
    tx_bytes: 1174986289607
    rx_broadcast: 176719
    tx_broadcast: 9450
    rx_multicast: 565605
    tx_multicast: 18926
    multicast: 565605
    collisions: 0
    rx_crc_errors: 0
    rx_no_buffer_count: 854
    rx_missed_errors: 59138
    tx_aborted_errors: 0
    tx_carrier_errors: 0
    tx_window_errors: 0
    tx_abort_late_coll: 0
    tx_deferred_ok: 0
    tx_single_coll_ok: 0
    tx_multi_coll_ok: 0
    tx_timeout_count: 0
    rx_long_length_errors: 0
    rx_short_length_errors: 0
    rx_align_errors: 0
    tx_tcp_seg_good: 0
    tx_tcp_seg_failed: 0
    rx_flow_control_xon: 0
    rx_flow_control_xoff: 0
    tx_flow_control_xon: 0
    tx_flow_control_xoff: 0
    rx_long_byte_count: 2447144126782
    tx_dma_out_of_sync: 0
    lro_aggregated: 0
    lro_flushed: 0
    lro_recycled: 0
    tx_smbus: 0
    rx_smbus: 0
    dropped_smbus: 0
    os2bmc_rx_by_bmc: 0
    os2bmc_tx_by_bmc: 0
    os2bmc_tx_by_host: 0
    os2bmc_rx_by_host: 0
    rx_errors: 0
    tx_errors: 0
    tx_dropped: 0
    rx_length_errors: 0
    rx_over_errors: 0
    rx_frame_errors: 0
    rx_fifo_errors: 59138
    tx_fifo_errors: 0
    tx_heartbeat_errors: 0
    tx_queue_0_packets: 2374446542
    tx_queue_0_bytes: 1161756422212
    tx_queue_0_restart: 0
    rx_queue_0_packets: 2608438208
    rx_queue_0_bytes: 2436710373950
    rx_queue_0_drops: 0
    rx_queue_0_csum_err: 1023
    rx_queue_0_alloc_failed: 0
    [Expert@1100prfw101:0]#

    [Expert@1100prfw101:0]# ethtool -S eth2-08
    NIC statistics:
    rx_packets: 77601344472
    tx_packets: 60802237357
    rx_bytes: 70937387802886
    tx_bytes: 29272549152054
    rx_broadcast: 874019
    tx_broadcast: 100666
    rx_multicast: 14765246
    tx_multicast: 6625122
    multicast: 14765246
    collisions: 0
    rx_crc_errors: 0
    rx_no_buffer_count: 255176
    rx_missed_errors: 6179371
    tx_aborted_errors: 0
    tx_carrier_errors: 0
    tx_window_errors: 0
    tx_abort_late_coll: 0
    tx_deferred_ok: 0
    tx_single_coll_ok: 0
    tx_multi_coll_ok: 0
    tx_timeout_count: 0
    rx_long_length_errors: 0
    rx_short_length_errors: 0
    rx_align_errors: 0
    tx_tcp_seg_good: 0
    tx_tcp_seg_failed: 0
    rx_flow_control_xon: 0
    rx_flow_control_xoff: 0
    tx_flow_control_xon: 0
    tx_flow_control_xoff: 0
    rx_long_byte_count: 70937387802886
    tx_dma_out_of_sync: 0
    lro_aggregated: 0
    lro_flushed: 0
    lro_recycled: 0
    tx_smbus: 0
    rx_smbus: 0
    dropped_smbus: 0
    os2bmc_rx_by_bmc: 0
    os2bmc_tx_by_bmc: 0
    os2bmc_tx_by_host: 0
    os2bmc_rx_by_host: 0
    rx_errors: 0
    tx_errors: 0
    tx_dropped: 0
    rx_length_errors: 0
    rx_over_errors: 0
    rx_frame_errors: 0
    rx_fifo_errors: 6179371
    tx_fifo_errors: 0
    tx_heartbeat_errors: 0
    tx_queue_0_packets: 60802237358
    tx_queue_0_bytes: 28929492900508
    tx_queue_0_restart: 73
    rx_queue_0_packets: 77601344476
    rx_queue_0_bytes: 70626982431054
    rx_queue_0_drops: 0
    rx_queue_0_csum_err: 34616
    rx_queue_0_alloc_failed: 0
    [Expert@1100prfw101:0]#

  4. #4
    Join Date
    2009-04-30
    Location
    Colorado, USA
    Posts
    2,248
    Rep Power
    14

    Default Re: RCV Overruns on bond interface

    OK I've seen this before, where the output reported by netstat -ni increments RX-DRP and RX-OVR in lockstep, and it is impossible to determine if the drop issue is a ring buffer overflow (RX-DRP) or NIC hardware buffering drop (RX-OVR) solely from that output. However the output of ethtool -S reveals:

    On eth2-07 there have been 854 overruns in the NIC hardware (rx_no_buffer_count) and 59138 ring buffer misses (rx_fifo_errors/rx_missed_errors). On eth2-08 there have been 255176 overruns in the NIC hardware (rx_no_buffer_count) and 6179371 ring buffer misses (rx_fifo_errors/rx_missed_errors). So based on those counters and the netstat -ni numbers which show a much larger number of frames passing through eth2-08, you may want to look at changing the distribution algorithm used by your bond. If this network is some kind of transit VLAN network that has only a core router and the firewalls on it, use of L2 (MAC addresses) for bond load balancing is inappropriate as almost all the traffic will pummel one physical interface. I'd suggest using L3/L4 to help balance out the bond traffic prior to any other tuning.
    Last edited by ShadowPeak.com; 2018-07-19 at 08:57.
    --
    Second Edition of my "Max Power" Firewall Book
    Now Available at http://www.maxpowerfirewalls.com

  5. #5
    Join Date
    2012-07-10
    Location
    Zurich, Switzerland
    Posts
    257
    Rep Power
    8

    Default Re: RCV Overruns on bond interface

    Quote Originally Posted by ShadowPeak.com View Post
    OK I've seen this before, where the output reported by netstat -ni increments RX-DRP and RX-OVR in lockstep, and it is impossible to determine if the drop issue is a ring buffer overflow (RX-DRP) or NIC hardware buffering drop (RX-OVR) solely from that output. However the output of ethtool -S reveals:

    On eth2-07 there have been 854 overruns in the NIC hardware (rx_no_buffer_count) and 59138 ring buffer misses (rx_fifo_errors/rx_missed_errors). On eth2-08 there have been 255176 overruns in the NIC hardware (rx_no_buffer_count) and 6179371 ring buffer misses (rx_fifo_errors/rx_missed_errors). So based on those counters and the netstat -ni numbers which show a much larger number of frames passing through eth2-08, you may want to look at changing the distribution algorithm used by your bond. If this network is some kind of transit VLAN network that has only a core router and the firewalls on it, use of L2 (MAC addresses) for bond load balancing is inappropriate as almost all the traffic will pummel one physical interface. I'd suggest using L3/L4 to help balance out the bond traffic prior to any other tuning.
    Don't let you mislead by the massive higher number of packets on eth2-08 compared to eth2-07. eth2-08 was the single physical IF before we introduced the bond interface. And we did not reset or restart the interfaces nor did we re-boot the member. For that reason, eth2-08 shows much more historical data than eth2-07 does, since that eth2-07 IF was NOT in use before we started the bonding.

    Tonight we will adjust the rx-ringsize on both physical IFs from currently 256 to 1024, reboot both members and watch the counters afterwards.

  6. #6
    Join Date
    2012-07-10
    Location
    Zurich, Switzerland
    Posts
    257
    Rep Power
    8

    Default Re: RCV Overruns on bond interface

    in the meantime we have this situation:

    netstat -ni
    Kernel Interface table
    Iface MTU Met RX-OK RX-ERR RX-DRP RX-OVR TX-OK TX-ERR TX-DRP TX-OVR Flg
    Mgmt 1500 0 40148390 0 0 0 43747539 0 0 0 BMRU
    Sync 1500 0 33414868 0 5467 5467 302832853 0 0 0 BMRU
    bond1 1500 0 5039522743 0 14626 14626 4048791046 0 0 0 BMmRU
    eth2-01 1500 0 5094431 0 0 0 2786461 0 0 0 BMRU
    eth2-01.608 1500 0 3178861 0 0 0 2786461 0 0 0 BMRU
    eth2-07 1500 0 2084924913 0 4965 4965 1932377075 0 0 0 BMsRU
    eth2-08 1500 0 2954598293 0 9661 9661 2116414316 0 0 0 BMsRU

    eth3-01 1500 0 5139415631 0 50533 0 6725680446 0 0 0 BMRU
    eth3-02 1500 0 22033362164 0 8583 0 21354359895 0 0 0 BMRU
    eth3-02.90 1500 0 4840431 0 0 0 6015613 0 0 0 BMRU
    eth3-02.101 1500 0 7450562027 0 0 0 7261225671 0 0 0 BMRU
    eth3-02.107 1500 0 24398417 0 0 0 5190613 0 0 0 BMRU
    eth3-02.223 1500 0 199383370 0 0 0 307946810 0 0 0 BMRU
    eth3-02.225 1500 0 1504636320 0 0 0 1532580487 0 0 0 BMRU
    eth3-02.325 1500 0 5944438936 0 0 0 3266402326 0 0 0 BMRU
    eth3-02.327 1500 0 360327255 0 0 0 446689022 0 0 0 BMRU
    eth3-02.328 1500 0 11308897 0 0 0 29785791 0 0 0 BMRU
    eth3-02.604 1500 0 78035619 0 0 0 65785207 0 0 0 BMRU
    eth3-02.605 1500 0 2551010547 0 0 0 6029425262 0 0 0 BMRU
    eth3-02.701 1500 0 3889630350 0 0 0 2377764285 0 0 0 BMRU
    eth3-02.702 1500 0 5867968 0 0 0 6945738 0 0 0 BMRU
    eth3-02.703 1500 0 5803436 0 0 0 2488867 0 0 0 BMRU
    eth3-02.704 1500 0 2546316 0 0 0 1936694 0 0 0 BMRU
    eth3-02.705 1500 0 247639 0 0 0 503545 0 0 0 BMRU
    eth3-02.706 1500 0 247644 0 0 0 503558 0 0 0 BMRU
    lo 16436 0 1873116 0 0 0 1873116 0 0 0 LRU

    The traffic on the bond1 interface seems reasonably to be balanced between eth2-07 and eth2-08, but we still have some small amount of rcv-ovr. Why should a change from L2 load-balancing to Layer3/4 Load-Balancing be better in terms of rcv-ovr?
    It is important to note that ALL traffic leaving the firewall on bond1 is NATed to a maximum of 7 different IP addresses

  7. #7
    Join Date
    2009-04-30
    Location
    Colorado, USA
    Posts
    2,248
    Rep Power
    14

    Default Re: RCV Overruns on bond interface

    Quote Originally Posted by slowfood27 View Post
    in the meantime we have this situation:

    netstat -ni
    Kernel Interface table
    Iface MTU Met RX-OK RX-ERR RX-DRP RX-OVR TX-OK TX-ERR TX-DRP TX-OVR Flg
    Mgmt 1500 0 40148390 0 0 0 43747539 0 0 0 BMRU
    Sync 1500 0 33414868 0 5467 5467 302832853 0 0 0 BMRU
    bond1 1500 0 5039522743 0 14626 14626 4048791046 0 0 0 BMmRU
    eth2-01 1500 0 5094431 0 0 0 2786461 0 0 0 BMRU
    eth2-01.608 1500 0 3178861 0 0 0 2786461 0 0 0 BMRU
    eth2-07 1500 0 2084924913 0 4965 4965 1932377075 0 0 0 BMsRU
    eth2-08 1500 0 2954598293 0 9661 9661 2116414316 0 0 0 BMsRU

    eth3-01 1500 0 5139415631 0 50533 0 6725680446 0 0 0 BMRU
    eth3-02 1500 0 22033362164 0 8583 0 21354359895 0 0 0 BMRU
    eth3-02.90 1500 0 4840431 0 0 0 6015613 0 0 0 BMRU
    eth3-02.101 1500 0 7450562027 0 0 0 7261225671 0 0 0 BMRU
    eth3-02.107 1500 0 24398417 0 0 0 5190613 0 0 0 BMRU
    eth3-02.223 1500 0 199383370 0 0 0 307946810 0 0 0 BMRU
    eth3-02.225 1500 0 1504636320 0 0 0 1532580487 0 0 0 BMRU
    eth3-02.325 1500 0 5944438936 0 0 0 3266402326 0 0 0 BMRU
    eth3-02.327 1500 0 360327255 0 0 0 446689022 0 0 0 BMRU
    eth3-02.328 1500 0 11308897 0 0 0 29785791 0 0 0 BMRU
    eth3-02.604 1500 0 78035619 0 0 0 65785207 0 0 0 BMRU
    eth3-02.605 1500 0 2551010547 0 0 0 6029425262 0 0 0 BMRU
    eth3-02.701 1500 0 3889630350 0 0 0 2377764285 0 0 0 BMRU
    eth3-02.702 1500 0 5867968 0 0 0 6945738 0 0 0 BMRU
    eth3-02.703 1500 0 5803436 0 0 0 2488867 0 0 0 BMRU
    eth3-02.704 1500 0 2546316 0 0 0 1936694 0 0 0 BMRU
    eth3-02.705 1500 0 247639 0 0 0 503545 0 0 0 BMRU
    eth3-02.706 1500 0 247644 0 0 0 503558 0 0 0 BMRU
    lo 16436 0 1873116 0 0 0 1873116 0 0 0 LRU

    The traffic on the bond1 interface seems reasonably to be balanced between eth2-07 and eth2-08, but we still have some small amount of rcv-ovr. Why should a change from L2 load-balancing to Layer3/4 Load-Balancing be better in terms of rcv-ovr?
    It is important to note that ALL traffic leaving the firewall on bond1 is NATed to a maximum of 7 different IP addresses
    A change in load-balancing on the switch to L3/L4 should help balance inbound traffic to the firewall interfaces and help avoid RX-OVR. However you need to provide ethtool -S output for eth2-07 and eth2-08 as the netstat output does not distinguish between RX-OVR and RX-DRP which are two very different things. Also please provide output from the following commands:

    fwaccel stat
    fwaccel stats -s
    grep -c ^processor /proc/cpuinfo
    /sbin/cpuinfo
    fw ctl affinity -l -r
    sim affinity -l
    enabled_blades
    --
    Second Edition of my "Max Power" Firewall Book
    Now Available at http://www.maxpowerfirewalls.com

  8. #8
    Join Date
    2012-07-10
    Location
    Zurich, Switzerland
    Posts
    257
    Rep Power
    8

    Default Re: RCV Overruns on bond interface

    many thnx for your quick response.
    Here we go:
    [Expert@1100prfw101:0]# ethtool -S eth2-07
    NIC statistics:
    rx_packets: 2186714449
    tx_packets: 2028555817
    rx_bytes: 2003239137553
    tx_bytes: 1102808832413
    rx_broadcast: 170654
    tx_broadcast: 8389
    rx_multicast: 465023
    tx_multicast: 16778
    multicast: 465023
    collisions: 0
    rx_crc_errors: 0
    rx_no_buffer_count: 513
    rx_missed_errors: 9563
    tx_aborted_errors: 0
    tx_carrier_errors: 0
    tx_window_errors: 0
    tx_abort_late_coll: 0
    tx_deferred_ok: 0
    tx_single_coll_ok: 0
    tx_multi_coll_ok: 0
    tx_timeout_count: 0
    rx_long_length_errors: 0
    rx_short_length_errors: 0
    rx_align_errors: 0
    tx_tcp_seg_good: 0
    tx_tcp_seg_failed: 0
    rx_flow_control_xon: 0
    rx_flow_control_xoff: 0
    tx_flow_control_xon: 0
    tx_flow_control_xoff: 0
    rx_long_byte_count: 2003239137553
    tx_dma_out_of_sync: 0
    lro_aggregated: 0
    lro_flushed: 0
    lro_recycled: 0
    tx_smbus: 0
    rx_smbus: 0
    dropped_smbus: 0
    os2bmc_rx_by_bmc: 0
    os2bmc_tx_by_bmc: 0
    os2bmc_tx_by_host: 0
    os2bmc_rx_by_host: 0
    rx_errors: 0
    tx_errors: 0
    tx_dropped: 0
    rx_length_errors: 0
    rx_over_errors: 0
    rx_frame_errors: 0
    rx_fifo_errors: 9563
    tx_fifo_errors: 0
    tx_heartbeat_errors: 0
    tx_queue_0_packets: 2028555817
    tx_queue_0_bytes: 1091722984683
    tx_queue_0_restart: 0
    rx_queue_0_packets: 2186714449
    rx_queue_0_bytes: 1994492279757
    rx_queue_0_drops: 0
    rx_queue_0_csum_err: 1259
    rx_queue_0_alloc_failed: 0
    [Expert@1100prfw101:0]# ethtool -S eth2-08
    NIC statistics:
    rx_packets: 3101182050
    tx_packets: 2220436282
    rx_bytes: 2677874866218
    tx_bytes: 1150438889217
    rx_broadcast: 33107
    tx_broadcast: 0
    rx_multicast: 714304
    tx_multicast: 519960
    multicast: 714304
    collisions: 0
    rx_crc_errors: 0
    rx_no_buffer_count: 237
    rx_missed_errors: 12437
    tx_aborted_errors: 0
    tx_carrier_errors: 0
    tx_window_errors: 0
    tx_abort_late_coll: 0
    tx_deferred_ok: 0
    tx_single_coll_ok: 0
    tx_multi_coll_ok: 0
    tx_timeout_count: 0
    rx_long_length_errors: 0
    rx_short_length_errors: 0
    rx_align_errors: 0
    tx_tcp_seg_good: 0
    tx_tcp_seg_failed: 0
    rx_flow_control_xon: 0
    rx_flow_control_xoff: 0
    tx_flow_control_xon: 0
    tx_flow_control_xoff: 0
    rx_long_byte_count: 2677874866218
    tx_dma_out_of_sync: 0
    lro_aggregated: 0
    lro_flushed: 0
    lro_recycled: 0
    tx_smbus: 0
    rx_smbus: 0
    dropped_smbus: 0
    os2bmc_rx_by_bmc: 0
    os2bmc_tx_by_bmc: 0
    os2bmc_tx_by_host: 0
    os2bmc_rx_by_host: 0
    rx_errors: 0
    tx_errors: 0
    tx_dropped: 0
    rx_length_errors: 0
    rx_over_errors: 0
    rx_frame_errors: 0
    rx_fifo_errors: 12437
    tx_fifo_errors: 0
    tx_heartbeat_errors: 0
    tx_queue_0_packets: 2220436284
    tx_queue_0_bytes: 1138483719821
    tx_queue_0_restart: 22
    rx_queue_0_packets: 3101182052
    rx_queue_0_bytes: 2665470141046
    rx_queue_0_drops: 0
    rx_queue_0_csum_err: 796
    rx_queue_0_alloc_failed: 0
    [Expert@1100prfw101:0]# fwaccel stat
    Accelerator Status : on
    Accept Templates : disabled by Firewall
    disabled from rule #472
    Drop Templates : disabled
    NAT Templates : disabled by user

    Accelerator Features : Accounting, NAT, Cryptography, QOS, Routing,
    HasClock, Templates, Synchronous, IdleDetection,
    Sequencing, TcpStateDetect, AutoExpire,
    DelayedNotif, TcpStateDetectV2, CPLS, McastRouting,
    WireMode, DropTemplates, NatTemplates,
    Streaming, MultiFW, AntiSpoofing, Nac,
    ViolationStats, AsychronicNotif, ERDOS,
    NAT64, GTPAcceleration, SCTPAcceleration,
    McastRoutingV2
    Cryptography Features : Tunnel, UDPEncapsulation, MD5, SHA1, NULL,
    3DES, DES, CAST, CAST-40, AES-128, AES-256,
    ESP, LinkSelection, DynamicVPN, NatTraversal,
    EncRouting, AES-XCBC, SHA256
    [Expert@1100prfw101:0]# fwaccel stats -s
    Accelerated conns/Total conns : 48121/142916 (33%)
    Delayed conns/(Accelerated conns + PXL conns) : 20519/127364 (16%)
    Accelerated pkts/Total pkts : 266291390/808643189 (32%)
    F2Fed pkts/Total pkts : 9981174/808643189 (1%)
    PXL pkts/Total pkts : 532370625/808643189 (65%)
    QXL pkts/Total pkts : 306184141/808643189 (37%)
    [Expert@1100prfw101:0]# grep -c ^processor /proc/cpuinfo
    8
    [Expert@1100prfw101:0]# /sbin/cpuinfo
    HyperThreading=disabled
    [Expert@1100prfw101:0]# fw ctl affinity -l -r
    CPU 0: eth2-01 eth3-01 eth2-07 eth2-08
    CPU 1: eth3-02 Mgmt Sync
    CPU 2: fw_5
    CPU 3: fw_4
    CPU 4: fw_3
    CPU 5: fw_2
    CPU 6: fw_1
    CPU 7: fw_0
    All: in.msd usrchkd in.acapd vpnd rad fgd50 lpd mpdaemon rtmd fwd cpd cprid
    [Expert@1100prfw101:0]# sim affinity -l
    Mgmt : 1
    Sync : 1
    eth2-01 : 0
    eth2-07 : 0
    eth2-08 : 0
    eth3-01 : 0
    eth3-02 : 1
    [Expert@1100prfw101:0]# enabled_blades
    fw vpn ips anti_bot

  9. #9
    Join Date
    2009-04-30
    Location
    Colorado, USA
    Posts
    2,248
    Rep Power
    14

    Default Re: RCV Overruns on bond interface

    Quote Originally Posted by slowfood27 View Post
    [Expert@1100prfw101:0]# ethtool -S eth2-07
    NIC statistics:
    rx_no_buffer_count: 513
    rx_missed_errors: 9563
    [Expert@1100prfw101:0]# ethtool -S eth2-08
    NIC statistics:
    rx_no_buffer_count: 237
    rx_missed_errors: 12437

    [Expert@1100prfw101:0]# fwaccel stats -s
    Accelerated conns/Total conns : 48121/142916 (33%)
    Delayed conns/(Accelerated conns + PXL conns) : 20519/127364 (16%)
    Accelerated pkts/Total pkts : 266291390/808643189 (32%)
    F2Fed pkts/Total pkts : 9981174/808643189 (1%)
    PXL pkts/Total pkts : 532370625/808643189 (65%)
    QXL pkts/Total pkts : 306184141/808643189 (37%)

    [Expert@1100prfw101:0]# sim affinity -l
    Mgmt : 1
    Sync : 1
    eth2-01 : 0
    eth2-07 : 0
    eth2-08 : 0
    eth3-01 : 0
    eth3-02 : 1
    [Expert@1100prfw101:0]# enabled_blades
    fw vpn ips anti_bot
    The main issue is RX-DRPs (rx_missed_errors) which indicates insufficient CPU resources on the SND/IRQ cores (CPUs 0 & 1) to empty interface ring buffers in a timely fashion, although the drop percentage is well under 0.1%. Full ring buffers can cause backups into the NIC hardware buffers where RX-OVR (rx_no_buffer_count) can occur. Given the blades you have enabled and the amount of fully-accelerated traffic (32%) I'd recommend decreasing the number of Firewall Workers (kernel instances) from 6 to 4 via cpconfig. This will result in a 4/4 split of SND/IRQ cores vs. Firewall Workers, and the eth2-07/eth2-08 interfaces will end up on their own dedicated SND/IRQ cores via automatic interface affinity (checked with sim affinity -l). There should be plenty of CPU in that config to empty the ring buffers expeditiously and avoid RX-DRP/RX-OVR.

    Just make sure that the existing 6 Firewall Worker cores (CPUs 2-7) have at least 50% idle during the firewall's busiest period (checked with top...1) before making this change; I'm pretty sure you will see that CPUs 0 & 1 are quite a bit more busy than CPUs 2-7. Also keep in mind that changing core allocations in a cluster should be treated as a version upgrade, as cluster members with differing core allocations will not sync each other until they have an identical core config.
    Last edited by ShadowPeak.com; 2018-07-25 at 10:23.
    --
    Second Edition of my "Max Power" Firewall Book
    Now Available at http://www.maxpowerfirewalls.com

Similar Threads

  1. Replies: 1
    Last Post: 2018-07-02, 13:31
  2. RX-ERR on newly added Bond interface
    By Neilharrison_253 in forum Topology Issues
    Replies: 9
    Last Post: 2015-07-23, 20:43
  3. Splat GUI on a bond interface
    By chkpjth in forum Check Point SecurePlatform (SPLAT)
    Replies: 1
    Last Post: 2011-04-05, 10:49
  4. Bond Interface SPLAT NGX R65 HFA60
    By peqwilber in forum Clustering (Security Gateway HA and ClusterXL)
    Replies: 6
    Last Post: 2010-03-18, 14:09
  5. Monitoring, Indication physical Interface down in a Bond (loadsharing)
    By guerog in forum Clustering (Security Gateway HA and ClusterXL)
    Replies: 0
    Last Post: 2009-11-23, 11:41

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •