| CPUG | |
| The Check Point User Group | |
| A Resource For The Check Point Community. Fast. Useful. Independent. | |
|
| |||||||
![]() |
| | LinkBack | Thread Tools | Display Modes |
| |||
| Hi everyone. I am replacing my existing Nokia IP530 gateway running NG-FP3 with a fresh install of NGX R60 on Splat w/the most current hotfixes on a new HP DL320. SIC is fine and I have successfully pushed my existing policy to it. However, when I put in into our production environment, it runs fine for a period of time then just stops responding. The following is what I have and what I have tried. Any suggestions would be appreciated. Hardware environment: HP DL320 72gb HD, 2.8 ghz cpu, 2gb ram, two onbaord nics, with an intel PRO/1000 MT quad nic also installed. We're using the two onboard nics (one internal, one extrernal), and only one of the quad ports (for mgmt interface). Nics are configured 100mb/full duplex. Software/OS: Splat NGX R60 HFA_01 gateway being managed by a Smartcenter running the same. These were not upgraded OS's, but fresh installs (accepting all defaults) with hotfixes on new servers, using the NGX import tool for the rulebase/users. Issue: Upon putting the gateway in place in an MPLS environment (on our external side), it runs great for a nominal period of time (usually 3-6 hours), before it stops responding. Sometimes, the system health indicator on the DL320 comes on when this happens, but not always. It also doesn't always lock up. The cpu utilization runs consistently around .08% until it reaches a point at which it jumps to 8% and stops responding. Upon a fresh boot the virtual memory and the active real memory usage increases at the exact same rate. My other gateway (R55) never touches the virtual memory and it only has 1gb ram, and has much more traffic and rules to process. The Nokia IP530 that I'm replacing runs NG FP3 with 256mb ram that never has a problem. Obviously all communications with the SmartCenter server work fine. Here's some things that I have noticed and have tried to do to fix it. - updated to the latest hotfixes. - replaced and increased ram. - replaced quad nic - increased the /proc/sys/net/ipv4/neigh/default/gc_thresh3 since I noticed a "neighbor table overflow" message in the /var/messages logs. The default ARP table size on Splat is 1024, I increased it to 10000. Since doing so I have no longer seen those errors. - decreased the /proc/sys/net/ipv4/neigh/default/gc_stale_time to 30 from the default of 60. I have read some posts that some people had some issues with the IntelPro/1000 MT drivers for Splat on R60. Has anyone been able to verify this? And was that fixed in HFA_01? I have also read that R61 will be out 1-06-06. Maybe I should wait til then? Last edited by jobroco; 2006-01-03 at 09:21. |
| |||
| There are some versions of the Pro/1000 MT that just don't seem to work. Open a call with TAC, I seem to recall someone saying they have new Pro/1000 drivers for R60. R61 is due out this month as I recall. -jlh |
| |||
| Did you export/import config from the old FW or recreate all rules from scrach? Try fresh install. You can import objects only and create rules from scrach. It can be a problem of some advanced option imported from old config. P.S. Did not spend a lot of time for this, as It is only my gueses. |
| |||
| Thanks for the replies. Here's how my upgrade process went. - Upgraded all licenses to NGX in the UserCenter. - Used the NGX upgrade_export utility to export all rules and objects from the existing SmartCenter server. - Built a fresh install (on a different machine) of SecurePlatform R60 for my new SmartCenter Server, making sure to apply all hotfixes. - Used the NGX upgrade_import utility to import all rules and objects to the new SmartCenter server. Worked flawless. - Built a fresh install (on a different machine) of SecurePlatform R60 for my new gateway, making sure to apply all hotfixes. - Re-established SIC between the SmartCenter server and the gateway. - Made certain topology was correct, by getting topology from new gateway. - Attached licenses, got gateway data, etc... in SmartUpdate. - Saved and installed policy on the new gateway successfully. Once again everything worked flawless (in my mock network environment). When I put the new gateway in service though, it runs fine for a period of about 3-6 hours when it finally just becomes unresponsive. Like I mentioned before, since then I have changed the gc_thresh3 and gc_stale_time settings because of the "neighbor table overflow" message that I was receiving. I've used ethtool to ensure that I my nics are running at the same speed as the switches I'm connecting through (100mb/full), so as not to confuse the nic any more by trying to autonegotiate the speed. I have started a new ticket with Checkpoint and am waiting for a response back. Maybe still an issue with the nic. Stay tuned for more. Thanks. |
| |||
| [quote=jobroco]Once again everything worked flawless (in my mock network environment). When I put the new gateway in service though, it runs fine for a period of about 3-6 hours when it finally just becomes unresponsive.[/QUOTE} Does the OS lock up or just the network stop? Quote:
-jlh |
| |||
| Jim, Both The OS locking up and the network not responding has happened. The OS is no longer locking up now that I have increased the size of the arp table. We checked the switch and the gateway to make sure both speeds are set manually to 100mb/full. We have also tried different ports on the switch. Checkpoint seems to think it could be a hardware problem. Weird though, I had previously installed Windows 2003 server on it and ran it for over a month in production as a WebFilter resource with my primary firewall and never had a problem with network traffic. Which leads me to believe that it's a Splat R60 problem, not hardware. Thanks again, I'll keep working on it. -jj |
![]() |
| Thread Tools | |
| Display Modes | |
| |