| CPUG | |
| The Check Point User Group | |
| A Resource For The Check Point Community. Fast. Useful. Independent. | |
|
| |||||||
![]() |
| | LinkBack | Thread Tools | Display Modes |
| |||
| Yesterday I had a total failure of a two node cluster using SPLAT. It appeared that after the first member crashed the standby followed within one minute. Cluster is VSX R60 HFA01 and is hot-standby only. Total of 8 customer VS's totaling about 4000 users. The cluster required a power cycle to recover, the console was blank screen no response from keyboard. These are Dell 2850 with Intel Pro1000 Quad Card. Using the on-board Intel nics for sync and mgt. Layer 3 only, no bridging. The only thing I've found so far is a log message stating the cluster is under high-load and will stop logging to the CLM. Looks like the largest customer was pushing 15k connections steady and the bandwidth throughput was less then 300 mbps. Check Point hasn't said much yet, has anyone seen anything like this using SPLAT and VSX? |
| |||
| I really am having some issues with my R60 VSX clusters. Since Feb 2008 I've had a total 7 cluster crashes. One cluster seems to crash every 7 days. This particular cluster doesn't have a high load at all, it processes about 9 million connections per day. Nothing ever shows up in the messages file before the crash, and the first attempt at getting a stack trace failed, the failed cluster member would not respond to a laptop on the console. The only thing I've found is a lot of high load messages in the log files of each VS, about stopping log forwarding due to high load. Top and cpstat -f os cpu, does not show a high load and the MDS and MLM are not loaded at all. Check Point has not seen this before, and need a stack trace to debug the issue. The hardware is Dell 2850's and run R60 VSX. Any insite on this would really be helpful. |
![]() |
| Thread Tools | |
| Display Modes | |
| |