CPUG

The Check Point User Group

A Resource For The Check Point Community.  Fast.  Useful.  Independent.

1. CCSA/CCSE One-Week Dual-Certification Training Course with CPUG in San Francisco!
    Courses Starting 11/3, 12/8, (2009) 1/19, 2/9, 3/9, 4/6, 5/4, 6/8, 7/6, 8/3, 9/7.
2. Join Us On LinkedIn - We now have a CPUG group.


Go Back   CPUG: The Check Point User Group > Check Point Firewall-1/VPN-1 And Related Products > Clustering (Security Gateway HA and ClusterXL)
Register FAQ Members List Calendar Search Today's Posts Mark Forums Read

Reply
 
LinkBack Thread Tools Display Modes
  #1 (permalink)  
Old 2006-04-14
crucial crucial is offline
Member
 
Join Date: 2006-03-24
Posts: 51
Rep Power: 3
crucial has an average reputation (10+)
Default Failed R60 upgrade, Rolled Back, Cluster not syncing

We have a management server running R60 and two cluster members on R54. On Tuesday night, we attempted to upgrade our R54 servers to R60. We brought one of the servers down, rebuilt it and brought it online. We had planned to push a policy to this server, bounce the other one, and rebuild that one after the first was online.

After the first one was back up, we were unable to push out a policy to the servers. So, we backtracked, wiped the first firewall again and reinstalled R54. So both servers are on R54 again.

Now, I am unable to get the cluster to be stable. The sync interface on the second firewall (the one we never changed) shows down. Also, we don't have the password for the second firewall (which is why we were wiping/rebuilding them to upgrade to R60.)

Here is some information from the first FW. Maybe somebody can make some recommendations on what to try next?



Code:
[fw-ep1]# cphaprob state

Cluster Mode:   New High Availability (Active Up)

Number     Unique Address  Assigned Load   State

1 (local)  192.168.32.44   100%            active
2          192.168.16.45   0%              down

Code:
Sync:
        Version: new
        Status: Able to Send/Receive sync packets
        Sync packets sent:
         total : 15096585,  retransmitted : 0, retrans reqs : 0,  acks : 0
        Sync packets received:
         total : 0,  were queued : 0, dropped by net : 0
         retrans reqs : 0, received 0 acks
         retrans reqs for illegal seq : 0
         dropped updates as a result of sync overload: 0 

[fw-ep1]#
Reply With Quote
  #2 (permalink)  
Old 2006-04-14
maurox maurox is offline
Member
 
Join Date: 2005-11-17
Location: Italy
Posts: 82
Rep Power: 3
maurox has an average reputation (10+)
Default Re: Failed R60 upgrade, Rolled Back, Cluster not syncing

Which OS ( for the password recovery procedure) do you have on the firewall machines ?
If you have the problems only on the secondary ( down ) machine , why don't you resinstall( OS an fw-1) that machine ?
Which problem did you have when you tried to push the policy on the NGX fw machine ?

Regards,
Maurox
Reply With Quote
  #3 (permalink)  
Old 2006-04-14
crucial crucial is offline
Member
 
Join Date: 2006-03-24
Posts: 51
Rep Power: 3
crucial has an average reputation (10+)
Default Re: Failed R60 upgrade, Rolled Back, Cluster not syncing

We are running splat on the enforcement points and W2k on the management server.

We are planning to wipe FW2 and rebuild it as R54 (with a known password), but currently it is passing all of the traffic and we do not know what will happen if we attempt to fail over. Our only scheduled downtime is almost a month out. Anytime we attempt to push a policy to the servers, they stop passing traffic and smartview monitor ClusterXL shows both as 'disconnected'.

After we brought the 1st server back online as R54, we were unable to push the cluster topology to the members. We get an error saying the interface configurations did not match, although they match as well as I can see.
Reply With Quote
  #4 (permalink)  
Old 2006-04-14
maurox maurox is offline
Member
 
Join Date: 2005-11-17
Location: Italy
Posts: 82
Rep Power: 3
maurox has an average reputation (10+)
Default Re: Failed R60 upgrade, Rolled Back, Cluster not syncing

Are you sure that all traffic is passing trough FW2 ?
In your previous post I read that the HA process on the fw2 is considered, by the fw1 ,down....
Reply With Quote
  #5 (permalink)  
Old 2006-04-14
crucial crucial is offline
Member
 
Join Date: 2006-03-24
Posts: 51
Rep Power: 3
crucial has an average reputation (10+)
Default Re: Failed R60 upgrade, Rolled Back, Cluster not syncing

All of the logs in Smartview tracker are displaying FW2 as the log origin.


Since my last message, I attempted to push a policy to the cluster and got an error:
Code:
Installation Targets	Version	Policy	Type	Details
fw-ep-shrd	NG AI	Advanced Security		

Reason: TCP connectivity failure ( port = 18191 )( IP = aaa.bbb.191.244 )[ error no. 10 ].   ( message from member fw-1 )
After that, both members showed 'disconnected' in clusterXL. I was forced to down FW-1 and leave it running on FW-2. Right now we are running steady on FW-2, and FW-1 is powered off. I think I'm going to have to leave it like this until the next scheduled downtime or until somebody with more experience gets here to look at it.

Thanks for the help
Reply With Quote
  #6 (permalink)  
Old 2006-04-14
Lackie Lackie is offline
Senior Member
 
Join Date: 2005-08-22
Location: Ottawa, Canada
Posts: 347
Rep Power: 4
Lackie has an average reputation (10+)
Default Re: Failed R60 upgrade, Rolled Back, Cluster not syncing

That connectivity error is usually due to SIC or the inability to communicate with the firewall, either because there is a policy or rule on the firewall (the firewall with the aaa.bbb.191.244 address in your case) or there is a routing problem. As this was working before I would say it's not a problem with the routing. After rebuilding FW-1, were you able to reset SIC with the firewall? Is there a policy currently installed on the firewall? Either the Initial Policy or the DefaultFilter will block a policy push.
Reply With Quote
  #7 (permalink)  
Old 2006-06-15
seanmac1904 seanmac1904 is offline
Member
 
Join Date: 2005-09-04
Location: Perth
Posts: 40
Rep Power: 0
seanmac1904 has an average reputation (10+)
Default Re: Failed R60 upgrade, Rolled Back, Cluster not syncing

I had this sic error (on solaris) my issue was that in /etc/rc3.d S99cpboot happened before my S99staticroutes file

therefore my module had not route to mgmt server and loaded the default policy.

I did an fw unloadlocal and pushed a new policy after the routes were added and all was fine ( I aslo changed my staticroutes to S98staticroutes so it happened before the CPboot)

I noticed you said you pushed a policy in the middle of the upgrade, my reading of the NGX upgrade guide said this is a "bad idea" for a zero-downtime cluster upgrade.

this is how I did it (and it went reasonably smoothly)

I needed to add set nautopush=64 to my /etc/system and change my routes file (as above)


here my process

NGX Upgrade Process

1. run cphaconf set_ccp broadcast on all cluster members

2. choose cluster_member1 as the final cluster member (upgrade cluster_member2 first)

3. attach NGX licenses to both firewalls

4. upgrade cluster_member2 using smartupdate

5. issue cphaprob stat on cluster_member1 and verify it is active or active-attention

6. issue command fw ctl setsync off on cluster_member1

7. issue cphastop on cluster_member1at this point Cluster_member2 will take up the load

8. use smartupdate to upgrade cluster_member1

9. reboot cluster_member1

10. run cphaconf set_ccp multicast followed by cphastart on all cluster members


there are a couple of steps you need to do if you dont use smartupdate to do with compiling the policy

hope this is of some help

cheers for now

Sean in Perth
Reply With Quote
  #8 (permalink)  
Old 2006-08-02
joefav joefav is offline
Junior Member
 
Join Date: 2006-02-15
Posts: 5
Rep Power: 0
joefav has an average reputation (10+)
Default Re: Failed R60 upgrade, Rolled Back, Cluster not syncing

I tried the ZERO DOWNTIME upgrade and the NGX modules never came back on line, but remained in a READY state. I was using SecurePlatform though and did an 'in place' upgrade. How can I do some troubleshooting with ClusterXL?
Reply With Quote
Reply

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are Off
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On



All times are GMT -7. The time now is 04:11.


Powered by vBulletin® Version 3.7.3
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
LinkBacks Enabled by vBSEO 3.0.0