PDA

View Full Version : Boot failure after revert.



polymath
2010-02-13, 16:15
Here is an account of a problem I had. I hope this report will speed your recovery.

Scenario:
HP Proliant DL380 G4
SmartArray 6i
Secure Platform Pro
VPN-1 VSX NGX R65 - Build 550

Before attempting a configuration repair I created a snapshot. The firewall wound up in worse shape and I decided to revert. As the last step of the revert the firewall rebooted (without warning). Now the real fun started. The system hung on the boot showing GRUB and a flashing cursor. Power cycling provided no relief.

Solution:
I had limited remote access and my favorite con$ultant thought a complete reinstall was likely. When I got on site I booted from a Linux CD (SLES 10 in this case, but most distributions with cciss drivers should do - Smart Platform is a RHEL derivative). The file systems were clean and everything look normal, but booting stalled in GRUB (the boot loader).

GRUB could not find the boot volume when I tried the standard installation method.

Google turned up this article (http://bugs.contribs.org/show_bug.cgi?id=1510 kudos to Graeme Fleming) part of which I incorporate into the following procedure.

Mount the root and boot filesystems on /mnt. [your partitions could differ from these].

# mount -o ro /dev/cciss/c0d0p7 /mnt
# mount -o rw /dev/cciss/c0d0p1 /mnt/boot
# mount -o bind /proc /mnt/proc
# mount -o bind /dev /mnt/dev
# chroot /mnt /bin/bash

Uncomment "boot=/dev/cciss/c0d0" in /boot/grub/grub.conf. [vi is available.]

# /sbin/grub --batch --device-map=/boot/grub/device.map --config-file=/boot/grub/grub.conf --no-floppy

grub> root (hd0,0)
grub> setup (hd0)
grub> quit

Remove your recovery media.

# reboot


If you revert subsequently a grub prompt (may) appear after the reboot. In that case just enter the first 2 grub commands and reboot the system.

Analysis:
I tested revert in a virtual mock up of my firewall and it worked fine. I think it's a hardware compatibility/software configuration issue even though the hardware is Checkpoint supported.
The problem is reproducible - it occurred again on my backup firewall.

Please report any errors in this write-up.

Good luck.

serlud
2010-04-04, 04:10
Here is an account of a problem I had. I hope this report will speed your recovery.

Scenario:
HP Proliant DL380 G4
SmartArray 6i
Secure Platform Pro
VPN-1 VSX NGX R65 - Build 550

Before attempting a configuration repair I created a snapshot. The firewall wound up in worse shape and I decided to revert. As the last step of the revert the firewall rebooted (without warning). Now the real fun started. The system hung on the boot showing GRUB and a flashing cursor. Power cycling provided no relief.

Solution:
I had limited remote access and my favorite con$ultant thought a complete reinstall was likely. When I got on site I booted from a Linux CD (SLES 10 in this case, but most distributions with cciss drivers should do - Smart Platform is a RHEL derivative). The file systems were clean and everything look normal, but booting stalled in GRUB (the boot loader).

GRUB could not find the boot volume when I tried the standard installation method.

Google turned up this article (Bug 1510 – GRUB fails to correctly install to cciss device (possibly others) (http://bugs.contribs.org/show_bug.cgi?id=1510) kudos to Graeme Fleming) part of which I incorporate into the following procedure.

Mount the root and boot filesystems on /mnt. [your partitions could differ from these].

# mount -o ro /dev/cciss/c0d0p7 /mnt
# mount -o rw /dev/cciss/c0d0p1 /mnt/boot
# mount -o bind /proc /mnt/proc
# mount -o bind /dev /mnt/dev
# chroot /mnt /bin/bash

Uncomment "boot=/dev/cciss/c0d0" in /boot/grub/grub.conf. [vi is available.]

# /sbin/grub --batch --device-map=/boot/grub/device.map --config-file=/boot/grub/grub.conf --no-floppy

grub> root (hd0,0)
grub> setup (hd0)
grub> quit

Remove your recovery media.

# reboot


If you revert subsequently a grub prompt (may) appear after the reboot. In that case just enter the first 2 grub commands and reboot the system.

Analysis:
I tested revert in a virtual mock up of my firewall and it worked fine. I think it's a hardware compatibility/software configuration issue even though the hardware is Checkpoint supported.
The problem is reproducible - it occurred again on my backup firewall.

Please report any errors in this write-up.

Good luck.

The best way you can open an SR and force CP to fix this issue:

We have already got an fix for normal FW and Provider-1 (not VSX):

http://www.cpug.org/forums/versions-firewall-1-vpn-1/11177-ngx-r65-include-hfa50-snapshoot-revert-cause-splat-machine-hang-stuck.html?highlight=snapshot

paddy
2011-01-26, 11:53
Thanks to Polymath for this post. Very helpful.

I had pretty much the same issue while installing Provider 1 R70 (from Check_Point_R70_P1.Splat.iso)

The setup:

HP Proliant DL380 G4
SmartArray 6i
Secure Platform Pro
This is Provider-1/SiteManager-1 R70 Build 097

In this version of code the grub files were not being picked up from /boot and never loaded The R70 code also seemed to think I had a SATA disk on the DL 380 G4 too which could have been afactor in the installation not working. This is the procedure that got my system working:

First update device.map (remember to mount the file system /dev/cciss/c0d0p1 and edit the files in that mount!) from:

(fd0) /dev/fd0
(hd1) /dev/cciss/c0d0
(hd0) /dev/sda

delete hd0=/dev/sda
update hd1=/dev/cciss/h0d0 to hd0=/dev/cciss/h0d0

so you device.map looks like:

(fd0) /dev/fd0
(hd0) /dev/cciss/c0d0

Edit grub.conf (remember to mount the file system /dev/cciss/c0d0p1 and edit the files in that mount!)

change all references to hd1 to hd0 referenced in device.map. e.g.
splashimage=(hd1,0)/grub/splash.xpm.gz
to
splashimage=(hd0,0)/grub/splash.xpm.gz
and all instances of root (hd1,0)
to
root (hd0,0)

Carry out the procedure as detailed by polymath. I used mount point mnt1 as mnt already existed. I did not uncomment "boot=/dev/cciss/c0d0" in /boot/grub/grub.conf" either

mount -o ro /dev/cciss/c0d0p7 /mnt1
mount -o rw /dev/cciss/c0d0p1 /mnt1/boot
mount -o bind /proc /mnt1/proc
mount -o bind /dev /mnt1/dev
chroot /mnt1 /bin/bash

# /sbin/grub --batch --device-map=/boot/grub/device.map --config-file=/boot/grub/grub.conf --no-floppy

grub> root (hd0,0)
grub> setup (hd0)
grub> quit

Remove your recovery media.

# reboot

Now I'm going to raise a support call...

paddy
2011-02-07, 06:06
Update from Checkpoint on this issue.

This problem occurs if you mount your ISO image via iLO Virtual Device.

It causes the system to see the following disk in addition to the RAID array:

hd0 = /dev/sda

The /dev/sda disk should not be seen, however the installer references this disk in /boot/grub/device.map and /boot/grub/grub.conf.

Checkpoint have an internal fix for this issue which is almost identical to the solution previously posted, although they have not seen this problem with a DL380 G4 previously.