Hi,

once again I'll have to express my dissatisfaction with HP-UX here (because
anything you tell the guys at the RC will never make it to those who actually
are responsible for it). Anyway:

Recently I've learned that customers do not want to install OS updates or
patches frequently, so HP only issues updates to paying customers twice a year
(look for the latest Support Plus in SUM if you find any). OK if the software
quality is so high that you don't need frequent patches, everything is fine,
but...

Trying to to update to the OS Core (Operating Environment) 05/2005 recently, I
learned (RC says that) that it will only install (=update) cleanly if I've
installed OE dated 09/2004 before. Installing OE 12/2004 would not help. The
problem seems to be related to "required patch PHKL_31151" which is
"required", but not "provided" in the OS DVD 05/2005. Don't ask me who is
responsible for that.

Anyway, the reason for this article is not that problem, but the failed
Upgrade from HP-UX 11.11 (64bit) to 11.23 (64bit) on an L-class server last
week: Upgrade failed miserably due to a series of software bugs:

After an Upgrade from 11.0 to 11.11 was a rather smooth process (I'm doing OS
upgrades since 8.02 (1992)), the upgrade from 11.11 to 11.23 is a mess.

I started out with Update-UX B.11.11.0412 (which was OK says the
manual). After a little preparation, the first (minor) problem came:

[...]
Executing pre-update script NsaHttp.100
Executing pre-update script OS-CorePatchCheck.100
Executing pre-update script RAMIPv6.100
Executing pre-update script RNG-DKRN.100
ERROR: The /dev/random or /dev/urandom device special files may not be in
use during update-ux. Use the fuser(1M) command to identify these
processes, then terminate them.
ERROR: Pre-Update script RNG-DKRN.100 failed.
[...]

CIFS Server (samba) uses the random device, so I had to stop CIFS. No big
problem, but IO had to restart the update procedure. After restarting the
pre-checks all succeeded(!) and update began.

Then the next surprise:
ERROR: "host:/": The kernel build script failed.
[...]
ERROR: There were errors encountered while installing the new OS. These
errors resulted in the install phase being aborted. If you can
correct the problems, you can re-enter the update-ux command as
before, or enter: "update-ux -first_depot" to start the update at the
installation phase.
NOTE: One or more errors encountered. Take appropriate corrective action
and re-run update-ux to complete the update.

The postinstall script for OS-Core.CORE2-KRN had failed for the following
reason: "(NPROC): The specified number is not valid for this command."
Please note that kernel configurations had all be done with SAM and the
setting was perfectly OK up to then. Anyway I simply replaced all occurrencies
of "(NPROC)" with "999" (about OK for that machine).

After restarting the process using "update-ux -first_depot", installation of
kernel filesets worked fine, but then (again):

ERROR: "host:/": The kernel build script failed.

The reason for this error was:
ld: Unsatisfied protected symbol "tape2_install" in file "/usr/conf/lib/libsio.a[tape2.modmeta.o]"

So I just deleted the "tape2" line in /stand/system and restarted the update
as before. Then all the 1600 (or so) filesets installed correctly, BUT NO
KERNEL WAS BUILT! (hey, who did write that code that?)

So we have:
[...]
======= 08/19/05 14:02:57 METDST END swinstall SESSION
(non-interactive)
(jobid=host-0161)

* Obtaining some information from the source depot.
* Executing Pre-Reboot scripts
* Rebooting


======= END update-ux


* Discarding pre-update kernel registry database.
ERROR: Unable to create /stand/vmunix -> /stand/current/vmunix link.
You will need to tell the boot loader to boot
/stand/current/vmunix.
link: No such file or directory

sync'ing disks (0 buffers to flush):
0 buffers not flushed
0 buffers still dirty

Closing open logical volumes...
Done

At this point why didn't the process stop for manual intervention when no
kernel could be installed??? You can guess how the boot lokked like:

Proceeding...

Trying Primary Boot Path
------------------------
Booting...
Boot IO Dependent Code (IODC) revision 1


HARD Booted.

ISL Revision A.00.44 Mar 12, 2003

ISL booting hpux

Boot
: disk(0/6/0/0.5.0.0.0.0.0;0)/stand/vmunix
disk(0/6/0/0.5.0.0.0.0.0;0)/stand/vmunix: cannot open, or not executable
Exec failed: No such file or directory

At this point I still ahd hope to fix this (what I thought) rather trivial
problem. Just use the "Recovery feature" of the installation media. As you can
guess, more fun is waiting for you:

Booting...
Boot IO Dependent Code (IODC) revision 1


SOFT Booted.

ISL Revision A.00.44 Mar 12, 2003

ISL booting hpux (;0):INSTALL

Boot
: disk(0/0/1/0.1.0.0.0.0.0;0):WINSTALL
14426112 + 6750440 + 3362904 start 0x35c168


[...]
Loading some basic commands...
Required files exist ... Proceeding
/duped_root/.kshrc[306]: /sbin/fs/cdfs/mount: not found
[...]
The disk is LVM.
Setting boot and root device file for c4t5d0...
Loading /sbin/umount...
[...]
Enter the fsck command string (e.g., /sbin/fs/vxfs/fsck -P ) :
Loading ...

/: write failed, file system is full
ERROR: tar: ./usr/lib/libc.1: HELP - extract write error: No space left on
device (errno = 28)
ERROR: File: . not found.
/: file system full
NOTE: Retrying loadfile command...
ERROR: Cannot make link to (busy?) file: ./usr/lib/#libdld.2: File exists
(errno = 17)
ERROR: tar: ./usr/lib/libdld.2 - cannot create: Text file busy (errno = 26).

/: write failed, file system is full
ERROR: tar: ./usr/lib/libc.1: HELP - extract write error: No space left on
device (errno = 28)
ERROR: File: . not found.
/: file system full
NOTE: Retrying loadfile command...
ERROR: Cannot make link to (busy?) file: ./usr/lib/#libdld.2: File exists
(errno = 17)
ERROR: tar: ./usr/lib/libdld.2 - cannot create: Text file busy (errno = 26).

[...]

So it seems the ramdisk was full, and the ramdisk is so small that I cannot
get the two fscks for HFS and VxFS into it. However recovery insists on
checking the filesystems before doing anything.

[...]
/duped_root/.kshrc[310]: /dev/rdsk/c4t5d0s1lvm: cannot execute
'fsck' could not fix all customer file system errors with the options and
answers specified. You may wish to run 'fsck' again with different options or
attempt to repair the file system with the 'fsck -b' option to use a redundant
superblock. Automatic recovery cannot be completed unless the target file
system can be successfully 'fsck'ed.
Mounting /dev/dsk/ to /ROOT/stand as read only.
Loading /sbin/fs/NONE/mount...

/: write failed, file system is full

/: write failed, file system is full
ERROR: tar: ./sbin/mount: HELP - extract write error: No space left on device
(errno = 28)
ERROR: tar: ./sbin/mount: HELP - extract write error: No space left on device
/: file system full
[...]

I can only guerss that the media and software are thoroughly tested at HP labs
before being sent out to customers. OOps, maybe they just sent out an alpha
test media? ;-)

Anyway, I would not give up that easily:

[...]
INFORMATION to verify:
Device file used for '/'(ROOT) is c4t5d0.
The hardware path to disk is 0/6/0/0.5.0.
[...]
Selection: a
The disk is LVM.
Setting boot and root device file for c4t5d0...
Loading /sbin/umount...
[...]
Select one of the following:
a. Mount the root disk and exit to a shell only.
b. Recover the bootlif/os partitions.
c. Replace the kernel on the root file system.
d. Both options: b and c.
v. Read information about LVM recovery (LVM.RECOVER).

m. Return to 'HP-UX Recovery Media Main Menu'.
x. Exit to the shell.

Selection: c
Entering file system checking...
[...]
Loading /sbin/fs/hfs/fsck...
[...]
** /dev/rdsk/c4t5d0s1lvm
** Last Mounted on /stand
** Phase 1 - Check Blocks and Sizes
** Phase 2 - Check Pathnames
** Phase 3 - Check Connectivity
** Phase 4 - Check Reference Counts
** Phase 5 - Check Cyl groups
79 files, 0 icont, 27134 used, 291991 free (151 frags, 36480 blocks)
Mounting /dev/dsk/c4t5d0s1lvm to /ROOT/stand as read only.
Loading /sbin/fs/hfs/mount...
Loading /sbin/umount...
umount /ROOT/stand.
Executing fsck on /ROOT file system.
Loading /sbin/fs/vxfs/fsck...

/: write failed, file system is full

/: write failed, file system is full
ERROR: tar: ./sbin/fs/vxfs/fsck: HELP - extract write error: No space left on
device (errno = 28)
ERROR: tar: ./sbin/fs/vxfs/fsck: HELP - extract write error: No space left on/: file system full
/: file system full
[...]

OK, we had that before...

[...]
The disk is LVM.
Setting boot and root device file for c4t5d0...
Loading /sbin/umount...

/: write failed, file system is full

/: write failed, file system is full
ERROR: tar: ./sbin/umount: HELP - extract write error: No space left on
device (errno = 28)
ERROR: tar: ./sbin/umount: HELP - extract write error: No space left on
de/: file system full
[...]

After a few more attempts with similar results, I decided to try boot an an
older 11.23 Install media (booting an L class with 8GB RAM and two CPUs takes
its time).

[...]
NOTE: Creating the second RAM disc and mounting on /dev ...
* Generating device file for the second ramdisc...
* Loading mkfs to make a file system...
version 5 layout
15625 sectors, 15625 blocks of size 1024, log size 1024 blocks
unlimited inodes, largefiles not supported
15625 data blocks, 14529 free data blocks
1 allocation units of 32768 blocks, 32768 data blocks
last allocation unit has 15625 data blocks
* Loading mount to mount/dev/ram1 file system...
vxfs mount: Cannot open portal device: No such file or directory
* Mounting /dev/ram1 file system succeeded!
* Copying /dev.old files back to /dev succeeds!
* Loading insf to create disk device files...
* Creating disk device files...
* Loading the recovery commands...
[...]
FILE SYSTEM CHECK
MENU
The file system check '/sbin/fs/vxfs/fsck -y /dev/rdsk/c4t5d0s1lvm'
will now be run.

Select one of the following:
a. Run fsck -y.
b. Prompt for the fsck run string on c4t5d0s1lvm.
m. Return to the 'HP-UX Recovery MENU.'

Selection: a
Loading /sbin/fs/hfs/fsck...


[...]

** /dev/rdsk/c4t5d0s1lvm
** Last Mounted on /ROOT/stand
** Phase 1 - Check Blocks and Sizes
** Phase 2 - Check Pathnames
** Phase 3 - Check Connectivity
** Phase 4 - Check Reference Counts
** Phase 5 - Check Cyl groups
79 files, 0 icont, 27134 used, 291991 free (151 frags, 36480 blocks)
Going to mount /dev/dsk/c4t5d0s1lvm to /ROOT/stand as read only
Loading /sbin/fs/hfs/mount...
Loading /sbin/umount ......
Umount /ROOT/stand
Doing fsck on /ROOT file system
Loading /sbin/fs/vxfs/fsck...
/sbin/fs/vxfs/fsck -y /dev/rdsk/c4t5d0s2lvm
file system is clean - log replay is not required
Mount /ROOT file system
/sbin/fs/vxfs/mount /dev/dsk/c4t5d0s2lvm /ROOT
Loading mount commands ...
vxfs mount: Cannot open portal device: No such file or directory
Mount /ROOT/stand again
/sbin/fs/hfs/mount -F hfs /dev/dsk/c4t5d0s1lvm /ROOT/stand
Entering loading kernel....
The kernel is missing or zero size! Please try to determine the problem, and then retry.


[...]

So it seems an older version of the installation media does not know about the
filesystem layout of the current media (and the current media doesn't work).

I still had hope to be able to fix that problem, going into "manual mode"
(root shell):
Volume group "/dev/vg02" has been successfully changed.
# mount -ea
mount: rksapfs1:/exports/home is already mounted on /home
mount: /dev/vg00/lvol8 is already mounted on /var
mount: /dev/vg00/lvol7 is already mounted on /usr
mount: /dev/vg00/lvol6 is already mounted on /opt
mount: /dev/vg00/lvol4 is already mounted on /tmp
mount: /dev/vg00/lvol1 is already mounted on /stand
mount: /opt/ora: No such file or directory
mount: /opt/sap: No such file or directory
[...]
However:
# /usr/bin/find / -name system _prep
/sbin/sh: /usr/bin/find: not found
# ls /usr
# mount /usr
mount: /dev/vg00/lvol7 is already mounted on /usr
# umount /usr
umount: cannot unmount /usr : Block device required
umount: return error 1.
[...]
# umount /usr
umount: cannot unmount /usr : Block device required
umount: return error 1.
# mount /usr
mount: /dev/vg00/lvol7 is already mounted on /usr
# umount /dev/vg00/lvol7
umount: cannot unmount /usr : Block device required
umount: return error 1.
# umount /dev/vg00/rlvol7
umount: cannot find /dev/vg00/rlvol7 in /etc/mnttab
cannot unmount /dev/vg00/rlvol7
# mount /usr
vxfs mount: Cannot open portal device: No such device
# ls -l /opt
total 0
# mount /opt
vxfs mount: Cannot open portal device: No such device
[...]

At that point I still had a faint hope to boot vmunix.prev:
[...]
ISL> hpux /stand/vmunix.prev

Boot
: disk(0/6/0/0.5.0.0.0.0.0;0)/stand/vmunix.prev
9232384 + 1716224 + 4010432 start 0x1fc0e8




alloc_pdc_pages: Relocating PDC from 0xf0f0000000 to 0x7eb00000.
gate64: sysvec_vaddr = 0xc0002000 for 2 pages
NOTICE: autofs_link(): File system was registered at index 3.
NOTICE: cachefs_link(): File system was registered at index 5.
NOTICE: nfs3_link(): File system was registered at index 6.

System Console is on the Built-In Serial Interface
Entering cifs_init...
Initialization finished successfully... slot is 9
Logical volume 64, 0x3 configured as ROOT
Logical volume 64, 0x2 configured as SWAP
Logical volume 64, 0x2 configured as DUMP
Swap device table: (start & size given in 512-byte blocks)
entry 0 - major is 64, minor is 0x2; start = 0, size = 40304640
Starting the STREAMS daemons-phase 1
Checking root file system.
Root check done.
Create STCP device files
Starting the STREAMS daemons-phase 2
$Revision: vmunix: vw: -proj selectors: CUPI80_BL2000_1108 -c 'Vw for CUPI80_BL2000_1108 build' -- cupi80_bl2000_1108 'CUPI80_BL2000_1108' Wed Nov 8 19:24:56 PST 2000 $
Memory Information:
physical page size = 4096 bytes, logical page size = 4096 bytes
Physical: 8388608 Kbytes, lockable: 7814288 Kbytes, available: 7427756 Kbytes


INIT: /etc/inittab: WARNING: Cannot pstat_getstatic: Invalid argument
[...no more output appeared...]
--------------------
At that point it was obvious that I had to recover!

At least one thing that worked:
======= 08/19/05 12:01:35 EDT Installation complete: Successful


####### # #
# # # #
# # # #
# # ###
# # # #
# # # #
####### # #
-------------------------------
So that was a short story on how to waste 10 hours of work using HP-UX.

Regards,
Ulrich
P.S. More luck with your upgrade!