Hi.

[This is a slight revision of a previous post, which I have cancelled.]

We have a lab with around 20 Linux (Fedora FC4) boxes. We would like these
machines to be essentially identical. The idea is that updates would only
be done on one machine, and that changes would migrate from this machine to
other machines. We do this via an rsync-based script (called sync-lab),
which would run overnight, courtesy of cron.

Of course:

(1) The hostnames and IP addresses shouldn't be replicated.
(2) We use autorpm to update the "source" machine only. We don't want to
run autorpm on the other lab machines.
(3) We don't want to mess with the log files on the other lab machines.

So here's the sync-lab script:

------%<--%<--%<---cut here---%<--%<--%<----------------------------
#!/bin/sh

remoteMachines=/usr/local/adm/sync-lab/remoteMachines
excludedFiles=/usr/local/adm/sync-lab/excludedFiles

export RSYNC_RSH=/usr/bin/ssh
for machine in `cat $remoteMachines`
do
rsync -ax --delete --exclude-from=$excludedFiles / $machine:/
rsync -ax --delete /boot/ $machine:/boot/
rsync -ax --delete /usr/local/ $machine:/usr/local/
done
------%<--%<--%<---cut here---%<--%<--%<----------------------------

The file "excludedFiles" contains the following:

/etc/sysconfig/network
/etc/sysconfig/network-scripts/ifcfg-eth0
/etc/autorpm.d/autorpm.cron
/var/spool/cron/root
/var/log/*

and "remoteMachines" contains the IP address of only one machine (while
we're in "shakedown" mode).

Here's the output of sync-lab:

---%<------%<------%<------%<--Cut-Here-%<------%<------%<------%<---
rsync: delete_file: rmdir "/dev/shm" failed: Device or resource busy (16)
rsync: delete_file: rmdir "/proc/tty" failed: Operation not permitted (1)
rsync: delete_file: rmdir "/selinux/booleans" failed: Operation not
permitted (1)
rsync: delete_file: rmdir "/sys/power" failed: Operation not permitted (1)
rsync: delete_file: rmdir "/var/lib/nfs/rpc_pipefs/statd" failed: Operation
not permitted (1)
rsync error: some files could not be transferred (code 23) at main.c(789)
---%<------%<------%<------%<--Cut-Here-%<------%<------%<------%<---

This was the initial version of our problem:

When the target machine was rebooted after running sync-lab, an X-windows
login would hang. It would get the username and password, but nothing
would happen after the login widget disappeared (i.e., we didn't get the
usual GNOME stuff). I eventually had to reboot the target machine.

However, after rebooting the target machine, everything seemed OK.

Here is an additional data point:

We rebooted the target machine after running sync-lab. I then brought up
a virtual text console (ALT-F1). I *carefully* typed in a username and
a password. The screen totally cleared and I was presented with the
login prompt; if you had looked away for a second, it would've looked as
if I hadn't typed anything at all. I then typed in the username/password
a second time, and I was allowed to log in. Moreover, the X windows
login did *not* hang.

Does anybody have an idea why this is happening? What should I change to
fix the problem?

Many thanks.

--
Art Werschulz (8-{)} "Metaphors be with you." -- bumper sticker
GCS/M (GAT): d? -p+ c++ l u+(-) e--- m* s n+ h f g+ w+ t++ r- y?
Internet: agw STRUDEL cs.columbia.edu
ATTnet: Columbia U. (212) 939-7060, Fordham U. (212) 636-6325