Wednesday, April 25, 2007

lustre failover experimentation

'lconf --failover' didn't seem to work, but 'lconf --cleanup --force --service=mds /root/config.xml' did. It removed all the modules. Once I was satisfied that mds-1 was not using the device, I started the failover device, mds-2, just by running 'lconf --node mds-2 /root/config.xml'.

On the client that was mounting the resource:

LustreError: 19533:0:(client.c:940:ptlrpc_expire_one_request()) @@@ timeout (sen t at 1177507980, 0s ago) req@f7d55600 x8852471556/t0 o400->mds_UUID@mds-1_UUID:1 2 lens 64/64 ref 1 fl Rpc:N/0/0 rc 0/0
LustreError: 19533:0:(client.c:940:ptlrpc_expire_one_request()) Skipped 29 previ ous similar messages
Lustre: 12:0:(linux-debug.c:96:libcfs_run_upcall()) Invoked LNET upcall /usr/lib /lustre/lnet_upcall ROUTER_NOTIFY,192.168.241.229@tcp,down,1177507956
LustreError: MDC_mds-1_mds_MNT_client-f70a0400: Connection to service mds via ni d 192.168.241.229@tcp was lost; in progress operations using this service will w ait for recovery to complete.
Lustre: Changing connection for MDC_mds-1_mds_MNT_client-f70a0400 to mds-2_UUID/192.168.241.227@tcp

The share was fine from the client after the switchover.

There were no messages on the OSSes. But I did this:

oss-4:/home/cmcleay/lustre-1.4.10# lctl ping mds-2
12345-0@lo
12345-192.168.241.227@tcp
oss-4:/home/cmcleay/lustre-1.4.10# lctl ping mds-1
failed to ping 192.168.241.229@tcp: Input/output error



I used the init scripts after I was satisfied with this - all worked well.
Doing an 'ls' on the share hung, but it came back after a little while after starting the failover mds-1

You need to have the config file in the right place for the init scripts to work properly

There were some messages on the oss:
Apr 25 23:40:48 oss-4 kernel: Lustre: 6:0:(linux-debug.c:98:libcfs_run_upcall()) Invoked LNET upcall /usr/lib/lustre/lnet_upcall ROUTER_NOTIFY,192.168.241.229@tcp,down,1176793786
Apr 25 23:44:11 oss-4 kernel: Lustre: 17433:0:(filter.c:3236:filter_set_info_async()) ost-beta: received MDS connection from 192.168.241.229@tcp


Then it all stopped working :(

lustre errors with a new config

The MDS writes a log file on the MDS device (if you mount a MDS volume, you can see it)

I tried re-writing it, but I decided to reformat it instead (I'd also tried a reboot, but that didn't help either)

MDSDEV: mds mds_UUID /dev/sdb1 ldiskfs no
! /usr/sbin/lctl (22): error: setup: Invalid argument


mds-1:~# lconf -v --node mds-1 /root/config.xml
configuring for host: ['mds-1']
Checking XML modification time
+ debugfs -c -R 'stat /LOGS' /dev/sdb1 2>&1 | grep mtime
xmtime 1177503678 > kmtime 1176793916
Error: MDS startup logs are older than config /root/config.xml. Please run --write_conf on stopped MDS to update. Use '--old_conf' to start anyways.
mds-1:~# lconf -v --node mds-1 --write-conf /root/config.xml
configuring for host: ['mds-1']
Service: network NET_mds-1_lnet NET_mds-1_lnet_UUID
loading module: libcfs srcdir None devdir libcfs
+ /sbin/modprobe libcfs
loading module: lnet srcdir None devdir lnet
+ /sbin/modprobe lnet
+ /sbin/modprobe lnet
loading module: ksocklnd srcdir None devdir klnds/socklnd
+ /sbin/modprobe ksocklnd
Service: ldlm ldlm ldlm_UUID
loading module: lvfs srcdir None devdir lvfs
+ /sbin/modprobe lvfs
loading module: obdclass srcdir None devdir obdclass
+ /sbin/modprobe obdclass
loading module: ptlrpc srcdir None devdir ptlrpc
+ /sbin/modprobe ptlrpc
Service: mdsdev MDD_mds_mds-1 MDD_mds_mds-1_UUID
original inode_size 0
stripe_count 1 inode_size 512
loading module: lquota srcdir None devdir quota
+ /sbin/modprobe lquota
loading module: mdc srcdir None devdir mdc
+ /sbin/modprobe mdc
loading module: osc srcdir None devdir osc
+ /sbin/modprobe osc
loading module: lov srcdir None devdir lov
+ /sbin/modprobe lov
loading module: mds srcdir None devdir mds
+ /sbin/modprobe mds
loading module: ldiskfs srcdir None devdir ldiskfs
+ /sbin/modprobe ldiskfs
loading module: fsfilt_ldiskfs srcdir None devdir lvfs
+ /sbin/modprobe fsfilt_ldiskfs
Service: mdsdev MDD_mds_mds-1 MDD_mds_mds-1_UUID
original inode_size 0
stripe_count 1 inode_size 512
MDSDEV: mds mds_UUID /dev/sdb1 ldiskfs no
+ /usr/sbin/lctl
attach mds mds mds_UUID
quit
+ /usr/sbin/lctl
cfg_device mds
setup /dev/sdb1 ldiskfs
quit
+ /usr/sbin/lctl
ignore_errors
cfg_device $mds
cleanup
detach
quit
! /usr/sbin/lctl (22): error: setup: Invalid argument

Wednesday, April 18, 2007

gotcha with /dev and udev when copying system

Usually I use

cd /
tar -clf - .|(cd /mnt;tar -xpf -)

to make a copy of a system. However, with newer systems running udev, this will cause problems as it does not copy /dev, which gets put on its own partition. So you will not be able to boot into a system unless you copy the /dev entries (it has a basic skeleton including vital files such as /dev/console and /dev/sda etc, stuff not needed for most boot environments will be created by udev dynamically)

Tuesday, April 17, 2007

Failover for Lustre nodes

Lustre does not provide the tool set for the system-level components necessary for a complete failover solution (node failure detection, power control, and so on), as this functionality has been available for some time from third party tools. CFS does provide the necessary scripts to interact with these packages, and exposes health information for system monitoring. The recommended choice is the Heartbeat package from linux-ha.org. Lustre will work with any HA software that supports resource (I/O) fencing. The Heartbeat software is responsible for detecting failure of the primary server node and controlling the failover.