The Ninth Dimension: 2007

Saturday, December 15, 2007

UUID grub fstab

I wanted to migrate my data to a SATA disk from an IDE disk on an Ubuntu 7.10 system. In the past, I have simply tarred the contents of the partitions to the new disk, updated fstab and bootloader configs, reinstalled the MBR, and rebooted. However, in this instance, since the disk was laid out the same, I didn't really need to update partition info in grub config and fstab. If I had, I would have noticed that Ubuntu uses UUIDs for fstab and the grub config for identifying partitions. So I found myself with a system unable to get past the bootloader. My 'quick' fix was to use an Ubuntu CD in rescue mode and change /etc/fstab and /boot/grub/menu.lst to the 'old' style of /dev/sda1 etc. I then acquainted myself with the concept of UUID ('apropos UUID') plus some web searches. I found that you can find the UUID of a partition by running 'vol_id -u ', e.g., 'vol_id -u /dev/sda1'. I then copied this to fstab and menu.lst.

I think that UUID is handy way of doing things, in that it is possible to attach another disk to a machine without worrying whether it will change the order of things (e.g., by becoming the first scsi disk, and your system disk becomes the second disk, whilst all the fstab and grub entries point to the first still). It is still a slight mystery to me why the BIOS is able to boot from the second disk without modification though. Also, I am not sure why grub, when configured with (hd0) after adding a new disk which then becomes primary, isn't bothered. Maybe because this is just in the initial setup, and the bootloader is installed there anyway.

Wednesday, September 19, 2007

running out of memory when copying large files

From coraid support:

5.19 Q: How can I avoid running out of memory when copying large files?

A: You can tell the Linux kernel not to wait so long before writing data out to backing storage.

echo 5 > /proc/sys/vm/dirty_ratio
echo 5 > /proc/sys/vm/dirty_background_ratio

These settings are even tighter and should help even on a system that is doing a large amount of larger-than-RAM data transfers.

echo 3 > /proc/sys/vm/dirty_ratio
echo 3 > /proc/sys/vm/dirty_background_ratio
echo 5120 > /proc/sys/vm/min_free_kbytes

Thursday, August 16, 2007

mysql replication

If mysql replication breaks with:

070816 13:03:17 Slave I/O thread: connected to master 'repl@mydns01.securepod.com:3306', replication started in log 'mydns01-bin.025' at position 20533567
070816 13:03:17 Error reading packet from server: Client requested master to start replication from impossible position (server_errno=1236)
070816 13:03:17 Got fatal error 1236: 'Client requested master to start replication from impossible position' from master when reading data from binary log'
070816 13:03:17 Slave I/O thread exiting, read up to log 'mydns01-bin.025', position 20533567

you can check the binlog that it last read up to (mydns01-bin.025), using mysqlbinlog, and search for log position it read up to (in this case 20533567). In this case, the last entry in mydns01-bin.025 is 20533147

mysqlbinlog mydns01-bin.025 | tail -10

#070815 16:01:10 server id 1 log_pos 20533147 Query thread_id=20094 exec_time=0 error_code=0
SET TIMESTAMP=1187211670;

INSERT INTO ZoneRecord
( ZoneRecord.zone, ZoneRecord.name, ZoneRecord.type, ZoneRecord.data, ZoneRecord.aux, ZoneRecord.ttl, ZoneRecord.active, ZoneRecord.categoryid )
VALUES ( '167857' , '' , 'NS' , 'ns1.securepod.com.' , 'None' , '14400' , '1' , '1' );

The next bin log mydns01-bin.026 starts from a new position (this position actually corresponds to the file size of the binlog). So: you can run the query that the last entry has in bin.025 and see if it exists, and run the query in bin.026 and see if it exists. If the first does, but the second does not, then you can change the bin log start position from the first position in bin.026, which is as easy as changing the master.info on the slave(s) to the new value and restarting mysql (or you could do it on the mysql command line if you can be bothered)

Tuesday, July 10, 2007

debootstrap bootstraps a basic Debian system of SUITE (eg, sarge, etch, sid) into TARGET from MIRROR by running
SCRIPT. MIRROR can be an http:// URL or a file:/// URL. Notice that file:/ URLs are translated to file:/// (cor‐
rect scheme as described in RFC1738 for local filenames), and file:// will not work.

Debootstrap can be used to install Debian in a system without using an installation disk but can also be used to
run a different Debian flavor in a chroot environment. This way you can create a full (minimal) Debian installation
which can be used for testing purposes (see the EXAMPLES section). If you are looking for a chroot system to build
packages please take a look at pbuilder.

Monday, June 25, 2007

differences between openvz and virtuozzo

openvz doesn't have vzfs, for one

Virtuozzo File System (VZFS) :

VZFS is a file system that allows to share common files among multiple VPSs without sacrificing flexibility. It is possible for VPS users to modify, update, replace, and delete shared files. When a user modifies a shared file, VZFS creates a private copy of the file transparently for the user. Thus, the modifications do not affect the other users of the file. Main benefits of VZFS are the following:

- It saves memory required for executables and libraries. A typical VPS running a simple web site might consume around 20–30 MBytes of RAM just for executable images. Sharing this memory improves scalability and total system performance;

- It saves disk space. A typical Linux server installation occupies several hundred MBytes of disk space. Sharing the files allows you to save up to 90% of disk space;

- VZFS does not require having different physical partitions for different VPSs or creating a special “file system in a file” setup for a VPS. This significantly simplifies disk administration;

- Disk quota enables the administrator to limit disk resources available to a VPS on-the-fly, in the same manner as the standard disk quota system works on a per-user basis. Disk quota for users and groups inside VPSs is also supported.

Tuesday, June 12, 2007

blocking spam with sendmail

Can block by domain or by IP address or network

in /etc/mail/access:

localhost.localdomain RELAY
localhost RELAY
127.0.0.1 RELAY
192.168.1 RELAY
202.124.241.222 ERROR:"550 Take a hike"

As this is a hash, rebuild to db with:

hash /etc/mail/access < /etc/mail/access

Useful way of telling php module support

php -m

or

php5 -m

php -v shows version

Also, you can do: 'php /home/cm/phpinfo.php'

Thursday, May 24, 2007

ubuntu - BUG: soft lockup detected on CPU#0!

My Ubuntu (Ubuntu 7.04, AMD 1800XP, 2.6.20-15-generic #2 SMP Sun Apr 15 07:36:31 UTC 2007 i686 GNU/Linux) box would get halfway through booting and just sit there, doing nothing. ctrl+alt+del didn't do anything, and I couldn't get a console. To get more information, I rebooted and edited grub cmd line, taking off the 'quiet splash' options. It would get most of the way through fscking the first partition and then report: BUG: soft lockup detected on CPU#0! Not sure what causes this exactly, but I did a quick search and on some of the forums, a lot of people mentioned wireless network card as a possible source. The machine has got a wireless network card (it is a RaLink RT2561/RT61 802.11g PCI chipset), but not sure if this has anything to do with it. After a couple of futile soft reboots, I pulled the power and checked that the wireless card was properly seated, which it was, and booted the machine. It worked fine. I should try and find more out about what the error means though.

Sunday, May 13, 2007

setting up ldapv3 under Ubuntu 7.04

I had numerous problems getting LDAP running under Ubuntu 7.04. Ubuntu 7.04 comes with slapd v2.3.30 and libldap2.3. I was setting it up as a precursor to running postfix and courier with virtual domain support. The main issue I had was using SASL, which can be disabled I think, but I wanted to use it just as a challenge (and it may be required for LDAP to be fully v3 compliant). I have read that no MTA currently supports SASL anyway, they are all v2 compliant only. I previously used LDAP v2 under Debian Sarge, which does not require SASL, and is very easy to get running out of the box. In any case, as MTAs auth via v2 only, I think you need to enable v2 bind support in slapd.conf even if you are running LDAP v3.

It is a very good idea to read the guide at www.openldap.org, as well as the relevant man pages, of which there are numerous. The documentation requires very careful attention.

I found a few useful pages, beside the official openldap documentation. Here's one that has been translated from japanese:

http://www.tom.sfc.keio.ac.jp/~torry/ldap/ldap_en.html#doc4_15616

I didn't delete the sasldb2 file as the author suggests though

Another one:

http://defindit.com/readme_files/ldap.html

And a pdf called 'Surviving Cyrus SASL':

http://postfix.state-of-mind.de/patrick.koetter/surviving_cyrus_sasl.pdf

A bit about SASL. See the wikipedia entry for background.

LDAP specific:

SASL - you can authenticate requests via SASL (for e.g., ldapdelete or other write operations), which looks at a password in /etc/sasldb2, or you can authenticate from info in the LDAP db, as you would other users. I'm still not 100% clear on whether you need to use SASL or not, and exactly what is required to have it working (some of my config options like sasl-realm and sasl-host may be superfluous, and I didn't use a slapd.conf file in /usr/lib/sasl2, etc etc)
I am not sure what auth mechanisms you need to set up.

I found that if you do specify a dn (i.e., with the '-D' flag), it will attempt to bind against what is in the LDAP db, e.g.,:

ldapdelete -x -D "cn=admin,dc=cm,dc=net" -W "ou=people,dc=cm,dc=net" -h localhost

but if you don't specify a dn, it will use SASL:

root@cm:/home/cm# ldapdelete -U admin "ou=people,dc=cm,dc=net" -h localhost
SASL/DIGEST-MD5 authentication started
Please enter your password:
SASL username: admin
SASL SSF: 128
SASL installing layers

Note that while SASL provides an *authentication* mechanism, that user still needs permission to be able to do things such as read or write to the LDAP db (authorization). So, you still need entries in the LDAP db allowing authenticated users permission to read/write/etc to LDAP (this is the same as with earlier versions of LDAP of course). Under v3, you could set LDAP up so that it authenticates via SASL and then uses the ACLs in the LDAP db to do control access.

cm@cm:~$ ldapadd -f 2nd.ldif -h localhost
SASL/DIGEST-MD5 authentication started
Please enter your password:
SASL username: cm
SASL SSF: 128
SASL installing layers
adding new entry "ou=People, dc=cm,dc=net"
ldap_add: Insufficient access (50)
additional info: no write access to parent

Note that you must specify the username (unless you are that user)

direct mapping vs search-based mapping

direct mapping avoids having to do an LDAP lookup. It uses the 'authz-regexp' or 'sasl-regexp' to re-write requests sent to it ('authz-regexp' and 'sasl-regexp' are the same AFAIK. authz is short for authorization, probably to distinguish it from authentication. I could not find any reference anywhere to 'sasl-regexp' using an apropos. But I was able to use both in the slapd.conf interchangeably. And the debug log shows sasl-regexp as 'slap_authz_regexp'

LDAP SASL library calls from the debug log:

May 13 22:53:09 cm slapd[24029]: do_sasl_bind: dn () mech DIGEST-MD5
May 13 22:53:09 cm slapd[24029]: SASL [conn=8] Debug: DIGEST-MD5 server step 2
May 13 22:53:09 cm slapd[24029]: slap_sasl_getdn: u:id converted to uid=admin,cn=cm.net,cn=DIGEST-MD5,cn=auth

search-mapping binds to the LDAP server and looks up the user via LDAP calls. It is more 'expensive', as you have to search the DIT

Tuesday, May 08, 2007

Linux-HA and heartbeat

http://www.linux-ha.org/

The goal of the Linux HA project is to provide 'a high availability (clustering) solution for Linux which promotes reliability, availability...'

The Linux-HA project's main software product is Heartbeat, which is a cluster management program for high availability clustering. Some of the features of heartbeat include:

- no fixed number of nodes
- resource monitoring
- fencing

Linux-HA state that their software (Heartbeat) is well integrated with separate projects like LVS and DRBD

Heartbeatv2 can do resource monitoring of Lustre for example, by using /etc/init.d/lustre with the 'status' argument (I think it looks in /proc or /sys), see https://mail.clusterfs.com/pipermail/lustre-discuss/2005-September/000870.html

keepalived

keepalived has been primarily developed as a means to provide high availability for LVS. It does this via health checks of protocols and services, and informs the kernel in case of failure (or if it becomes available again), and it uses a version of the VRRP to handle failover for the LVS director. (VRRP is an open standard based on Cisco's HSRP)

-It runs as a user-space program

-It checks multiple layers of the TCP stack (i.e., based on the OSI 7 layer model). It does checks at the IP level, tcp level, transport dialogue layer (5), and application layer.

"Keepalived implements a framework based on three family checks : Layer3, Layer4 & Layer5/7. This framework gives the daemon the ability of checking a LVS server pool states. When one of the server of the LVS server pool is down, keepalived informs the linux kernel via a setsockopt call to remove this server entry from the LVS topology. In addition keepalived implements an independent VRRPv2 stack to handle director failover. So in short keepalived is a userspace daemon for LVS cluster nodes healthchecks and LVS directors failover."

LVS

LVS is described on the home page of the project as a 'load balancer'. The aim is to provide scalability for services. The main product of the LVS project is IPVS, which provides Layer-4 (transport layer) load balancing inside the kernel.

From the home page:

"IPVS implements transport-layer load balancing (layer-4 switching) inside the Linux kernel"

They also have under development Layer-7 (application level) switching inside the kernel, using KTCPVS

To provide redundancy, their are several possibilities listed on their home page.

- pirhana
- ultramonkey
- keepalived
- heartbeat + mon + coda
- heartbeat + ldirectord

I have set up a cluster using the last option. The 2nd is what nr uses.

Wednesday, April 25, 2007

lustre failover experimentation

'lconf --failover' didn't seem to work, but 'lconf --cleanup --force --service=mds /root/config.xml' did. It removed all the modules. Once I was satisfied that mds-1 was not using the device, I started the failover device, mds-2, just by running 'lconf --node mds-2 /root/config.xml'.

On the client that was mounting the resource:

LustreError: 19533:0:(client.c:940:ptlrpc_expire_one_request()) @@@ timeout (sen t at 1177507980, 0s ago) req@f7d55600 x8852471556/t0 o400->mds_UUID@mds-1_UUID:1 2 lens 64/64 ref 1 fl Rpc:N/0/0 rc 0/0
LustreError: 19533:0:(client.c:940:ptlrpc_expire_one_request()) Skipped 29 previ ous similar messages
Lustre: 12:0:(linux-debug.c:96:libcfs_run_upcall()) Invoked LNET upcall /usr/lib /lustre/lnet_upcall ROUTER_NOTIFY,192.168.241.229@tcp,down,1177507956
LustreError: MDC_mds-1_mds_MNT_client-f70a0400: Connection to service mds via ni d 192.168.241.229@tcp was lost; in progress operations using this service will w ait for recovery to complete.
Lustre: Changing connection for MDC_mds-1_mds_MNT_client-f70a0400 to mds-2_UUID/192.168.241.227@tcp

The share was fine from the client after the switchover.

There were no messages on the OSSes. But I did this:

oss-4:/home/cmcleay/lustre-1.4.10# lctl ping mds-2
12345-0@lo
12345-192.168.241.227@tcp
oss-4:/home/cmcleay/lustre-1.4.10# lctl ping mds-1
failed to ping 192.168.241.229@tcp: Input/output error

I used the init scripts after I was satisfied with this - all worked well.
Doing an 'ls' on the share hung, but it came back after a little while after starting the failover mds-1

You need to have the config file in the right place for the init scripts to work properly

There were some messages on the oss:
Apr 25 23:40:48 oss-4 kernel: Lustre: 6:0:(linux-debug.c:98:libcfs_run_upcall()) Invoked LNET upcall /usr/lib/lustre/lnet_upcall ROUTER_NOTIFY,192.168.241.229@tcp,down,1176793786
Apr 25 23:44:11 oss-4 kernel: Lustre: 17433:0:(filter.c:3236:filter_set_info_async()) ost-beta: received MDS connection from 192.168.241.229@tcp

Then it all stopped working :(

lustre errors with a new config

The MDS writes a log file on the MDS device (if you mount a MDS volume, you can see it)

I tried re-writing it, but I decided to reformat it instead (I'd also tried a reboot, but that didn't help either)

MDSDEV: mds mds_UUID /dev/sdb1 ldiskfs no
! /usr/sbin/lctl (22): error: setup: Invalid argument

mds-1:~# lconf -v --node mds-1 /root/config.xml
configuring for host: ['mds-1']
Checking XML modification time
+ debugfs -c -R 'stat /LOGS' /dev/sdb1 2>&1 | grep mtime
xmtime 1177503678 > kmtime 1176793916
Error: MDS startup logs are older than config /root/config.xml. Please run --write_conf on stopped MDS to update. Use '--old_conf' to start anyways.
mds-1:~# lconf -v --node mds-1 --write-conf /root/config.xml
configuring for host: ['mds-1']
Service: network NET_mds-1_lnet NET_mds-1_lnet_UUID
loading module: libcfs srcdir None devdir libcfs
+ /sbin/modprobe libcfs
loading module: lnet srcdir None devdir lnet
+ /sbin/modprobe lnet
+ /sbin/modprobe lnet
loading module: ksocklnd srcdir None devdir klnds/socklnd
+ /sbin/modprobe ksocklnd
Service: ldlm ldlm ldlm_UUID
loading module: lvfs srcdir None devdir lvfs
+ /sbin/modprobe lvfs
loading module: obdclass srcdir None devdir obdclass
+ /sbin/modprobe obdclass
loading module: ptlrpc srcdir None devdir ptlrpc
+ /sbin/modprobe ptlrpc
Service: mdsdev MDD_mds_mds-1 MDD_mds_mds-1_UUID
original inode_size 0
stripe_count 1 inode_size 512
loading module: lquota srcdir None devdir quota
+ /sbin/modprobe lquota
loading module: mdc srcdir None devdir mdc
+ /sbin/modprobe mdc
loading module: osc srcdir None devdir osc
+ /sbin/modprobe osc
loading module: lov srcdir None devdir lov
+ /sbin/modprobe lov
loading module: mds srcdir None devdir mds
+ /sbin/modprobe mds
loading module: ldiskfs srcdir None devdir ldiskfs
+ /sbin/modprobe ldiskfs
loading module: fsfilt_ldiskfs srcdir None devdir lvfs
+ /sbin/modprobe fsfilt_ldiskfs
Service: mdsdev MDD_mds_mds-1 MDD_mds_mds-1_UUID
original inode_size 0
stripe_count 1 inode_size 512
MDSDEV: mds mds_UUID /dev/sdb1 ldiskfs no
+ /usr/sbin/lctl
attach mds mds mds_UUID
quit
+ /usr/sbin/lctl
cfg_device mds
setup /dev/sdb1 ldiskfs
quit
+ /usr/sbin/lctl
ignore_errors
cfg_device $mds
cleanup
detach
quit
! /usr/sbin/lctl (22): error: setup: Invalid argument

Wednesday, April 18, 2007

gotcha with /dev and udev when copying system

Usually I use

cd /
tar -clf - .|(cd /mnt;tar -xpf -)

to make a copy of a system. However, with newer systems running udev, this will cause problems as it does not copy /dev, which gets put on its own partition. So you will not be able to boot into a system unless you copy the /dev entries (it has a basic skeleton including vital files such as /dev/console and /dev/sda etc, stuff not needed for most boot environments will be created by udev dynamically)

Tuesday, April 17, 2007

Failover for Lustre nodes

Lustre does not provide the tool set for the system-level components necessary for a complete failover solution (node failure detection, power control, and so on), as this functionality has been available for some time from third party tools. CFS does provide the necessary scripts to interact with these packages, and exposes health information for system monitoring. The recommended choice is the Heartbeat package from linux-ha.org. Lustre will work with any HA software that supports resource (I/O) fencing. The Heartbeat software is responsible for detecting failure of the primary server node and controlling the failover.

Thursday, February 22, 2007

linuxthreads being used instead of NPTL (native posix threading library)

Built a Debian machine which had an unusual problem whereby it was using linuxthreads instead of NPTL (native posix threading library) threads. Thus, when threaded services like nscd and java started, they would show multiple processes rather than a single (threaded) process. The /lib/tls directory was present, and all the right packages. But as a getconf showed, it was using linuxthreads:

# getconf GNU_LIBPTHREAD_VERSION
linuxthreads-0.10

ldconfig -v showed the /lib/tls libraries, it just wasn't using them

The clue was in an strace when starting nscd:

access("/etc/ld.so.nohwcap", F_OK) = 0
mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb7f00000
access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory)
open("/etc/ld.so.cache", O_RDONLY) = 3
fstat64(3, {st_mode=S_IFREG|0644, st_size=74936, ...}) = 0
mmap2(NULL, 74936, PROT_READ, MAP_PRIVATE, 3, 0) = 0xb7eed000
close(3) = 0
access("/etc/ld.so.nohwcap", F_OK) = 0
open("/lib/libncurses.so.5", O_RDONLY) = 3
read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0\200\345"..., 512) = 512
fstat64(3, {st_mode=S_IFREG|0644, st_size=263040, ...}) = 0
mmap2(NULL, 264196, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0xb7eac000
mmap2(0xb7ee4000, 32768, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x38) = 0xb7ee4000
mmap2(0xb7eec000, 2052, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0xb7eec000
close(3) = 0
access("/etc/ld.so.nohwcap", F_OK) = 0
open("/lib/libdl.so.2", O_RDONLY) = 3
read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0\20\f\0"..., 512) = 512
fstat64(3, {st_mode=S_IFREG|0644, st_size=9592, ...}) = 0
mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb7eab000
mmap2(NULL, 12404, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0xb7ea7000
mmap2(0xb7ea9000, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x1) = 0xb7ea9000
close(3) = 0
access("/etc/ld.so.nohwcap", F_OK) = 0
open("/lib/libc.so.6", O_RDONLY) = 3

On a machine using nptl:

execve("/etc/init.d/nscd", ["/etc/init.d/nscd", "start"], [/* 16 vars */]) = 0
uname({sys="Linux", node="ws-6", ...}) = 0
brk(0) = 0x80e6000
old_mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x40017000
access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory)
open("/etc/ld.so.preload", O_RDONLY) = -1 ENOENT (No such file or directory)
open("/etc/ld.so.cache", O_RDONLY) = 3
fstat64(3, {st_mode=S_IFREG|0644, st_size=19421, ...}) = 0
old_mmap(NULL, 19421, PROT_READ, MAP_PRIVATE, 3, 0) = 0x40018000
close(3) = 0
access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory)
open("/lib/libncurses.so.5", O_RDONLY) = 3
read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0\220\342"..., 512) = 512
fstat64(3, {st_mode=S_IFREG|0644, st_size=252592, ...}) = 0
old_mmap(NULL, 257868, PROT_READ|PROT_EXEC, MAP_PRIVATE, 3, 0) = 0x4001d000
old_mmap(0x40053000, 36864, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED, 3, 0x35000) = 0x40053000
close(3) = 0
access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory)
open("/lib/tls/libdl.so.2", O_RDONLY) = 3
read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0\320\32"..., 512) = 512
fstat64(3, {st_mode=S_IFREG|0644, st_size=9872, ...}) = 0
old_mmap(NULL, 8632, PROT_READ|PROT_EXEC, MAP_PRIVATE, 3, 0) = 0x4005c000
old_mmap(0x4005e000, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED, 3, 0x2000) = 0x4005e000
close(3) = 0
access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory)
open("/lib/tls/libc.so.6", O_RDONLY) = 3

The presence of a file 'ld.so.nohwcap' was the problem. According to the man page of ld.so (on an Etch machine): "When this file is present the dynamic linker will load the non-optimized version of a library, even if the CPU supports the optimized version"

The reason for the presence of this file was because glibc was downgraded to an earlier version, which causes Debian to put this file in /etc. glibc was downgraded because this machine was mistakenly installed with 'etch' instead of 'sarge' and it was deemed easier at the time to downgrade via apt/aptitude etc, rather than do a fresh install. Perhaps a mistake :)

Friday, February 09, 2007

load balancing with f5s

I was wondering how f5s configured as active-active do load balancing. That is, for the nodes or real servers that they load balance, how do they a) accept packets from a source, without configuring two separate default routes on the router sending the packets, and have a different IP on each f5 (having static default routes would be problematic if one of the f5s failed, so it would have to be able to detect this), and b) how do they accept traffic from the servers they are load balance equally, as the servers only have one default route.

I poked around the web and found that the Ultramonkey project, using Saru, can do this, among others. I think the way it might work is to have a common MAC address shared between two f5s, so that the packet goes to both f5s, and then use some mechanism whereby the f5s compare the packets to make sure that they are the same, and one of them forwards the packet, and one of them drops it. In a way, this scenario is not really load balancing, as both machines still receive the packet and process it to some degree. It would only really be worthwhile if very little processing was done on each packet to decide which machine was to forward it. Otherwise, you get no real gain from having two machines active. Perhaps this could be a hash lookup, which, if the hashes match, then some very simple algorithm could then be used to decide which unit will forward and which unit will drop the packet. Maybe a lot more of the CPU and system resources would be dedicated to NATing the packet and applying various other rules to it, so this scheme would work. Anyway, I don't really know if that is how it does it. I'll have to research a bit more.

Wednesday, January 24, 2007

fixing broken/missing /var/lib/dpkg/available file

/var/lib/dpkg/available went missing on a machine, breaking commands like 'dpkg -l':

ws-1:/usr/local/bin# dpkg -l
dpkg-query: failed to open package info file `/var/lib/dpkg/available' for reading: No such file or directory

'dselect update' (from man dpkg) fixed this (may also need to run apt-get update prior)

ws-1:/usr/local/bin# dselect update
Hit http://192.168.241.146 stable/main Packages
Hit http://192.168.241.146 stable/main Release
Hit http://192.168.241.146 stable/contrib Packages
Hit http://192.168.241.146 stable/contrib Release
Hit http://192.168.241.146 stable/non-free Packages
Hit http://192.168.241.146 stable/non-free Release
Hit http://192.168.241.146 stable/updates/main Packages
Hit http://192.168.241.146 stable/updates/main Release
Hit http://192.168.241.146 stable/updates/contrib Packages
Hit http://192.168.241.146 stable/updates/contrib Release
Hit http://192.168.241.146 sarge-backports/main Packages
Hit http://192.168.241.146 sarge-backports/main Release
Hit http://192.168.241.146 sarge-backports/contrib Packages
Hit http://192.168.241.146 sarge-backports/contrib Release
Hit http://192.168.241.146 sarge-backports/non-free Packages
Hit http://192.168.241.146 sarge-backports/non-free Release
Reading Package Lists... Done
Merging Available information
Replacing available packages info, using /var/cache/apt/available.
Information about 17330 package(s) was updated.

This also get information about packages that were not retrieved through apt-get, i.e., locally made packages

The Ninth Dimension