Thursday, May 24, 2007

ubuntu - BUG: soft lockup detected on CPU#0!

My Ubuntu (Ubuntu 7.04, AMD 1800XP, 2.6.20-15-generic #2 SMP Sun Apr 15 07:36:31 UTC 2007 i686 GNU/Linux) box would get halfway through booting and just sit there, doing nothing. ctrl+alt+del didn't do anything, and I couldn't get a console. To get more information, I rebooted and edited grub cmd line, taking off the 'quiet splash' options. It would get most of the way through fscking the first partition and then report: BUG: soft lockup detected on CPU#0! Not sure what causes this exactly, but I did a quick search and on some of the forums, a lot of people mentioned wireless network card as a possible source. The machine has got a wireless network card (it is a RaLink RT2561/RT61 802.11g PCI chipset), but not sure if this has anything to do with it. After a couple of futile soft reboots, I pulled the power and checked that the wireless card was properly seated, which it was, and booted the machine. It worked fine. I should try and find more out about what the error means though.

Sunday, May 13, 2007

setting up ldapv3 under Ubuntu 7.04

I had numerous problems getting LDAP running under Ubuntu 7.04. Ubuntu 7.04 comes with slapd v2.3.30 and libldap2.3. I was setting it up as a precursor to running postfix and courier with virtual domain support. The main issue I had was using SASL, which can be disabled I think, but I wanted to use it just as a challenge (and it may be required for LDAP to be fully v3 compliant). I have read that no MTA currently supports SASL anyway, they are all v2 compliant only. I previously used LDAP v2 under Debian Sarge, which does not require SASL, and is very easy to get running out of the box. In any case, as MTAs auth via v2 only, I think you need to enable v2 bind support in slapd.conf even if you are running LDAP v3.

It is a very good idea to read the guide at www.openldap.org, as well as the relevant man pages, of which there are numerous. The documentation requires very careful attention.

I found a few useful pages, beside the official openldap documentation. Here's one that has been translated from japanese:

http://www.tom.sfc.keio.ac.jp/~torry/ldap/ldap_en.html#doc4_15616

I didn't delete the sasldb2 file as the author suggests though

Another one:

http://defindit.com/readme_files/ldap.html

And a pdf called 'Surviving Cyrus SASL':

http://postfix.state-of-mind.de/patrick.koetter/surviving_cyrus_sasl.pdf

A bit about SASL. See the wikipedia entry for background.

LDAP specific:

SASL - you can authenticate requests via SASL (for e.g., ldapdelete or other write operations), which looks at a password in /etc/sasldb2, or you can authenticate from info in the LDAP db, as you would other users. I'm still not 100% clear on whether you need to use SASL or not, and exactly what is required to have it working (some of my config options like sasl-realm and sasl-host may be superfluous, and I didn't use a slapd.conf file in /usr/lib/sasl2, etc etc)
I am not sure what auth mechanisms you need to set up.

I found that if you do specify a dn (i.e., with the '-D' flag), it will attempt to bind against what is in the LDAP db, e.g.,:

ldapdelete -x -D "cn=admin,dc=cm,dc=net" -W "ou=people,dc=cm,dc=net" -h localhost

but if you don't specify a dn, it will use SASL:

root@cm:/home/cm# ldapdelete -U admin "ou=people,dc=cm,dc=net" -h localhost
SASL/DIGEST-MD5 authentication started
Please enter your password:
SASL username: admin
SASL SSF: 128
SASL installing layers

Note that while SASL provides an *authentication* mechanism, that user still needs permission to be able to do things such as read or write to the LDAP db (authorization). So, you still need entries in the LDAP db allowing authenticated users permission to read/write/etc to LDAP (this is the same as with earlier versions of LDAP of course). Under v3, you could set LDAP up so that it authenticates via SASL and then uses the ACLs in the LDAP db to do control access.

cm@cm:~$ ldapadd -f 2nd.ldif -h localhost
SASL/DIGEST-MD5 authentication started
Please enter your password:
SASL username: cm
SASL SSF: 128
SASL installing layers
adding new entry "ou=People, dc=cm,dc=net"
ldap_add: Insufficient access (50)
additional info: no write access to parent


Note that you must specify the username (unless you are that user)


direct mapping vs search-based mapping

direct mapping avoids having to do an LDAP lookup. It uses the 'authz-regexp' or 'sasl-regexp' to re-write requests sent to it ('authz-regexp' and 'sasl-regexp' are the same AFAIK. authz is short for authorization, probably to distinguish it from authentication. I could not find any reference anywhere to 'sasl-regexp' using an apropos. But I was able to use both in the slapd.conf interchangeably. And the debug log shows sasl-regexp as 'slap_authz_regexp'

LDAP SASL library calls from the debug log:

May 13 22:53:09 cm slapd[24029]: do_sasl_bind: dn () mech DIGEST-MD5
May 13 22:53:09 cm slapd[24029]: SASL [conn=8] Debug: DIGEST-MD5 server step 2
May 13 22:53:09 cm slapd[24029]: slap_sasl_getdn: u:id converted to uid=admin,cn=cm.net,cn=DIGEST-MD5,cn=auth

search-mapping binds to the LDAP server and looks up the user via LDAP calls. It is more 'expensive', as you have to search the DIT

Tuesday, May 08, 2007

Linux-HA and heartbeat

http://www.linux-ha.org/

The goal of the Linux HA project is to provide 'a high availability (clustering) solution for Linux which promotes reliability, availability...'

The Linux-HA project's main software product is Heartbeat, which is a cluster management program for high availability clustering. Some of the features of heartbeat include:

- no fixed number of nodes
- resource monitoring
- fencing

Linux-HA state that their software (Heartbeat) is well integrated with separate projects like LVS and DRBD

Heartbeatv2 can do resource monitoring of Lustre for example, by using /etc/init.d/lustre with the 'status' argument (I think it looks in /proc or /sys), see https://mail.clusterfs.com/pipermail/lustre-discuss/2005-September/000870.html

keepalived

keepalived has been primarily developed as a means to provide high availability for LVS. It does this via health checks of protocols and services, and informs the kernel in case of failure (or if it becomes available again), and it uses a version of the VRRP to handle failover for the LVS director. (VRRP is an open standard based on Cisco's HSRP)

-It runs as a user-space program

-It checks multiple layers of the TCP stack (i.e., based on the OSI 7 layer model). It does checks at the IP level, tcp level, transport dialogue layer (5), and application layer.


"Keepalived implements a framework based on three family checks : Layer3, Layer4 & Layer5/7. This framework gives the daemon the ability of checking a LVS server pool states. When one of the server of the LVS server pool is down, keepalived informs the linux kernel via a setsockopt call to remove this server entry from the LVS topology. In addition keepalived implements an independent VRRPv2 stack to handle director failover. So in short keepalived is a userspace daemon for LVS cluster nodes healthchecks and LVS directors failover."

LVS

LVS is described on the home page of the project as a 'load balancer'. The aim is to provide scalability for services. The main product of the LVS project is IPVS, which provides Layer-4 (transport layer) load balancing inside the kernel.

From the home page:

"IPVS implements transport-layer load balancing (layer-4 switching) inside the Linux kernel"

They also have under development Layer-7 (application level) switching inside the kernel, using KTCPVS


To provide redundancy, their are several possibilities listed on their home page.

- pirhana
- ultramonkey
- keepalived
- heartbeat + mon + coda
- heartbeat + ldirectord

I have set up a cluster using the last option. The 2nd is what nr uses.