The Ninth Dimension: 2006

Wednesday, December 27, 2006

spamassassin

spamassassin and OCR plugin

I wanted to install something to weed out spam that had gifs containing spam content. Normally, these
can't be parsed by standard methods. AC pointed me in the direction of an OCR plugin, which can scan
an image and recognise certain patterns. At first, I found OCR, which was based on some perl stuff as
well as gocr. I had some problems integrating it into the existing spamassassin setup. It just was not
running the scans. I just didn't know where to put the 'loadplugin' statements and perl modules.
I found running spamassassin in debug mode is the best way to find out exactly where it looks and what tests
it does. By this point I'd found that there was a newer plugin which claimed improvements on OCR

FuzzyOCR (wiki at: http://fuzzyocr.own-hero.net/wiki/Installation-3.x)

spamassassin --debug FuzzyOcr < ./gif_spam > /dev/null

So I ran it it in debug mode and it prints all the paths it searches, plugins it loads, and tests it does.

It turned out that it was using /var/lib/spamassassin, not the standard /usr/share/spamassassin (which also existed)

The wiki comes with good instructions and suggested to put it in /etc/spamassassin, where spamd just picked it up (after a reload). I had to set the db directory to something writeable by the nobody user though. Running it in debug (as above) reported all the issues it had.

http://spamassassin.apache.org/tests_3_1_x.html

AREA TESTED LOCALE DESCRIPTION OF TEST TEST NAME DEFAULT SCORES
(local, net, with bayes, with bayes+net) MORE INFO
(additional wiki docs)
body Generic Test for Unsolicited Bulk Email GTUBE 1000.000 Wiki
body Incorporates a tracking ID number TRACKER_ID 2.000 1.295 2.292 1.032 Wiki
body Weird repeated double-quotation marks WEIRD_QUOTING 1.120 1.200 1.295 1.341 Wiki

Tuesday, December 05, 2006

exim - removing frozen messages

Found this as a one-liner to remove frozen messages on an exim server
Haven't tried it - might need to replace 'exim' with 'exim4' though

mailq | awk '/frozen/ { print $3 }' | xargs exim -Mrm

Thursday, November 16, 2006

Fixing timestamps on mail server

Messages that passed through one of our mailservers were stamped with UTC rather than EST, which caused some customers to complain. Looking at the hardware clock, it showed UTC instead of EST. I wasn't able to set it via the hwclock options. The problem was that it was missing a symlink to /etc/localtime:

Should have looked like:

# ls -l /etc/localtime
lrwxrwxrwx 1 root root 48 Mar 31 11:19 /etc/localtime -> /usr/share/zoneinfo/Australia/Sydney

Wednesday, November 15, 2006

mkinitrd, depmod

Note that mkinitrd does not work on Debian post 2.6.12 kernels. It has been replaced by other packages, e.g., initramfs-tools (which I have installed), or Yaird, or linux-initramfs-tool.

Problems trying to use a 3ware 9550-sx 4LP on a box running Lustre

The default 2.6.8-3 debian kernel image works fine, but could not get the SUSE 2.6.5 Lustre kernel to work. Tried using the default image, and then generating an initrd. Also tried using the kernel source tree of that version, with drivers compiled in, or as modules. No luck. It was not able to mount root fs. This was on a Sarge machine. Below is the process I used to mkinitrd for a Debian 'Sarge' box, using a lustre-patched redhat EL kernel (2.6.9).

First, I tried making initrd using default /etc/mkinitrd/mkinitrd.conf values. It was failing silently. Solution: change the following:

# Command to generate the initrd image.
MKIMAGE='mkcramfs %s %s > /dev/null'

I copied mkext2fs (obtained from another machine) into /usr/local/sbin and put the following line at the suggestion of a colleague:

# Command to generate the initrd image.
MKIMAGE='/usr/local/sbin/mkext2fs %s %s > /dev/null'

The command to generate the initrd.img was:

mkinitrd -d /etc/mkinitrd -o /boot/initrd.img- /lib/modules/

But this failed because it could not find a modules.dep file, which was not generated from the conversion of the .rpm kernel image I had downloaded from clusterfs.com (I used alien -c to generate the .deb). So I ran a 'depmod -a ', where 'kernel_ver' was equivalent to a uname -r of the kernel that I wanted to run. This generated my modules.dep

Tried again. This failed because it reported that /tmp was running out of space. The /tmp filesystem had 288GB free, so it wasn't actually the /tmp filesystem, but it must have made a loopback filesystem of a certain small size on /tmp. The solution to this was to change:

# What modules to install.
MODULES=most

to:

# What modules to install.
MODULES=dep

in the /etc/mkinitrd/mkinitrd.conf file, and list the essential modules in /etc/mkinitrd/modules

e.g.,

scsi_mod
libata
ata_piix
3w_9xxx
ext3

These must be modules otherwise it will complain that it can't find them (of course). If they are compiled into the kernel, or not at all, then it will complain. It may still boot ok if they are in the kernel. Even if it complains, it will probably still build the image.

One question on my mind is whether the order that they are put in /etc/mkinitrd/modules matters. I suspect it doesn't, which might be why having a 'modules.dep' file is important. Some modules need to be loaded before others. I think dpkg kernels run pre-inst scripts (e.g., /var/lib/dpkg/info/linux-image-2.6.16-2-686-smp.preinst. Maybe after they install the modules they run a mkinitrd against the module tree. Need to spend a bit more time mucking around to find out.

One thing I did find out was that copying driver source from a 2.6.17 kernel source tree to a 2.6.5 source tree and then trying to compile does not always work, though it can. I was doing this to get a later version of the driver into a lustre kernel source tree, which was a fairly old one (2.6.5). It has worked in some cases though.

debian:/etc/mkinitrd# cd
debian:~# mkinitrd -d /etc/mkinitrd -o /boot/initrd.img-2.6.5lustre.1.4.7 /lib/modules/2.6.5lustre.1.4.7
/usr/sbin/mkinitrd: add_modules_dep_2_5: modprobe failed
FATAL: Module libata not found.
FATAL: Module ata_piix not found.
WARNING: This failure MAY indicate that your kernel will not boot!
but it can also be triggered by needed modules being compiled into
the kernel.

initial ramdisk creation in Debian stock kernels (from /var/lib/dpkg/info/linux-image-2.6.16-2-686-smp.postinst):

my @ramdisklist;
@ramdisklist = find_inird_tool($hostversion, $version, $ramdisk) if $ramdisk;
die "Failed to find suitable ramdisk generation tool for kernel version \n" .
"$version on running kernel $hostversion in $ramdisk\n"
if $#ramdisklist < 0;
my $success = 0;
for $ramdisk_cmd (@ramdisklist) {
print STDERR "Using $ramdisk_cmd to build the ramdisk.\n";
print STDERR "Other valid candidates: @ramdisklist\n" if $#ramdisklist > 0;

my $initrd_path = $realimageloc . "initrd.img-$version";
my $ret = system("$ramdisk_cmd " .
($mkimage ? "-m '$mkimage' " : "") .
"-o $initrd_path.new $modules_base/$version");
if ($ret) {
warn "$ramdisk_cmd failed to create initrd image.\n";
}
else {
rename("$initrd_path.new", "$initrd_path")
or die("Failed to rename initrd ($initrd_path)\n");
$success = 1;
last;
}
}

Monday, November 13, 2006

shutdown stuff

Running the following sequence of commands is useful if you have a machine that tries to stop a lot of processes that take a long time to stop or might hang the shutdown process (e.g., unmounting NFS shares when there are NFS problems). Issuing this command assumes that it is ok to immediately terminate processes. You run the sync first to ensure that the local data is written to disk first (doesn't try syncing NFS data)

sync
reboot -fn

'shutdown -c' will kill a shutdown command that has been issued

slay and kill -9 -1

slay is like kill -9 -1
It will kill all processes belonging to a user

AC used this to kill some hung 'umount -f '

Very handy

Monday, November 06, 2006

dns statistics with dnstop

dnstop -t -s eth0

Toggle with 'c' - this appears to be undocumented. Below is some flags (from the man page):

s display the source address table

d display the destination address table

t display the breakdown of query types seen

o display the breakdown of opcodes seen

1 show the TLD table

2 show the SLD table

3 show the 3LD table

c show the SLD+source table

# show the 3LD+source table

^R reset the counters

^X exit the program

? help

Sunday, November 05, 2006

Apache server used as a spam proxy via php bug

Whilst working at the datacentre on day, I got a call from the office
to say that the load on a few of our cPanel servers was very high. After
a bit of looking around, I noticed that there were lot of log entries
(from various IPs in Taiwan to port 80, which requested connections to port 25
of another IP:

201.63.4.219 - - [16/Jan/2005:14:03:47 +1100] "CONNECT 215.66.11.47:25 HTTP/1.0" 200 1243

What was happening was that Apache was proxying connections to port 80
to mail servers to send spam (presumably so it didn't look like the messages
came from their IP). I know that Apache can be used as a proxy (using mod_proxy), but
this was not enabled in the httpd.conf. After some quick checking via google, the
problem turned out to be a bug in php.

##### php (apparently) has a
##### vulnerability which allows Apache to be used as a
##### proxy without the mod_proxy or mod_proxy_connect
##### modules. To block this, we block 'CONNECT'

Order deny,allow
Deny from all

Wednesday, October 25, 2006

serveraid 8i aacraid x86_64

Recently built a new database server with a serveraid 8i raid controller, which is based on an adaptec chipset. Under linux, it uses the aacraid driver. AFAIK, it is not supported by any current Debian Etch AMD-64 boot CDs. A quick google turned up a site which provides tools to do a net boot using a very minimal CD image (http://kmuto.jp/b.cgi/debian/d-i-2615-amd64.htm) which saved me a lot of time (note: must use 'modprobe aacraid' to get it to see the controller). I installed the standard SMP Debian 2.6.17 AMD 64 kernel, which allows it to boot fine. However, this kernel was not suitable for our purposes for a number of reasons, so I wanted to roll my own. But that is proving extremely difficult. When compiling the driver into the kernel, it sees the controller just fine, but does not pick up the disks. Same when using an initrd. I have tried copying the kernel config from the debian kernel and deleting stuff that wasn't needed, like pcmcia, acpi, video for linux, sound card support (this is a server, after all!), but to no avail. I tried patching the source tree with the latest driver source (actually, a rather dodgy patch of just replacing the original .c and .h files with the newer one; it still compiles ok), and updating the firmware. No luck so far. I am thinking that I might try comparing the kernel messages on boot between the working debian kernel and my failed attempts. So far nothing is standing out.

Wednesday, October 11, 2006

sendfile() issue on lustre 1.4.x

The messages below relate to an issue with Lustre and sendfile() syscall. Lustre 1.4.x does not support it. The sendfile() syscall copies directly from disk to the network rather than doing multiple copies to memory or whatever. The messages are caused by proftp running on this server. You can disable sendfile() by recompiling proftp with it turned off

Oct 12 12:46:47 cthulhu kernel: Lustre: 1483:0:(rw.c:1380:ll_readpage()) ino 20592684 page 175 (716800) not covered by a lock (mmap?). check debug logs.
Oct 12 12:46:47 cthulhu kernel: Lustre: 1483:0:(rw.c:1380:ll_readpage()) previously skipped 286 similar messages
Oct 12 12:46:53 cthulhu kernel: Lustre: 1483:0:(rw.c:1380:ll_readpage()) ino 20592684 page 245 (1003520) not covered by a lock (mmap?). check debug logs.
Oct 12 12:46:53 cthulhu kernel: Lustre: 1483:0:(rw.c:1380:ll_readpage()) previously skipped 69 similar messages
Oct 12 13:56:26 cthulhu kernel: Lustre: 3154:0:(rw.c:1380:ll_readpage()) ino 16526558 page 0 (0) not covered by a lock (mmap?). check debug logs.
Oct 12 13:56:26 cthulhu kernel: Lustre: 3154:0:(rw.c:1380:ll_readpage()) previously skipped 15 similar messages
Oct 12 13:56:47 cthulhu kernel: Lustre: 3154:0:(rw.c:1380:ll_readpage()) ino 16526526 page 0 (0) not covered by a lock (mmap?). check debug logs.
Oct 12 13:58:59 cthulhu kernel: Lustre: 3154:0:(rw.c:1380:ll_readpage()) ino 27364595 page 0 (0) not covered by a lock (mmap?). check debug logs.
Oct 12 14:01:24 cthulhu kernel: Lustre: 3154:0:(rw.c:1380:ll_readpage()) ino 13216322 page 0 (0) not covered by a lock (mmap?). check debug logs.
Oct 12 14:07:43 cthulhu kernel: Lustre: 3216:0:(rw.c:1380:ll_readpage()) ino 20459809 page 68 (278528) not covered by a lock (mmap?). check debug logs.
Oct 12 14:08:04 cthulhu kernel: Lustre: 3216:0:(rw.c:1380:ll_readpage()) ino 20459809 page 209 (856064) not covered by a lock (mmap?). check debug logs.
Oct 12 14:08:04 cthulhu kernel: Lustre: 3216:0:(rw.c:1380:ll_readpage()) previously skipped 140 similar messages
Oct 12 14:11:48 cthulhu kernel: Lustre: 3154:0:(rw.c:1380:ll_readpage()) ino 19054807 page 42 (172032) not covered by a lock (mmap?). check debug logs.
Oct 12 14:11:48 cthulhu kernel: Lustre: 3154:0:(rw.c:1380:ll_readpage()) previously skipped 37 similar messages
Oct 12 14:15:08 cthulhu kernel: Lustre: 3154:0:(rw.c:1380:ll_readpage()) ino 19054900 page 30 (122880) not covered by a lock (mmap?). check debug logs

Sunday, October 08, 2006

more sed stuff

from http://www.oracle.com/technology/pub/articles/dulaney_sed.html

$ cat sample_one
one 1
two 1
three 1
one 1
two 1
two 1
three 1
$

Suppose that it would be desirable for "1" to be substituted with "2," but only after the word "two" and not throughout every line. This can be accomplished by specifying that a match is to be found before giving the substitute command:

$ sed '/two/ s/1/2/' sample_one
one 1
two 2
three 1
one 1
two 2
two 2
three 1
$

handy way in shell of moving files with spaces in them

Say you have a bunch of files with spaces in the names, e.g.,:

02 Workinonit.m4a

And you want to move them to names without spaces (e.g., convert them to underscores). Luckily, there's an easy way: use 'read'

Let's say we want to copy them to filenames with an underscore replacing the space. To check that we have it right, the following command will echo what the result would be

find . -type f | while read f ; do echo cp "$f" "`echo "$f" | sed 's/ */_/g'`";done

cp ./18 Don't Cry.m4a ./18_Don't_Cry.m4a

Then just remove the first 'echo':

find . -type f | while read f ; do cp "$f" "`echo "$f" | sed 's/ /_/g'`";done

and you are done :) Interestingly, the site I found this on used sed 's/ */_g' (i.e., two spaces followed by an asterisk. This works, but I don't know why. I also found that you can use alternation to pick up other characters, like !,(). However, it seems sed has no way of dealing with apostrophes (according to all the man pages etc I consulted. Escaping it with a '\' will not work, nor will two backslashes. If you want to replace spaces, commas, exclamation marks and brackets in filenames, use:

sed -r 's/ |$|$|\,|\!/_/g'

I haven't yet worked out how to do all this in perl

Thursday, October 05, 2006

useful debian tools

update-rc.d

e.g., update-rc.d aveserver start 50 2 3 4 5 . stop 30 0 1 6 .

modconf - allows you to add/remove modules from start up

Monday, October 02, 2006

update-rc.d

Debian way of managing sysV init scripts

You can create links or delete them, among other things, with this script. It is better to manage service startup with this rather than, e.g., deleting symlinks manually, as they may get recreated when the service is updated. Thus, if you want to disable a service, use update-rc.d to set this (see man page). There a variety of other handy scripts for managing and configuring services on Debian under the name of update-*

I/O scheduler for linux

The kernel I/O scheduler (the available schedulers are set at kernel compile config)
There are schedulers optimized for servers and desktops. 'noop' and 'deadline' are best for server environments, whereas 'anticipatory' and 'cfq' are for desktops

If set, it can be seen in /sys/block//queue/scheduler in later kernels (2.6.??)
Otherwise, it uses whatever default for the vendor kernel (e.g., SUSE enterprise use CFQ for their kernels by default)

You need to set it for each device, e.g., sda, sdb, hda etc

sysctl -w sys.block.sda.queue.scheduler=deadline

OR

echo "deadline" > /sys/block/sda/queue/scheduler

cat /sys/block/sda/queue/scheduler
noop anticipatory [deadline] cfq

The one in the square brackets is what is set

It can be set while running in later kernels too by changing that parameter

Otherwise, it can be set at boot time. Debian has a utility called 'sysfsutils' which allows you to set it in /etc/sysfs.conf

sysfsutils - sysfs query tool and boot-time setup

# cat /etc/sysfs.conf | grep cfq
block/sda/queue/scheduler = cfq
block/sdc/queue/scheduler = cfq

# /etc/init.d/sysfsutils restart

Or can be done via parameter at boot:

GRUB:
root=/dev/sda1 noapic elevator=deadline

LILO:
append="elevator=deadline"

From Lustre documentation:

deadline – This is the 'old' scheduler, which works well for servers.

anticipatory I/O scheduler (AS) – It is designed for 'batching' I/O requests. It does not work well for servers and high IO loads.

cfq – It adds multiple scheduling classes. cfq also does not work well for servers and high I/O loads.

noop – This is the 'old' elevator. It works well for servers.

This seems quite different to what RedHat recommend http://www.redhat.com/magazine/008jun05/features/schedulers/

Wednesday, September 27, 2006

backporting with debian

The quick and simple way:

1. Add the repository to you /etc/apt/sources.list, e.g., if you are running 'sarge' and want later packages:

deb http://www.backports.org/debian sarge-backports main contrib non-free

2. apt-get update

3. apt-get -t sarge-backports install

Sunday, September 24, 2006

udpcast for machine replication

http://www.udpcast.linux.lu/

One way of replicating many machines at once, if they are identical hardware, including disk drives etc, is to use udpcast. The idea is that one machine is used as a template for the others. Each machine is booted off a special boot disk (or a system image served over the network) which has some special programs that allow machines to be replicated over the network. One machine acts as a sender and the others are receivers.

I had to build a special kernel as the disk controllers used were not yet in the kernel source tree. It is actually extremely easy to build a boot disk with the instructions given at http://www.udpcast.linux.lu/mkimagedoc.html. For a CD image, it just involves having the kernel image (can be placed anywhere) and the modules (in /lib/modules, if you are using a modular kernel). Then run makeImage -k -c

It will replicate the disks (maybe using dd?) over the network. In my case, I just used a crossover cable as I was just doing one machine (it is really designed to do many machines at once)

Of course, you may need to modify some things, like IP addresses, hostnames etc...

Friday, September 22, 2006

Passing parameters for modules and kernel drivers

As with modules, it is possible to pass parameters to drivers at boot time. There is documentation in the kernel source tree under Documentation/kernel-parameters.txt. You can find what parameters a module will accept, use 'modinfo -p ' (for a binary module). Or you can also look at the source code of the driver (if for instance, your driver is in the kernel).

For instance, if the driver says that you can pass it the option:

adp94xx=access_HostRAID:1

Then you can create a file in /etc/modprobe.d/ with:

options adp94xx adp94xx=access_HostRAID:1

If the driver is in the kernel, you alter it to:

adp94xx.adp94xx=access_HostRAID:1

Assuming you are using GRUB, edit the boot parameter and add the line above

Wednesday, September 20, 2006

adp94xx kernel driver for linux

Had to install Debian Sarge on a IBM xSeries 306m. This machine comes with a adaptec 9405 SAS controller, which is not yet in the kernel source tree. There are rpms for redhat and SUSE available though, as well as the driver source code, so you can compile it yourself. Steps to add it to the kernel source tree:

-Create a subdir in drivers/scsi called adp94xx
-copy the source files (.h, .c, and Makefile) provided by adaptec into this directory
-add this directory to the top level Makefile in the scsi directory, basing the format on the aic7xxx stuff
-add a line to Kconfig so that menuconfig would pick it up

One question: at what point do you use 'patch' to modify the kernel source tree as opposed to e.g., just sticking in a driver? Is a patch just a way of more conveniently and consistently adding the same thing (as well as distributing it)?

This enabled me to compile it into the kernel, which avoids the necessity of using an initrd to boot the system, since the rootfs will be on a disk attached to that controller. The other alternative is to compile it as a module (haven't yet tried this), or use the binary (rpm) modules supplied by Adaptec. As I was using Debian, I couldn't use the rpms directly, so I converted them into .deb files with Alien, and then extracted them using dpkg -x to get a .ko. One problem I have had with using the driver compiled into the kernel is not being able to get the driver to see the controller when it is configured in RAID mode. The driver defaults to non-RAID mode, which can be switched to RAID mode by sending it a parameter at boot. From reading Documentation/kernel-parameters.txt, and the documentation supplied with the driver, I figured this would be:

adp94xx.adp94xx=attach_HostRAID=1

so I added this at boot by editing the grub boot line. However, it failed to see the controller, reporting:

"Probing AIC-94xx Controller(s)...
AIC-94xx controller(s) attached = 0."

I have got it to work fine when using the driver as a module on a generic Debian 2.6.8 kernel:

modprobe adp94xx adp94xx=attach_HostRAID=1

scsi1 : Adaptec AIC-9405W SAS/SATA Host Adapter
Vendor: IBM-ESXS Model GNA073C33ESTT0Z N Rev: BHOD
Type: Direct-Access
etc
AIC-94xx controller(s) attached = 1.

(also need to modprobe sd-mod so that there is a scsi transport layer so you can access the disk.)

My next attempt will be to change the line in the source code that makes the default to be off:

From adp94xx_osm.c:

/* By default we do not attach to HostRAID enabled controllers.
* You can turn this on by passing
* adp94xx=attach_HostRAID:1
* to the driver (kernel command line, module parameter line).
*/
static int asd_attach_HostRAID = 0;

to

static int asd_attach_HostRAID = 1;

I'll try this when I have more time.

UPDATE: I have tried the above, using a debian 2.6.8 sarge kernel source tree, and compiled it in to the kernel and as a module. It fails to detect the card when it is configured as RAID in the BIOS. I have tried passing it the option on the command line to set this, despite the default supposedly being '1' (modprobe adp94xx adp94xx=attach_HostRAID=1) when used as a module. The only other option I can think of right now is to install the OS on a USB disk (they are 1RU servers with no space for further internal storage). I have tried putting a single disk in and marking this disk as 'simple storage' in the BIOS of the controller which allows data migration, but this still requires RAID as enabled, and it forgets this if you disable RAID

Monday, September 18, 2006

The following packages cannot be authenticated!

When trying to install packages on an 'Etch' system (upgraded from 'Sarge')

WARNING: The following packages cannot be authenticated!

tic tac toe

Install these packages without verification [y/N]?

To fix:

apt-get install debian-archive-keyring

apt-get update

Saturday, September 09, 2006

parameter substitution in shell

To delete pattern from variable:

Assuming our the value of our variable is 'abracadabra'

my_variable=abracadabra

${variable%pattern} - deletes pattern AFTER variable. That is, if the pattern starts with pattern, it will delete it. If the pattern appears anywhere else than the end, the entire contents of the variable are printed.

echo ${my_variable%ra}

gives:

abracadab

${variable#pattern} - deletes pattern BEFORE variable

echo ${my_variable#abr}

gives:

acadabra

However, if it doesn't begin with 'abr', it will just print the variable (see above)

echo ${my_variable#ra}

gives:

abracadabra

But you can use regular expressions within the match.

Thus:

echo ${my_variable#*ra}

will work from the beginning of the value of the variable and delete everything up until the first occurence of 'ra'. A single '#' tells to substitute the smallest match, but if you use '##', it uses longest possible match (greedy match). Thus

echo ${my_variable##*ra}

will print nothing (or a newline), since the variable ends with 'ra'

Example usage:

To rename a list of files that end in ".bak" to a name without the ".bak":

for xx in `ls *.bak`;do mv $xx ${xx%.bak};done

In the Korn shell, you can also use '?' to match a single character, '[]' to match a set of characters enclosed in the parentheses, e.g., [a-s] will match any letters from a-s, and the negation would be [!a-s] to not match any characters from a-s. A '*' or a '+' will match one or more, e.g., ${my_variable%%+([a-r])} will print a blank. The '()' seem to be needed around the square brackets. I am not sure what the equivalent for this is in bash.

Taken from: UNIX Shell programming, by Kochan & Wood

Thursday, September 07, 2006

cross-compiling between x86 versions on Debian

Say you had a nice, fast Debian 'Sarge' box and you wanted to use it to compile a kernel (or something else) for a Debian 'Woody' box - will this work? What about vice-versa? Could you compile a kernel for a Sarge box on a Woody machine? When cross-compiling for different architectures, there are special things that need to be done (look at these - what do they tell you about the things that go into compiling). There are pitfalls. For instance, if you wanted to compile a 2.4 kernel for your currently running Debian 'Woody' machine on a distro that uses a very recent version of gcc (=< 3.4), it will fail. This is because gcc 3.4 and above will not compile a 2.4 kernel. You need to use an earlier version of gcc. Presumably, vice versa, early versions of gcc would not be able to compile recent kernels either.

Another question on my mind is whether the tools used to create kernel images on later/earlier versions of an OS environment will cause problems for that kernel to be used on earlier/later OS environments. E.g., would gcc, make, bin86, bzip2, etc create a binary kernel image that just could not be used on an environment different to the machine the kernel was compiled on? For example, the version of bzip2 that compresses the kernel image on e.g., a Sarge machine, may not create it in a format that the version of bzip2 on the 'potato' machine you install it on can understand. What about creating symbol tables? And linkers?

What about using recent kernels on old machines? Problems here could be the kernel code, i.e., scsi.c, adaptec.c, ext2.h, etc may not work with user-space utilities on an old environment (e.g., filesystems such as ext2 compiled into a 2.6 kernel may not be compatible with a Debian 'potato' system, as the code for ext2 has changed since potato was created, and all of the tools it uses to access ext2 filesystems). Another example is 2.6 kernel for Woody systems: the module tools on woody cannot load the modules of a 2.6 system (the 2.6 kernel uses a different module loading system to that of Woody. Sarge has the necessary tools however). Of course, you could dispense with modules all together.

Think about what the kernel does. It manages resources, provides access to hardware (device drivers) and so on. A quick list:

-device drivers
-filesystems
-support for executable file formats (e.g., ELF, a.out)
-security (e.g., LIDS SELinux, GRSecurity)
-crypto
-Misc other stuff like networking, firewalls, sysctl etc

Thus, all the user-space stuff that talks to all of this has to be able to talk to the interfaces provided by the kernel.

What about things beside the kernel? e.g., if I compiled 'tar' on a woody system, could I use it on a sarge system? One question here is whether the compiler looks at your OS environment. Dynamic linking would be an issue possibly. It may assume a certain version of libc for instance. Then there is the sort of format the compiler produces.

2.6 kernel under Debian 'woody'

From: http://marc.herbert.free.fr/linux/linux2.6_for_woody.txt

The mechanism for dynamically loading kernel modules has been rewritten between 2.4 and 2.6. As a consequence, the former "modutils" tools (insmod, modprobe,...) are not compatible with 2.6 You need the
new debian package "module-init-tools" instead.

More info: http://marc.herbert.free.fr/linux/linux2.6_for_woody.txt

Of course, you could always use a monolithic kernel...

boot blocks, partition tables

I couldn't be bothered summarising this. Here is a good overview:

http://en.wikipedia.org/wiki/Mbr

Inodes

Much of this is drawn from the excellent Linux Tutorials page (http://www.linux-tutorial.info/modules.php?name=Tutorial&pageid=224), and a other bits from wikipedia

Inodes are used in most unix filesystems, although there are differences in implementation. Also, some filesystems (such as ReiserFS) do not use them. In BSD, inodes are called 'vnodes', the 'v' standing for their role as part of the filesystem abstraction layer (VFS). Whether inode or vnode, they all play the same sort of role. The term 'inode' dates from unix pioneer Dennis Ritchie, and he hazards a guess that the term may have once stood for 'index node'.

The number of inodes on a disk is set when the disk filesystem is created (e.g., with mke2fs), and cannot be adjusted (at least for ext2). The list of inodes is stored in a table called the inode table near the start of the disk, along with the superblock, which contains the number of free inodes, as well as the location of the inode table (I am not sure whether, like the superblock, there can be backup copies of the inode table placed at other locations on the disk. I would presume so). One of the purposes of inodes is to store information about files, which in POSIX implementations includes:

* The length of the file in bytes.
* Device ID (this identifies the device containing the file).
* The User ID of the file's owner.
* The Group ID of the file.
* An inode number that identifies the file within the filesystem.
* The file mode, which determines what users can read, write, and execute the file.
* Timestamps telling when the inode itself was last changed (ctime), the file content last modified (mtime), and last accessed (atime).
* A reference count telling how many hard links point to the inode.

The 'stat' system call retrieves a file's inode number and some of the information in the inode.
As users, we mostly deal with files via name. However, filenames just point to inodes, which are referred to by number. The inode corresponding to a file can be shown in a variety of ways, including the '-i' switch with 'ls', which shows the inode number. Inodes are not files themselves, but data structures (taking a guess here). They do take up blocks on disk however. Inodes act as pointers to the actual data blocks that files take up on the disk. They can do this either directly or indirectly. There are a total of 15 references to datablocks stored in each inode (presumably, the number is kept low to keep things efficient and stop the inode getting too large). The first 12 blocks of a file (under the ext2 implementation) refer directly to the addresses of the blocks on disk. Assuming a data block size of 4096 bytes, this would allow a file size of 12 x 4096 bytes = 48k. If all references were direct, the remaining three references would only allow a maximum file size of 60k, so ext2 uses a system of indirect referencing. For the next (13th) reference, the inode points to a data block which contains a 4 byte value (allowing 128 references). Subsequently, there are double indirect and triple indirect references. This allow files of up to 4TB in size, assuming a 4k block size. However, as the maximum for a 32-bit integer is 4GB, this is the limit on ext2 on 32-bit systems.

One inode per file. There can also be inodes without files.

BSD (vnodes):
* the permissions of the file
* the file link count
* old user and group ids
* the inode number
* the size of the file in bytes
* the last time the file was accessed (atime)
* the last time the file was modified (mtime)
* the last inode change time for the file (ctime)
* direct disk blocks
* indirect disk blocks
* status flags (chflags)
* blocks actually held
* file generation number
* the owner of the file
* the primary group of the owner of the file
Notice that the name of the file is not part of the inode's metadata. The filesystem doesn't care what the name of the file is; it only needs to know what inode number is associated with that file.

Wednesday, September 06, 2006

linux hardware reboot from software

#/bin/sh

#This script does the same thing as hitting the "reboot" button
#on a machine.

echo 1 > /proc/sys/kernel/sysrq # turns it on
echo s > /proc/sysrq-trigger # sync
echo u > /proc/sysre-trigger # umount ro
echo s > /proc/sysrq-trigger # sync
echo b > /proc/sysrq-trigger # immediate hardware reboot

the 'replace' command

Got a file that has xxx.xxx.xx.xxx and want to replace it with yyy.yyy.yy.yyy?

shell> replace xxx.xxx.xxx.xxx yyy.yyy.yyy.yyy -- myfile.txt

The "--" tells the replace command where the string replacements end and the file name starts.

Or you can pipe into it like so:

shell> replace xxx.xxx.xxx.xxx yyy.yyy.yyy.yyy < myfile.txt

This will replace x with y in the myfile1.txt:

shell> replace x.x.x.x y.y.y.y y.y.y.y x.x.x.x -- myfile1.txt

This will actually swap the two in the two files:

shell> replace x y y x -- myfile1.txt myfile2.txt

Fun with 'find'

find . -printf 'Name: %f Owner: %u %s bytes\n'

From: http://marcnapoli.com.au/osi.php

find files younger than 24h

find . -type f -mtime -0

find dirs younger than 60 minutes

find . -type d -mmin -60

find all SUID and SGID

find / -path '/proc' -prune -o -path '/www/db/' -prune -o -type f -perm +6000 -fls /root/findallsetuidgid.txt

find / -xdev -type f -perm +u=s -o -perm +g=s -print  

find all world writables on a system

find / -noleaf -path '/proc' -prune \

  -o -path '/sys' -prune \

  -o -path '/dev' -prune \

  -o -perm -2 ! -type l  ! -type s \

  ! \( -type d -perm -1000 \) -fls /root/findworldwritables.txt

## Search / skipping a few dirs. noleaf assumes not all mounted
filesystems are unix fs. It will skip sockets, symlinks, and any
directory with the sticky bit set.



 find all files owned by no one in particular 

find / -path '/proc' -prune -o -nouser -o -nogroup -fls /root/findownedbynoone.txt

useful stuff about tar

tar will generally overwrite already existing files when untarring. There is the '-k' switch which tells it to keep old files. It will not erase files that are not in the tarball being extracted though. From the '--help':

-k, --keep-old-files don't replace existing files when extracting
--keep-newer-files don't replace existing files that are newer
than their archive copies
--overwrite overwrite existing files when extracting
--no-overwrite-dir preserve metadata of existing directories

When you extract a tarball as root, it will extract the file with the same UID as it was created with. I found this out in an amusing way when attempting to extract a tarball I'd downloaded, and it would fail with the message 'quota exceeded'. This was because the UID of the files in the tarball were the same as one of the users on the box. '--no-same-owner' will extract them as your current ID, which is the default for non-root.

Tar across the network, using ssh:

tar -czf - ~/mp3s | ssh 192.168.1.163 tar -xvzf -

Sunday, August 27, 2006

kill -9 -1

Kill all processes you can kill:

kill -9 -1

This can really come in handy if you have run a script or something that keeps forking and starts chewing up loads of memory.

Thursday, August 24, 2006

SANs and NASs

SAN and NAS

Storage Area Networks and Network Attached Storage - what is the difference?

There are several differences (basing this on more typical setups, there are exceptions)

SAN

Example physical setup: A chassis containing a number of disks, the disks have an FCAL interface connector, which hotplugs into a FCAL backplane, which is connected to one or more FCAL RAID controllers (e.g., dothill/artecon ones I have used have 2 controllers for redundancy). These are connected via FCAL cabling to a FC switch, which have connections to FCAL HBAs in servers. You don't need the RAID controllers in the disk chassis, as you could have JBODs (I think the dothill ones allow a JBOD mode anyway). It seems a bit unusual to have an HBA connect to a RAID controller, at least compared to usual SCSI practice, which would have a RAID controller in the server. I think the idea is that you can define the RAID level on the chassis, and define LUNs on an array which are presented as disks to the server, though I can't remember (5 years ago since I set them up and administered them!). There is presumably some way that allows you to share the array with multiple clients that will only allow one client to access one area of the array. Maybe it uses tags.

In a SAN environment, servers see the SAN storage as local disks, that is, access requests for data on a SAN disk is like any other local disk, at the block level. Typically, this is via SCSI commands, as the attached disks are seen as SCSI devices (I think they use the SCSI command set, not sure. In the Linux kernel source tree, there are FCAL HBA drivers in the SCSI low-level block device drivers section).

NAS

Network Attached Storage

In a NAS environment (which could use a SAN as it's basis), the storage is another host, usually with a full host OS, although Network Appliance boxes that I have run have a special version of BSD on them, with a restricted command set. The NAS host will share the data as per NFS, CIFS etc.

DAS

Direct Attached Storage

Host has storage directly attached

Wednesday, August 23, 2006

tech stuff

I've decided to create a tech blog, mainly as a repository of stuff that I have found that interests me personally, and I can refer back to if I ever need to. If anyone else can benefit by it, so much the better :)