Friday, November 25, 2011

broken ps

ps, w and a bunch of other things were not working (they would just hang uninterruptedly). I checked in proc. 'ls' worked fine, and so I thought I'd cat the cmdline of each process. I did this as a 'for' loop and found that it stopped at one particular PID. As I was able to read the /proc dir, I could see that the perms were ok, and that the process owner was someone else than the logged in user or root. I could not find anything in /sys (not that I would know really what to look for, since that is mostly device stuff). I ran tcpdump in case it had a keylogger sending stuff back (netstat would also hang)

cd /proc

ls | grep -E "([0-9]){1}" > /tmp/cm.txt

for pid in `cat /tmp/cm.txt`;do echo "checking PID $pid...";cat /proc/${pid}/cmdline;echo;done

I found that I could not kill the process either. Tried all 64 signals with a for loop.

Nothing in /var/run with a PID matching that.

The machine gets loads of 'BUG: Bad page state in process' errors for a variety of processes. These don't seem to result in crashes.

Perhaps there is a region of memory that the process has written to that is unreadable?