experiences with computers and software (under construction)

Because we manage most of our computers by ourselves, we have some experiences with PCs running linux and more powerful computers (SUN, HP, SGI, IBM) with gigabytes of RAM and multiprocessors running some Unix variants. I used it for heavy calculations (exact diagonalization of BIG matrices), for which one needs enough CPU power and shared memory. May be, you find some interesting hints here.

Memory, Out-of-Memory

Extensive memory consumption and memory leaks seems to be the most important problem in software to me (year 2013). Systems become instable. I observed that on HPC-Clusters, PCs, ebook readers and even on small devices like smart phones, which is very annoying. In overcommit mode (allocate more memory than physicaly available allowed) system often crashes or hanging, because system may go in out-of-memory condition and kills the wrong threads (important system threads, drivers etc.). Without overcommiting applications are not able to handle failed malloc often. Both cases are very unlikely. But I see only one solution: developpers have to test there applications under strong memory limitations (using ulimit) and design it in a way that it can live in an environment, where malloc may fail. I think having a memory available waranty system is more important than having cpu time waranty systems (real time). Good applications only alloc the minimum of required memory and give memory back if not needed anymore. Have a look to your system using top -mM -d 60 and look also at the VIRT (virtual memory consumption) column and see for yourself how much memory is wasted. Here is al list of randomly choosen problem apps:

nscd 2.12 (2010)                     rss=2MB  vm=200MB-1200MB BAD  (physmem=2GB) 
thunderbird 3.1.18 (2012) + enigmail rss=58MB vm=400MB  very BAD # 400MB minimum reading mail only! CentOS-5
mutt   1.4.2 (2006)                  rss=2MB  vm=60MB        BAD # (ulimit -v 60000;mutt)
alpine 2.03                          rss=3MB  vm=70MB-136MB  BAD # (ulimit -v 70000;alpine) read ok, OOM(fork) on sent, see /proc/PID/smaps 
(ulimit -v 4990;cat;echo $?)                     # 4990kB CentOS-5   libc-2.5=1.7MB
(ulimit -v 2330;cat;echo $?)                     # 2330kB TinyCore-4 libc-2.13=1MB
Testtool:
(ulimit -v 2330;cat;echo $?)   # max virtual memory size 
(ulimit -d    1;cat;echo $?)   # max data segment size - has no effect!?
# -m has also no effect ... how to configure a memory safe system?

# Minimal-Testprogramm a.c: main(){return 0;}
gcc -static a.c;(ulimit -v      780;./a.out;echo $?) #  780kB size=550k  brk ... gcc4.1.2
gcc         a.c;(ulimit -v     3800;./a.out;echo $?) # 3800kB size=2k libc-2.5=1.7MB smaps=3MB(libc)
diet gcc    a.c;(ulimit -v       92;./a.out;echo $?) #   92kB size=.4k dietlibc
# java is crazy in memory consumption (it looks like it eats big portion of available memory)
javac a.java;(ulimit -v   980000;java a;echo $?)                 # 980MB size=.3k java1.6.0 (50%? of 2GB-PC)
javac a.java;(ulimit -v   290000;java -Xmx2m HelloWorld;echo $?) # 290MB size=.3k java1.6.0 (2GB-PC)
javac a.java;(ulimit -v   140000;java -Xmx2m -XX:MaxPermSize=1m ... ) # 140MB   (2GB-PC)
javac a.java;(ulimit -v  1500000;java -Xmx2m -XX:MaxPermSize=1m ... ) # 1.5GB (288GB-PC)
javac a.java;(ulimit -v 37000000;java a;echo $?)                 #  37GB size=.3k java1.7.0 (20%? on 760GB-PC) 
javac a.java;(ulimit -v  4000000;java -Xmx2m HelloWorld;echo $?) #   4GB size=.3k java1.7.0 (760GB-PC)
(ulimit -v 3900000;java -Xmx2m -XX:MaxPermSize=1m ...) # 3.9GB (760GB-PC) Java is bad!

gcc -fopenmp hpc.c;(ulimit -v  22000;a.out)              #   10MB/thread gcc4.1.2+openmp2.5
mpicc        hpc.c;(ulimit -v 155000;mpirun -n 2 a.out)  #  155MB/task   gcc4.1.2+openmpi2.1

OOM-Beispiel Moodle(2013)

Am Mo den 11.03.2013 kam es von 10:00 bis ca. 11:18 zu Problemen mit dem Server moodle.ovgu.de. Nach Analyse des Problems fuehrte die gleichzeitige Nutzung von nur 81 Nutzern um 09:57 mit ca. 11000 GET-Anfragen pro Minute zu einem Out-Of-Memory des moodle-Servers, der 16GB Hauptspeicher besitzt. Als Ursache wurde eine unpassende Standard-Konfiguration des Apache ausgemacht, die zusammen mit wenig optimierten PHP-code des moodle den exzessiven Speicherbedarf ermoeglichte. Die Konfiguration (httpd.conf) wurde deshalb so geaendert, dass die Anzahl der Apache-Prozesse, die jeweils 60-600MB Speicher verbrauchen, stark beschraenkt wurde (von 256 auf 8). Damit sollte der Fehler nicht mehr auftreten. (Eventuelle Rueckfragen richten Sie bitte an URZ-S Tel.58408).

Obsolescenz in Software (2014-09)

Wie war das noch mal mit der Werbung? Java ist toll, platformunabhaengig und sicher! Aber nicht zeitabhaengig! Schlecht wenn man auf Oracle-Java angewiesen ist:

 /usr/java/jre1.7.0_55/bin/javaws $JOPT https:://...
 Disabling Java as it is too old and likely to be out of date. To reenable use jcontrol utility.
 
Jcontrol hat natuerlich nicht geholfen, aber mit -verbose kam kein Fehler mehr. Nicht auszudenken, wenn das in jeder Software Mode wird.

Wozu Swap? (2013-04)

Man koennte denken, man braucht keinen Swap bei den heutigen Hauptspeicherausstattungen. Leider ist diese Annahme nur bedingt richtig. Denn Programme allozieren oft mehr Speicher als sie brauchen und speziell beim Starten von Subprozessen (forken) verdoppelt sich kurzzeitig der virtuelle Speicherverbrauch. Wenn dann der Kernel auf sinnvolles overcommit_memory=2 eingestellt ist (keine Ueberbuchung erlaubt), dann scheitert der Fork am fehlenden virtuellen Speicher. Natuerlich kann man das Problem auch mit besseren Programmen beheben, aber nicht jeder hat die Zeit, sich in das Problemprogramm zu vertiefen. Da liefert gar nicht benutzter und deshalb ruhig langsamster Swap eine Umgeheungsloesung. Nachteil ist natuerlich, wenn dann tatsaechlich auf den Speicher zugegriffen wird, was nicht passieren darf. Vorteil ist mit Ueberbuchungsverbot das keine ungewuenschten Systemzustaende mit Speichermangel fuer Systemprozesse entstehen.

Linux

SuSE 9.1

SuSE 8.0

We now running SuSE-8.0 on three of our PCs. The only problem after installation was, that sendmail was not working (connection refused on port 25). I changed the arguments in /etc/rc.d/sendmail to "-bd" and now its running without problems. I also recommend to use ext2 or ext3 instead of reiserfs. ReiserFS is probably not stable and can cause problems after power failures (see crashes).

Red Hat 7.3 (first contact to Red Hat)

My first impression is, that SuSE is more comfortable to install. I missed the description for each software package. Also tuning of X to more than 85Hz was not possible with Xconfigurator, so I had to edit the Modelines in /etc/X11/XF86Config-4 by hand.

Mozilla 1.1

Since August 2002 I use Mozilla 1.x instead of netscape. I can only recommend it.

Qemu-0.6.1 - the PC emulation (Jan05)

Very impressive! Faster than bochs. Does handle vmware disks (except 2GB-splitted files and SCSI-images). WinXP can be installed but it does not boot after installation. I checked the partition entry and got some strange CHS values (64 heads, 63 sectors), which probably confuse the BIOS. I tried to to use different -hdachs options, and the behavior of the booting process is changing, but no way to run it. I had the same problem for WinNT4. If you know whats going wrong, tell me. Knoppix console locks unregulary for unknown reason.
I used my spinpack-package for speed tests and found a factor of 6 to 10 for numerical applications (a 300MHz machine on a 2600MHz PC).

Qemu-0.7 - the PC emulation (Jul05)

WinXP-Prof-DE runs. After installation it does not work in normal mode and shows error messages that license can not be prooven. Use the "Abgesicherter Modus" (press F8 very early) and install SP2 from CD-img (also with some errors, but it works). After that XP runs slow but normally. Use images of CDs instead of /dev/cdrom which is really terrible slow (5 hours instead of one for installation).

server administration

I have administrated the following list of servers (excerpt):

Unfortunately the list ist incomplete, but gives you an overview what kind of computers I have administered.

HP 9000 Model 819/K210, system: HP-UX B.10.01

We bought this system 1995 with one HP-PA7200-120 processor and 640MB RAM. The system was running very stable and the machine was easy to decompose. As I remember right, we had to build a new kernel to use more than 128MB RAM, but it was no problem to do it. Unfortunately we had no cooled room for the machine and in the summer we had to shut off the machine some times in order to risk no damage.

Disadvantages were the noisy fan and later we recognized that the three 2GB-HDs (SCSI-LVD) had very hight temperature around the maximum of disk specification. Probably that was the reason that 1998 the first disk leaved us. The second was following 1999. We try to put a new SCSI disk (non HP) to the machine, but it was not detected. So we ask for original HP disks and should pay 6TDM for 9GB, unbelievable! We could find no other sources and replacing of all disks by other SCSI (SE or LVD) disks or by a SCSI-IDE adapter failed too, so we decided to give the machine away 2001. A second machine bought at the same time by colleagues had the same problems. Another problem was the external MOD drive. After 6 months running it did produce only errors. We thought about dust problems and asked for exchange or cleaning. After further months I got a call from Netherlands that we use the wrong formatted disks (with 1024 byte sectors instead of 512 byte sectors) this should cause an error in the system after writing a number of disks. I could not belief this, first there is no logic behind this, second we got this disks together with the other equipment and third another MOD drive (probably not so dusty) of same type was running well on the same machine. Thats service! If there were no warranty I had cleaned it by myself. At the end we did not used the MOD drive anymore and I would never buy state of the art super hightec drives for high prices if it is not standard technique. A normal CD writer would be the better choice.

IBM model 580

Bought in the '90 (?), 640MB RAM, 66MHz processors, running AIX 4.3. The support was not the best. As we bought some new disk for this machine we did not get the right screws and adapters but the disk was working well for years lying at the dusty bottom (good disks). Updating to a new AIX version was always an adventure. "Never change a running ...". At 1997 we got a defect graphics card after power failure. The cheapest replacement was a Gt4xi graphics card of the price of new PC. At 2001 we throw the machine away.

Alpha-PCs

bought 1998, two equivalent machines with 1.5GB memory, 164UX-Boards 67MHz, DEC-alpha 21164A CPUs 533MHz, 9GB SCSI-disks, running Linux. Very nice machines and the cheapest available for this configuration. The only problem was, that the IDE-Adapter is really slow, I do not know why. A additional PCI-IDE card (noname CMD-PCI646U2) could be used only in slow modes because of missing drivers (?). But using IDE disks via SCSI-IDE Adapters on the SCSI-bus was no problem. The insight of the tower is very warm, but now the machines are in a air conditioned room and no problems are expected. With older Linux kernels 2.2.x there seems to be a problem with applications using more than 1GB RAM, but after using Linux 2.4.x the machines are 100% stable. After we moved to another building we had problems with auto sense of 10Mb/100Mb duplex/halfduplex network. Luckily we do not need highspeed network at the moment. Only the 10Mb/halfduplex version was working well.Soft-RAID0 was able to increase the 19MB/s disks to 28MB/s (two disks) speed. After all, it was a good deal.

SGI power challenge (not administrated by our work group)

bought 1999, 8 MIPS R10000 250MHz, 8GB shared memory, running IRIX64 v6.4, very fast and stable. If I remember right we had to change one defect board during the last two years (during warranty time). Using MP-pragmas for parallel processing was working bad for complex programs. Sometimes the program was 3 times slower on 8 processors than on one processor. I could not find out why. With pthreads I got a speedup of factor 8 with the same algorithm. So I do not trust the very expensive parallel C compilers and use the more primitive standard libraries for multi processing.

Tru64-V5.1 on Alpha GS160, GS1280 and ES45 (Dec04)

I admin three Alphasystems, two of them are big machines (One has 128GB Memory and 32 EV7-Processors, the other 24GB and 16CPUs) and very fast! Unfortunately system is not very stable, there are two to four crashes a year. The hardware support is ok, but the software support is bad (Vortrag+AudioMitschnitt von 2005). You get updates regulary, but dont try to ask HP questions about misbehavior of the Tru64-system. The hotline dont think about forwarding your question/report to the programmers, they only ask for money for tuning support (@HP: I dont want to buy tuning support, I want to have the bugs I found in your system fixed without paying additional money for it!). Probably they can not reproduce our problems with there test machines and the effort for analyzing the problem for a 128GB machine is high, but simply playing the ball back is not the right thing to do with its customers aren't it? So I dont ask anymore and try to solve the problem by myself. Thats not easy without tracing like linux-strace program and kernel sources. Here is a list of things which cause problems on Tru64-V5.1, may be its usefull to know for you:

Sun Ultra 1 SBus, UltraSPARC 143MHz, 512MB RAM

Bought two machines 1997 (one with only 64MB RAM). There are quite OK. Easily to open and to look inside. After one year we changed from Solaris to Linux because it is easier to manage if you have already a lot of experiences in Linux but less in using Solaris. One bad point is the SunTurboGX graphic card. It was only possible to use with 1152x900x76colors at 72kHz/76Hz. With a 20-inch hightec SUN-Monitor you get easily a headache with less than 80Hz. It is really a bad combined hardware. Another point is the CPU-fan. Two of them are died within the last two years. Also we had one disk crash and you need special SUN disks (expensive). They are only used now as number cruncher and for guests. Funny thing is, that you get a lot of software you never need for every machine, but only one boot disk for about 16 machines. Since September 2002 we are testing SuSE 7.3 for sparc without problems until now.

Pentium dual board 440LX

Bought 1998 with two 300MHz PII CPUs and 512MB RAM and SCSI. Big mistake! Every PC magazine was claiming that one needs SCSI for burning CDs at this time, but this probably was only true for WinXX. SCSI was not necessary for the CD writer to avoid buffer underruns. After using IDE CD writer under Linux on other older PCs without any trouble, I prefer the cheaper IDE versions today. This was not a problem at all but the CPUs chosen were bad. CPUs becomes so hot that the board beeps if both CPUs are 100% used. After few weeks with lots of crashes the contact to the cooling bodies got lost (cooling body was bend by heat). With the new cooling bodies we got, the problem was not completely solved. The seller could not really solve the problem and switched the BIOS heat warnings off, but with moderate success. Nowadays I know that this CPU version was the hottest one. The voltage was increased to 5V to get the CPUs running at 300MHz. Surprisingly the CPUs are not outburned after using 3 years. On hot days with both CPUs used the board does still its quiet beeping. After such experiences we went back to Celerons with moderate clocks and 128MB RAM, which are silent, still waiting that PCs with stable boards and more than 1GB RAM are broadly available to build small PC clusters as a better solution. Buying not the newest product seems a good tactics nowadays. If you need more performance, tune your code!

Linux crashes

On January 2002 we head trouble with two machines running SuSE 6.4 with reiserfs-2.x on root. Both machines showed inconsistencies on the reiserfs-filesystem (all actions took lot of time) after about 19 months. A new installation was necessary.

Remark: If you have any entries in /etc/hosts twice, sendmail failes to start. We took more than two hours to find and fix this problem.

On September the 23th, 2002 we had a powerfail. After that a PC with SuSE 7.3 with reiserfs-3.x.0k-pre9 installed showed non-reproducable errors during numerical calculations (wrong results, unexpected aborts). We made a reiserfs-check, bad it claims that everything was ok. After reboot and further tests also the compiler gcc gave non-reproducable "internal compiler errors", which appear more and more frequently leading finaly in a kernel panic. We installed SuSE 8.0 with ext2 and again strange things happens. So we opened the PC and made a visual check of the hardware. Only the CPU-fan (2 years old) was not in its best state, in some cases it does not start to rotate after stopping by hand. Was the fan not started after the power fail? This would cause to high temperatures and would explain the strange behaviour. Indeed the PC was running without problems after checking the CPU-fan so that we are now sure to have located the problem.

Speed tests

Network speed tests

Have you ever tried to measure speed of your network? A simple command to do that is:

  time dd if=/dev/zero bs=1024k count=1000 | rsh remotehost "cat >/dev/null"
The result is 89s for a 100Mbps ethernet card (1000MB/89s=11.2MB/s=90Mbps). Pretty accurate! For a 1000Mbps card the test failed because rsh took 100% of CPU time. In a second test I started the above command 6 times parallel and got 55MB/s=440Mbps using a 8-CPU-machine which shows that a better speed-test is needed here. Do you have a simple on?

Security

We had only one compromised system. An old 486 PC used as printer and floppy disk server (for the SUNs without floppy disk). This PC was used to upload and download files. We noticed it because the PC crashed after the 500MB disk was full. Other problems were connected with sendmail and relaying. It was used three times to send spam over the world. We noticed it because the machines could not do anything else and there was lot of disk activity. Sorry to all victims of spam. Now we have configured all our machines to not relay email. Since our machines are configured more securely and we use SSH logins we rarely notice portscans and other attacks.

If you use Windows on your client PC and want to login to a Unix-Box, you can use exceed + ssh (commercial), cygwin-package or Xming to work with graphical applications.

Using pine for imap-server via SSL:

 pine -f {IMAP-Server/imap/ssl/user=userid}inbox   # OR
 .pinerc  inbox-path=\
    {sunny.urz.uni-magdeburg.de/imap/ssl/user="username"}inbox
   # inbox-path={imap.web.de/novalidate-cert/user="username"}inbox
 instead of novalidate-cert do:
   - download OvGUMssl.pem to /etc/ssl/certs
   - openssl x509 -noout -fingerprint -in OvGUMssl.pem  # better use SHA1
     # MD5 Fingerprint=72:A0:34:4C:64:18:57:6A:80:9A:89:72:48:92:7F:83
   - openssl x509 -noout -hash -in OvGUMssl.pem # 6cc6a28b
   - ln -s OvGUMssl.pem $(openssl x509 -noout -hash -in OvGUMssl.pem).0
   - openssl verify -CApath /etc/ssl/certs OvGUMssl.pem
   - same with dfn-cert.pem
   - ToDo: check CRL = certification revocation list
 # check the connection: netstat -atn # to hostip:993 ESTABLISHED (imap via SSL)

out-of-memory/out-of-swap (Mai06)

This happens some times on our compute servers, mostly if users dont estimate the memory needs of there programs. Most operating systems are slowing down, but are working further. Tru64-5.1B does the worst thing, killing any process (also old deamons running as root), which results often as crash. IRIX64-6.5 has killed the user process in all test situations and everything else continues well. ulimit -v can help but is not practicable for MPI processes with asymmetric memory consumtion on shared memory machines.

Ethernet cluster (Oct2008)

Explored a firmware bug in the IPMI Software of the BMC of a DELL PowerEdge 1950 Server which cause a Linux system crash on high loads. See at end of that webpage for more information.

SiCortex SC5832 + 256GB 32Core SMP (Jan2009)

SC5832, 256GB SMP,

problem: OOM killer kills init, sshd etc. (Mar09)

This happens if a long running process (days or weeks) eats all the memory. The OOM-killer does not kill this process because long run processes are important. As an solution create /etc/skel/.ssh/rc with "ps -o pid --no-heading | xargs renice 10 >/dev/null 2>&1" and copy that to existing user homes. You could also use /etc/ssh/sshrc (xauth add .. must be added to sshrc files, because xauth is not called by sshd if rc file exist and x11 tunneling will fail). Nice makes killing of system processes more unlikely. Also set /proc/sys/vm/overcommit_memory to 2 and /proc/sys/vm/overcommit_ratio to 90 or higher. Also disable swap, it makes no sence for HPC, it will only create a long time slow down before OOM happens. Programs which alloc all of the memory and more are bad.

problem: automatic power down an SGI Altix330 Server (Apr09)

The problem is to power off the Altix 330 server after shutdown in case of high room temperatures or power failures (to save UPS power for others). shutdown -p -h does not work, the machine stays and consumes still power. The only way is to connect the service processor from the service net by telnet and power it off. This can be done automatically by:

(echo "pwr down";sleep 9;echo -e "\x1dquit";sleep 1) | telnet 10.0.0.1

Same technique can be used for the alpha servers above.

ssh - attacker dt_ssh5 18.01.2010

At Januar 2010 one of our machines did ssh attacks to other servers in the world. It was the 141.44.40.29-linux machine. netstat -atn | wc -l showed about 2500 ssh connections. ps auxw output was looking like this (user name changed):

matze   18773  0.5  0.0   1736   308 ?        S    Jan17  10:42 ./dt_ssh5 200 2 17.79.182.153 2
root    18492  0.1  0.1   8252  2396 ?        Ss   11:10   0:00 sshd: root@pts/0
root    18890  0.0  0.0   4120  1844 pts/0    Ss   11:11   0:00 -bash
matze   25993  0.0  0.0   1736   468 ?        S    11:14   0:00 ./dt_ssh5 200 2 17.79.182.153 2
matze   26023  0.0  0.0   1736   468 ?        S    11:14   0:00 ./dt_ssh5 200 2 17.79.182.153 2
matze   26024  0.0  0.0   1736   468 ?        S    11:14   0:00 ./dt_ssh5 200 2 17.79.182.153 2
The user had a to simple password choosen (183000 google hits for it). The binary was lying in /tmp-path and showed this properties:
ls -l
-rwxr-xr-x   1    1001 users 1379632 2010-01-17 05:05 dt_ssh5
file tmp/dt_ssh5
tmp/dt_ssh5: ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV), for GNU/Linux 2.6.4,\
 bad note description size 0x83e58955, bad note name size 0xe8000001, bad note name size 0xc2815a00,\
 bad note description size 0xc7397500, bad note name size 0x89589c00, bad note name size 0xc831589c,\
 statically linked, stripped
md5sum tmp/dt_ssh5
 f0b5fc67c41d567c1f306e88363f139a  tmp/dt_ssh5 
strings -9 dt_ssh5 showed strings belonging ssh and openssl libreraries. Two successfull logins in /var/log/messages (name changed):
Dec 21 21:55:41 fermion sshd[23143]: Accepted keyboard-interactive/pam for matze from 58.247.222.163 port 40039 ssh2
Jan 17 05:05:10 fermion sshd[18758]: Accepted keyboard-interactive/pam for matze from 217.79.182.153 port 45300 ssh2
Jan 17 05:05:10 fermion sshd[18761]: subsystem request for sftp
Jan 17 05:05:10 fermion sshd[18761]: channel 0: rcvd big packet 131030, maxpack 32768
Jan 17 05:05:10 fermion sshd[18761]: channel 0: rcvd big packet 112867, maxpack 32768
Jan 17 05:05:10 fermion sshd[18761]: channel 0: rcvd big packet 112838, maxpack 32768
Jan 17 05:05:10 fermion sshd[18761]: channel 0: rcvd big packet 112809, maxpack 32768