Home Page | |
![]() |
Most recent entry: ALOM SSH support
Sun has an interesting new offer: They say that we can apply for a sixty day trial of one of their T1-powered Niagara systems.
That sounds great! We will see if we can get our hands on one of those.
I just received an email from Sun: They welcome us to their Sun Fire T2000 Try and Buy Program. Now lets see how long it will take to receive hardware...
The machine has arrived. It is a SunFire T2000 with an eight-core 1GHz T1 processor and 8GB RAM. Each core can process four hardware threads.
We upgraded the SC (122430-01) and the SCSI-RAID-Controller (122165-01). The SC version is now:
niagara-sc> showhost Host flash versions: Reset V1.0.0 Hypervisor 1.1.0 2005/12/15 11:10 OBP 4.20.0 2005/12/15 16:48 MPT SAS FCode Version 1.00.37 (2005.06.13) Sun Fire[TM] T2000 POST 4.20.0 2005/12/15 17:19 niagara-sc> showsc version -v Advanced Lights Out Manager CMT v1.1.2 SC Firmware version: CMT 1.1.1 SC Bootmon version: CMT 1.1.1 VBSC 1.1.1 VBSC firmware built Jan 20 2006, 17:56:19 SC Bootmon Build Release: 01 SC bootmon checksum: 1A6E3FF4 SC Bootmon built Jan 20 2006, 18:08:25 SC Build Release: 01 SC firmware checksum: 0856BF03 SC firmware built Jan 20 2006, 18:08:41 SC firmware flashupdate UNKNOWN SC System Memory Size: 32 MB SC NVRAM Version = f SC hardware type: 4 FPGA Version: 4.1.10.7
There were some problems with the SC, even after the firmware upgrade.
telnet
sessions froze, and the SC didn't respond to
ping
.
Overall, we found the SC a bit disappointing:
Even after the firmware upgrade, strange things happen: The first attempt to boot via network results in the machine asking us to file a bug:
Sun Fire T200, No Keyboard Copyright 2005 Sun Microsystems, Inc. All rights reserved. OpenBoot 4.20.0, 8184 MB memory available, Serial #67274234. Ethernet address 0:14:4f:2:85:fa, Host ID: 840285fa. {0} ok show-nets a) /pci@7c0/pci@0/pci@2/network@0,1 b) /pci@7c0/pci@0/pci@2/network@0 c) /pci@780/pci@0/pci@1/network@0,1 d) /pci@780/pci@0/pci@1/network@0 q) NO SELECTION Enter Selection, q to quit: d /pci@780/pci@0/pci@1/network@0 has been selected. Type ^Y ( Control-Y ) to insert it in the command line. e.g. ok nvalias mydev ^Y for creating devalias mydev for /pci@780/pci@0/pci@1/network@0 {0} ok boot /pci@780/pci@0/pci@1/network@0 -svV install Boot device: /pci@780/pci@0/pci@1/network@0 File and args: -svV install 100 Mbps full duplex Link up Requesting Internet Address for 0:14:4f:2:85:fa ERROR: /packages/obp-tftp: Last Trap: Division by Zero [Exception handlers interrupted, please file a bug] [type 'resume' to attempt a normal recovery]
At this point, the telnet
session was gone.
Notice that the machine thinks it's a T200 instead of a T2000. We didn't file a bug for that.
Sun's buzzword CoolThreads is well chosen. This output was captured immediately after powering the T2000 on:
Sensor Status Temp ------------------------------ PDB/T_AMB OK 15 MB/T_AMB OK 15 MB/CMP0/T_TCORE OK 24 MB/CMP0/T_BCORE OK 24 IOBD/IOB/TCORE OK 23 IOBD/T_AMB OK 15
And this shows the output five hours later:
Sensor Status Temp ------------------------------ PDB/T_AMB OK 17 MB/T_AMB OK 20 MB/CMP0/T_TCORE OK 36 MB/CMP0/T_BCORE OK 36 IOBD/IOB/TCORE OK 35 IOBD/T_AMB OK 22
The T2000 is well suited for resource management experiments. We experimented a bit with zones and the fair share scheduler. The results were convincing:
# prctl -n zone.cpu-shares -v 5 -r -i zone global # prctl -n zone.cpu-shares -v 10 -r -i zone erie # prctl -n zone.cpu-shares -v 20 -r -i zone ontario
ZONEID NPROC SIZE RSS MEMORY TIME CPU ZONE 2 68 292M 180M 2.2% 1:18:20 56% ontario 1 68 291M 180M 2.2% 0:58:50 29% erie 0 90 396M 258M 3.1% 0:38:06 14% global
We're approaching the end of our two months with the T2000. Unfortunately, we haven't had enough time to run application benchmarks as planned. There's no doubt that the T2000 performs very well, though.
We won't keep this particular machine. There's one reason why we would prefer a T2000 more recently produced: We heard that the internal disk controller will soon be functional, and the extra controller that now takes one of the PCI-X slots won't be necessary any more.
Here are some very preliminary results for my investigation on the
bzip2smp
utility. The OS on the machine was build #31 of Solaris Nevada (aka Solaris 11
Early Access).
For comparison, I used the bzip2
that comes with Solaris:
bzip2, a block-sorting file compressor. Version 1.0.2, 30-Dec-2001
(I don't think using 1.0.3 would have made much difference.)
I built bzip2smp (v1.0) with Sun Studio 11 using generic options:
-O -xtarget=ultra -xarch=generic64 -fast
This produced a 32 bit binary. I did not create a 64 bit binary since
(a) the Sun provided bzip2
is 32 bit and
(b) the bzip2smp
utility was
developed and tested under Linux. Such software usually has 64 bit
problems, and I did not want to get into those. :-)
I created some big
files in /tmp
using the sequence:
tar cvf tar /kernel cat tar tar tar > big cat big big big big > huge
The file big
is 87.25 MB, the file huge
is 348.95 MB.
I could have used /dev/random
but I was more interested in something obviously compressible.
Everything ran in /tmp
, no disk I/O.
I timed the runs using the tcsh
builtin, time
.
I ran both the regular bzip2
and the bzip2smp
in parallel.
But these are rough qualitative results anyway.
I specified the bzip2smp
thread count explicitly
to avoid having to deal with that strange hyperthreading detection.
Here goes:
bzip2 -c big > big.mono 109.24u 0.66s 1:49.92 99.9% bzip2smp -p32 < big > big.smp 118.44u 1.34s 0:16.71 716.8% Ratio of wall-clock time: 6.58 speedup bzip2 -c huge > huge.mono 434.89u 2.21s 7:17.14 99.9% bzip2smp -p32 < huge > huge.smp 474.47u 3.70s 1:00.31 792.8% Ratio of wall-clock time: 7.25 speedup
I diff'ed both resulting files each time to check if they were bit-identical.
They were. :-) Incidentally, compressed size for big
was 31.92 MB, huge
was 127.82 MB.
BTW I'd have expected smaller size for huge
since it is just 12 concatenated copies of the same data...
The upshot is that bzip2smp
seems to be a big win on SMP systems.
Even if total time does not scale linearly with the number of available
hardware threads, the savings in wall clock time is significant.
I would have expected a bigger speedup factor (the box can do 32 hardware threads) but exploring this will have to wait a bit as we have to send the T2000 back to Sun tomorrow. :-(
The T2000 is on its way back to Sun. Thank you, Sun, for letting us take a close look.
We're looking forward to tomorrow: Denis Sheahan, UltraSPARC T1 performance expert, will be in Munich for a workshop. Denis is the author of a paper in Sun's Blueprint series.
Yesterday's workshop provided great insight and taught us a lot. Thanks, Denis!
Pictures from the workshop have been posted.
More good news: A future version of the ALOM will bring SSH.
The T2000 firmware version 6.2.4 has brought SSH support. Find the README with news about other enhancements at SunSolve.