HP/Convex SPP2000 (S-Class/X-Class)
| Quick Facts | |
|---|---|
| CPU | 4-16 PA-8000 180MHz |
| Caches | 1/1MB L1 |
| RAM | 16GB |
| Drives | 20 SCSI |
| Expansion | 28 PCI |
| Bandwidth | CPU 7.5GB/s Mem 15GB/s I/O 1.9GB/s XBAR 15.3GB/s SCI 3.8GB/s |
| I/O | SCSI Console SCI (CTI) |
Overview
The HP/Convex Exemplar SPP2000 are large scalable PA-RISC computing servers and
the direct predecessors of the later HP V-Class (V2200,
V2500 et al).
Originally developed by Convex, the SPP2000
are based on a crossbar architecture with the central internal switching
component connecting the resources to each other by forming matrix connections between
the devices’ input and output ports.
A single SPP2000 computer can hold up to sixteen 64-bit PA-8000 processors with 16GB of memory in a single Node — the resulting system is called S-Class (according the HP’s nomenclatura). The SPP2000 can form a large-scale system by connecting single Nodes with SCI links (forming rings) into a larger cluster (Wall) of up to 32 nodes/512 processors. The resulting interconnected systems are called X-Class, in effect consisting of several S-Classes. Interconnected X-Classes are ccNUMA computers, cache-coherent Non-Uniform Memory Access (for a detailed explanation cf. for example the ccNUMA section on the Wikipedia Non-Uniform Memory Access page). Interestingly, the clustering capabilities of the SPP2000’s successors, the V2500, have been reduced significantly — in contrast to the 32-node maximum of SPP2000 clusters, V2500s only can be clustered to groups of four.
As the other Exemplar systems, the SPP2000/S-Class are operated and controlled via
so-called teststations,
Unix workstations that connect to a central management
board in the single nodes which provides booting, system monitoring and diagnostics,
and console connections.
(These teststations were either IBM RS/6000 AIX systems or later, more common, HP 9000
workstation running HP-UX.)
Introduced: 1996-97 with prices at time of introduction of $189,000 (SPP2000 Node/HP S-Class, four-CPU) to $720,000 to $3 million (SPP2000 Cluster, HP X-Class).
Internals
CPU
- SPP2000 Node/S-Class: 4-16 PA-8000 180MHz with 1/1MB off-chip I/D L1 cache each
- SPP2000 Cluster/Wall/X-Class: 32-512 PA-8000 180MHz with 1/1MB off-chip I/D L1 cache each
Chipset
The SPP2000 is based on the Exemplar crossbar architecture which connects the CPU and I/O to the system main memory.
- 8x8 nonblocking crossbar
is the central part of the system, it connects the memory to the processor buses and I/O channels.
There are eight ports for
agents
for CPUs and I/O — each agent connects to two CPUs and one I/O channel —, and eight ports for memory. Each crossbar port has a path width of 64-bit, giving it 960MB/s peak bandwidth. The peak raw bandwidth of the crossbar is 15.3GB/s combined. The crossbar in the original Exemplar design (SPP1x00) was built of GaA chips, the SPP2000’s in standard CMOS (1.1M transistors). - Eight Data Mover/Agents attach to the crossbar and provide access for the processors (Runway buses) and I/O controllers (I/O channels) to the memory via the crossbar over a 1.9GB/s datapath (four 32-bit, unidirectional buses from two ports on the Agent connect to two crossbar ports). The I/O channels on the agent have a maximum bandwidth of 240MB/s. Each Agent has two Runway processors buses (64-bit, bidirectional) which have an aggregate raw bandwidth of 960MB/s.
- Eight PCI controller connect the 240MB/s I/O channels/PCI buses to the Agents.
- Eight Memory controllers attach each one four-way interleaved memory board to the Hyperplane crossbar. Each Memory controller has a bandwidth of 1.9GB/s. The memory controllers probably also interface with the CTI interconnection.
» View a system-level ASCII illustration of the crossbar architecture.
Buses
- Total crossbar bandwidth 15.3GB/s (intra-crossbar)
- CPU bandwidth 7.5GB/s (CPU-to-Agent, eight Runway 960MB/s buses)
- Memory bandwidth 15GB/s (memory-to-crossbar, sixteen 960MB/s links)
- I/O bandwidth 1.9GB/s (eight 240MB/s channels, I/O channel-to-Agent)
- Eight PCI-32 I/O buses for expansion slots (each 240MB/s)
- Attachments to SCI rings/CTI (
Coherent Toroidal Interconnect
) via two rings (X-ring and Y-ring), Node-to-Node bandwidth of 3.84GB/s, the rings operate at a clock of 120MHz with a width of 32 bit - SCSI-2 Ultra main storage I/O bus
Memory
- SDRAM DIMMs
- Two to eight memory boards per node
- Memory is up to four-way interleaved per memory board and up to 32-way interleaved per node
- SPP2000 Node/S-Class: 1GB minimum, 16GB maximum
- SPP2000 Wall/X-Class: 512GB maximum (with 32 nodes)
Expansion
- 24 PCI 32-bit slots on eight PCI 32-bit channels
Drives
- 20 internal Ultra SCSI drives
Clustering
Multiple Exemplar SPP2000/HP S-Class systems can be connected together to form a single large system,
a Wall
/X-Class.
- Up two 32 single nodes can be clustered together to form a system with up to
- 512 processors
- 512GB of RAM
- 768 PCI slots
- 640 SCSI drives
- Clustered SPP2000s/X-Class are ccNUMA computers; they are not fully conformant to the PA-RISC 2.0 specification (and thus do not run standard HP-UX).
- Multiple systems are connected via two CTI rings: these links attach
to the eight memory controllers of a node.
A single system attaches to other single
nodes
and their respective crossbars with a node-to-note data rate of 3.8GB/s. - The two rings are called X-ring and Y-ring.
- The links are implementations of the IEEE SCI from Convex — Convex Toroidal Interconnect.
- Each node’s main memory is globally accessible from other nodes on the CTI network (that is, local memory is globally shared).
- A part of each system’s main memory is reserved for cache memory for the CTI network (configured statically at boot time).
External connectors
- 68-pin VHDCI Ultra LVD external SCSI
- Three DB9 male RS232C serial (local console, remote console, general purpose) via a
DB25
M cable
- 10/100Mbit Ethernet TP/RJ45
- 10/100Mbit Ethernet TP/RJ45 LAN console
ROM update
There is an firmware update available for the SPP2000 which contains the latest version 4.2.1.
- PF_CV220421.txt has details about the contents and installation of the patch.
- PF_CV220421.tar.gz contains the patch.
References
Articles
- Exemplar System Architecture Hewlett-Packard/Convex (Januar 1997, archive.org mirror, access August 2008)
- SPP 2000 Architecture presentation (FTP, Postscript) Beth Richardson (N.d.: NCSA. Google archive accessed August 2008)
- A Comparative Evaluation of Hierarchical Network Architecture of the HP-Convex Exemplar (Postscript) Robert Castaneda, et al. (1997: in Proceedings of IEEE International Conference on Computer Design (ICCD’97) [there is a mirrored PDF version from citeseer (accessed August 2008)]
Operating systems
- Convex SPP-UX, a heavily modified Mach-based operating system, which looks familiar to HP-UX but is a completely different design. The later HP V-Class are able to run stock HP-UX (which was modified specially for the V-Class architecture).
Benchmarks
| Model | SPEC95 int |
SPEC95 fp |
SPEC95 rate, int |
SPEC95 rate, fp |
|---|---|---|---|---|
| SPP2000/S-Class/X-Class | 11.8 | 18.7 | 92.5 2-CPU: 183 4-CPU: 363 6-CPU: 539 8-CPU: 713 10-CPU: 867 12-CPU: 1012 16-CPU: 1307 |
141 2-CPU: 276 4-CPU: 524 6-CPU: 739 8-CPU: 935 10-CPU: 1085 12-CPU: 1220 16-CPU: 1413 |
Compare these with other results on the Benchmarks page.
Physical dimensions
- Single node: 736×914×889 mm
- Weight of about 250kg