Katana Cluster
Table of contents
Overview
In September 2007, we added an IBM BladeCenter (referred to as the Katana Cluster) to our array of supercomputing systems. This system originally comprised 14 IBM LS21 blade servers but since then we have made several additions to the Katana Cluster. The current number of blade servers we have installed is 42, with a total of 192 processors.
We currently have blade servers in the following configurations:
- Six with two quad-core 3.0 GHz Intel Xeon E5450 processors. Each of the eight processor cores share 16 GB of system memory and 50 GB of local /scratch space. Each of the processor cores has a 32 KB L1 data cache, a 32 KB L1 instruction cache, and a 12 MB L2 cache.
- Twenty-two with two dual-core 2.6 GHz AMD Opteron 2218HE processors. Each of the four processor cores share 8 GB of system memory and 50 GB of local /scratch space. Each of the processor cores has a 64 KB L1 data cache, a 64 KB L1 instruction cache, and a 1 MB L2 cache.
- Fourteen with two dual-core 2.4 GHz AMD Opteron 2216HE processors. Each of the four processor cores share 8 GB of system memory and 50 GB of local /scratch space. Each of the processor cores has a 64 KB L1 data cache, a 64 KB L1 instruction cache, and a 1 MB L2 cache.
Each blade server is connected via 1 Gigabit/second Ethernet and, with the exception of the 2.4 GHz blades, 4X Infiniband. The 4X Infiniband network operates at 10 Gbps with less then 200 nanoseconds of port latency.
The operating system running on the Katana Cluster is BULinux 5.0, a derivation of Red Hat Enterprise Linux 5.0. The system runs a 64-bit kernel and both 64-bit and 32-bit libraries are installed so that you can build and execute both 32-bit and 64-bit binaries on all systems in the Katana Cluster.
Users with accounts on the Katana Cluster must use ssh to log in to katana.bu.edu. Passwords are shared over the Scientific Computing Facilities so if you already have an account and password on our other systems, you will have the same login and password on the Katana Cluster. If you are a member of the BU community with a BU Kerberos password you can use that password to access the Katana or Linux Clusters also - you still must have an SCF account.
Help Information
This page provides only very basic information on the Katana Cluster. For additional information, please follow the sidebar links.
For general questions or to report system problems, please send Email to help@katana.bu.edu.
For more information or help in using or porting applications to the Katana Cluster, please contact Kadin Tseng (kadin@bu.edu) or Doug Sondak (sondak@bu.edu).
If you have questions regarding your computer account or resource allocations, please send Email to scfacct@bu.edu.
Allocations and Accounting
Our allocations policy is outlined in our SCF Users Information document. Usage on the Katana Cluster is charged at a rate of 1.0 SUs (Service Units) per CPU (processor) hour used for the 2.6 GHz AMD Opteron 2218HE blades, 0.9 SUs per CPU hour on the slightly slower AMD Opteron 2218HE blades, and 1.5 SUs per CPU hour for the 3.0 GHz Intel Xeon E5450 blades, reflecting the relative performance of each set of nodes compared to the baseline, but now retired, p690 processors.
File Systems
Users have one shared home directory for the Katana Cluster, Linux Cluster, and IBM Blue Gene.
As with our other systems, home directories on the Katana Cluster are backed up nightly. If you accidentally remove a file, you can request that it be restored by sending email to help@katana.bu.edu. Make sure to specify the name (and full path such as katana:/usr1/scv/aarondf/Temp/myfile) of the file(s) that have been deleted and the date the deletion occurred.
Our file storage section has additional information on what resources are available for storing your files.
Usage policies and batch
The login node for the Katana Cluster is katana.bu.edu. Use this machine for compiling and interactive development. General interactive login sessions are only allowed on this machine.
The batch system running on the Katana Cluster is Sun Grid Engine. Users are responsible for setting up a batch script indicating the runtime limit and other runtime parameters. Consult this page for more detailed information on preparing your batch script and running your job. Please refer to the sidebar links for further information on code compilation and batch executions.
Individual jobs have a 24 hour wallclock runtime limit and a user may not have more than 32 processors allocated to him at any one time.
|