--
CarstenPreuss - 12 Apr 2011
Current Environment
Currently SGE is installed on lxlsftest and is exported via
NFS to all involved (Submit- and Batch-) nodes :
lxb281-lxb640 all nodes have SGE mounted and configured as
ExecutionHosts , daemons are only started on the Lenny64 nodes so far, all
ExecutionHosts should also be
SubmitHosts
lxir001/sge-submit is the central submit and administration host, Alice desktops will be submit hosts in the near future, too.
lxgrid3 is an
ExecutionHost and is SGE enabled by copying the directorie /SGE from the master.
The advantage of this setup is not having
NFS mounts, the disadvantage is with using
LoadSensors and
SystemScripts like prolog, startup a.s.o. because thess scripts must then be synced otherwise.
Also the usage of the qacct command is delivering wrong infos, because of not having access to the central accounting file.
Maybe a mixed setup with only some small directories/files mounted and the rest local is a good idea...
Planned Environment
The new nodes (lxb681-lxb800) which should arrive in May 2011 are planned to be used via SGE only.
Installation is under Lenny64, with a non-NFS environment. How SGE will be done there is currently unclear.
These cores will be mainly for usage by Alice, both GRID and local jobs, due to the fact that both only depends on /var/gsi/gridhome and /d respectively /u, LXadmin.Lustre and in some cases /d .
Queue structure
Queue |
Purpose |
Access |
Priority |
alicegrid |
for GRID jobs |
alicesgm/alivo* |
low |
proof |
for PoD jobs |
Alice |
high |
mpi |
for MPI jobs |
Theory |
high |
high_mem |
for high memory jobs |
Theory |
mid |
virtual |
for jobs running in virtual nodes |
only for dedicated users |
mid |
long |
for jobs with long runtime (> 1 day) |
all users |
low |
The prioritys are set up via POSIX priority and load/suspend thresholds.