-- CarstenPreuss - 12 Apr 2011

Current Environment

Currently SGE is installed on lxlsftest and is exported via NFS to all involved (Submit- and Batch-) nodes :

lxb281-lxb640 all nodes have SGE mounted and configured as ExecutionHosts , daemons are only started on the Lenny64 nodes so far, all ExecutionHosts should also be SubmitHosts

lxir001/sge-submit is the central submit and administration host, Alice desktops will be submit hosts in the near future, too.

lxgrid3 is an ExecutionHost and is SGE enabled by copying the directorie /SGE from the master. The advantage of this setup is not having NFS mounts, the disadvantage is with using LoadSensors and SystemScripts like prolog, startup a.s.o. because thess scripts must then be synced otherwise. Also the usage of the qacct command is delivering wrong infos, because of not having access to the central accounting file.

Maybe a mixed setup with only some small directories/files mounted and the rest local is a good idea...

Planned Environment

The new nodes (lxb681-lxb800) which should arrive in May 2011 are planned to be used via SGE only.

Installation is under Lenny64, with a non-NFS environment. How SGE will be done there is currently unclear.

These cores will be mainly for usage by Alice, both GRID and local jobs, due to the fact that both only depends on /var/gsi/gridhome and /d respectively /u, LXadmin.Lustre and in some cases /d .

Queue structure

Queue Purpose Access Priority
alicegrid for GRID jobs alicesgm/alivo* low
proof for PoD jobs Alice high
mpi for MPI jobs Theory high
high_mem for high memory jobs Theory mid
virtual for jobs running in virtual nodes only for dedicated users mid
long for jobs with long runtime (> 1 day) all users low

The prioritys are set up via POSIX priority and load/suspend thresholds.
Topic revision: r5 - 2011-11-07, BastianNeuburger