--
CarstenPreuss - 16 May 2011
Resource based (SGE) versus slot based (
LSF) scheduling
In
LSF we have 1 job per core. This is most of the time inefficient because we have mainly analysis jobs which are not that CPU efficient, so an overprovisioning with 1.0 to 1.5 jobs per core would be nice to have.
Advantage :
- -can use resources more efficient than a static slot based scheduling
Disadvantage :
- -needs more memory due to more active jobs
But should be better than the current
LSF setup with running and suspended jobs.
- job to core binding
- allocation rules Round Robin or Fill Up (also own strategies should be possible in the future)
- live monitoring of used resources
- -not the current resource usage
- -but should be possible, to determine whether a job is still doing something or not
- jobs which cant start will go back to Eqw
- -with an small explanation why the job cant start
- -solve the problem and delete the error marker and the job will start again
- source code available
- own load sensors and complex values
- nice suspend mechanism
- load/suspend thresholds