Fair Share

Fair share is the name of the LSF method to calculate user priorities.

Scheduling of new jobs of any user is based on this priority. It is calculated taking into account the user's history of batch farm usage:
  • the CPU time
  • the total run time
  • the number of jobs

The priority is re-calculated every time one of these quantities change. Thus a user's priority is lowered once new jobs actually start running ($job\_slots$) and is gradually going down over time ($cpu\_time$, $run\_time$).

The priority list can be displayed with bqueues -l queuename.
  • This command shows a snapshot of the priority distribution: users that have a large number of jobs running at that moment or have jobs running for a long time already will then be low on this list. Hence the priority list will never tell why some user's jobs are running while somebody else's jobs are still pending.

Fairshare Priority Formula

 \begin{displaymath} priority = \frac{number\_shares}{cpu\_time \cdot \texttt{CPU\_TIME\_FACTOR}+run\_time \cdot \texttt{RUN\_TIME\_FACTOR}+(1+job\_slots) \cdot \texttt{RUN\_JOB\_FACTOR}} \end{displaymath}


$number\_shares$
Number of shares assigned to the users, identical for all users/all queues @ GSI
$cpu\_time$
Cumulative CPU time used by the user.
$run\_time$
Total run time of running jobs.
$job\_slots$
Number of job slots reserved and in use.

A decay factor is applied to the CPU time as well as the total run time, such that after $ \texttt{HIST\_HOURS}$, the $cpu\_time$ or the $run\_time$ is at 10% of its original value.

The relative importance of these factors for the calculation of the priority are given by the following values:

  • $ \texttt{CPU\_TIME\_FACTOR}= 0.7 $ ( Default 0,7 )
  • $ \texttt{RUN\_TIME\_FACTOR}= 2.0$ ( Default: 0,7 )
  • $ \texttt{RUN\_JOB\_FACTOR}= 10$ ( Default 3 )
  • $ \texttt{HIST\_HOURS}= 168$ ( Default 5 )

-- HelmutKreiser - 19 Sep 2005 -- ThomasRoth - 26 Oct 2007
Topic revision: r6 - 2007-10-31, ThomasRoth