Caltech Home > Caltech HPC Homepage > Documentation > FAQ > How are priority and fairshare set up in the cluster
Search open search form

How are priority and fairshare set up in the cluster

Fairshare

What are priority and fairshare?

In a shared system like our central cluster, jobs get submitted to a queue and run within a particular order.  The order is determined by the numerical  priority of the job.  This priority is determined using a multi factor priority algorithm which can take into account the size of the job, the length of time in the queue, the user or group that submitted it, the qos it was submitted to, etc.  The vast majority of priority in our cluster comes from fairshare.

Fairshare is a method of allocating shares of a cluster to different groups and attempting to run jobs on the cluster to get them their "fair share".   The share is determined within a window of time with decreasing impact the further back in time jobs were run determined by a half life.

What does this mean in practice?

If your group has run a lot of jobs recently and exceed their "fair share", the priority on submitted jobs for the group will go down.  If your group has not used the cluster much within the window, your priority will go up.  The scheduler takes in to account a few other factors into the priority calculation, but with much lower impact. In short, the idea is to try and keep a level playing field for all groups. 


How are shares setup on the central cluster?

Shares of the cluster are determined by group size and investment in the cluster.

Non Investor:
Groups get 1 fairshare "user unit" per user.
 
Example:
10 users
10 x "user unit"
 
Investor:
A Caltech investor group gets 2 fairshare "user units" per user plus 1 fairshare credit per dollar spent. `
 
Investment $100,000
10 users
100,000 + 10*2 x "user unit"
 
Note: FY23 User Unit = 2,763 (recalculated yearly)