Allocations and monitoring
Resource allocations
Resource allocations (compute time, data storage) are given to projects based on the appropriate grant, agreement, service contract or EuroHPC call. Users are at least part of one project, and may be linked to several projects.
On MeluXina, projects receive a Project ID (e.g. p212345
) which is used:
- for the SLURM account that users are members of (access control, compute time allocations, etc.)
- for the project data directory/directories (access control, data quotas, etc.)
- in service requests users open on the ServiceDesk
For compute time allocations, the overall allocation is divided in monthly grants, which depend on the number of days in the month.
As a practical example:
- a 1 year project with 1200 GPU node-hours allocation is granted (Jan 1st - Dec 31st 2022)
- the GPU node-hours are converted to node-minutes: 1200 x 60 = 72000
- the equivalent of the allocation is 72000 / 365 = 197.26 node-minutes per day
- in January, the project will be able to access 197.26 x 31 = 6115 GPU node-minutes, or almost 102 GPU node-hours
- in February, this will be 197.26 x 28 = 5523 node-minutes, or just over 92 GPU node-hours
- the project's compute time utilization is set to zero on Feb 1st, and the new monthly allocation is applied
- the same process is applied throughout the lifetime of the project
For data allocations, the overall allocation is granted throughout the lifetime of the project.
Compute time utilisation
Users shall ensure that they utilize their compute time allocation consistently and proportionally during the project lifetime. The division of compute time allocations for the duration of the projects and the roll-over process is for now tentative and subject to change.
Remember!
Users shall ensure that they submit jobs only to partitions to which they have been granted access to, jobs submitted to partitions for which no node-hours have been granted will fail to execute.
Myquota
To monitor the usage of your resource allocations, the myquota
tool is available on the MeluXina login nodes.
The tool will show you:
-
Compute time allocations, for each project (SLURM account) you have access to
- in Node-Hours for every type of computing resource (CPU, GPU, FPGA and LargeMem nodes)
- Used out of the total available (Max) in the current month
-
Data storage allocations, for each project (group account) you have access to
- in GiB (Gibibytes) for data size
- in Number-of-Files, for the metadata allocation
- Used at the current moment, out of the maximum (Max) allocated for the project
myquota
By default, myquota
will show both compute and data utilization and allocations.
(login/compute)$ myquota
Output
COMPUTE ALLOCATIONS FOR CURRENT MONTH
Project CPU node-hours GPU node-hours FPGA node-hours LargeMem node-hours
Used | Max | Use% Used | Max | Use% Used | Max | Use% Used | Max | Use%
meluxinaproject 5 | 10 | 50% 0 | 100 | 0% 1 | 4 | 25% 0 | 4 | 0%
DATA ALLOCATIONS
Datapath Data GiB No. Files
Used | Max | Use% Used | Max | Use%
/home/users/myuser 1 | 100 | 1% 13216 | 100000 | 13%
/project/home/meluxinaproject 146 | 5000 | 2% 437033 | 5000000 | 8%
/project/scratch/meluxinaproject 29 | 100 | 29% 41160 | 100000 | 41%
Myquota can also be used to display just compute time usage and allocations:
(login/compute)$ myquota -t compute
Output
COMPUTE ALLOCATIONS FOR CURRENT MONTH
Project CPU node-hours GPU node-hours FPGA node-hours LargeMem node-hours
Used | Max | Use% Used | Max | Use% Used | Max | Use% Used | Max | Use%
meluxinaproject 5 | 10 | 50% 0 | 100 | 0% 1 | 4 | 25% 0 | 4 | 0%
Myquota can also be used to display just data storage usage and allocations:
(login/compute)$ myquota -t data
Output
DATA ALLOCATIONS
Datapath Data GiB No. Files
Used | Max | Use% Used | Max | Use%
/home/users/myuser 1 | 100 | 1% 13216 | 100000 | 13%
/project/home/meluxinaproject 146 | 5000 | 2% 437033 | 5000000 | 8%
/project/scratch/meluxinaproject 29 | 100 | 29% 41160 | 100000 | 41%
You can further configure the output of the myquota
tool using multiple options:
--start
/ --end
: set start- and end- date of usage period in ISO format (YYYY-MM-DD). Can only be used by Project coordinators.
--coord
: output the usage of all participants of projects you are a coordinator of
--time
: specify output to be in node- hours, minutes or seconds
Warning
Note that you can only run myquota
once every 10 seconds. Do not run the tool repeatedly, e.g.
using the watch
command or similar.
Resource monitoring with SLURM & Lustre native commands
sreport
Compute time usage can be also checked with native Slurm commands:
(login/compute)$ sreport -t hours cluster AccountUtilizationByUser Tree start=2021-06-01 end=2021-07-01
Output
--------------------------------------------------------------------------------
Cluster/Account/User Utilization 2021-06-01T00:00:00 - 2021-12-31T23:59:59 (18493200 secs)
Usage reported in CPU Hours
--------------------------------------------------------------------------------
Cluster Account Login Proper Name Used Energy
--------- -------------------- --------- --------------- ----------- --------
meluxina p200000 688392 0
meluxina p200000 u100000 MeluXina user 137 0
meluxina p200000 u100000 MeluXina user 67121 0
meluxina p200000 u100000 MeluXina user 621134 0
meluxina p20000X 367605 0
meluxina p20000X u100000 MeluXina user 115266 0
meluxina p20000X u100000 MeluXina user 142792 0
meluxina p20000X u100000 MeluXina user 32063 0
meluxina p20000X u100000 MeluXina user 32153 0
meluxina p20000X u100000 MeluXina user 41810 0
scontrol
Non-coordinators can check total compute time usage for projects they are a member of using scontrol. The returned value is the quota, followed by the consumption in brackets.
(login/compute)$ scontrol show assoc_mgr account=your-project-name flags=assoc | head -n 15 | grep GrpTRESMins | tr ',' '\n'| grep gres/.*n
Output
gres/cpun=82391(10991)
gres/fpgan=0(0)
gres/gpun=11424(5437)
gres/memn=0(0)
lfs
Data storage usage can be also checked with native Lustre commands which requires two steps. First - retrieve the Lustre FS project id for your (home) directory:
(login/compute)$ lfs project -d /home/users/$USER
Output
10011 P /home/users/myuser
The command above prints the project id of the directory specific to Lustre (not to be confused with your project number). In the second step - based on this ID, retrieve the quotas set for the directory:
(login/compute)$ lfs quota -h -p $PROJECTID /home/users/$USER
Output
Disk quotas for prj 10011 (pid 10011):
Filesystem used quota limit grace files quota limit grace
/home/users/myuser
299M 0k 100G - 15364 0 100000 -