- Checking a posteriori the memory
 
$> sacct -o jobid,reqnodes,reqcpus,reqmem,maxrss,averss,elapsed -j JOBID
#reqmem : RAM demandée via sbatch
#maxrss : RAM maximale utilisée
#averss : RAM moyenne utilisée
- In the following example, the job has used 105GB of RAM.
 
$> sacct -o jobid,reqnodes,reqcpus,reqmem,maxrss,averss,elapsed -j 94079
       JobID ReqNodes  ReqCPUS     ReqMem     MaxRSS     AveRSS    Elapsed
------------ -------- -------- ---------- ---------- ---------- ----------
94079               1        1      125Gn                         00:10:20
94079.batch         1        1      125Gn 105823148K 105823148K   00:10:27
- However, the measure is not necessarily reliable
 
- Here is a python program with which we have launched a job. This program uses 3Go of RAM.
 
- This programs runs 1 min (with sleep(60))
 
import psutil
import time
import numpy as np
arr=np.ones((1024,1024,1024,3), dtype=np.uint8)
print(psutil.Process().memory_info().rss / (1024*1024))
time.sleep(60)
- The analysis given by the command sacct is here correct.
 
$> sacct -o jobid,reqnodes,reqcpus,reqmem,maxrss,averss,elapsed -j 703201
       JobID ReqNodes  ReqCPUS     ReqMem     MaxRSS     AveRSS    Elapsed 
------------ -------- -------- ---------- ---------- ---------- ---------- 
703201              1        1       60Gn                         00:01:06 
703201.batch        1        1       60Gn   3172208K   3172208K   00:01:06 
- If the execution time is too short, the scheduler does not give a correct analysis of the memory really used.
 
- Le programme est presqu’exactement le même. Il utilise bien 3Gos de RAM mais ne dure que 10 secondes (avec sleep(10))
 
- The program is almost the same. It uses 3GB of RAM but runs only 10 seconds (with sleep(10))
 
import psutil
import time
import numpy as np
arr=np.ones((1024,1024,1024,3), dtype=np.uint8)
print(psutil.Process().memory_info().rss / (1024*1024))
time.sleep(10)
- However, the analysis given by the command sacct is here wrong.
 
$> sacct -o jobid,reqnodes,reqcpus,reqmem,maxrss,averss,elapsed -j 703202
       JobID ReqNodes  ReqCPUS     ReqMem     MaxRSS     AveRSS    Elapsed 
------------ -------- -------- ---------- ---------- ---------- ---------- 
703202              1        1       60Gn                         00:00:11 
703202.batch        1        1       60Gn      1484K      1484K   00:00:11 
Python
- To monitor the memory used by a process in Python, we can use the following code:
 
import psutil
# ...
psutil.Process().memory_info().rss / (1024*1024)
- The division by 1024*1024 gives a value into GB.
 
- Here is a piece of Python code to monitor the memory used by the process proc at a given time.
 
while proc.poll() is None:
    rss = psutil.Process(proc.pid).memory_info().rss
    proc.wait(timeout)
- The parameter timeout means the steptime between which you make measurements.
 
- Documentation