{{tag>Brouillon Cluster Grid Ressource}}
= Notes ordonnanceur cluster grid batch scheduler slurm
== Slurm
Liens :
* http://cascisdi.inra.fr/sites/cascisdi.inra.fr/files/slurm_0.txt
* https://wiki.fysik.dtu.dk/niflheim/SLURM
* https://www.glue.umd.edu/hpcc/help/slurm-vs-moab.html
* https://www.crc.rice.edu/wp-content/uploads/2014/08/Torque-to-SLURM-cheatsheet.pdf
* http://slurm.schedmd.com/rosetta.pdf
* http://www.accre.vanderbilt.edu/wp-content/uploads/2012/04/Slurm.pdf
* https://github.com/accre/SLURM
* http://slurm.schedmd.com/quickstart.html
* http://slurm.schedmd.com/slurm_ug_2011/Basic_Configuration_Usage.pdf
* https://www.unila.edu.br/sites/default/files/files/user_guide_slurm.pdf
* https://computing.llnl.gov/tutorials/slurm/slurm.pdf
* https://computing.llnl.gov/tutorials/bgq/
* https://computing.llnl.gov/linux/slurm/quickstart.html
* https://computing.llnl.gov/linux/slurm/faq.html
* https://rc.fas.harvard.edu/resources/running-jobs/
* http://bap-alap.blogspot.fr/2012_09_01_archive.html
* https://fortylines.com/blog/startingWithSLURM.blog.html
* https://github.com/ciemat-tic/codec/wiki/Slurm-cluster
* http://manx.classiccmp.org/mirror/techpubs.sgi.com/library/manuals/5000/007-5814-001/pdf/007-5814-001.pdf
* http://wildflower.diablonet.net/~scaron/slurmsetup.html
* http://wiki.sc3.uis.edu.co/index.php/Slurm_Installation
* http://eniac.cyi.ac.cy/display/UserDoc/Copy+of+Slurm+notes
* http://www.ibm.com/developerworks/library/l-slurm-utility/index.html
* https://www.lrz.de/services/compute/linux-cluster/batch_parallel/
* http://www.gmpcs.lumat.u-psud.fr/spip.php?rubrique35
* https://services-numeriques.unistra.fr/hpc/applications-disponibles/systeme-de-files-dattente-slurm.html
* http://www.brightcomputing.com/Blog/bid/174099/Slurm-101-Basic-Slurm-Usage-for-Linux-Clusters
* https://dashboard.hpc.unimelb.edu.au/started/
API
* http://slurm.schedmd.com/slurm_ug_2012/pyslurm.pdf
Voir aussi :
* https://aws.amazon.com/fr/batch/use-cases
== A faire
MPI with Slurm
* http://slurm.schedmd.com/mpi_guide.html
* openmpi
* https://www.hpc2n.umu.se/batchsystem/slurm_info
* hwloc-nox (Portable Linux Processor Affinity (PLPA))
* https://www.hpc2n.umu.se/batchsystem/slurm_info
* https://computing.llnl.gov/linux/slurm/mpi_guide.html
* https://computing.llnl.gov/tutorials/openMP/ProcessThreadAffinity.pdf
* https://www.open-mpi.org/faq/?category=slurm
* http://stackoverflow.com/questions/31848608/slurms-srun-slower-than-mpirun
* https://www.rc.colorado.edu/support/examples-and-tutorials/parallel-mpi-jobs.html
* http://www.brightcomputing.com/Blog/bid/149455/How-to-run-an-OpenMPI-job-in-Bright-Cluster-Manager-through-Slurm
* http://www.hpc2n.umu.se/node/875
== Install
Slurm utilisant par défaut **munge** pour faire le lien entre les comptes des machines **il faut que toutes les machines aient l'horloge synchronisées**
Manager :
apt-get install slurm-wlm
Nœuds :
apt-get install -y slurmd slurm-wlm-basic-plugins
Manager et Nœuds
systemctl enable munge.service
zcat /usr/share/doc/slurm-client/examples/slurm.conf.simple.gz > /etc/slurm-llnl/slurm.conf
Il faut adapter slurm.conf, il peut-être généré à partir de :
* /usr/share/doc/slurmctld/slurm-wlm-configurator.easy.html
* /usr/share/doc/slurmctld/slurm-wlm-configurator.html
* https://computing.llnl.gov/linux/slurm/configurator.html
On copie le même fichier de conf sur les nœuds (le même fichier sur le manager que sur les nœuds)
scp -3 vmdeb1:/etc/slurm-llnl/slurm.conf vmdeb2:/etc/slurm-llnl/slurm.conf
scp -3 vmdeb1:/etc/munge/munge.key vmdeb2:/etc/munge/munge.key
Lister les "daemons" démarrés
scontrol show daemons
Sur le maître (ControlMachine) : slurmctld slurmd \\
Sur les nœuds : slurmd
''/etc/slurm-llnl/slurm.conf''
# slurm.conf file generated by configurator easy.html.
# Put this file on all nodes of your cluster.
# See the slurm.conf man page for more information.
#
ControlMachine=vmdeb1
#ControlAddr=127.0.0.1
#
#MailProg=/bin/mail
#MpiDefault=none
MpiDefault=openmpi
MpiParams=ports=12000-12999
#MpiParams=ports=#-#
#ProctrackType=proctrack/pgid
Proctracktype=proctrack/linuxproc
SlurmctldPidFile=/var/run/slurm-llnl/slurmctld.pid
#SlurmctldPort=6817
SlurmdPidFile=/var/run/slurm-llnl/slurmd.pid
#SlurmdPort=6818
SlurmdSpoolDir=/var/lib/slurm-llnl/slurmd
SlurmUser=slurm
#SlurmdUser=root
#UsePAM=1
DisableRootJobs=YES
EnforcePartLimits=YES
JobRequeue=0
ReturnToService=1
#TopologyPlugin=topology/tree
# Must be writable by user SlurmUser. The file must be accessible by the primary and backup control machines.
# On NFS share !? See http://manx.classiccmp.org/mirror/techpubs.sgi.com/library/manuals/5000/007-5814-001/pdf/007-5814-001.pdf
StateSaveLocation=/var/lib/slurm-llnl/slurmctld
SwitchType=switch/none
#TaskPlugin=task/none
#TaskPlugin=task/cgroup
TaskPlugin=task/affinity
#
#
# TIMERS
#KillWait=30
#MinJobAge=300
#SlurmctldTimeout=120
#SlurmdTimeout=300
Waittime=0
#
#
# SCHEDULING
FastSchedule=1
SchedulerType=sched/backfill
#SchedulerPort=7321
SelectType=select/linear
#
#
# LOGGING AND ACCOUNTING
ClusterName=cluster1
#JobAcctGatherFrequency=30
JobAcctGatherType=jobacct_gather/linux
#SlurmctldDebug=3
SlurmctldLogFile=/var/log/slurm-llnl/slurmctld.log
#SlurmdDebug=3
SlurmdLogFile=/var/log/slurm-llnl/slurmd.log
SlurmSchedLogFile=/var/log/slurm-llnl/slurmSched.log
#JobCompType=jobcomp/filetxt
#JobCompType=jobcomp/mysql
JobCompType=jobcomp/none
JobCompLoc=/var/log/slurm-llnl/jobcomp
#JobCheckpointDir=/var/lib/slurm-llnl/checkpoint
#AccountingStorageType=jobacct_gather/linux
#AccountingStorageType=accounting_storage/filetxt
AccountingStorageType=accounting_storage/slurmdbd
AccountingStoreJobComment=YES
DefaultStorageType=accounting_storage/slurmdbd
#AccountingStorageLoc=/var/log/slurm-llnl/accounting
AccountingStoragePort=6819
AccountingStorageEnforce=associations
#
#
NodeName=vmdeb1
# COMPUTE NODES
NodeName=DEFAULT
PartitionName=DEFAULT MaxTime=INFINITE State=UP
NodeName=vmdeb2 CPUs=1 RealMemory=494 State=UNKNOWN
NodeName=vmdeb3 CPUs=2 RealMemory=494 TmpDisk=8000 State=UNKNOWN
PartitionName=debug Nodes=vmdeb[2-3] Default=YES MaxTime=INFINITE Shared=YES State=UP
=== Install de slurmdbd
Il est recommandé d'utiliser MySQL (pas toutes les fonctionnalité avec PostgreSQL, dommage)
Ici on part du principe que vous avez déjà une base de donnés MySQL et compte et droits crée.
apt-get install slurmdbd
zcat /usr/share/doc/slurmdbd/examples/slurmdbd.conf.simple.gz > /etc/slurm-llnl/slurmdbd.conf
On adapte le fichier slurmdbd.conf
Puis
service slurmdbd restart
On test
sacct
JobID JobName Partition Account AllocCPUS State ExitCode
------------ ---------- ---------- ---------- ---------- ---------- --------
== Pb
munge -n | ssh vmdeb1 unmunge
STATUS: Expired credential (15)
Solution :
ntpdate -u pool.ntp.org
sudo -u slurm -- /usr/sbin/slurmctld -Dcvvvv
/usr/sbin/slurmd -Dcvvvv
-c : Clear : Efface l'etat précedent, purge les jobs...
-D : Deamon : Lancement en arrière plan. Logs sur STDOUT
-v : Verbose : Mode bavare. Mettre plusieurs "v" pour être très bavare
slurmd -C
Affiche la configuration de l'hôte courant
Aide
Le **man**
et
commande --help
commande --usage
Variables :
SQUEUE_STATES=all for the squeue command to display jobs in any state. (y compris les job en COMPLETED et CANCELLED)
Commande :
sbatch
salloc
srun
sattach
srun -l --ntasks-per-core=1 --exclusive -n 2 hostname
sinfo --Node
scontrol show partition
scancel --user=test --state=pending
scontrol show config
scontrol show job
scancel -i --user=test
# The Slurm -d singleton argument tells Slurm not to dispatch this job until all previous jobs with the same name have completed.
sbatch -d singleton simple.sh
scontrol ping
sinfo -R
# Afficher egalement les jobs terminés
squeue -t all
#A/I/O/T = "active(in use)/idle/other/total"
sinfo -l
#
sinfo -Nle -o '%n %C %t'
=== Astuce
==== Lancer une commande **srun** sans attendre
Normalement
$ srun -N2 -l hostname
srun: job 219 queued and waiting for resources
Solution (compte root ou le "SlurmUser")
# sinfo --noheader -o %N
vmdeb[2-3]
# srun -N2 --no-allocate -w vmdeb[2-3] hostname
-------
Cancel / terminate a job in "CG" state
scontrol update nodename=node4-blender state=down reason=hung
scontrol update nodename=node4-blender state=idle
Il faudra aussi tuer le processus 'slurmstepd' sur les nœuds \\
Problème de flux réseaux : Node => Manager:TCP6817
PB
"srun: error: Application launch failed: User not found on host"
Solution :
Il faut que le même utilisateur ai le même UID sur les nœuds ainsi que sur le manager. Apparemment c'est lié à **munge**
Il peut être intéressant d'utiliser LDAP