王海庆的云笔记

Centos7搭建slurm-21.08.8任务调度系统(MPW)


       Slurm是面向Linux和Unix的开源工作调度程序,由世界上许多超级计算机使用,主要功能如下: 

1、为用户分配计算节点的资源,以执行工作; 

2、提供的框架在一组分配的节点上启动、执行和监视工作(通常是并行作业); 

3、管理待处理作业的工作队列来仲裁资源争用问题;


Slurm架构


环境配置

 服务器IP  主机名操作系统 配置
 控制节点 172.18.10.10master CentOS7.94核8G
 计算节点1 172.18.10.20mpw-1
 CentOS7.916核26G
 计算节点2 172.18.10.49mpw-2 CentOS7.916核16G
 计算节点3 172.18.10.51mpw-3 CentOS7.916核16G
 计算节点4 172.18.10.93mpw-4 CentOS7.916核16G
 计算节点5 172.18.10.98mpw-5 CentOS7.916核16G


一、基础环境(除说明外,所有机器都要执行)

关闭防火墙

systemctl stop firewalld
systemctl disable firewalld
sed -i -e  's/^SELINUX=.*/SELINUX=disabled/g' /etc/selinux/config
setenforce 0


换源

rm -rf /etc/yum.repos.d/*
curl -o /etc/yum.repos.d/CentOS-Base.repo https://mirrors.aliyun.com/repo/Centos-7.repo
curl -o /etc/yum.repos.d/epel.repo https://mirrors.aliyun.com/repo/epel-7.repo

yum clean all
yum makecache fast -y


公司里CentOS7的源

rm -rf /etc/yum.repos.d/*
cat >  /etc/yum.repos.d/centos7.repo << EOF
[base]
name=base
baseurl=http://172.18.0.61/centos7/base
enabled=1
gpgcheck=0

[extras]
name=extras
baseurl=http://172.18.0.61/centos7/extras
enabled=1
gpgcheck=0

[updates]
name=updates
baseurl=http://172.18.0.61/centos7/updates
enabled=1
gpgcheck=0

[epel]
name=epel
baseurl=http://172.18.0.61/centos7/epel
enabled=1
gpgcheck=0
EOF

yum clean all
yum makecache fast -y


设置主机名,主机名一定不能重复(分别执行)

hostnamectl set-hostname master
hostnamectl set-hostname mpw-1
hostnamectl set-hostname mpw-2
hostnamectl set-hostname mpw-3
hostnamectl set-hostname mpw-4
hostnamectl set-hostname mpw-5


设置hosts

cat >>  /etc/hosts << EOF
172.18.10.10 master
172.18.10.20 mpw-1
172.18.10.49 mpw-2
172.18.10.51 mpw-3
172.18.10.93 mpw-4
172.18.10.98 mpw-5
EOF


安装软件

yum -y install net-tools wget vim ntpdate chrony htop glances nfs-utils rpcbind python3


ntpdate 时间同步

ntpdate time1.aliyun.com
echo "*/10 * * * * /usr/sbin/ntpdate time1.aliyun.com " >> /var/spool/cron/root
timedatectl set-timezone Asia/Shanghai
hwclock --systohc


配置SSH免登陆

# 控制节点上面执行
echo y| ssh-keygen -t rsa -P '' -f  ~/.ssh/id_rsa

ssh-copy-id -i ~/.ssh/id_rsa.pub  -o  StrictHostKeyChecking=no root@master
ssh-copy-id -i ~/.ssh/id_rsa.pub  -o  StrictHostKeyChecking=no root@mpw-1
ssh-copy-id -i ~/.ssh/id_rsa.pub  -o  StrictHostKeyChecking=no root@mpw-2
ssh-copy-id -i ~/.ssh/id_rsa.pub  -o  StrictHostKeyChecking=no root@mpw-3
ssh-copy-id -i ~/.ssh/id_rsa.pub  -o  StrictHostKeyChecking=no root@mpw-4
ssh-copy-id -i ~/.ssh/id_rsa.pub  -o  StrictHostKeyChecking=no root@mpw-5


二、配置Munge(除说明外,所有机器都要执行)


创建Munge用户

Munge用户要确保Master Node和Compute Nodes的UID和GID相同,所有节点都需要安装Munge;

groupadd -g 1108 munge
useradd -m -c "Munge Uid 'N' Gid Emporium" -d /var/lib/munge -u 1108 -g munge -s /sbin/nologin munge


生成熵池

# 安装
yum install -y rng-tools

# 使用/dev/urandom来做熵源
rngd -r /dev/urandom
 
sed -i 's#^ExecStart.*#ExecStart=/sbin/rngd -f -r /dev/urandom#g'  /usr/lib/systemd/system/rngd.service

systemctl daemon-reload
systemctl start rngd
systemctl enable rngd
systemctl status rngd


部署Munge,Munge是认证服务,实现本地或者远程主机进程的UID、GID验证。

yum install munge munge-libs munge-devel -y


 创建全局密钥,在Master Node创建全局使用的密钥

# 控制节点上面执行
/usr/sbin/create-munge-key -r
dd if=/dev/urandom bs=1 count=1024 > /etc/munge/munge.key


密钥同步到所有计算节点

# 控制节点上面执行
scp -p /etc/munge/munge.key root@master:/etc/munge
scp -p /etc/munge/munge.key root@mpw-1:/etc/munge
scp -p /etc/munge/munge.key root@mpw-2:/etc/munge
scp -p /etc/munge/munge.key root@mpw-3:/etc/munge
scp -p /etc/munge/munge.key root@mpw-4:/etc/munge
scp -p /etc/munge/munge.key root@mpw-5:/etc/munge

# 计算节点上面执行
chown munge: /etc/munge/munge.key
chmod 400 /etc/munge/munge.key


启动所有节点 

systemctl restart munge
systemctl enable munge
systemctl status munge


测试Munge服务,每个计算节点与控制节点进行连接验证

# 本地查看凭据
munge -n

# 本地解码
munge -n | unmunge

# 验证compute node,远程解码
munge -n | ssh master unmunge

# Munge凭证基准测试
remunge


 三、配置Slurm(除说明外,所有机器都要执行)


创建Slurm用户 

groupadd -g 1109 slurm
useradd -m -c "Slurm manager" -d /var/lib/slurm -u 1109 -g slurm -s /bin/bash slurm


 安装Slurm依赖

yum install gcc gcc-c++ readline-devel perl-ExtUtils-MakeMaker pam-devel rpm-build mysql-devel http-parser-devel json-c-devel libjwt  libjwt-devel -y


编译Slurm和安装Slurm

# 下载地址
https://download.schedmd.com/slurm/

wget https://download.schedmd.com/slurm/slurm-21.08.8-2.tar.bz2

rpmbuild -ta --with mysql --with slurmrestd --with jwt slurm-21.08.8-2.tar.bz2
cd /root/rpmbuild/RPMS/x86_64/
yum localinstall -y slurm-*

参数 --with slurmrestd支持restful api

 

配置控制节点Slurm 

# 控制节点上面执行
/bin/cp -f /etc/slurm/cgroup.conf.example /etc/slurm/cgroup.conf

cat >  /etc/slurm/slurm.conf << EOF

ClusterName=cluster
# SlurmctldHost=master

ControlMachine=master
ControlAddr=172.18.10.10
 
#
SlurmctldDebug=info
SlurmdDebug=debug3
GresTypes=gpu
 
MpiDefault=none
ProctrackType=proctrack/cgroup
SlurmctldPidFile=/var/run/slurmctld.pid
SlurmctldPort=6817
SlurmdPidFile=/var/run/slurmd.pid
SlurmdPort=6818
SlurmdSpoolDir=/var/spool/slurm
SlurmUser=slurm
StateSaveLocation=/var/spool/slurm/ctld
SwitchType=switch/none
TaskPlugin=task/affinity,task/cgroup
# Fix Mentioned Error
# TaskPluginParam=Sched
TaskPluginParam=verbose
 
# TIMERS
#InactiveLimit=0
#KillWait=30
#ResumeTimeout=600
MinJobAge=172800
#OverTimeLimit=0
#SlurmctldTimeout=12
#SlurmdTimeout=300
#Waittime=0

# SCHEDULING
SchedulerType=sched/backfill
SelectType=select/cons_tres
SelectTypeParameters=CR_Core

# LOGGING AND ACCOUNTING
AccountingStorageEnforce=associations
AccountingStorageHost=master
AccountingStoragePort=6819
AccountingStorageType=accounting_storage/slurmdbd

# Fix Mentioned Error
# AccountingStoreJobComment=YES
AccountingStoreFlags=job_comment


#JobCompHost=localhost
#JobCompPass=123456
#JobCompPort=3306
#JobCompType=jobcomp/mysql
#JobCompUser=root
#JobAcctGatherFrequency=1
#JobAcctGatherType=jobacct_gather/linux
SlurmctldLogFile=/var/log/slurm/slurmctld.log
SlurmdLogFile=/var/log/slurm/slurmd.log

AuthAltTypes=auth/jwt
AuthAltParameters=jwt_key=/var/spool/slurm/ctld/jwt_hs256.key

#
# PARTITION mpw
PartitionName=mpw Nodes=mpw-[1-50] Default=YES Shared=YES OverSubscribe=YES

#   DUMMY
NodeName=mpw-[6-50] CPUs=16 RealMemory=16000 State=FUTURE

#   NODES
NodeName=mpw-1 CPUs=16 RealMemory=16000 Weight=1 State=CLOUD
NodeName=mpw-2 CPUs=16 RealMemory=16000 Weight=1 State=CLOUD
NodeName=mpw-3 CPUs=16 RealMemory=16000 Weight=1 State=CLOUD
NodeName=mpw-4 CPUs=16 RealMemory=16000 Weight=1 State=CLOUD
NodeName=mpw-5 CPUs=16 RealMemory=16000 Weight=1 State=CLOUD

EOF


复制控制节点配置文件到计算节点 

# 控制节点上面执行
scp /etc/slurm/*.conf mpw-1:/etc/slurm/
scp /etc/slurm/*.conf mpw-2:/etc/slurm/
scp /etc/slurm/*.conf mpw-3:/etc/slurm/
scp /etc/slurm/*.conf mpw-4:/etc/slurm/
scp /etc/slurm/*.conf mpw-5:/etc/slurm/


设置控制、计算节点文件权限 

mkdir -p /var/spool/slurm
chown slurm: /var/spool/slurm
mkdir -p /var/log/slurm
chown slurm: /var/log/slurm


配置控制节点Slurm Accounting,Accounting records为slurm收集作业步骤的信息,可以写入一个文本文件或数据库,但这个文件会变得越来越大,最简单的方法是使用MySQL来存储信息。


CentOS7采用yum方式安装mysql5.7(修改存储路径)


创建数据库的Slurm用户

# mysql5.7
grant all on slurm_acct_db.* to 'slurm'@'%' identified by 'Slurm*1234' with grant option;

# mysql8.0
CREATE USER 'slurm'@'%' identified with mysql_native_password  by 'Slurm*1234';
GRANT ALL ON slurm_acct_db.* TO 'slurm'@'%';
flush privileges;


配置slurmdbd.conf文件 

# 控制节点上面执行
cp /etc/slurm/slurmdbd.conf.example /etc/slurm/slurmdbd.conf
 
cat >  /etc/slurm/slurmdbd.conf << 'EOF'

AuthType=auth/munge
AuthInfo=/var/run/munge/munge.socket.2
DbdAddr=172.18.10.10
DbdHost=master
SlurmUser=slurm
DebugLevel=verbose
LogFile=/var/log/slurm/slurmdbd.log
PidFile=/var/run/slurmdbd.pid
StorageType=accounting_storage/mysql
StorageHost=172.18.0.191
StorageUser=slurm
StoragePass=Slurm*1234
StorageLoc=slurm_acct_db4 #db名,slurmdbd会自动创建db
StoragePort=3306
AuthAltTypes=auth/jwt
AuthAltParameters=jwt_key=/var/spool/slurm/ctld/jwt_hs256.key

EOF


设置权限

# 控制节点上面执行
chown slurm: /etc/slurm/slurmdbd.conf
chown slurm: /etc/slurm/slurm.conf


Add JWT key to controller (StateSaveLocation目录)

mkdir -p /var/spool/slurm/ctld

dd if=/dev/random of=/var/spool/slurm/ctld/jwt_hs256.key bs=32 count=1
chown slurm:slurm /var/spool/slurm/ctld/jwt_hs256.key
chmod 0600 /var/spool/slurm/ctld/jwt_hs256.key
# chown root:root /etc/slurm
chmod 0755 /var/spool/slurm/ctld
chown slurm:slurm /var/spool/slurm/ctld

 

 启动服务

# 启动控制节点Slurmdbd服务
systemctl restart slurmdbd
systemctl enable slurmdbd
systemctl status slurmdbd
 
# 启动控制节点slurmctld服务
systemctl restart slurmctld
systemctl enable slurmctld
systemctl status slurmctld
 
# 启动计算节点的服务
systemctl restart slurmd
systemctl enable slurmd
systemctl status slurmd

# 服务无法启动,可通过直接启动命令查看
slurmdbd -Dvvv
slurmctld -Dvvv
slurmd -Dvvv


四、检查Slurm集群

创建用户

useradd zkxy
echo 123456 | passwd --stdin zkxy


检查Slurm集群

# 控制节点和计算节点上面都可以执行

# 查看集群
sinfo
scontrol show partition
scontrol show node

# 提交作业 
srun -N2 hostname
scontrol show jobs

# 查看作业
squeue -a


新建用户  

useradd zkxy
echo zkxy | passwd --stdin zkxy


运行slurm api  (不能是root和SlurmUser用户)

cat > /etc/slurm/slurmrestd.conf << 'EOF'
include /etc/slurm/slurm.conf
AuthType=auth/jwt
EOF

chown slurm:slurm /etc/slurm/slurmrestd.conf

su - zkxy
slurmrestd -f /etc/slurm/slurmrestd.conf 0.0.0.0:6688 -a jwt -s openapi/v0.0.36 
slurmrestd -f /etc/slurm/slurmrestd.conf -a rest_auth/jwt -s openapi/v0.0.36 -vvv 0.0.0.0:6688


创建systemd服务

cat > /usr/lib/systemd/system/slurmrestd.service <<EOF
[Unit]
Description=slurmrestd service
After=network.service

[Service]
Type=simple  

User=zkxy
Group=zkxy
WorkingDirectory=/usr/sbin
ExecStart=/usr/sbin/slurmrestd -f /etc/slurm/slurmrestd.conf -a rest_auth/jwt -s openapi/v0.0.36 -vvv 0.0.0.0:6688

Restart=always
 
ProtectSystem=full
PrivateDevices=yes
PrivateTmp=yes
NoNewPrivileges=true

[Install]
WantedBy=multi-user.target
EOF

systemctl daemon-reload
systemctl stop slurmrestd
systemctl restart slurmrestd
systemctl enable slurmrestd
systemctl status slurmrestd


获取token(默认lifespan=1800,最大为99999999999)

scontrol token lifespan=999999999 username=zkxy


如果node状态为down,slurm Reason=Not responding,重启服务无效的话,可以试一下下面命令

scontrol update NodeName=node01 State=RESUME


python调用测试api

import requests
 
url='http://172.18.0.220:6688/slurm/v0.0.36/ping'
 
headers = {
    'X-SLURM-USER-NAME':'zkxy',
    'X-SLURM-USER-TOKEN':'eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJleHAiOjE2MzQxOTM4NzMsImlhdCI6MTYzNDE5MjA3Mywic3VuIjoid2hxIn0.82HpB4ss96Iw7o9JAzDp8WGRfFWDOCbPzx-J3Y5nK_U',
}

response = requests.get(url, headers=headers)

print(response.text)


报错是因为X-SLURM-USER-TOKEN不对

{
   "meta": {
     "plugin": {
       "type": "openapi\/v0.0.36",
       "name": "REST v0.0.36"
     },
     "Slurm": {
       "version": {
         "major": 21,
         "micro": 7,
         "minor": 8
       },
       "release": "21.08.7"
     }
   },
   "errors": [
     {
       "error": "_op_handler_ping: slurmctld config is unable to load: Unknown error 5005",
       "errno": 5005
     },
     {
       "error": "_op_handler_ping: slurmctld config is missing"
     }
   ]
 }


slurm-rest_api接口参考

https://app.swaggerhub.com/apis/rherrick/slurm-rest_api/0.0.35#/


https://slurm.schedmd.com/SLUG20/REST_API.pdf
https://slurm.schedmd.com/SLUG19/REST_API.pdf


REST_API.pdf


因偶尔出现远程访问rest接口会比较慢,但是在集群内部访问会比较快。因此将6688端口转发到16688,这样就可以加快接口调用了。

yum install nginx -y

cat > /etc/nginx/conf.d/slurm.conf << 'EOF'
upstream backend {
    server 127.0.0.1:6688; 
}

server {
   listen 16688;  
   server_name localhost;
    location / {
        proxy_pass http://backend; 
    }
} 
EOF

systemctl restart nginx
systemctl enable nginx
systemctl status nginx


参考

https://www.cnblogs.com/liu-shaobo/p/13285839.html
https://blog.csdn.net/kongxx/article/details/52550653
https://www.jianshu.com/p/c7cf800656dc
https://www.jianshu.com/p/e560b19dbd3e


jwt参考

https://slurm.schedmd.com/jwt.html
https://elwe.rhrk.uni-kl.de/documentation/jwt.html


Slurm中文用户手册

https://docs.slurm.cn/users/


slurm-21.08.8-2.tar.bz2


执行

cat > /opt/test.sh << 'EOF'
#!/bin/bash
hostname
EOF

sbatch /opt/test.sh
sbatch --account=root --cpus-per-task=1 --nodes=1 /opt/test.sh




文章最后更新时间: 2022-05-30 15:17:26