Slurm database
This page describes Slurm database configuration on RHEL 7 or 8 (and clones) servers.
Jump to our top-level Slurm page: Slurm batch queueing system
Database documentation
See the accounting page and the Slurm_tutorials with Slurm Database Usage. Please note this statement:
MySQL or MariaDB is the preferred database.
To enable this database support one only needs to have the development package for the database they wish to use on the system.
Slurm uses the InnoDB storage engine in MySQL to make rollback possible.
This must be available on your MySQL installation or rollback will not work.
The following configuration is relevant only for the Database node (which may be the Head/Master node), but not the compute nodes.
Hardware optimization
SchedMD recommends to have a separate database server, if possible. It may be on the same server as slurmctld, but this may impact performance.
Furthermore, running the database on a separate server gives you a strong advantage because you can upgrade slurmdbd without affecting slurmctld and thereby the cluster operations!
You should consider optimizing the database performance by mounting the MariaDB (MySQL) database directory on a dedicated high-speed file system:
/var/lib/mysql
Whether this is required depends on the number and frequency of jobs expected. A high-speed file system could be placed on a separate SSD SAS/SATA disk drive, or even better on a PCIe NVMe disk drive.
Such disks must be qualified for high-volume random small read/write operations relevant for databases, and should be built with the Non-Volatile Memory Express (NVMe) storage interface standard for reliability and performance.
A disk size of 200 GB or 400 GB should be sufficient. Consider installing 2 disk drives and run them in a RAID-1 mirrored configuration.
Install slurmdbd package
Install the slurm database RPM on the database-only (slurmdbd service) node:
export VER=23.11.8-1 # Use the latest version
dnf install slurm-$VER*rpm slurm-devel-$VER*rpm slurm-slurmdbd-$VER*rpm
Explicitly enable the service:
systemctl enable slurmdbd
Set up MariaDB database
The accounting page has a section named MySQL Configuration which should be studied first. RHEL8 and EL8 clones contain the MariaDB database version 10.3.
Make sure the MariaDB packages were installed before you built the Slurm RPMs:
rpm -q mariadb-server mariadb-devel
rpm -ql slurm-slurmdbd | grep accounting_storage_mysql.so # Must show location of this file
Otherwise you must install MariaDB packages:
dnf install mariadb-server mariadb-devel
and rebuild all RPMs with mysql support as shown in Slurm installation and upgrading:
rpmbuild -ta slurm-$VER.tar.bz2 --with mysql
If you will use Ansible to manage the database, Ansible needs this Python package:
dnf install python3-mysql (EL8)
dnf install python3-PyMySQL (EL9)
Now start the MariaDB service:
systemctl start mariadb
systemctl enable mariadb
systemctl status mariadb
Make sure to configure the MariaDB database’s root password as instructed at first invocation of the mariadb service, or run this command:
/usr/bin/mysql_secure_installation
Select a suitable slurm user’s database password. Now follow the accounting page instructions (using -p to enter the database password):
# mysql -p
grant all on slurm_acct_db.* TO 'slurm'@'localhost' identified by 'some_pass' with grant option; ### WARNING: change the some_pass
SHOW GRANTS;
SHOW VARIABLES LIKE 'have_innodb';
create database slurm_acct_db;
quit;
WARNING: Use the slurm database user’s password in stead of some_pass
!
Optional: If you would like to grant read-only (SELECT) access to the database, set up a readonly
user with access from %
(meaning any host):
# mysql -p
grant select on slurm_acct_db.* TO 'readonly'@'%' identified by 'some_pass';
For remote hosts you of course have to open the database server’s firewall on port 3306 as described in Firewall between slurmctld and slurmdbd.
You can verify the database grants for the slurm user:
# mysql -p -u slurm
show grants;
quit;
Regarding InnoDB, by default, MariaDB uses the XtraDB storage engine, a performance enhanced fork of the InnoDB storage engine.
This will grant user ‘slurm’ access to do what it needs to do on the local host or the storage host system.
This must be done before the slurmdbd will work properly. After you grant permission to the user ‘slurm’ in mysql then you can start slurmdbd and the other Slurm daemons.
You start slurmdbd by typing its pathname ‘/usr/sbin/slurmdbd’ or ‘/etc/init.d/slurmdbd start’.
You can verify that slurmdbd is running by typing ps aux | grep slurmdbd
.
If the slurmdbd is not running you can use the -v option when you start slurmdbd to get more detailed information.
Starting the slurmdbd in daemon mode with the -D -vvv
option can also help in debugging so you don’t have to go to the log to find the problem.
MySQL configuration
In the accounting page section Slurm Accounting Configuration Before Build some advice about MySQL configuration is given:
NOTE: Before running the slurmdbd for the first time, review the current setting for MySQL’s innodb_buffer_pool_size. Consider setting this value large enough to handle the size of the database. This helps when converting large tables over to the new database schema and when purging old records. Setting
innodb_lock_wait_timeout
andinnodb_log_file_size
to larger values than the default is also recommended.
The following is recommended for /etc/my.cnf
,
but on EL7/EL8 you should create a new file /etc/my.cnf.d/innodb.cnf
containing:
[mysqld]
innodb_buffer_pool_size=32768M
innodb_log_file_size=64M
innodb_lock_wait_timeout=900
The innodb_buffer_pool_size could be even larger, like 50%-80% of the server’s RAM size.
To implement this change you have to shut down the database and move/remove logfiles:
systemctl stop mariadb
mv /var/lib/mysql/ib_logfile? /tmp/
systemctl start mariadb
You can check the current setting in MySQL like so:
# mysql -p
SHOW VARIABLES LIKE 'innodb_buffer_pool_size';
SHOW VARIABLES LIKE 'innodb_log_file_size';
SHOW VARIABLES LIKE 'innodb_lock_wait_timeout';
quit;
See also Bug_2457:
The innodb_buffer_pool_size can have a huge impact - we’d recommend setting this as high as half the RAM available on the slurmdbd server.
Slurm database tables
To view the status of the tables in the slurm_acct_db database:
# mysqlshow -p --status slurm_acct_db
It is possible to display the contents of the slurm_acct_db database:
# mysql -p -u slurm slurm_acct_db
Enter password:
To show tables in the database:
MariaDB [slurm_acct_db]> show tables;
+----------------------------------+
| Tables_in_slurm_acct_db |
+----------------------------------+
| acct_coord_table |
| acct_table |
| clus_res_table |
| cluster_table |
| convert_version_table |
| federation_table |
| niflheim_assoc_table |
| niflheim_assoc_usage_day_table |
| niflheim_assoc_usage_hour_table |
| niflheim_assoc_usage_month_table |
| niflheim_event_table |
| niflheim_job_table |
| niflheim_last_ran_table |
| niflheim_resv_table |
| niflheim_step_table |
| niflheim_suspend_table |
| niflheim_usage_day_table |
| niflheim_usage_hour_table |
| niflheim_usage_month_table |
| niflheim_wckey_table |
| niflheim_wckey_usage_day_table |
| niflheim_wckey_usage_hour_table |
| niflheim_wckey_usage_month_table |
| qos_table |
| res_table |
| table_defs_table |
| tres_table |
| txn_table |
| user_table |
+----------------------------------+
where niflheim refers to the name of our cluster.
The contents of a table can be described like in this example:
MariaDB [slurm_acct_db]> describe user_table;
+---------------+---------------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+---------------+---------------------+------+-----+---------+-------+
| creation_time | bigint(20) unsigned | NO | | NULL | |
| mod_time | bigint(20) unsigned | NO | | 0 | |
| deleted | tinyint(4) | YES | | 0 | |
| name | tinytext | NO | PRI | NULL | |
| admin_level | smallint(6) | NO | | 1 | |
+---------------+---------------------+------+-----+---------+-------+
An element in the user_table can be printed:
MariaDB [slurm_acct_db]> select * from user_table where name="xxx";
+---------------+------------+---------+------+-------------+
| creation_time | mod_time | deleted | name | admin_level |
+---------------+------------+---------+------+-------------+
| 1477321893 | 1477321893 | 0 | xxx | 1 |
+---------------+------------+---------+------+-------------+
SlurmDBD Configuration
While the slurmdbd will work with a flat text file for recording job completions and such this configuration will not allow “associations” between a user and account. A database allows such a configuration.
MySQL or MariaDB is the preferred database. To enable this database support one only needs to have the development package for the database they wish to use on the system. Slurm uses the InnoDB storage engine in MySQL to make rollback possible. This must be available on your MySQL installation or rollback will not work.
slurmdbd requires its own configuration file called slurmdbd.conf. Start by copying the example file from the slurmdbd.conf man-page.
The file /etc/slurm/slurmdbd.conf
should be only on the computer where slurmdbd executes and should only be readable by the user which executes slurmdbd (e.g. “slurm”).
It must be protected from unauthorized access since it contains a database login name and password::
See the slurmdbd.conf man-page for a more complete description of the configuration parameters.
NOTICE: The /etc/slurm/slurm.conf
file is not needed or used in the slurmdbd server.
The only file required is /etc/slurm/slurmdbd.conf
.
However, user commands such as sinfo and sacctmgr will need access to slurm.conf,
and if /etc/slurm/slurm.conf
does not exist, then they will use the configless mode (if configured).
Set up files and permissions:
chown slurm: /etc/slurm/slurmdbd.conf
chmod 600 /etc/slurm/slurmdbd.conf
mkdir /var/log/slurm
touch /var/log/slurm/slurmdbd.log
chown slurm: /var/log/slurm/slurmdbd.log
Configure some of the slurmdbd.conf variables:
LogFile=/var/log/slurm/slurmdbd.log
DbdHost=XXXX # Replace by the slurmdbd server hostname (for example, slurmdbd.my.domain)
DbdPort=6819 # The default value
SlurmUser=slurm
StorageHost=localhost
StoragePass=some_pass # The above defined database password, change it for your site!
StorageLoc=slurm_acct_db
Setting database purge parameters
A database with very many job records (maybe of the order of millions) might possibly cause widespread problems when upgrading the Slurm database, see the mailing list thread [slurm-users] Extreme long db upgrade 16.05.6 -> 17.11.3.
In order to solve this problem, it is advisable to purge job records from the Slurm database. In slurmdbd.conf you may define a number of purge parameters such as:
PurgeEventAfter
PurgeJobAfter
PurgeResvAfter
PurgeStepAfter
PurgeUsageAfter
The values of these parameters depend on the number of jobs in the database, which differs a lot between sites. There does not seem to be any heuristics for determining good values, so some testing will be required.
The high_throughput page has this advise:
You might also consider setting the Purge options in your slurmdbd.conf to clear out old Data. A typical configuration might look like this:
PurgeEventAfter=12months PurgeJobAfter=12months PurgeResvAfter=2months PurgeStepAfter=2months PurgeSuspendAfter=1month PurgeTXNAfter=12months PurgeUsageAfter=12months
The purge operation is done at the start of each time interval (see bug_4295), which means on the 1st day of the month in this example. Monthly, daily or even hourly purge operations would occur when using different time units for the same interval:
PurgeStepAfter=2months
PurgeStepAfter=60days
PurgeStepAfter=1440hours
A monthly purge operation can be a huge amount of work for a database depending on its size, and you certainly want to cut down the amount of work required during the purges. If you did not use any purges before, it is a good idea to make a series of daily purges starting with a very long interval, for example:
PurgeEventAfter=2000days
PurgeJobAfter=2000days
PurgeResvAfter=2000days
PurgeStepAfter=2000days
PurgeSuspendAfter=2000days
If this works well over a few days, decrease the purge interval 2000days
little by little and try again (1800, 1500, etc)
until you after many iterations have come down to the desired final purge intervals.
Logging of purge events can be configured in slurmdbd.conf using:
DebugLevel=verbose
DebugFlags=DB_ARCHIVE
slurmdbd hostname configuration
The slurmdbd hostname must be configured correctly. The default value may be localhost, meaning that no other hosts can inquire the slurmdbd service (you may or may not want this limitation).
We recommend to explicitly set the slurmdbd hostname (for example, slurmdbd.my.domain
) in these files:
On the slurmdbd server configure
DbdHost
in slurmdbd.conf as documented above:DbdHost=slurmdbd.my.domain
On the slurmctld server configure
AccountingStorageHost
in slurm.conf so that slurmctld points to the slurmdbd server’s hostname:AccountingStorageHost=<slurmdbd.my.domain>
After restarting the slurmctld and slurmdbd services, verify the setup by:
scontrol show config | grep AccountingStorageHost
If other nodes than the slurmdbd node must be able to connect to the slurmdbd service, you must open the firewall to specific hosts as described in Firewall between slurmctld and slurmdbd.
Setting MaxQueryTimeRange
It may be a good idea to limit normal users from inquiring the database for too long periods of time. The slurmdbd.conf parameter is used for this, for example for a maximum of 60 days:
MaxQueryTimeRange=60-0
Start the slurmdbd service
First try to run slurmdbd manually to see the log:
slurmdbd -D -vvv
Terminate the process by Control-C when the testing is OK.
Start the slurmdbd service:
systemctl enable slurmdbd
systemctl start slurmdbd
systemctl status slurmdbd
If you get this error in /var/log/slurm/slurmdbd.log
:
error: Couldn't find the specified plugin name for accounting_storage/mysql looking at all files
then the file /usr/lib64/slurm/accounting_storage_mysql.so
is missing because you forgot to install the mariadb-devel RPM before building Slurm RPMs.
You must install the mariadb-devel RPM and rebuild and reinstall Slurm RPMs as shown above.
Backup and restore of database
In order to backup the entire database to a different location (for disaster recovery or migration), the Slurm database must be backed up regularly.
You may want to read the page Mysqldump with Modern MySQL.
See also the discussion in Bug_10295 about dumping only the slurm_acct_db database, and using --single-transaction
for InnoDB tables.
You can show the tables by:
# mysqlshow -p --status slurm_acct_db
For compressing the (large) database dumps, install the gzip, bzip2, and perhaps the lbzip2 package:
dnf install gzip bzip2
dnf install lbzip2 # From EPEL
Backup of MariaDB 10.1 and later
In MariaDB 10.1 and later, Mariabackup is the recommended backup method to use instead of Percona XtraBackup, see this page.
However, the usual mysqldump_for_MariaDB utility still exists for MariaDB.
Backup script with crontab
Make a slurm_acct_db database using mysqldump, for example,
using this example script /root/mysqlbackup
(Note: Insert the correct root DATABASE-PASSWORD in PWD
):
#!/bin/sh
# MySQL Backup Script for slurm_acct_db database
HOST=localhost
BACKUPFILE=/root/mysql_dump.gz
USER=root
PWD='DATABASE-PASSWORD'
DUMP_ARGS="--single-transaction"
DATABASES="-B slurm_acct_db"
/usr/bin/mysqldump --host=$HOST --user=$USER --password=$PWD $DUMP_ARGS $DATABASES | gzip > $BACKUPFILE
Write permission to $BACKUPFILE is required. The script is also available in https://github.com/OleHolmNielsen/Slurm_tools/blob/master/database/.
Make regular database dumps, for example by a crontab job:
# MySQL database backup
30 7 * * * /root/mysqlbackup
Backup script with logrotate
It is preferable to make daily database backup dumps and keep a number of backup versions. The logrotate tool is ideal for this purpose.
This logrotate file /etc/logrotate.d/slurm_acct_db_backup
will make 8 backup versions in /var/log/mariadb/
(insert the correct root DATABASE-PASSWORD):
/var/log/mariadb/slurm_acct_db_backup.bz2 {
daily
dateext
dateyesterday
rotate 8
nocompress
create 640 root adm
postrotate
# Dump ONLY the Slurm database slurm_acct_db
# Strongly recommended: --single-transaction
# Use bzip2 for compression.
# Alternatively use lbzip2 from the EPEL repo: lbzip2 --force -n 5
/usr/bin/mysqldump --user=root --password=<DATABASE-PASSWORD> --single-transaction -B slurm_acct_db | bzip2 > /var/log/mariadb/slurm_acct_db_backup.bz2
endscript
}
You must first create an empty backup file:
touch /var/log/mariadb/slurm_acct_db_backup.bz2
The script is also available in https://github.com/OleHolmNielsen/Slurm_tools/tree/master/database.
Using mysqldump --single-transaction
is recommended by SchedMD to avoid race conditions when slurmdbd is being run while taking the MySQL dump, see
https://bugs.schedmd.com/show_bug.cgi?id=10295#c18
Testing the script:
logrotate -dv /etc/logrotate.d/slurm_acct_db_backup
Note: SELinux enforces that logrotate only create files in the /var/log/
folder and below.
If logrotate tries to create files in other locations it will get permission denied errors,
and errors will be present in /var/log/audit/audit.log
.
See the logrotate_selinux manual page and this Red Hat solution: https://access.redhat.com/solutions/39006
Restore of a database backup
The database contents must be loaded from the backup. To restore a MySQL database see for example How do I restore a MySQL .dump file?.
Make sure slurmdbd is stopped:
systemctl stop slurmdbd
As user root read in the above created backup file:
mysql -u root -p < /root/mysql_dump
or if the dump file has been compressed:
zcat /root/mysql_dump.gz | mysql -u root -p
bzcat /root/mysql_dump.bz2 | mysql -u root -p # For bzip2 compressed files
The MariaDB/MySQL password will be asked for. Reading in the database dump may take many minutes depending on the size of the dump file, the storage system speed, and the CPU performance.
Upgrade of MySQL/MariaDB
If you restore a database dump onto a different server running a newer MySQL or MariaDB version there are some extra steps:
Consult the Upgrading_MariaDB page with detailed instructions for upgrading between MariaDB versions or from MySQL.
You should run the mysql_upgrade command whenever major (or even minor) version upgrades are made:
mysql_upgrade -p
It may be necessary to force an upgrade if you have restored a database dump made on an earlier version of MariaDB, say, when migrating from CentOS7/RHEL7 to EL8:
mysql_upgrade -p --force
It may be necessary to restart the mysqld service or reboot the server after this upgrade (??).
When migrating a database from CentOS/RHEL 7 (EL7) to RHEL 8 (and EL8 clones) you should read Upgrading from MariaDB 5.5 to MariaDB 10.0 since there are some incompatible changes between 5.5 and 10.
Slurm database modifications required for MariaDB 10.2.1 and above
In MariaDB 10.2.1 and above there are some important changes which have been discussed in bug_13562. Several Slurm database tables must be altered while the slurmdbd is stopped. Please note that EL7 contains MariaDB version 5.5, and EL8 contains MariaDB version 10.3, so this point is important, for example, when upgrading from EL7 to EL8! This has been resolved from Slurm 22.05.7.
We have discussed the procedure for MariaDB 10.2.1 and above in details in bug_15168. A future version of Slurm may perform these changes automatically.
The procedure for Slurm database modifications must be followed when MariaDB has been upgraded from older versions than 10.2.1 to 10.2.1 or newer, or when a Slurm database has been restored from a dump from an older MariaDB version.
The following steps should be made:
Make sure slurmdbd is stopped:
systemctl stop slurmdbd
Configure MariaDB 10.3 for Slurm in the usual way (see above in the present page).
Login to the slurm_acct_db database:
# mysql -p -u slurm slurm_acct_db Enter password:
Drop (delete) the following database table and show tables in the database:
drop table table_defs_table; show tables; quit;
Set a high debug level in slurmdbd.conf:
DebugLevel=debug4
Start the slurmdbd service and look for a number of table creation lines in
/var/log/slurm/slurmdbd.log
such as:debug4: Table "niflheim_assoc_table" doesn't exist, adding
You can also repeat the
show tables;
command from above to verify thattable_defs_table
exists once again.At the end, set a normal debug level in slurmdbd.conf:
DebugLevel=verbose
and restart slurmdbd:
systemctl start slurmdbd
Configure database accounting in slurm.conf
Finally, when you have made sure that the slurmdbd service is working correctly, you must configure the Slurm controller’s slurm.conf to use slurmdbd. In slurm.conf you must configure accounting so that the database will be used through the slurmdbd database daemon:
AccountingStorageType=accounting_storage/slurmdbd
Migrate the slurmdbd service to another server
It is recommended to run the slurmdbd database server on a separate host from the slurmctld’s server, see documents in Slurm_publications:
Technical: Field Notes From the Frontlines of Slurm Support, Tim Wickberg, SchedMD (2017) slides on High-Availability.
Technical: Field Notes Mark 2: Random Musings From Under A New Hat, Tim Wickberg, SchedMD (2018) slides on My Preferred Deployment Pattern:
However, many sites run both services successfully on the same server. If you decide to migrate the slurmdbd service to another server, here is a tested procedure which works on a running production cluster.
It is important to understand that the slurmctld service can run without problems even when the slurmdbd database is not responding,
since slurmctld just caches all state information in the StateSaveLocation
directory:
$ scontrol show config | grep StateSaveLocation
StateSaveLocation = /var/spool/slurmctld
Therefore we can take down the slurmdbd service and server for a number of minutes or hours without any problems.
The outstanding messages in the StateSaveLocation
are currently capped at 3xNodes + MaxJobCount
.
Configure a slurmdbd server
Install a new Slurm server as described in Slurm installation and upgrading. You must make sure that these prerequisites are satisfied:
Install the same Slurm version on the new server as on the old server! This ensures that the database migration will be as fast as possible. Any upgrading should be done at a later date according to the instructions in Upgrading Slurm.
Make sure to open the firewall completely as described in Firewall between slurmctld and slurmdbd.
Configure the MariaDB/MySQL and the slurmdbd services as described above.
Testing the database restore
Take a database dump file and restore it into the MariaDB/MySQL database (see above Backup and restore of database). Use the time_command to get an estimate of the time this will take.
Configure the server’s hostname ( for example db2
) in slurmdbd.conf:
DbdHost=<hostname>
Start the slurmdbd service manually to see if any errors occur:
slurmdbd -D -vvvv
and wait for the output:
slurmdbd: debug2: Everything rolled up
and do a Control-C.
Database migration procedure
Let us denote the slurmdbd servers as:
db1
is the current slurmdbd and MariaDB database server. This could be the same as the slurmctld server, or it could be a dedicated server.db2
is the designated new slurmdbd and MariaDB database server.
db1: stop slurmdbd
On the db1
server:
Stop and disable slurmdbd and make sure the status is down:
systemctl disable slurmdbd systemctl stop slurmdbd systemctl status slurmdbd
Run the MySQL database dump described above Backup and restore of database.
Copy the database dump to the
db2
server. Make a long-term copy of the database dump.
db2: restore database and start slurmdbd
On the db2
server:
Make sure the slurmdbd service is stopped and that no crontab jobs will run database dumps.
Load the database dump from
db1
into MariaDB as shown above Backup and restore of database.If the MariaDB version on
db2
than ondb1
then you must remember to perform the MariaDB update actions shown above.Start the slurmdbd service manually to see if any errors occur:
slurmdbd -D -vvvv
and wait for the output:
slurmdbd: debug2: Everything rolled up
and do a Control-C.
Start and enable slurmdbd and make sure the status is up:
systemctl enable slurmdbd systemctl start slurmdbd systemctl status slurmdbd
Now the new slurmdbd service should be up and running on the db2
server in a stable state.
slurmctld server: reconfigure AccountingStorageHost
On the slurmctld server:
Now it’s time to reconfigure slurmctld for the new db2
slurmdbd server.
Stop the slurmctld:
systemctl stop slurmctld
Edit slurm.conf to configure the new slurmdbd server (
db2
):AccountingStorageHost=db2
Just for safety, but not required: Make a backup copy of the
StateSaveLocation
directory/var/spool/slurmctld
:tar czf $HOME/var.spool.slurmctld.tar.gz /var/spool/slurmctld/*
Make sure the contents of the tar-ball file looks correct!
Start the slurmctld:
systemctl start slurmctld
and make a
reconfig
to notify all theslurmd
processes:scontrol reconfig
Check the slurmctld log file, for example:
grep slurmdbd: /var/log/slurm/slurmctld.log
Test that your Slurm cluster’s functionality has now been completely restored (use squeue, sinfo etc.).
db2: Enable database backups
On the db2
server:
Make a crontab job for doing database dumps as in Backup and restore of database.
Make sure the
db2
server and the database dumps are backed up daily/regularly to your site’s backup service.