Slurm Task Plugin Programmer Guide

Overview

This document describes Slurm task management plugins and the API that defines them. It is intended as a resource to programmers wishing to write their own Slurm scheduler plugins.

Slurm task management plugins are Slurm plugins that implement the Slurm task management API described herein. They would typically be used to control task affinity (i.e. binding tasks to processors). They must conform to the Slurm Plugin API with the following specifications:

const char plugin_type[]
The major type must be "task." The minor type can be any recognizable abbreviation for the type of task management. We recommend, for example:

  • affinity — A plugin that implements task binding to processors. The actual mechanism used to task binding is dependent upon the available infrastructure as determined by the "configure" program when Slurm is built and the value of the TaskPluginParam as defined in the slurm.conf (Slurm configuration file).
  • cgroup — Use Linux cgroups for binding tasks to resources.
  • none — A plugin that implements the API without providing any services. This is the default behavior and provides no task binding.

const char plugin_name[]
Some descriptive name for the plugin. There is no requirement with respect to its format.

const uint32_t plugin_version
If specified, identifies the version of Slurm used to build this plugin and any attempt to load the plugin from a different version of Slurm will result in an error. If not specified, then the plugin may be loaded by Slurm commands and daemons from any version, however this may result in difficult to diagnose failures due to changes in the arguments to plugin functions or changes in other Slurm functions used by the plugin.

Data Objects

The implementation must maintain (though not necessarily directly export) an enumerated errno to allow Slurm to discover as practically as possible the reason for any failed API call. These values must not be used as return values in integer-valued functions in the API. The proper error return value from integer-valued functions is SLURM_ERROR.

API Functions

The following functions must appear. Functions which are not implemented should be stubbed.

int init (void)

Description:
Called when the plugin is loaded, before any other functions are called. Put global initialization here.

Returns:
SLURM_SUCCESS on success, or
SLURM_ERROR on failure.

void fini (void)

Description:
Called when the plugin is removed. Clear any allocated storage here.

Returns: None.

Note: These init and fini functions are not the same as those described in the dlopen (3) system library. The C run-time system co-opts those symbols for its own initialization. The system _init() is called before the Slurm init(), and the Slurm fini() is called before the system's _fini().

int task_p_slurmd_batch_request (batch_job_launch_msg_t *req);

Description: Prepare to launch a batch job. Establish node, socket, and core resource availability for it. Executed by the slurmd daemon as user root.

Argument:
req   (input/output) Batch job launch request specification. See src/common/slurm_protocol_defs.h for the data structure definition.

Returns: SLURM_SUCCESS if successful. On failure, the plugin should return SLURM_ERROR and set the errno to an appropriate value to indicate the reason for failure.

int task_p_slurmd_launch_request ( launch_tasks_request_msg_t *req, uint32_t node_id);

Description: Prepare to launch a job. Establish node, socket, and core resource availability for it. Executed by the slurmd daemon as user root.

Arguments:
req   (input/output) Task launch request specification including node, socket, and core specifications. See src/common/slurm_protocol_defs.h for the data structure definition.
node_id   (input) ID of the node on which resources are being acquired (zero origin).

Returns: SLURM_SUCCESS if successful. On failure, the plugin should return SLURM_ERROR and set the errno to an appropriate value to indicate the reason for failure.

int task_p_slurmd_suspend_job (uint32_t job_id);

Description: Temporarily release resources previously reserved for a job. Executed by the slurmd daemon as user root.

Arguments: job_id   (input) ID of the job which is being suspended.

Returns: SLURM_SUCCESS if successful. On failure, the plugin should return SLURM_ERROR and set the errno to an appropriate value to indicate the reason for failure.

int task_p_slurmd_resume_job (uint32_t job_id);

Description: Reclaim resources which were previously released using the task_p_slurmd_suspend_job function. Executed by the slurmd daemon as user root.

Arguments: job_id   (input) ID of the job which is being resumed.

Returns: SLURM_SUCCESS if successful. On failure, the plugin should return SLURM_ERROR and set the errno to an appropriate value to indicate the reason for failure.

int task_p_pre_setuid (stepd_step_rec_t *job);

Description: task_p_pre_setuid() is called before setting the UID for the user to launch his jobs. Executed by the slurmstepd program as user root.

Arguments: job   (input) pointer to the job to be initiated. See src/slurmd/slurmstepd/slurmstepd_job.h for the data structure definition.

Returns: SLURM_SUCCESS if successful. On failure, the plugin should return SLURM_ERROR and set the errno to an appropriate value to indicate the reason for failure.

int task_p_pre_launch_priv (stepd_step_rec_t *job);

Description: task_p_pre_launch_priv() is called by each forked task just after the fork. Note that no particular task related information is available in the job structure at that time. Executed by the slurmstepd program as user root.

Arguments: job   (input) pointer to the job to be initiated. See src/slurmd/slurmstepd/slurmstepd_job.h for the data structure definition.

Returns: SLURM_SUCCESS if successful. On failure, the plugin should return SLURM_ERROR and set the errno to an appropriate value to indicate the reason for failure.

int task_p_pre_launch (stepd_step_rec_t *job);

Description: task_p_pre_launch() is called prior to exec of application task. Executed by the slurmstepd program as the job's owner. It is followed by TaskProlog program (as configured in slurm.conf) and --task-prolog (from srun command line).

Arguments: job   (input) pointer to the job to be initiated. See src/slurmd/slurmstepd/slurmstepd_job.h for the data structure definition.

Returns: SLURM_SUCCESS if successful. On failure, the plugin should return SLURM_ERROR and set the errno to an appropriate value to indicate the reason for failure.

int task_p_post_term (stepd_step_rec_t *job, slurmd_task_p_info_t *task);

Description: task_p_term() is called after termination of job step. Executed by the slurmstepd program as the job's owner. It is preceded by --task-epilog (from srun command line) followed by TaskEpilog program (as configured in slurm.conf).

Arguments:
job   (input) pointer to the job which has terminated. See src/slurmd/slurmstepd/slurmstepd_job.h for the data structure definition.
task   (input) pointer to the task which has terminated. See src/slurmd/slurmstepd/slurmstepd_job.h for the data structure definition.

Returns: SLURM_SUCCESS if successful. On failure, the plugin should return SLURM_ERROR and set the errno to an appropriate value to indicate the reason for failure.

int task_p_post_step (stepd_step_rec_t *job);

Description: task_p_post_step() is called after termination of all the tasks of the job step. Executed by the slurmstepd program as user root.

Arguments: job   (input) pointer to the job which has terminated. See src/slurmd/slurmstepd/slurmstepd_job.h for the data structure definition.

Returns: SLURM_SUCCESS if successful. On failure, the plugin should return SLURM_ERROR and set the errno to an appropriate value to indicate the reason for failure.

Last modified 17 June 2020