Topology Plugin Programmer Guide

Overview

This document describes Slurm topology plugin and the API that defines them. It is intended as a resource to programmers wishing to write their own Slurm topology plugin.

Slurm topology plugins are Slurm plugins that implement convey system topology information so that Slurm is able to optimize resource allocations and minimize communication overhead. The plugins must conform to the Slurm Plugin API with the following specifications:

const char plugin_type[]
The major type must be "topology." The minor type specifies the type of topology mechanism. We recommend, for example:

  • 3d_torus — Optimize placement for a three dimensional torus.
  • none — No topology information.
  • tree — Optimize placement based upon a hierarchy of network switches.

const char plugin_name[]
Some descriptive name for the plugin. There is no requirement with respect to its format.

const uint32_t plugin_version
If specified, identifies the version of Slurm used to build this plugin and any attempt to load the plugin from a different version of Slurm will result in an error. If not specified, then the plugin may be loaded by Slurm commands and daemons from any version, however this may result in difficult to diagnose failures due to changes in the arguments to plugin functions or changes in other Slurm functions used by the plugin.

The actions performed by these plugins vary widely. In the case of 3d_torus, the nodes in configuration file are re-ordered so that nodes which are nearby in the one-dimensional table are also nearby in logical three-dimensional space. In the case of tree, a tabled is built to reflect network topology and that table is later used by the select plugin to optimize placement. Note carefully, however, the versioning discussion below.

Data Objects

The implementation must maintain (though not necessarily directly export) an enumerated errno to allow Slurm to discover as practically as possible the reason for any failed API call. Plugin-specific enumerated integer values may be used when appropriate.

These values must not be used as return values in integer-valued functions in the API. The proper error return value from integer-valued functions is SLURM_ERROR. The implementation should endeavor to provide useful and pertinent information by whatever means is practical. Successful API calls are not required to reset any errno to a known value. However, the initial value of any errno, prior to any error condition arising, should be SLURM_SUCCESS.

API Functions

The following functions must appear. Functions which are not implemented should be stubbed.

int init (void)

Description:
Called when the plugin is loaded, before any other functions are called. Put global initialization here.

Returns:
SLURM_SUCCESS on success, or
SLURM_ERROR on failure.

void fini (void)

Description:
Called when the plugin is removed. Clear any allocated storage here.

Returns: None.

Note: These init and fini functions are not the same as those described in the dlopen (3) system library. The C run-time system co-opts those symbols for its own initialization. The system _init() is called before the Slurm init(), and the Slurm fini() is called before the system's _fini().

int topo_build_config(void);

Description: Generate topology information.

Returns: SLURM_SUCCESS or SLURM_ERROR on failure.

bool topo_generate_node_ranking(void)

Description: Determine if this plugin will reorder the node records based upon each job's node rank field.

Returns: true if node reording is supported, false otherwise.

int topo_get_node_addr(char* node_name, char** paddr, char** ppatt);

Description: Get Topology address of a given node.

Arguments:
node_name (input) name of the targeted node
paddr (output) returns the topology address of the node and connected switches. If there are multiple switches at some level in the hierarchy, they will be represented using Slurm's hostlist expression (e.g. "s0" and "s1" are reported as "s[0-1]"). Each level in the hierarchy is separated by a period. The last element will always be the node's name (i.e. "s0.s10.nodename")
ppatt (output) returns the pattern of the topology address. Each level in the hierarchy is separated by a period. The final element will always be "node" (i.e. "switch.switch.node")

Returns: SLURM_SUCCESS or SLURM_ERROR on failure.

Last modified 27 March 2015