Skip to content
Snippets Groups Projects
README.md 8.92 KiB

Information Lifecycle Managment (ILM) via GPFS policy engine

The GPFS policy engine is well described in this white paper. A good presentation overview of the policy file is here. The relavent documentation is available from IBM.

This project focuses on scheduled execution of lifecyle policies to gather and process data about file system objects and issue actions against those objects based on policy.

Applying Policies

Applying a policy to filesets is done through the mmapplypolicy command at a base level. This repo contains wrapper scripts to call that command with a specified policy file on a given fileset where each wrapper has different levels of functionality meant for different groups of users in RC. All scripts are stored in src/run-policy

  • run-mmpol: the main script that calls mmapplypolicy. Generally not invoked on its own
  • submit-pol-job: general wrapper that sets up the Slurm job run-mmpol executes in. Admins can execute a policy run from this level using any policy file they have defined
  • run-submit-pol-job.py: a Python wrapper for submit-pol-job meant specifically for running list policy jobs. This wrapper can be run by specific non-admins who have been given sudo permissions on this file alone. It can only run one of two policies: list-path-external and list-path-dirplus.

The production version of these scripts are kept in /data/rc/list-gpfs-dirs. Admins can run any one of these scripts from anywhere, but non-admins are only granted sudo privileges on the run-submit-pol-job.py file in that directory.

Note: The command is aligned to run on specific nodes by way of arguments to mmapplypolicy. The command is technically not run inside of the job reservation so the resource constraints are imperfect. The goal is to use the scheduler to ensure the policy run does not conflict with existing resource allocations on the cluster.

List Policies (non-admin)

A list policy can be executed using run-submit-pol-job.py using the following command:

sudo run-submit-pol-job.py [-h] [-o OUTDIR] [-f LOG_PREFIX] [--with-dirs]
                           [-N NODES] [-c CORES] [-p PARTITION] [-t TIME]
                           [-m MEM_PER_CPU]
                           device
  • outdir: specifies the directory the output log should be saved to. Defaults to /data/rc/gpfs-policy/data
  • log-prefix: string to begin the name of the policy output with. Metadata containing the policy file name, slurm job ID, and time run will be appended to this prefix. Defaults to list-policy_<device>. See below for device
    • Note: this is currently non-functional
  • --with-dirs: changes the policy file from list-path-external to list-path-dirplus. The only difference is that directories are included in the policy output.
  • device: the fileset or directory to apply the policy to.

All other arguments are Slurm directives dictating resource requests. The default paramaters are as follows:

  • nodes: 1
  • cores: 16
  • partition: amd-hdr100, medium
  • time: 24:00:00
  • mem-per-cpu: 8G

This script was written using the default python3 interpreter on Cheaha (version 3.6.8) so no environment needs to be active. Running this script in an environment with a newer Python version may cause unintended errors and effects.

Run Any Policies (admins)

Any defined policy file can be run using the submit-pol-job by running the following:

sudo ./submit-pol-job [ -h ] [ -o | --outdir ] [ -f | --outfile ] [ -P | --policy ] 
                      [ -N | --nodes ] [ -c | --cores ] [ -p | --partition ] 
                      [ -t | --time ] [ -m | --mem ]
                      device

The only difference here is that a path to the policy file can be specified using -P or --policy. All other arguments are the same and have the same defaults

Output