-
Matthew K Defenderfer authored526b4b8d
Information Lifecycle Managment (ILM) via GPFS policy engine
The GPFS policy engine is well described in this white paper. A good presentation overview of the policy file is here. The relavent documentation is available from IBM.
This project focuses on scheduled execution of lifecyle policies to gather and process data about file system objects and issue actions against those objects based on policy.
Applying Policies
Applying a policy to filesets is done through the mmapplypolicy
command at a base level. This repo contains wrapper scripts to call that command with a specified policy file on a given fileset where each wrapper has different levels of functionality meant for different groups of users in RC. All scripts are stored in src/run-policy
-
run-mmpol
: the main script that callsmmapplypolicy
. Generally not invoked on its own -
submit-pol-job
: general wrapper that sets up the Slurm jobrun-mmpol
executes in. Admins can execute a policy run from this level using any policy file they have defined -
run-submit-pol-job.py
: a Python wrapper forsubmit-pol-job
meant specifically for running list policy jobs. This wrapper can be run by specific non-admins who have been givensudo
permissions on this file alone. It can only run one of two policies:list-path-external
andlist-path-dirplus
.
The production version of these scripts are kept in /data/rc/list-gpfs-dirs
. Admins can run any one of these scripts from anywhere, but non-admins are only granted sudo
privileges on the run-submit-pol-job.py
file in that directory.
Note: The command is aligned to run on specific nodes by way of arguments to mmapplypolicy. The command is technically not run inside of the job reservation so the resource constraints are imperfect. The goal is to use the scheduler to ensure the policy run does not conflict with existing resource allocations on the cluster.
List Policies (non-admin)
A list policy can be executed using run-submit-pol-job.py
using the following command: