@@ -7,57 +7,69 @@ The relavent [documentation is available from IBM](https://www.ibm.com/docs/en/s
...
@@ -7,57 +7,69 @@ The relavent [documentation is available from IBM](https://www.ibm.com/docs/en/s
This project focuses on scheduled execution of lifecyle policies to gather and process data about
This project focuses on scheduled execution of lifecyle policies to gather and process data about
file system objects and issue actions against those objects based on policy.
file system objects and issue actions against those objects based on policy.
## Running a policy
## Applying Policies
A policy is executed in the context of a SLURM batch job reservation using the submit-pol-job script:
Applying a policy to filesets is done through the `mmapplypolicy` command at a base level. This repo contains wrapper scripts to call that command with a specified policy file on a given fileset where each wrapper has different levels of functionality meant for different groups of users in RC. All scripts are stored in `src/run-policy`
-**outdir** - the directory for the output files, should be global to cluster (e.g. /scratch of the user running the job)
-`run-mmpol`: the main script that calls `mmapplypolicy`. Generally not invoked on its own
-**policy** - path to the GPFS policy to execute (e.g. in ./policy directory)
-`submit-pol-job`: general wrapper that sets up the Slurm job `run-mmpol` executes in. Admins can execute a policy run from this level using any policy file they have defined
-**nodecount** - number of nodes in the cluster that will run the policy
-`run-submit-pol-job.py`: a Python wrapper for `submit-pol-job` meant specifically for running list policy jobs. This wrapper can be run by specific non-admins who have been given `sudo` permissions on this file alone. It can only run one of two policies: `list-path-external` and `list-path-dirplus`.
-**corespernode** - number of cores on each node to reserve
-**ram** - ram per core, can use "G" for gigabytes
-**partition** - the partition to submit the job
-**time** - the time in minutes to reserve for the job
Note: the resource reservation is imperfect. The job wrapper calls a script `run-mmpol.sh` which is responsible for executing the `mmapplypolicy` command.
The production version of these scripts are kept in `/data/rc/list-gpfs-dirs`. Admins can run any one of these scripts from anywhere, but non-admins are only granted `sudo` privileges on the `run-submit-pol-job.py` file in that directory.
The command is aligned to run on specific nodes by way of arguments to mmapplypolicy. The command is technically not run inside of the job reservation so the resource constraints are imperfect. The goal is to use the scheduler to ensure the policy run does not conflict with existing resource allocations on the cluster.
Note: The command is aligned to run on specific nodes by way of arguments to mmapplypolicy. The command is technically not run inside of the job reservation so the resource constraints are imperfect. The goal is to use the scheduler to ensure the policy run does not conflict with existing resource allocations on the cluster.
## Running the policy "list-policy-external"
### List Policies (non-admin)
The list-policy-external policy provides an efficient tool to gather file stat data into a URL-encoded
A list policy can be executed using `run-submit-pol-job.py` using the following command:
ASCII text file. The output file can then be processed by down-stream to create reports on storage
-`outdir`: specifies the directory the output log should be saved to. Defaults to `/data/rc/gpfs-policy/data`
submit-pol-job /path/to/output/dir \
-`log-prefix`: string to begin the name of the policy output with. Metadata containing the policy file name, slurm job ID, and time run will be appended to this prefix. Defaults to `list-policy_<device>`. See below for `device`
/absolute/path/policy/list-path-external \
-**Note: this is currently non-functional**
4 24 4G partition_name \
-`--with-dirs`: changes the policy file from `list-path-external` to `list-path-dirplus`. The only difference is that directories are included in the policy output.
/path/to/listed/dir \
-`device`: the fileset or directory to apply the policy to.
180
All other arguments are Slurm directives dictating resource requests. The default paramaters are as follows:
-`nodes`: 1
-`cores`: 16
-`partition`: `amd-hdr100, medium`
-`time`: `24:00:00`
-`mem-per-cpu`: `8G`
### Run Any Policies (admins)
Any defined policy file can be run using the `submit-pol-job` by running the following:
The only difference here is that a path to the policy file can be specified using `-P` or `--policy`. All other arguments are the same and have the same defaults
- the `submit-pol-job` script may need a `./` prefix if it is not in your path.
- use absolute paths for all directory arguments to avoid potential confusion
### Output
- make sure the output dir has sufficient space to hold the resulting file listing (It could be 100's of Gigabytes for a large collection of files.)
The list-policy-external policy provides an efficient tool to gather file stat data into a URL-encoded
ASCII text file. The output file can then be processed by down-stream to create reports on storage
patterns and use. Make sure the output dir has sufficient space to hold the resulting file listing (It could be 100's of Gigabytes for a large collection of files.)
The slurm job output file will be local to the directory from which this command executed. It can be watched to observe progress in the generation of the file list. A listing of 100's of millions of files may take a couple of hours to generate and consume serveral hundred gigabytes for the output file.
The slurm job output file will be local to the directory from which this command executed. It can be watched to observe progress in the generation of the file list. A listing of 100's of millions of files may take a couple of hours to generate and consume serveral hundred gigabytes for the output file.
The output file in `/path/to/output/dir` is named as follows
#### List Policy Specific Outputs
- a prefix of "list-${SLURM_JOBID}"
- ".list" for the name of the policy rule type of "list"
- a tag for the list name name defined in the policy file, "list-gather" for `list-path-external` policy
The output file contains one line per file object stored under the `/path/to/listed/dir`. No directories or non-file objects are included in this listing. Each entry is a space-seperated set of file attributes selected by the SHOW command in the LIST rule. Entries are encoded according to RFC3986 URI percent encoding. This means all spaces and special characters will be encoded, making it easy to split lines into fields using the space separator.
The raw output file for list policies in `outdir` will be named `list-<jobid>.list.gather-info`.
The ouput file is an unsorted list of files in uncompressed ASCII. Further processing is desireble to use less space for storage and provide organized collections of data.
The output file contains one line per file object stored under the `device`. No directories or non-file objects are included in this listing unless the `list-path-dirplus` policy is used. Each entry is a space-seperated set of file attributes selected by the SHOW command in the LIST rule. Entries are encoded according to RFC3986 URI percent encoding. This means all spaces and special characters will be encoded, making it easy to split lines into fields using the space separator.