Skip to content
Snippets Groups Projects
Commit 517774cf authored by John-Paul Robinson's avatar John-Paul Robinson
Browse files

Add listcmd param to split script

This lets caller control the list of directories to split without
having to edit the script.

Added array task protections for lazy callers who don't care if
array size aligns with work tasks.
parent 612008ae
No related branches found
No related tags found
1 merge request!52Draft: Improvements to post processing workflow to make ad hoc use of scripts easier
...@@ -13,7 +13,17 @@ ...@@ -13,7 +13,17 @@
module load Anaconda3 module load Anaconda3
conda activate gpfs conda activate gpfs
logs=($(find /data/rc/gpfs-policy/data -path "*/list-policy_data-user_list-path-external_slurm-31[35]*/raw/*.gz")) # listcmd env var sets the command to enumerate datasets to process
# supports passing args during sbatch, e.g. listcmd="cat split-list" sbatch <thisscript>
# note: maxdeth speeds execution of find by avoiding deep dirs
listcmd=${listcmd:-find /data/rc/gpfs-policy/data -maxdepth 3 -path "*/list-policy_data-user_list-path-external_slurm-31[35]*/raw/*.gz"}
logs=($($listcmd))
log=${logs[${SLURM_ARRAY_TASK_ID}]} log=${logs[${SLURM_ARRAY_TASK_ID}]}
split-log --no-clobber ${log}
\ No newline at end of file # for lazy submit. only do work if there is work to do
if [ ${SLURM_ARRAY_TASK_ID} -lt ${#logs[@]} ]
then
echo split-log --no-clobber ${log}
fi
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment