diff --git a/README.md b/README.md
index 4e6a6c105080e4c561573e2edd8f7465fd1465cd..59783c7c8faa99ebfef070cf98c02b3507edd0b9 100644
--- a/README.md
+++ b/README.md
@@ -24,10 +24,10 @@ Note: The command is aligned to run on specific nodes by way of arguments to mma
 A list policy can be executed using `run-submit-pol-job.py` using the following command:
 
 ``` bash
-run-submit-pol-job.py [-h] [-o OUTDIR] [-f LOG_PREFIX] [--with-dirs]
-                      [-N NODES] [-c CORES] [-p PARTITION] [-t TIME]
-                      [-m MEM_PER_CPU]
-                      device
+sudo run-submit-pol-job.py [-h] [-o OUTDIR] [-f LOG_PREFIX] [--with-dirs]
+                           [-N NODES] [-c CORES] [-p PARTITION] [-t TIME]
+                           [-m MEM_PER_CPU]
+                           device
 ```
 
 - `outdir`: specifies the directory the output log should be saved to. Defaults to `/data/rc/gpfs-policy/data`
@@ -125,6 +125,10 @@ All other options control the array job resources. Default values are as follows
 
 The default resources can parse 5 million line files in approximately 3 minutes so should cover all common use cases.
 
+For all policies run on filesets in `/data/user`, `/data/project`, `/home`, or `/scratch` will automatically have their "top-level directory" (`tld`) computed and added to the parquet output. This is defined as the directory just under any of those specified filesets. For example, a file with path `/data/project/datascienceteam/example.txt` will have `tld` set to `datascienceteam`.
+
+Any files in a directory outside those specified filesets will have `tld` set to `None`.
+
 ## Running reports
 
 ### Disk usage by top level directies