Skip to content
Snippets Groups Projects
  • John-Paul Robinson's avatar
    Notebook to convert policy run output to parquet data sets · 22776c4d
    John-Paul Robinson authored
    This is intended to be run on URL encoded output lines from a
    gpfs list policy run.  It creates panda structures that are
    then saved as parquet format for ease of downstream processing.
    
    Can be run in parallel across many inputs by wrapping with papermill
    and have upstream split the input file.
    22776c4d