Skip to content
Snippets Groups Projects

Add --no-clobber to hive conversion tools

Added --no-clobber option to convert_flat_to_hive and hivize. It operates per tld + acq combination. It searches the hive directory for the specific directory cell containing the given tld and acq data. If there are any parquet files in that cell directory, it removes all rows with that tld+acq combination from the dataframe. If the dataframe is empty after that, it exits without writing any data. Otherwise, it writes the remaining dataframe entries to a hive dataset like normal.

Merge request reports

Loading
Loading

Activity

Filter activity
  • Approvals
  • Assignees & reviewers
  • Comments (from bots)
  • Comments (from users)
  • Commits & branches
  • Edits
  • Labels
  • Lock status
  • Mentions
  • Merge request status
  • Tracking
Please register or sign in to reply
Loading