Patch v0.5.1
This is a somewhat larger patch where a couple of things were fixed, added, and refactored to a more suitable state.
Major Changes
- Added initial testing framework with pytest, and added tests for all functions in the
utils
module. Leaving the other modules for later - Added tld grouping back to hive pipeline.
-
convert-to-parquet
uses the Rust backend instead of pyarrow for parquet creation. This allows us to export full statistics and use those in our parquet scans in other pipelines.
Bugfixes
- Naked integers passed as breakpoints to
prep_age_distribution
are handled properly - Naked timedeltas passed to
create_timedelta_breakpoints
are now handled properly - Timedelta values less than or equal to zero are now removed in both
create_timedelta_labels
andcreate_timedelta_breakpoints