rockfish compact
Compact and merge Parquet files for storage efficiency.
Overview
The compact command merges multiple small Parquet files from recent days into larger, optimized files and removes data older than the configured retention period. This reduces file count, improves query performance, and manages disk usage.
Usage
rockfish compact [OPTIONS]
Options
| Option | Description | Default |
|---|---|---|
-d, --data-dir | Parquet data directory | required |
--sensor | Sensor name for partitioning | — |
--hive | Use hive-style date partitioning | false |
--retention | Data retention period | 30d |
--dry-run | Preview changes without modifying files | false |
Examples
# Compact and prune with 30-day retention
rockfish compact -d /data --sensor prod-01 --hive --retention 30d
# Preview what would be compacted
rockfish compact -d /data --sensor prod-01 --hive --dry-run
# 90-day retention
rockfish compact -d /data --sensor prod-01 --hive --retention 90d
How It Works
- Scans partitioned Parquet directories for small files
- Merges files from the same partition into larger, optimized files
- Removes partitions older than the retention period
- Preserves all data and metadata during compaction