MERA/JLD2 File Converter - Multithreaded
Overview
batch_convert_mera is a safe, multithreaded tool to re-save older Mera.jl data files in the current format. It features active safety-margin monitoring, intelligent thread management, and robust error handling for batch conversion of large datasets.
Current Mera (JLD2 0.6, with the bundled JLD2Lz4) reads older LZ4-compressed Mera files directly — loaddata/viewdata work on files written by earlier Mera versions with no extra steps (see Loading older Mera files). Convert when you want to remove reconstruction warnings, standardise a large archive on the current format, or speed up repeated loads of very old files.
Problem Description
JLD2 files created with older Mera/dependency versions can still load, but may print a reconstruction warning such as:
┌ Warning: saved type CodecLz4.LZ4FrameCompressor has field header::TranscodingStreams.Memory,
but workspace type has field header::Vector{UInt8}, and no applicable convert method exists; reconstructingThis comes from internal field-type changes in CodecLz4/TranscodingStreams between versions. The file still reads correctly (Mera reconstructs the type), but the reconstruction can mean:
- Performance Degradation: Slower file loading due to reconstruction overhead
- Data Integrity Concerns: Potential inconsistencies in reconstructed objects
- Memory Inefficiency: Higher memory usage during the reconstruction process
- Workflow Disruption: Constant warning messages during data analysis
Converting once re-writes the file cleanly in the current format and removes the warning.
Solution Architecture
Core Components
- Custom Type Converter: Extends JLD2's
rconvertfunction to handle version mismatches - Safety Margin Monitor: Real-time system resource monitoring with configurable thresholds
- Intelligent Threading: Dynamic thread count adjustment based on system constraints
- Progress Tracking: Thread-safe progress reporting with current file display
- Memory Management: Aggressive garbage collection and memory usage optimization
Key Features
- Active Safety Monitoring: Continuous memory usage tracking with violation alerts
- Skip Existing Files: Prevents accidental overwriting of previously converted files
- Batch Range Processing: Convert specific output number ranges (e.g., 100-200)
- Configurable Parameters: All safety and performance settings are user-adjustable
- Comprehensive Reporting: Detailed conversion statistics and resource usage metrics
Installation and Dependencies
Required Packages
using MeraConfiguration Parameters
Default Constants
const DEFAULT_SAFETY_MARGIN = 0.8 # Use max 80% of system memory
const DEFAULT_MIN_THREADS = 1 # Minimum thread count
const DEFAULT_MAX_THREADS = 64 # Maximum thread countFunction Parameters
batch_convert_mera()
| Parameter | Type | Default | Description |
|---|---|---|---|
input_dir | String | Required | Source directory containing old JLD2 files |
output_dir | String | Required | Destination directory for converted files |
start_output | Int | Required | Starting output number for conversion range |
end_output | Int | Required | Ending output number for conversion range |
requested_threads | Int | Threads.nthreads() | Desired number of conversion threads |
safety_margin | Float64 | 0.8 | Maximum memory usage threshold (0.0-1.0) |
min_threads | Int | 1 | Minimum allowable thread count |
max_threads | Int | 64 | Maximum allowable thread count |
skip_existing | Bool | true | Skip files that already exist in output directory |
show_confirmation | Bool | true | Display user confirmation prompt before starting |
Usage Examples
Basic Conversion
Convert a range of files with default safety settings:
results = batch_convert_mera(
"/data/old_simulations/",
"/data/converted_simulations/",
100, 200
)Memory-Conscious Conversion
For large files or limited memory systems:
results = batch_convert_mera(
"/data/old_simulations/",
"/data/converted_simulations/",
100, 200;
requested_threads=2,
safety_margin=0.9, # Use only 90% of memory
max_threads=4
)High-Performance Conversion
For systems with abundant resources:
results = batch_convert_mera(
"/data/old_simulations/",
"/data/converted_simulations/",
100, 200;
requested_threads=16,
safety_margin=0.7, # Allow up to 70% memory usage
max_threads=32,
skip_existing=false # Force re-conversion of existing files
)Interactive Mode
User-guided conversion with prompts:
interactive_mera_converter(
"/data/old_simulations/",
"/data/converted_simulations/";
safety_margin=0.85
)Safety Margin Monitoring
How It Works
The safety margin system monitors real-time memory usage and compares it against a configurable threshold:
- Pre-conversion Check: Validates system state before starting
- Per-file Monitoring: Checks memory usage before and after each file load
- Periodic Monitoring: Regular checks every 3 files during batch processing
- Violation Handling: Automatic garbage collection and warning generation
- Final Reporting: Summary of violations and system state
Memory Usage Calculation
memory_usage_percent = (total_memory - available_memory) / total_memory * 100
safety_violation = memory_usage_percent > (safety_margin * 100)Violation Response
When safety margin violations occur:
- Warning Generation: Immediate alert with current usage percentage
- Garbage Collection: Forced cleanup to free memory
- Brief Pause: 0.1-second delay to allow GC completion
- Violation Counting: Track total violations for reporting
- Progress Logging: Record which files triggered violations
File Processing Logic
File Discovery and Filtering
The converter expects RAMSES-style filenames:
output_00100.jld2 # Output number: 100
output_00101.jld2 # Output number: 101
output_00102.jld2 # Output number: 102Files are:
- Discovered: Scan input directory for
.jld2files - Parsed: Extract output numbers using regex pattern
- Filtered: Select files within specified range
- Sorted: Process in numerical order
Skip Existing Logic
When skip_existing=true (default):
- Check if output file already exists
- If exists, increment skip counter and continue
- If not exists, proceed with conversion
- Report skipped files in final summary
Conversion Process
For each file:
- Safety Check: Verify memory usage within margin
- Type Mapping: Configure JLD2 to handle version mismatches
- Load Operation: Read data with custom type conversion
- Memory Check: Monitor usage after data loading
- Save Operation: Write converted data to output file
- Cleanup: Explicit memory cleanup and garbage collection
Error Handling and Recovery
Common Error Scenarios
- Out of Memory Errors
- Detection: Catch
OutOfMemoryErrorexceptions - Response: Immediate error logging and thread termination
- Recovery: User advised to reduce thread count
- Detection: Catch
- File Access Errors
- Detection: File permission or corruption issues
- Response: Log error and continue with next file
- Recovery: Manual file verification recommended
- Safety Margin Violations
- Detection: Memory usage exceeds threshold
- Response: Warning generation and garbage collection
- Recovery: Automatic with violation tracking
- Type Conversion Failures
- Detection: JLD2 reconstruction errors
- Response: Fallback to default compressor objects
- Recovery: Automatic with warning log
Recovery Strategies
- Partial Failures: Continue processing remaining files
- Memory Pressure: Automatic garbage collection and thread reduction recommendations
- Interrupted Processing: Skip existing files allows resuming partial conversions
- Validation: Post-conversion file existence verification
Sample Output and Interpretation
Successful Conversion with Safety Monitoring
================================================================================
Safe Multithreaded JLD2 Batch Converter with Safety Margin Monitoring
================================================================================
Input directory: /data/simulations/old/
Output directory: /data/simulations/converted/
Output range: 100 to 200
System Memory Information:
Total memory: 64.0 GB
Available memory: 58.2 GB
Current usage: 9.1%
Safety limit: 80.0%
✅ Current memory usage within safety margin
Requested threads: 8
Recommended thread count (with safety margin): 8
Files to be converted (101 total):
- output_00100.jld2 (output 100)
- output_00101.jld2 (output 101)
- output_00102.jld2 (output 102)
... and 98 more files
Files that will be skipped (already exist): 0
Proceed with conversion using 8 threads (safety margin: 80.0%)? (y/n): y
Starting multithreaded conversion with safety margin monitoring...
[67/101] Processing: output_00166.jld2: 66%|████████████████ | 67/101 [04:23<02:15, 1.5it/s]
⚠️ Safety margin exceeded during load of output_00145.jld2 (82.3%)
⚠️ Safety margin exceeded during load of output_00189.jld2 (84.7%)
================================================================================
Conversion Summary with Safety Margin Report
================================================================================
Files processed: 101
Successfully converted: 99
Failed conversions: 2
Skipped files: 0
Safety margin violations: 5
Total conversion time: 421.3 seconds
Average time per file: 4.17 seconds
Threads used: 8
Final memory usage: 15.2%
⚠️ SAFETY MARGIN VIOLATIONS DETECTED!
Consider using fewer threads or processing smaller batches for future conversions.
Conversion complete!Interpreting Results
- Success Rate: 99/101 files (98% success rate)
- Safety Violations: 5 violations indicate memory pressure
- Performance: 4.17 seconds average per file with 8 threads
- Recommendations: Consider reducing to 6 threads for future batches
Return Dictionary Structure
results = Dict(
"success" => 99, # Successfully converted files
"failed" => 2, # Failed conversions
"skipped" => 0, # Already existing files skipped
"safety_violations" => 5, # Safety margin violations
"conversion_time" => 421.3, # Total time in seconds
"threads_used" => 8, # Actual threads used
"final_memory_usage_percent" => 15.2 # Final memory usage percentage
)Troubleshooting Guide
High Memory Usage
Symptoms: Frequent safety margin violations, slow performance Solutions:
- Reduce
requested_threadsto 2-4 - Increase
safety_marginto 0.9 - Process smaller batches (e.g., 20-50 files at a time)
- Close other memory-intensive applications
Poor Performance
Symptoms: Low threading efficiency, long conversion times Solutions:
- Verify SSD storage usage
- Check network storage configuration
- Increase
safety_marginto 0.7 if memory allows - Monitor system load during conversion
Conversion Failures
Symptoms: High failure rate, type conversion errors Solutions:
- Verify input file integrity
- Check file permissions
- Update JLD2 and CodecLz4 packages
- Test with single-threaded conversion first
Integration with Mera.jl Workflows
Typical Workflow Integration
- Pre-analysis Conversion: Convert all data files before starting analysis
- Incremental Conversion: Convert new simulation outputs as they're generated
- Archive Maintenance: Batch convert older archived data periodically
- Collaborative Sharing: Provide converted files to team members
Best Practices
- Version Documentation: Keep record of conversion timestamps and software versions
- Backup Strategy: Maintain original files until conversion is verified
- Testing Protocol: Convert small batches first to verify system compatibility
- Resource Planning: Schedule conversions during off-peak system usage