Multi-Threading API Reference
High-performance parallel computing with MERA.jl: leveraging multi-core processors for accelerated astrophysical data analysis
Comprehensive reference for Mera's parallel processing functions and threading control.
Quick Threading Reference
Core Functions
# Threading setup and diagnostics
show_threading_info() # Display threading configuration
# Performance benchmarking
benchmark_projection_hydro(gas, [1,2,4,8]) # Test thread performance
# Progress tracking for long operations
tracker = create_progress_tracker(100) # Create progress tracker
update_progress!(tracker, 50) # Update to 50% complete
complete_progress!(tracker) # Mark as finished
Key Parameters
Parameter | Type | Purpose | Default |
---|---|---|---|
max_threads | Int | Limit concurrent threads | Threads.nthreads() |
verbose_threads | Bool | Show threading diagnostics | false |
Threading Information Functions
show_threading_info()
Display comprehensive threading configuration and recommendations.
Purpose:
- Check current Julia threading setup
- Get performance recommendations
- Troubleshoot threading issues
Returns: Nothing
(prints information)
Example:
using Mera
show_threading_info()
# Output:
# 🧵 JULIA THREADING INFORMATION
# ===============================
# Available threads: 8
# CPU cores: 8
# ✅ Multi-threading enabled
#
# 🚀 PERFORMANCE RECOMMENDATIONS
# ==============================
# Variable-based parallel processing:
# • 2+ variables: Automatic variable-based parallelization
# • Single variable: Optimized sequential processing
# • Threading scales linearly with variable count
When to use:
- Before starting threading-intensive work
- When experiencing performance issues
- To verify Julia was started with threading enabled
Performance Benchmarking
benchmark_projection_hydro(gas_data, thread_counts, n_runs=10, output_file="")
Benchmark projection performance across different thread counts to find optimal settings.
Arguments:
gas_data
: Hydro data object fromgethydro()
thread_counts
: Vector of thread counts to test (e.g.,[1,2,4,8]
)n_runs
: Number of benchmark runs per thread count (default: 10)output_file
: Optional file to save results (default: auto-generated)
Returns: Performance results and creates benchmark report
Example:
using Mera
info = getinfo(400, "../data")
gas = gethydro(info; lmax=10)
# Test different thread counts
thread_counts = [1, 2, 4, 8, 16]
benchmark_projection_hydro(gas, thread_counts, 5)
# Output shows optimal thread count for your system
Performance Analysis:
- Tests projection performance with multiple variables
- Identifies optimal
max_threads
values - Detects resource bottlenecks and contention
- Provides specific recommendations for your hardware
Interpreting Results:
- Look for thread count with best performance/thread ratio
- Watch for performance degradation at high thread counts
- Consider I/O vs compute-bound characteristics
Progress Tracking
For long-running multi-threaded operations with optional Zulip notifications.
create_progress_tracker(total_items; kwargs...)
Create a progress tracker for monitoring threaded operations.
Arguments:
total_items
: Total number of items to process
Keyword Arguments:
time_interval
: Seconds between time-based notifications (default: 300)progress_interval
: Percentage between progress notifications (default: 10)task_name
: Descriptive name for the task (default: "Processing")zulip_channel
: Zulip channel for notifications (default: "progress")zulip_topic
: Zulip topic for notifications (default: "Task Progress")
Returns: Dictionary containing tracker state
Example:
# Create tracker for 1000 snapshots
tracker = create_progress_tracker(1000;
task_name="Multi-snapshot analysis",
progress_interval=5) # Notify every 5%
update_progress!(tracker, current_item, custom_message="")
Update progress tracker with current status.
Arguments:
tracker
: Tracker dictionary fromcreate_progress_tracker()
current_item
: Current item number being processedcustom_message
: Optional custom status message
Returns: Nothing
(updates tracker and may send notifications)
Thread Safety: Safe to call from multiple threads
Example:
@threads for i in 1:1000
# Process snapshot
analyze_snapshot(snapshots[i])
# Update progress (thread-safe)
update_progress!(tracker, i, "Processed snapshot $(snapshots[i])")
end
complete_progress!(tracker, final_message=""; include_summary=true)
Mark progress tracking as complete and send final notification.
Arguments:
tracker
: Tracker dictionaryfinal_message
: Custom completion messageinclude_summary
: Include timing summary (default: true)
Returns: Nothing
(sends completion notification)
Example:
complete_progress!(tracker, "Multi-snapshot analysis completed successfully")
Threading Control Parameters
max_threads
Parameter
Controls the maximum number of threads used by Mera functions.
Available in:
gethydro()
getgravity()
getparticles()
projection()
- Most analysis functions
Usage patterns:
# Outer-loop parallelism: limit inner threading
@threads for snapshot in snapshots
gas = gethydro(info; max_threads=1) # Serial loading
end
# Inner-kernel parallelism: use all threads
gas = gethydro(info) # Uses Threads.nthreads() by default
# Mixed: controlled allocation
gas = gethydro(info; max_threads=4) # 4 threads for I/O
proj = projection(gas, vars; max_threads=4) # 4 threads for compute
Optimization Guidelines:
- I/O-bound operations: 2-8 threads often optimal
- CPU-bound operations: Match physical cores
- Memory-bound operations: 2-4 threads recommended
- Network storage: Fewer threads usually better
verbose_threads
Parameter
Enables detailed threading diagnostics and performance metrics.
Available in:
projection()
functions
Example:
# Enable detailed threading diagnostics
proj = projection(gas, [:rho, :T, :vx, :vy];
verbose_threads=true,
max_threads=4)
# Output shows:
# - Thread assignment per variable
# - Load balancing information
# - Per-thread performance metrics
# - Memory allocation patterns
Diagnostic Output Includes:
- Thread utilization per variable
- Load balancing effectiveness
- Memory allocation per thread
- Execution time breakdown
- Resource contention indicators
Threading Patterns Reference
Pattern Selection Guide
┌─ Multiple independent tasks?
│ ├─ Yes → Outer-Loop Pattern
│ └─ No ↓
└─ Single large dataset?
├─ Yes → Inner-Kernel Pattern
└─ Complex workflow → Mixed Pattern
Outer-Loop Pattern
Best for: Multiple snapshots, parameter studies, independent analyses
@threads for item in work_items
result = mera_function(item; max_threads=1)
end
Inner-Kernel Pattern
Best for: Single large dataset, multiple variables, complex analysis
gas = gethydro(info) # Full threading
proj = projection(gas, multiple_variables) # Parallel per variable
Mixed Pattern
Best for: Controlled resource allocation, complex workflows
task1 = @spawn mera_function1(data; max_threads=N1)
task2 = @spawn mera_function2(data; max_threads=N2)
results = fetch.([task1, task2])
Thread Safety Guidelines
Safe Operations
# ✅ Pre-allocated arrays (each thread writes to different index)
results = Vector{Float64}(undef, n)
@threads for i in 1:n
results[i] = compute(i)
end
# ✅ Atomic operations
total = Atomic{Float64}(0.0)
@threads for i in 1:n
atomic_add!(total, compute(i))
end
# ✅ Thread-local accumulators
result = @distributed (+) for i in 1:n
compute(i)
end
Unsafe Operations
# ❌ Race conditions
total = 0.0
@threads for i in 1:n
global total += compute(i) # DANGEROUS
end
# ❌ Shared mutable state without synchronization
shared_dict = Dict()
@threads for i in 1:n
shared_dict[i] = compute(i) # DANGEROUS
end
Performance Optimization
Memory Management
Pre-allocation:
# Good: Allocate once
results = Vector{Float64}(undef, n_items)
@threads for i in 1:n_items
results[i] = expensive_computation(i)
end
Garbage Collection:
# Monitor GC impact
@time threaded_analysis() # Watch "gc time" percentage
# Reduce allocations
@. output_array = input1 + input2 # In-place operations
Thread Utilization
Check load balancing:
# Uneven workloads: use @spawn instead of @threads
tasks = [@spawn process_item(item) for item in variable_workload]
results = fetch.(tasks)
Resource monitoring:
# Use verbose_threads to identify bottlenecks
proj = projection(gas, vars; verbose_threads=true)
# Look for thread imbalances or resource contention
Troubleshooting
Common Issues
Poor scaling:
- Check for I/O bottlenecks with
max_threads
reduction - Monitor memory bandwidth with fewer threads
- Use
verbose_threads=true
to identify contention
High GC time:
- Pre-allocate arrays instead of growing with
push!
- Use in-place operations (
@.
macro) - Process data in chunks for large datasets
Race conditions:
- Use atomic operations for simple reductions
- Pre-allocate arrays with fixed indices per thread
- Add proper synchronization for complex shared state
Debugging Tools
# Thread-safe debugging output
using Base.Threads: SpinLock
debug_lock = SpinLock()
function safe_debug(msg)
lock(debug_lock) do
println("Thread $(threadid()): $msg")
end
end
For complete implementation examples, see the Multi-Threading Tutorial.