User Guide

This guide covers all aspects of using marEx for marine extreme detection and tracking, from basic concepts to advanced workflows.

Introduction 

marEx is a Python package for efficient identification and tracking of marine extremes (e.g., Marine Heatwaves) in oceanographic data. It provides a complete pipeline from raw data preprocessing to tracked event visualisation.

Key Features 

Flexible Data Support: Both structured (lat/lon grids) and unstructured (irregular meshes) data
Multiple Methods: Various anomaly computation and threshold definition methods
Parallel Processing: Built on Dask for efficient computation on large datasets
Event Tracking: Sophisticated algorithms for tracking coherent events through time
Rich Visualisation: Automatic plotting system with publication-ready figures
HPC Integration: Optimised for supercomputing environments

Installation and Setup 

For complete installation instructions including system requirements, optional dependencies, and HPC environments, see Installation.

Data Preparation 

Input Data Requirements 

marEx works with xarray DataArrays containing oceanographic time series data:

Required Dimensions:

Time dimension: Regular or irregular time steps (daily to monthly)
Spatial dimensions: Either structured (lat, lon) or unstructured (cell/ncells)

Supported Formats:

NetCDF (.nc)
Zarr (.zarr)
Any xarray-compatible format

Data Structure Examples:

For structured/gridded data:

data.dims = ('time', 'lat', 'lon')
# Coordinates: time, lat, lon as dimensions

For unstructured data:

data.dims = ('time', 'ncells')
# Coordinates: time as dimension, lat/lon as coordinates

Data Quality Requirements 

Temporal coverage: Minimum 10 years for robust climatology
Spatial coverage: Consistent grid throughout time series

Data Loading Examples 

Loading NetCDF files:

import xarray as xr

# Single file
ds = xr.open_dataset('sst_data.nc', chunks={})
sst = ds.sst

# Multiple files
ds = xr.open_mfdataset('sst_*.nc', parallel=True, chunks={})
sst = ds.sst

Loading Zarr datasets:

# Local Zarr
ds = xr.open_zarr('data.zarr', chunks={})

# Cloud-optimised Zarr
ds = xr.open_zarr('gs://bucket/data.zarr', chunks={})

Optimising data loading:

# Apply chunking for efficient processing (chunk sizes vary by operation)
sst = sst.chunk({'time': 365, 'lat': 50, 'lon': 100})

# Ensure Dask backing
if not marEx.is_dask_collection(sst.data):
    sst = sst.chunk()

Note: Optimal chunk sizes depend on your dataset size and the operation (preprocessing vs tracking). See the Performance Optimisation section below for detailed chunking strategies.

Core Workflow 

The marEx workflow consists of three main steps. For a conceptual overview and foundations, see Core Concepts.

┌─────────────────┐      ┌─────────────────┐      ┌─────────────────┐
│  1. Detect      │  →   │  2. Track       │  →   │  3. Visualise   │
│    Extremes     │      │    Events       │      │     & Analyse   │
└─────────────────┘      └─────────────────┘      └─────────────────┘
        ↓                        ↓                        ↓
 preprocess_data()           tracker()                  plotX()
        ↓                        ↓                        ↓
 Binary extreme map        Tracked objects          Maps, animations,
                           with unique IDs           & statistics

Summary:

Preprocessing: Convert raw data to anomalies and identify extremes
Tracking: Track extreme events through time
Visualisation: Analyse and visualise results

Step 1: Preprocessing 

The preprocessing step transforms raw oceanographic data into anomalies and detects extreme event locations.

Basic preprocessing:

import marEx

# Basic preprocessing with default settings
extremes = marEx.preprocess_data(
    sst,
    threshold_percentile=95,
    method_anomaly='shifting_baseline',
    method_extreme='hobday_extreme'
)

Advanced configuration:

# Advanced preprocessing configuration
extremes = marEx.preprocess_data(
    sst,
    # Anomaly computation method
    method_anomaly='detrend_fixed_baseline',  # or 'detrend_harmonic', 'fixed_baseline', 'shifting_baseline'
    detrend_order=[1, 2],
    smooth_days_baseline=21,             # Smoothing for climatology

    # Extreme identification method
    method_extreme='hobday_extreme',     # or 'global_extreme'
    threshold_percentile=95,             # 95th percentile
    window_days_hobday=11,               # Days around each day of year
    window_spatial_hobday=5,             # Spatial window for percentile calculation

    # Output options
    dask_chunks={'time': 25}
)

The resulting xarray dataset extremes will have the following structure & entries:

xarray.Dataset
Dimensions:     (lat, lon, time)
Coordinates:
    lat         (lat)
    lon         (lon)
    time        (time)
Data variables:
    dat_anomaly     (time, lat, lon)        float64     dask.array
    mask            (lat, lon)              bool        dask.array
    extreme_events  (time, lat, lon)        bool        dask.array
    thresholds      (dayofyear, lat, lon)   float64     dask.array

where:

dat_anomaly (time, lat, lon): Anomaly data
extreme_events (time, lat, lon): Binary field locating extreme events (1=event, 0=background)
thresholds (dayofyear, lat, lon): Extreme event thresholds used to determine extreme events
mask (lat, lon): Valid data mask

See ./examples/unstructured data/01_preprocess_extremes.ipynb in Examples Gallery for a detailed example of pre-processing on an unstructured grid.

Step 2: Event Tracking 

The tracking step identifies coherent extreme events and follows them through time.

Basic tracking:

tracked_events = marEx.tracker(
    extremes.extreme_events,
    extremes.mask,
    area_filter_absolute=100   # Remove objects smaller than 100 grid cells
    R_fill=8,                  # Radius for filling gaps (in grid cells)
).run()

Advanced tracking options:

# Initialise advanced tracker
tracker = marEx.tracker(
    extremes.extreme_events,
    extremes.mask,

    # Temporal criteria
    T_fill=4,                    # Fill gaps up to 4 days (to maintain continuous events)

    # Spatial criteria
    R_fill=8,                    # Fill small holes with radius up to 8 grid cells
    area_filter_quartile=0.5,    # Remove smallest 50% of events (alternative to area_filter_absolute)
    cell_areas=grid_areas,       # Optional: physical cell areas for structured grids (m²)

    # Merging criteria
    allow_merging=True,          # Allow merging of events (and keep track of merged IDs & events)
    overlap_threshold=0.5,       # 50% overlap for merging (otherwise events keep independent IDs)
    nn_partitioning=True,        # Use nearest-neighbour partitioning when splitting events
)

# Run tracking and return merging data
tracked_events, merge_events = tracker.run(return_merges=True)

The resulting xarray dataset tracked_events will have the following structure & entries:

xarray.Dataset
Dimensions: (lat, lon, time, ID, component, sibling_ID)
Coordinates:
    lat         (lat)
    lon         (lon)
    time        (time)
    ID          (ID)
Data variables:
    ID_field              (time, lat, lon)        int32       dask.array
    global_ID             (time, ID)              int32       ndarray
    area                  (time, ID)              float32     ndarray
    centroid              (component, time, ID)   float32     ndarray
    presence              (time, ID)              bool        ndarray
    time_start            (ID)                    datetime64  ndarray
    time_end              (ID)                    datetime64  ndarray
    merge_ledger          (time, ID, sibling_ID)  int32       ndarray

where:

ID_field: Field containing the IDs of tracked events (0=background)
global_ID: Unique global ID of each object; global_ID.sel(ID=10) maps event ID 10 to its original ID at each time
area: Area of each event as a function of time (in units of cell counts, or physical units if cell_areas/grid_resolution provided)
centroid: (x, y) centroid coordinates of each event as a function of time
presence: Presence (boolean) of each event at each time (anywhere in space)
time_start: Start time of each event
time_end: End time of each event
merge_ledger: Sibling IDs for merging events (matching ID_field); -1 indicates no merging event occurred

When running with return_merges=True, the resulting xarray dataset merge_events will have the following structure & entries:

xarray.Dataset
Dimensions: (merge_ID, parent_idx, child_idx)
Data variables:
    parent_IDs      (merge_ID, parent_idx)  int32       ndarray
    child_IDs       (merge_ID, child_idx)   int32       ndarray
    overlap_areas   (merge_ID, parent_idx)  int32       ndarray
    merge_time      (merge_ID)              datetime64  ndarray
    n_parents       (merge_ID)              int8        ndarray
    n_children      (merge_ID)              int8        ndarray

where:

parent_IDs: Original parent IDs of each merging event
child_IDs: Original child IDs of each merging event
overlap_areas: Area of overlap between parent and child objects in each merging event
merge_time: Time of each merging event
n_parents: Number of parent objects in each merging event
n_children: Number of child objects in each merging event

See ./examples/unstructured data/02_id_track_events.ipynb in Examples Gallery for a detailed example of identification, tracking, & merging on an unstructured grid.

Step 3: Visualisation 

marEx includes a powerful visualisation system called plotX that automatically detects data types and creates appropriate plots.

Basic visualisation:

# Plot global extreme event frequency
event_frequency = (tracked_events.ID_field > 0).mean("time")

# Configure plot appearance
config = marEx.PlotConfig(
    var_units="MHW Frequency",
    cmap="hot_r",
    cperc=[0, 96],
    grid_labels=True
)

# Create single plot
fig, ax, im = event_frequency.plotX.single_plot(config)

Advanced visualisation:

# Multi-panel visualisation: seasonal extreme event frequency
seasonal_frequency = (tracked_events.ID_field > 0).groupby("time.season").mean(dim="time")

# Configure plot appearance
config = marEx.PlotConfig(
    var_units="MHW Frequency",
    cmap="hot_r",
    cperc=[0, 96],
    grid_labels=True
)

# Create multi-panel plot
fig, ax = seasonal_frequency.plotX.multi_plot(config, col="season", col_wrap=2)

# Create animation of tracked events
ID_field_subset = tracked_events.ID_field.sel(time=slice("2020-01-01", "2022-05-31"))
config = marEx.PlotConfig(plot_IDs=True)
ID_field_subset.plotX.animate(config, plot_dir=plot_dir, file_name="marine_heatwaves")

# Plot consecutive time periods
ID_field_subset = tracked_events.ID_field.sel(time=slice("2021-01-01", "2021-01-06"))
config = marEx.PlotConfig(plot_IDs=True)
fig, ax = ID_field_subset.plotX.multi_plot(config, col="time", col_wrap=3)

Method Selection Guide 

This section provides practical guidance for selecting appropriate methods. For foundations and definitions, see Core Concepts.

Use this decision tree to select the most appropriate anomaly calculation method for your research:

┌─────────────────────────────────────────────────────────────────────┐
│ Anomaly Method Selection Decision Tree                              │
├─────────────────────────────────────────────────────────────────────┤
│                                                                     │
│ Need full time series? ──No──> SHIFTING BASELINE                    │
│         │                       (most accurate, shortens data by    │
│        Yes                       window_year_baseline years)        │
│         │                                                           │
│         ├─> Remove trends? ──No──> FIXED BASELINE                   │
│         │         │                 (keeps trends in anomaly)       │
│         │        Yes                                                │
│         │         │                                                 │
│         │         └──> Need efficiency? ──Yes──> DETREND HARMONIC   │
│         │                      │                  (fast, biased)    │
│         │                     No                                    │
│         │                      │                                    │
│         │                      └──> DETREND FIXED BASELINE          │
│         │                           (accurate detrending)           │
└─────────────────────────────────────────────────────────────────────┘

Anomaly Methods 

Harmonic Detrending (detrend_harmonic):

Detrends with an OLS 6+ coefficient model (mean, annual & semi-annual harmonics, and arbitrary polynomial trends)
Best for: Datasets with linear trends, operational monitoring
Pros: Fast & memory efficient
Cons: Does not capture phenological shifts and non-harmonic seasonal variability. Strongly biases certain statistics.
Use when: Real-time processing

Fixed Baseline (fixed_baseline):

Removes the daily climatology (calculated in a reference_period) – does not remove climate trends
Best for: Simple anomaly calculation without detrending
Pros: Straightforward interpretation, preserves long-term trends
Cons: Does not account for climate change trends, seasonal timing shifts
Use when: Baseline comparison studies, trend-inclusive analysis, public outreach

Detrend Fixed Baseline (detrend_fixed_baseline):

Polynomial detrending followed by fixed daily climatology – keeps full time-series of data, but does not account for trends in the timing of seasonal transitions
Also supports calculating the climatology only within a reference_period (detrending still uses all data)
Best for: Studies requiring detrending but maintaining full temporal data coverage
Pros: Removes long-term trends while preserving seasonal cycles, maintains full time series
Cons: Does not account for changes in seasonal timing or seasonality
Use when: Climate variability studies with trend removal

Shifting Baseline (shifting_baseline):

Removes seasonal & long-term trends using a smoothed rolling climatology
Best for: Climate change studies, non-stationary data
Pros: Captures non-linear trends, adapts to changing climate, and seasonal timing variability
Cons: Computationally expensive, shortens effective time series
Use when: Long-term climate studies, accurate & robust analysis

Extreme Identification Methods 

Global Extreme (global_extreme):

Applies a global-in-time (i.e. constant in time) threshold
N.B.: This method is a hack designed for ocetrac, which when paired with std_normalise=True can approximate hobday_extreme in very specific cases and with a number of caveats. (However, normalising by local STD is again memory-intensive, at which point there is little gained by this approximate method.)
Best for: Simple threshold, constant across seasons
Pros: Simple interpretation, fast computation (without std_normalise)
Cons: Seasonal bias, may miss winter extremes, variability distribution is skewed
Use when: Initial analysis, 0th order comparisons

Hobday Method (hobday_extreme):

Defines a local day-of-year specific threshold within a rolling window (equivalent to the Hobday et al. (2016) definition for simple time-series)
Best for: Long-term scientific climate studies, seasonal studies, biological impacts
Pros: Accounts for seasonal variability, literature standard
Cons: Complex threshold interpretation, computationally intensive
Use when: Ecological studies, seasonal analysis, literature comparison

Spatial Window for Hobday Extreme (`window_spatial_hobday`)

New in v3.0+: The Hobday extreme method now supports optional spatial windowing to create more statistically robust thresholds.

# Traditional Hobday (temporal window only)
extremes_traditional = marEx.preprocess_data(
    sst,
    method_extreme='hobday_extreme',
    window_days_hobday=11,          # 11-day temporal window
    window_spatial_hobday=None      # No spatial window
)

# With spatial windowing (recommended for most applications)
extremes_window = marEx.preprocess_data(
    sst,
    method_extreme='hobday_extreme',
    window_days_hobday=11,          # 11-day temporal window
    window_spatial_hobday=5         # 5×5 spatial window (marEx default; use 1 for strict Hobday et al. 2016 definition for time-series)
)

How Spatial Windowing Works:

Traditional Hobday:                  With Spatial Window (5×5):
Single cell samples                  25 cells × 11 days = 275 samples
(lat, lon) ──> 11 days               ┌───┬───┬───┬───┬───┐
                                     │   │   │   │   │   │
Only temporal pooling                ├───┼───┼───┼───┼───┤
                                     │   │   │   │   │   │
                                     ├───┼───┼─●─┼───┼───┤ Central cell
                                     │   │   │   │   │   │
                                     ├───┼───┼───┼───┼───┤
                                     │   │   │   │   │   │
                                     └───┴───┴───┴───┴───┘
                                     Spatial + temporal pooling

Benefits:

Spatially coherent thresholds: Reduces noise from individual grid cells
More robust statistics: Larger sample size for robust percentile calculation

Limitations:

Structured grids only: Not supported for unstructured (irregular) grids
Requires approximate method: Only works with method_percentile='approximate'

When to use:

Short time-series of data
High percentile thresholds defining the extremes
Noisy SST fields (e.g., satellite with gaps)
Default (window_spatial_hobday=5) works well for most cases

Example: Comparing threshold noise:

# Compare threshold variability
threshold_std_traditional = extremes_traditional.thresholds.std(dim='dayofyear').mean()
threshold_std_smooth = extremes_smooth.thresholds.std(dim='dayofyear').mean()

print(f"Traditional threshold noise: {threshold_std_traditional.compute():.4f}")
print(f"Windowed threshold noise: {threshold_std_window.compute():.4f}")

Threshold Percentiles 

90th percentile: More events, captures moderate extremes
95th percentile: Standard for marine heatwaves, balanced approach
99th percentile: Only most extreme events, rare events focus

Performance Optimisation 

This section provides guidance for optimising marEx performance across different computing environments and dataset sizes.

Memory Management and Chunking 

Chunking Strategy 

Optimal chunking strategies vary by dataset size and operation type:

# Small datasets (< 10GB): Chunk time dimension
sst = sst.chunk({'time': 365, 'lat': -1, 'lon': -1})

# Large datasets (> 100GB): Chunk all dimensions
sst = sst.chunk({'time': 365, 'lat': 180, 'lon': 360})

# For tracking: Use smaller time chunks
sst = sst.chunk({'time': 25, 'lat': -1, 'lon': -1})  # Recommended for tracker

Memory Requirements Estimation 

Operation	Memory (per core)	Optimal Time Chunk	Notes
`preprocess_data`	2-4 GB	365+ days	Depends on method_anomaly
`tracker`	1-2 GB	25-50 days	Keep spatial dims unbounded (-1)
Exact percentiles	8-16 GB	Full time series	Requires careful chunking
Approximate percentiles	1-2 GB	Any	Memory efficient

Exact vs Approximate Percentile Methods 

When to Use Each Method 

APPROXIMATE (default): For large datasets

extremes = marEx.preprocess_data(
    sst,
    method_percentile='approximate',
    precision=0.01,        # ~0.01°C bins
    max_anomaly=5.0,       # Histogram range ±5°C
    dask_chunks={'time': 365}  # Any chunking works
)
# ✓ Memory efficient (1-2 GB/core)
# ✓ Fast parallel computation
# ✓ ~0.01°C precision (sufficient for most studies)

EXACT: For small datasets or high precision needs

extremes_exact = marEx.preprocess_data(
    sst,
    method_percentile='exact',
    dask_chunks={'time': -1, 'lat': 100, 'lon': 100}  # Careful chunking required!
)
# ⚠ High memory (8-16 GB/core), depending on time-series length
# ⚠ Slower computation
# ✓ Mathematically exact percentiles

JAX Acceleration 

When JAX Helps Most 

import marEx

# Check JAX availability
if marEx.has_dependency('jax'):
    print("JAX acceleration active")
    # Best for:
    # - Harmonic detrending (10-50× speedup)
    # - Large spatial operations
    # - GPU/TPU available systems
else:
    print("Install JAX: pip install marEx[full]")
    # Falls back to NumPy (still fast with Numba JIT)

Performance Gains 

Operation	NumPy+Numba	JAX (CPU)	JAX (GPU)
Harmonic detrend	1× (baseline)	15× faster	50× faster
Percentile calc	1×	1× (similar)	2-3× faster
Tracking	1×	1× (similar)	N/A

HPC Cluster Setup 

SLURM Distributed Cluster 

For supercomputers:

import marEx

# Start distributed Dask cluster on SLURM
client = marEx.helper.start_distributed_cluster(
    n_workers=64,           # Number of workers
    cores_per_worker=4,     # Cores per worker
    memory_per_worker='8GB' # Memory per worker
)

# Get dashboard URL for monitoring
marEx.helper.get_cluster_info(client)

# Run analysis
extremes = marEx.preprocess_data(sst, ...)
events = marEx.tracker(extremes.extreme_events, ...).run()

# Clean up
client.close()

Local Cluster 

For workstations:

# Auto-detect system resources
client = marEx.helper.start_local_cluster(
    n_workers=8,            # Leave some cores for system
    threads_per_worker=2,
    memory_limit='16GB'     # Per worker
)

Example Batch Scripts 

The examples/batch jobs/ directory provides ready-to-use scripts designed to be run from a login node on HPC systems with SLURM:

Workflow:

run_detect.py: Launches a distributed Dask job to execute the preprocessing and extreme detection pipeline
run_track.py: Launches a distributed Dask job for event identification and tracking

Both scripts utilise marEx.helper.start_distributed_cluster() for cluster management and are configured via environment variables.

Configuration Environment Variables:

DASK_N_WORKERS: Number of Dask workers to launch (default varies by script: 128 for detect, 32 for track)
DASK_WORKERS_PER_NODE: Workers per SLURM node (default varies by script: 64 for detect, 32 for track)
DASK_RUNTIME: Maximum runtime in minutes (default: 39 for detect, 89 for track)
SLURM_ACCOUNT: SLURM account for billing (default: ‘bk1377’)
Additional for run_track.py: RUN_BASIC_TRACKER, GRID_RESOLUTION, AREA_FILTER, R_FILL, T_FILL, OVERLAP_THRESHOLD

Example Usage:

# Run preprocessing on SLURM cluster
export DASK_N_WORKERS=256
export DASK_RUNTIME=120
export SLURM_ACCOUNT=my_project
python examples/batch\ jobs/run_detect.py

# Run tracking after preprocessing completes
export DASK_N_WORKERS=64
export DASK_RUNTIME=180
python examples/batch\ jobs/run_track.py

Customisation:

These scripts are designed to be copied and modified for your specific:

Data file paths and chunk sizes
Preprocessing parameters (anomaly method, thresholds)
Tracking parameters (spatial filters, merge/split settings)
Cluster resources (workers, memory, runtime)

The scripts handle cluster setup, data processing, and saving results to zarr/netCDF files on the scratch filesystem.

Checkpointing for Large Datasets 

When processing very large datasets on HPC systems, Dask may recompute intermediate results multiple times, especially for complex operations like 2D histogram-based percentile calculations. This can significantly increase computation time and memory usage.

The Problem:

Dask builds a computation graph that tracks all operations. For large preprocessing pipelines, this graph can become deeply nested, causing Dask to recompute expensive intermediate steps (climatologies, anomalies, thresholds) whenever downstream operations need them.

The Solution:

The use_temp_checkpoints=True parameter breaks the Dask computation graph by saving intermediate results to temporary zarr stores and immediately reloading them. This prevents expensive recomputations at the cost of some disk I/O.

How It Works:

Intermediate arrays (anomalies, climatologies, thresholds, extremes) are saved to temporary zarr files
The saved data is immediately reloaded as a fresh Dask array
This breaks the dependency chain in the computation graph
Temporary files are automatically cleaned up after reloading

When to Use:

Large datasets (>100 GB) where preprocessing takes hours
HPC environments with fast scratch storage
When 2D histogram percentile calculations are a bottleneck
When you notice Dask recomputing the same operations multiple times

Example Usage:

import xarray as xr
import marEx

# Load large dataset
sst = xr.open_zarr('large_dataset.zarr', chunks={'time': 30}).sst

# Enable checkpointing to prevent expensive recomputations
extremes = marEx.preprocess_data(
    sst,
    method_anomaly='shifting_baseline',
    method_extreme='hobday_extreme',
    threshold_percentile=95,
    use_temp_checkpoints=True,  # Enable checkpointing
    dask_chunks={'time': 25}
)

Performance Trade-offs:

Benefit: Prevents expensive recomputations, reduces memory pressure
Cost: Requires fast disk I/O for temporary zarr stores
Recommendation: Enable on HPC systems with high-speed scratch storage; test on your specific dataset to measure performance impact

Note: Temporary files are automatically removed after data is reloaded, so no manual cleanup is required.

Grid Types and Coordinate Systems 

Structured/Gridded Data 

Regular latitude-longitude grids are the most common data type:

# Typical structured data
print(sst.dims)         # e.g. ('time', 'lat', 'lon')
print(sst.coords)       # e.g. ('time', 'lat', 'lon')

# marEx automatically detects structured grids
processed = marEx.preprocess_data(sst)

Unstructured Data 

Irregular meshes common in coastal and global ocean models:

# Typical unstructured data (e.g., ICON model)
print(sst.dims)         # e.g. ('time', 'ncells')
print(sst.coords)       # e.g. ('time', 'lat', 'lon') as coordinates, not dims

# Specify grid configuration
marEx.specify_grid(
    grid_type='unstructured',
    fpath_tgrid='grid_info.nc',      # Grid topology file
    fpath_ckdtree='kdtree.pkl'       # Spatial index (optional)
)

Requirements:

Spatial dimension name (ncells, cell, etc.)
Latitude/longitude as coordinate arrays
Optional: Grid topology information for advanced/unstructured features

Advanced Features 

Compound Events 

Analyse events that exceed multiple thresholds or variables:

# Multiple variable analysis
sst_extremes = marEx.preprocess_data(sst, threshold_percentile=95)
salinity_extremes = marEx.preprocess_data(salinity, threshold_percentile=5)  # Low salinity

# Compound events
compound_events = sst_extremes.extreme_events & salinity_extremes.extreme_events

Best Practices and Guidelines 

Research Workflow 

Exploratory Analysis: Start with basic preprocessing to understand data
Method Comparison: Test different methods on subset of data
Quality Control: Validate results thoroughly
Full Processing: Apply chosen method to complete dataset
Validation: Compare with known events and literature

Literature Compliance 

For marine heatwave studies following Hobday et al. (2016):

# Standard MHW definition
mhw_config = {
    'method_anomaly': 'shifting_baseline',
    'method_extreme': 'hobday_extreme',
    'threshold_percentile': 90,
    'window_days_hobday': 11,
    'window_year_baseline': 30,
    'smooth_days_baseline': 11,
    'window_spatial_hobday': 1,  # Hobday et al. (2016) considers only single points
}

Getting Help and Support 

For troubleshooting common issues, diagnostic checklists, and support resources, see Troubleshooting.

Citations and References 

When using marEx in publications, please cite:

marEx package
Marine heatwave methods: Hobday et al. (2016) for standard MHW definition

For more detailed examples and advanced usage, see the Examples Gallery.