marEx.identify_extremes

marEx.identify_extremes(da, method_extreme='hobday_extreme', threshold_percentile=95, dimensions=None, coordinates=None, window_days_hobday=11, window_spatial_hobday=None, method_percentile='approximate', precision=0.01, max_anomaly=5.0, use_temp_checkpoints=False, verbose=None, quiet=None)[source]

Identify extreme events exceeding a percentile threshold using specified method.

Parameters:
  • da (xarray.DataArray) – DataArray containing anomalies

  • method_extreme (str, default='hobday_extreme') – Method for threshold calculation (‘global_extreme’ or ‘hobday_extreme’)

  • threshold_percentile (float, default=95) – Percentile threshold (e.g., 95 for 95th percentile)

  • dimensions (dict, optional) – Mapping of dimensions to names in the data

  • coordinates (dict, optional) – Mapping of coordinates to names in the data

  • window_days_hobday (int, default=11) – Window for day-of-year threshold (hobday_extreme only)

  • window_spatial_hobday (int, default=None) – Window for day-of-year threshold spatial clustering (hobday_extreme only)

  • method_percentile (str, default='approximate') – Method for percentile computation (‘exact’ or ‘approximate’)

  • precision (float, default=0.01) – Precision for histogram bins in approximate method

  • max_anomaly (float, default=5.0) – Maximum anomaly value for histogram binning

  • use_temp_checkpoints (bool)

  • verbose (bool | None)

  • quiet (bool | None)

Returns:

Tuple of (extremes, thresholds) where extremes is a boolean array identifying extreme events and thresholds contains the threshold values used

Return type:

tuple

Examples

Basic extreme identification with global thresholds:

>>> import xarray as xr
>>> import marEx
>>>
>>> # Load anomaly data (from compute_normalised_anomaly)
>>> anomalies = xr.open_dataset('anomalies.nc', chunks={}).dat_anomaly
>>>
>>> # Identify extreme events using global-in-time 95th percentile
>>> extremes, thresholds = marEx.identify_extremes(
...     anomalies,
...     method_extreme="global_extreme",
...     threshold_percentile=95
... )
>>> print(f"Extreme events shape: {extremes.shape}")
Extreme events shape: (1461, 180, 360)
>>> print(f"Thresholds shape: {thresholds.shape}")
Thresholds shape: (180, 360)
>>> # Count total extreme events
>>> total_extremes = extremes.sum().compute()
>>> print(f"Total extreme events: {total_extremes}")

Using day-of-year specific thresholds (cf. Hobday et al. 2016 method):

>>> # More sophisticated threshold calculation
>>> extremes_hobday, thresholds_hobday = marEx.identify_extremes(
...     anomalies,
...     method_extreme="hobday_extreme",
...     threshold_percentile=95,
...     window_days_hobday=11  # 11-day window around each day-of-year
...     window_spatial_hobday=3  # 3x3 spatial window for clustering percentile calcuation
... )
>>> print(f"Hobday thresholds shape: {thresholds_hobday.shape}")
Hobday thresholds shape: (366, 180, 360)
>>> # Compare seasonal variation in thresholds
>>> summer_threshold = thresholds_hobday.sel(dayofyear=200).mean()
>>> winter_threshold = thresholds_hobday.sel(dayofyear=50).mean()
>>> print(f"Summer vs Winter thresholds: {summer_threshold:.3f} vs {winter_threshold:.3f}")

Comparison of exact vs approximate percentile methods:

>>> # Approximate method (faster, default)
>>> extremes_approx, thresh_approx = marEx.identify_extremes(
...     anomalies, method_percentile="approximate"
... )
>>>
>>> # Exact method (slower & memory intensive)
>>> extremes_exact, thresh_exact = marEx.identify_extremes(
...     anomalies, method_percentile="exact"
... )
>>>
>>> # Compare threshold precision — ~0.005C
>>> threshold_diff = (thresh_exact - thresh_approx).std().compute()
>>> print(f"Threshold difference (exact vs approx): {threshold_diff:.6f}")

Different percentile thresholds for varying event rarity:

>>> # Conservative threshold - very extreme events only
>>> extremes_98, _ = marEx.identify_extremes(
...     anomalies, threshold_percentile=98
... )
>>>
>>> # Moderate threshold - more frequent events
>>> extremes_90, _ = marEx.identify_extremes(
...     anomalies, threshold_percentile=90
... )
>>>
>>> # Compare event frequency
>>> print(f"99th percentile events: {extremes_99.sum().compute()}")
>>> print(f"90th percentile events: {extremes_90.sum().compute()}")

Processing unstructured data:

>>> # ICON ocean model data
>>> icon_anomalies = xr.open_dataset('icon_anomalies.nc', chunks={}).dat_anomaly
>>> extremes_unstructured, thresholds_unstructured = marEx.identify_extremes(
...     icon_anomalies,
...     dimensions={"time": "time", "x": "ncells"},
...     coordinates={"time": "time", "x": "lon", "y": "lat"},
...     threshold_percentile=95
... )
>>> print(f"Unstructured extremes shape: {extremes_unstructured.shape}")

Advanced Hobday method with custom temporal window:

>>> # Longer temporal window for smoother thresholds
>>> extremes_smooth, thresholds_smooth = marEx.identify_extremes(
...     anomalies,
...     method_extreme="hobday_extreme",
...     window_days_hobday=31,  # Longer smoothing window
...     threshold_percentile=95
... )
>>>
>>> # Compare threshold smoothness
>>> std_11day = thresholds_hobday.std(dim='dayofyear').mean().compute()
>>> std_31day = thresholds_smooth.std(dim='dayofyear').mean().compute()
>>> print(f"Threshold variability: 11-day={std_11day:.3f}, 31-day={std_31day:.3f}")