marEx.identify_extremes
- marEx.identify_extremes(da, method_extreme='hobday_extreme', threshold_percentile=95, dimensions=None, coordinates=None, window_days_hobday=11, window_spatial_hobday=None, method_percentile='approximate', precision=0.01, max_anomaly=5.0, use_temp_checkpoints=False, verbose=None, quiet=None)[source]
Identify extreme events exceeding a percentile threshold using specified method.
- Parameters:
da (xarray.DataArray) – DataArray containing anomalies
method_extreme (str, default='hobday_extreme') – Method for threshold calculation (‘global_extreme’ or ‘hobday_extreme’)
threshold_percentile (float, default=95) – Percentile threshold (e.g., 95 for 95th percentile)
dimensions (dict, optional) – Mapping of dimensions to names in the data
coordinates (dict, optional) – Mapping of coordinates to names in the data
window_days_hobday (int, default=11) – Window for day-of-year threshold (hobday_extreme only)
window_spatial_hobday (int, default=None) – Window for day-of-year threshold spatial clustering (hobday_extreme only)
method_percentile (str, default='approximate') – Method for percentile computation (‘exact’ or ‘approximate’)
precision (float, default=0.01) – Precision for histogram bins in approximate method
max_anomaly (float, default=5.0) – Maximum anomaly value for histogram binning
use_temp_checkpoints (bool)
verbose (bool | None)
quiet (bool | None)
- Returns:
Tuple of (extremes, thresholds) where extremes is a boolean array identifying extreme events and thresholds contains the threshold values used
- Return type:
Examples
Basic extreme identification with global thresholds:
>>> import xarray as xr >>> import marEx >>> >>> # Load anomaly data (from compute_normalised_anomaly) >>> anomalies = xr.open_dataset('anomalies.nc', chunks={}).dat_anomaly >>> >>> # Identify extreme events using global-in-time 95th percentile >>> extremes, thresholds = marEx.identify_extremes( ... anomalies, ... method_extreme="global_extreme", ... threshold_percentile=95 ... ) >>> print(f"Extreme events shape: {extremes.shape}") Extreme events shape: (1461, 180, 360) >>> print(f"Thresholds shape: {thresholds.shape}") Thresholds shape: (180, 360)
>>> # Count total extreme events >>> total_extremes = extremes.sum().compute() >>> print(f"Total extreme events: {total_extremes}")
Using day-of-year specific thresholds (cf. Hobday et al. 2016 method):
>>> # More sophisticated threshold calculation >>> extremes_hobday, thresholds_hobday = marEx.identify_extremes( ... anomalies, ... method_extreme="hobday_extreme", ... threshold_percentile=95, ... window_days_hobday=11 # 11-day window around each day-of-year ... window_spatial_hobday=3 # 3x3 spatial window for clustering percentile calcuation ... ) >>> print(f"Hobday thresholds shape: {thresholds_hobday.shape}") Hobday thresholds shape: (366, 180, 360)
>>> # Compare seasonal variation in thresholds >>> summer_threshold = thresholds_hobday.sel(dayofyear=200).mean() >>> winter_threshold = thresholds_hobday.sel(dayofyear=50).mean() >>> print(f"Summer vs Winter thresholds: {summer_threshold:.3f} vs {winter_threshold:.3f}")
Comparison of exact vs approximate percentile methods:
>>> # Approximate method (faster, default) >>> extremes_approx, thresh_approx = marEx.identify_extremes( ... anomalies, method_percentile="approximate" ... ) >>> >>> # Exact method (slower & memory intensive) >>> extremes_exact, thresh_exact = marEx.identify_extremes( ... anomalies, method_percentile="exact" ... ) >>> >>> # Compare threshold precision — ~0.005C >>> threshold_diff = (thresh_exact - thresh_approx).std().compute() >>> print(f"Threshold difference (exact vs approx): {threshold_diff:.6f}")
Different percentile thresholds for varying event rarity:
>>> # Conservative threshold - very extreme events only >>> extremes_98, _ = marEx.identify_extremes( ... anomalies, threshold_percentile=98 ... ) >>> >>> # Moderate threshold - more frequent events >>> extremes_90, _ = marEx.identify_extremes( ... anomalies, threshold_percentile=90 ... ) >>> >>> # Compare event frequency >>> print(f"99th percentile events: {extremes_99.sum().compute()}") >>> print(f"90th percentile events: {extremes_90.sum().compute()}")
Processing unstructured data:
>>> # ICON ocean model data >>> icon_anomalies = xr.open_dataset('icon_anomalies.nc', chunks={}).dat_anomaly >>> extremes_unstructured, thresholds_unstructured = marEx.identify_extremes( ... icon_anomalies, ... dimensions={"time": "time", "x": "ncells"}, ... coordinates={"time": "time", "x": "lon", "y": "lat"}, ... threshold_percentile=95 ... ) >>> print(f"Unstructured extremes shape: {extremes_unstructured.shape}")
Advanced Hobday method with custom temporal window:
>>> # Longer temporal window for smoother thresholds >>> extremes_smooth, thresholds_smooth = marEx.identify_extremes( ... anomalies, ... method_extreme="hobday_extreme", ... window_days_hobday=31, # Longer smoothing window ... threshold_percentile=95 ... ) >>> >>> # Compare threshold smoothness >>> std_11day = thresholds_hobday.std(dim='dayofyear').mean().compute() >>> std_31day = thresholds_smooth.std(dim='dayofyear').mean().compute() >>> print(f"Threshold variability: 11-day={std_11day:.3f}, 31-day={std_31day:.3f}")