marEx.detect.rolling_climatology

marEx.detect.rolling_climatology(da, window_year_baseline=15, dimensions=None, coordinates=None, use_temp_checkpoints=False)[source]

Compute rolling climatology efficiently using flox cohorts. Uses the previous window_year_baseline years of data and reassemble it to match the original data structure. Years without enough previous data will be filled with NaN.

Parameters:

da (xarray.DataArray) – Input data with time coordinate
window_year_baseline (int, default=15) – Number of years to include in each climatology window
dimensions (dict, optional) – Mapping of dimensions to names in the data
coordinates (dict, optional) – Mapping of coordinates to names in the data
use_temp_checkpoints (bool)

Returns:

Rolling climatology with same shape as input data

Return type:

xarray.DataArray

Examples

Basic rolling climatology computation:

>>> import xarray as xr
>>> import marEx
>>>
>>> # Load 20 years of SST data
>>> sst = xr.open_dataset('sst_data.nc', chunks={}).sst.chunk({'time': 30})
>>>
>>> # Compute 15-year rolling climatology
>>> climatology = marEx.rolling_climatology(sst, window_year_baseline=15)
>>> print(climatology.shape)
(7305, 180, 360)  # Same as input
>>>
>>> # First 15 years will be NaN (insufficient history)
>>> print(f"NaN values in first year: {climatology.isel(time=slice(0, 365)).isnull().all().compute()}")
True

Shorter window for datasets with limited time span:

>>> # For datasets with only 10 years, use shorter window
>>> short_climatology = marEx.rolling_climatology(
...     sst, window_year_baseline=5
... )
>>> # First 5 years will be NaN instead of 15

Processing unstructured data:

>>> # ICON ocean model data
>>> icon_sst = xr.open_dataset('icon_sst.nc', chunks={}).to.chunk({'time': 25})
>>> icon_climatology = marEx.rolling_climatology(
...     icon_sst,
...     dimensions={"time": "time", "x": "ncells"}
...     coordinates={"time": "time", "x": "lon", "y": "lat"}
... )
>>> print(icon_climatology.dims)
Frozen({'time': 7305, 'ncells': 83886})

Comparing with fixed climatology:

>>> # Fixed climatology (traditional approach)
>>> fixed_clim = sst.groupby(sst.time.dt.dayofyear).mean()
>>>
>>> # Rolling climatology (adaptive approach)
>>> rolling_clim = marEx.rolling_climatology(sst)
>>>
>>> # Rolling climatology adapts to climate change
>>> clim_2000 = rolling_clim.sel(time='2000').mean()
>>> clim_2020 = rolling_clim.sel(time='2020').mean()
>>> print(f"Climate change signal: {(clim_2020 - clim_2000).compute():.3f} °C")

Memory considerations for large datasets:

>>> # Ensure appropriate chunking for memory efficiency
>>> large_sst = sst.chunk({'time': 30, 'lat': 45, 'lon': 90})
>>> large_climatology = marEx.rolling_climatology(large_sst)
>>> # Output maintains input chunking structure