Tracking Module (`marEx.track`)

The marEx.track module provides sophisticated algorithms for identifying and tracking extreme events through time. It handles complex scenarios including event merging, splitting, and temporal evolution using advanced image processing and graph theory techniques.

Overview

The tracking module implements a comprehensive workflow for converting binary extreme event masks into tracked event objects with unique identifiers, statistical properties, and temporal evolution information. It includes advanced algorithms for handling event lifecycles, including merging and splitting events whilst maintaining unique identities.

Key Features:

Binary Object Tracking: Advanced connected component analysis with dask parallelisation
Merge/Split Handling: Algorithms for event lifecycle management
Morphological Operations: Image processing for event preprocessing and hole filling
Statistical Analysis: Comprehensive event property calculation
Memory Efficient: Optimised for large datasets with intelligent chunking
Grid Agnostic: Works with both structured and unstructured grids

Main Classes

tracker(data_bin, mask, R_fill[, ...])

Tracker identifies and tracks arbitrary binary objects in spatial data through time.

Tracking Algorithm Comparison

The video below demonstrates the critical difference between basic connectivity-based tracking and marEx’s advanced merge/split algorithm:

Left (Basic Method — 3D Connected Components): Uses simple 3D connected component labeling where ANY objects that touch at any point in space-time are permanently merged into the same event. This creates a chain reaction: Event A touches Event B → merged. The merged AB touches Event C → all become one event. Over time, this produces unrealistic basin-spanning “mega-events” that spuriously link dozens of independent physical phenomena. These mega-events are algorithmically-induced artifacts with no coherent physical origin, making their statistics (size, duration, intensity) meaningless for mechanistic analysis.

Right (marEx Advanced Method — Genealogy Tracking): Prevents mega-events using overlap_threshold (requires significant overlap, not just touching) and maintains individual event identities through merge/split events using nearest-neighbor partitioning. Records complete parent/child genealogy in merge_ledger, enabling reconstruction of physically realistic event evolution. Tracked events correspond to actual coherent phenomena, providing mechanistically relevant statistics for scientific analysis.

Detailed Documentation

Tracker Class

class marEx.track.tracker(data_bin, mask, R_fill, area_filter_quartile=None, area_filter_absolute=None, temp_dir=None, T_fill=2, allow_merging=True, nn_partitioning=False, overlap_threshold=0.5, unstructured_grid=False, dimensions=None, coordinates=None, neighbours=None, cell_areas=None, grid_resolution=None, max_iteration=40, checkpoint=None, debug=0, verbose=None, quiet=None, regional_mode=False, coordinate_units=None)[source]

Bases: object

Tracker identifies and tracks arbitrary binary objects in spatial data through time.

The tracker supports both structured (regular grid) and unstructured data, and seamlessly handles splitting & merging of objects. It identifies connected regions in binary data at each time step, and tracks these as evolving events through time.

Main workflow:

Preprocessing: Fill spatiotemporal holes, filter small objects
Object identification: Label connected components at each time
Tracking: Determine object correspondences across time
Optional splitting & merging: Handle complex event evolution

Parameters:

data_bin (xarray.DataArray) – Binary field of extreme points to group, label, and track (True = object, False = background) Must represent and underlying dask array.
mask (xarray.DataArray) – Binary mask indicating valid regions (True = valid, False = invalid)
R_fill (int) – The radius of the kernel used in morphological opening & closing, relating to the largest hole/gap that can be filled. In units of grid cells.
area_filter_quartile (float, optional) – The fraction of the smallest objects to discard, i.e. the quantile defining the smallest area object retained. Quantile must be in (0-1) (e.g., 0.25 removes smallest 25%). Mutually exclusive with area_filter_absolute. Default is 0.5 if neither parameter is provided.
area_filter_absolute (int, optional) – The minimum area (in grid cells) for an object to be retained. Mutually exclusive with area_filter_quartile. Use this for fixed minimum area thresholds (e.g., 10 cells minimum).
temp_dir (str, optional) – Path to temporary directory for storing intermediate results
T_fill (int, default=2) – The permissible temporal gap (in days) between objects for tracking continuity to be maintained (must be even)
allow_merging (bool, default=True) – Allow objects to split and merge across time. Apply splitting & merging criteria, track merge events, and maintain original identities of merged objects across time. N.B.: False reverts to classical ndmeasure.label with simplar time connectivity, i.e. Scannell et al.
nn_partitioning (bool, default=False) – Implement a better partitioning of merged child objects based on closest parent cell. False reverts to using parent centroids to determine partitioning between new child objects, i.e. Di Sun & Bohai Zhang 2023. N.B.: Centroid-based partitioning has major problems with small merging objects suddenly obtaining unrealistically-large (and often disjoint) fractions of the larger object.
overlap_threshold (float, default=0.5) – The fraction of the smaller object’s area that must overlap with the larger object’s area to be considered the same event and continue tracking with the same ID.
unstructured_grid (bool, default=False) – Whether data is on an unstructured grid
dimensions (dict, default={"time": "time", "x": "lon", "y": "lat"}) – Mapping of dimensions to names in the data
coordinates (dict, optional) – Coordinate names for unstructured grids. Should contain ‘x’ and ‘y’ keys for x and y coordinates. May also contain ‘time’ if the time coordinate name is different from the dimension name.
neighbours (xarray.DataArray, optional) – For unstructured grid, indicates connectivity between cells
cell_areas (xarray.DataArray, optional) – For unstructured grid, area of each cell (required). For structured grid, area of each cell (optional). If not provided, defaults to 1.0 for each cell (resulting in cell counts as areas). Note: Overridden by grid_resolution if provided for structured grids.
grid_resolution (float, optional) – Grid resolution in degrees for structured grids only (ignored for unstructured grids). When provided, automatically calculates cell areas using spherical geometry. Overrides any provided cell_areas parameter.
max_iteration (int, default=40) – Maximum number of iterations for merging/splitting algorithm
checkpoint (str, default='None') – Checkpoint strategy (‘save’, ‘load’, or None)
debug (int, default=0) – Debug level (0-2)
verbose (bool, optional) – Enable verbose logging with detailed progress information. If None, uses current global logging configuration.
quiet (bool, optional) – Enable quiet logging with minimal output (warnings and errors only). If None, uses current global logging configuration. Note: quiet takes precedence over verbose if both are True.
regional_mode (bool, default=False) – Enable regional mode for non-global coordinate ranges. When True, coordinate_units must be specified.
coordinate_units (str, optional) – Coordinate units when regional_mode=True. Must be either ‘degrees’ or ‘radians’.

Examples

Basic tracking of marine heatwave events from preprocessed data:

>>> import xarray as xr
>>> import marEx
>>>
>>> # Load preprocessed extreme events data
>>> processed = xr.open_dataset('extreme_events.nc', chunks={})
>>> extreme_events = processed.extreme_events  # Boolean array
>>> mask = processed.mask  # Ocean/land mask
>>>
>>> # Initialise tracker with basic parameters
>>> tracker = marEx.tracker(
...     extreme_events,
...     mask,
...     R_fill=8,                    # Fill holes up to 8 grid cells
...     area_filter_quartile=0.5     # Remove smallest 50% of objects
...     allow_merging=False          # Basic tracking without splitting/merging
... )
>>>
>>> # Run tracking algorithm
>>> events = tracker.run()
>>> print(f"Identified {events.ID.max().compute()} distinct events")
Identified 1247 distinct events

Using automatic grid area calculation from resolution:

>>> # For regular lat/lon grids, automatically calculate physical areas
>>> grid_tracker = marEx.tracker(
...     extreme_events,
...     mask,
...     R_fill=8,
...     area_filter_quartile=0.5,
...     grid_resolution=0.25  # Grid resolution in degrees
... )
>>> # Cell areas are calculated automatically using spherical geometry
>>> grid_events = grid_tracker.run()

Advanced tracking with merging and splitting enabled:

>>> # More sophisticated tracking with temporal gap filling
>>> advanced_tracker = marEx.tracker(
...     extreme_events,
...     mask,
...     R_fill=12,               # Larger spatial gap filling
...     T_fill=4,                # Fill up to 4-day temporal gaps
...     area_filter_quartile=0.25,  # More aggressive size filtering
...     allow_merging=True,      # Enable split/merge detection
...     overlap_threshold=0.3    # Lower threshold for object linking
... )
>>>
>>> events_advanced, merges_log = advanced_tracker.run(return_merges=True)
>>> print(events_advanced.data_vars)
Data variables:
    event           (time, lat, lon)        int32           dask.array<chunksize=(25, 180, 360)>
    event_centroid  (time, lat, lon)        int32           dask.array<chunksize=(25, 180, 360)>
    ID_field        (time, lat, lon)        int32           dask.array<chunksize=(25, 180, 360)>
    global_ID       (time, ID)              int32           dask.array<chunksize=(25, 1247)>
    area            (time, ID)              float32         dask.array<chunksize=(25, 1247)>
    centroid        (component, time, ID)   float64         dask.array<chunksize=(2, 25, 1247)>
    presence        (time, ID)              bool            dask.array<chunksize=(25, 1247)>
    time_start      (ID)                    datetime64[ns]  dask.array<chunksize=(1247,)>
    time_end        (ID)                    datetime64[ns]  dask.array<chunksize=(1247,)>
    merge_ledger    (time, ID, sibling_ID)  int32           dask.array<chunksize=(25, 1247, 10)>

Processing unstructured ocean model data (ICON):

>>> # Load ICON ocean model data with connectivity
>>> icon_data = xr.open_dataset('icon_extremes.nc', chunks={})
>>> icon_extremes = icon_data.extreme_events  # (time, ncells)
>>> icon_mask = icon_data.mask
>>> neighbours = icon_data.neighbours  # Cell connectivity
>>> cell_areas = icon_data.cell_areas  # Physical areas
>>>
>>> # Track events on unstructured grid
>>> unstructured_tracker = marEx.tracker(
...     icon_extremes,
...     icon_mask,
...     R_fill=5,                                   # 5-neighbor radius for gap filling
...     area_filter_quartile=0.6,                   # Remove 60% of smallest events
...     unstructured_grid=True,                     # Enable unstructured mode
...     dimensions={"x": "ncells"},                 # Must specify the name of the spatial dimension
...     coordinates={"x": "lon", "y": "lat"},       # Spatial coordinate names
...     neighbours=neighbours,                      # Required for unstructured
...     cell_areas=cell_areas                       # Required for area calculations
... )
>>> unstructured_events = unstructured_tracker.run()

Memory management and checkpointing for large datasets:

>>> # Use checkpointing for very large datasets
>>> large_tracker = marEx.tracker(
...     extreme_events,
...     mask,
...     R_fill=8,
...     area_filter_quartile=0.5,
...     temp_dir='/scratch/user/tracking_temp',  # Temporary storage
...     checkpoint='save'             # Save intermediate results
... )
>>> # Processing can be resumed if interrupted
>>> large_events = large_tracker.run()

Comparing different filtering strategies:

>>> # Conservative filtering - keep more events
>>> conservative = marEx.tracker(
...     extreme_events, mask, R_fill=5, area_filter_quartile=0.1
... )
>>> conservative_events = conservative.run()
>>>
>>> # Aggressive filtering - focus on largest events
>>> aggressive = marEx.tracker(
...     extreme_events, mask, R_fill=15, area_filter_quartile=0.8
... )
>>> aggressive_events = aggressive.run()
>>>
>>> print(f"Conservative: {conservative_events.ID.max().compute()} events")
>>> print(f"Aggressive: {aggressive_events.ID.max().compute()} events")

Using absolute area filtering instead of percentile-based:

>>> # Filter objects smaller than 25 grid cells
>>> absolute_tracker = marEx.tracker(
...     extreme_events, mask, R_fill=8, area_filter_absolute=25
... )
>>> absolute_events = absolute_tracker.run()
>>>
>>> # Default behavior (area_filter_quartile=0.5) when no parameters provided
>>> default_tracker = marEx.tracker(extreme_events, mask, R_fill=8)
>>> default_events = default_tracker.run()  # Uses quartile=0.5 filtering

Using physical cell areas for structured grids:

>>> # Load data with irregular grid cell areas
>>> grid_areas = xr.open_dataset('grid_areas.nc').cell_area  # (lat, lon) in m²
>>>
>>> # Track events using physical areas instead of cell counts
>>> physical_tracker = marEx.tracker(
...     extreme_events,
...     mask,
...     R_fill=8,
...     area_filter_quartile=0.5,
...     cell_areas=grid_areas  # Physical areas in m²
... )
>>> events = physical_tracker.run()
>>> # Now events.area contains physical areas in m² instead of cell counts

Integration with full marEx workflow:

>>> # Complete workflow from raw data to tracked events
>>> raw_sst = xr.open_dataset('sst_data.nc', chunks={}).sst.chunk({'time': 30})
>>>
>>> # Step 1: Preprocess to identify extremes
>>> processed = marEx.preprocess_data(raw_sst, threshold_percentile=95)
>>>
>>> # Step 2: Track extreme events
>>> tracker = marEx.tracker(
...     processed.extreme_events,
...     processed.mask,
...     R_fill=8,
...     area_filter_quartile=0.5
... )
>>> tracked_events = tracker.run()

Initialise the tracker with parameters and data.

__init__(data_bin, mask, R_fill, area_filter_quartile=None, area_filter_absolute=None, temp_dir=None, T_fill=2, allow_merging=True, nn_partitioning=False, overlap_threshold=0.5, unstructured_grid=False, dimensions=None, coordinates=None, neighbours=None, cell_areas=None, grid_resolution=None, max_iteration=40, checkpoint=None, debug=0, verbose=None, quiet=None, regional_mode=False, coordinate_units=None)[source]

Initialise the tracker with parameters and data.

Parameters:

data_bin (DataArray)
mask (DataArray)
R_fill (int | float)
area_filter_quartile (float | None)
area_filter_absolute (int | None)
temp_dir (str | None)
T_fill (int)
allow_merging (bool)
nn_partitioning (bool)
overlap_threshold (float)
unstructured_grid (bool)
dimensions (Dict[str, str] | None)
coordinates (Dict[str, str] | None)
neighbours (DataArray | None)
cell_areas (DataArray | None)
grid_resolution (float | None)
max_iteration (int)
checkpoint (Literal['save', 'load', 'None'] | None)
debug (int)
verbose (bool | None)
quiet (bool | None)
regional_mode (bool)
coordinate_units (Literal['degrees', 'radians'] | None)

Return type:

None

run(return_merges=False, checkpoint=None)[source]

Run the complete object identification and tracking pipeline.

This method executes the full workflow:

Preprocessing: morphological operations and size filtering
Identification and tracking of objects through time
Computing and attaching statistics to the results

Parameters:

return_merges (bool, default=False) – If True, return merge events dataset alongside the main events
checkpoint (str, optional) – Override the instance checkpoint setting

Returns:

events_ds (xarray.Dataset) – Dataset containing tracked events and their properties
merges_ds (xarray.Dataset, optional) – Dataset with merge event information (only if return_merges=True)

Return type:

Dataset | Tuple[Dataset, Dataset]

run_preprocess(checkpoint=None)[source]

Preprocess binary data to prepare for tracking.

This performs morphological operations to fill holes/gaps in both space and time, then filters small objects according to the area_filter_quartile or area_filter_absolute.

Parameters:

checkpoint (str, optional) – Checkpoint strategy override

Returns:

data_bin_filtered (xarray.DataArray) – Preprocessed binary data
object_stats (tuple) – Statistics about the preprocessing

Return type:

Tuple[DataArray, Tuple[float, int, int, float, float, float]]

run_tracking(data_bin_preprocessed)[source]

Track objects through time to identify events.

Parameters:

data_bin_preprocessed (xarray.DataArray) – Preprocessed binary data

Returns:

events_ds (xarray.Dataset) – Dataset containing tracked events
merges_ds (xarray.Dataset) – Dataset with merge information
N_events_final (int) – Final number of unique events

Return type:

Tuple[Dataset, Dataset, int]

run_stats_attributes(events_ds, merges_ds, object_stats, N_events_final)[source]

Add statistics and attributes to the events dataset.

Parameters:

events_ds (xarray.Dataset) – Dataset containing tracked events
merges_ds (xarray.Dataset) – Dataset with merge information
object_stats (tuple) – Preprocessed object statistics
N_events_final (int) – Final number of events

Returns:

events_ds – Dataset with added statistics and attributes

Return type:

xarray.Dataset

compute_area(data_bin)[source]

Compute the total area of binary data at each time.

Parameters:: data_bin (xarray.DataArray) – Binary data
Returns:: area – Total area at each time (units: pixels for structured grid, matching cell_area for unstructured)
Return type:: xarray.DataArray

fill_holes(data_bin, R_fill=None)[source]

Fill holes and gaps using morphological operations.

This performs closing (dilation followed by erosion) to fill small gaps, then opening (erosion followed by dilation) to remove small isolated objects.

Parameters:

data_bin (xarray.DataArray) – Binary data to process
R_fill (int, optional) – Fill radius override

Returns:

data_bin_filled – Binary data with holes/gaps filled

Return type:

xarray.DataArray

fill_time_gaps(data_bin)[source]

Fill temporal gaps between objects.

Performs binary closing (dilation then erosion) along the time dimension to fill small time gaps between objects.

Parameters:: data_bin (xarray.DataArray) – Binary data to process
Returns:: data_bin_filled – Binary data with temporal gaps filled
Return type:: xarray.DataArray

refresh_dask_graph(data_bin)[source]

Clear and reset the Dask graph via save/load cycle.

This is needed to work around a memory leak bug in Dask where “Unmanaged Memory” builds up within loops.

Parameters:: data_bin (xarray.DataArray) – Data to refresh
Returns:: data_new – Data with fresh Dask graph
Return type:: xarray.DataArray

filter_small_objects(data_bin)[source]

Remove objects smaller than a threshold area.

Parameters:

data_bin (xarray.DataArray) – Binary data to filter

Returns:

data_bin_filtered (xarray.DataArray) – Binary data with small objects removed
area_threshold (float) – Area threshold used for filtering
object_areas (xarray.DataArray) – Areas of all objects pre-filtering
N_objects_prefiltered (int) – Number of objects before filtering
N_objects_filtered (int) – Number of objects after filtering

Return type:

Tuple[DataArray, float, DataArray, int, int]

identify_objects(data_bin, time_connectivity)[source]

Identify connected regions in binary data.

Parameters:

data_bin (xarray.DataArray) – Binary data to identify objects in
time_connectivity (bool) – Whether to connect objects across time

Returns:

object_id_field (xarray.DataArray) – Field of integer IDs for each object
None (NoneType) – Placeholder for compatibility with track_objects
N_objects (int) – Number of objects identified

Return type:

Tuple[DataArray, None, int]

calculate_centroid(binary_mask, original_centroid=None)[source]

Calculate object centroid, handling edge cases for periodic boundaries.

Parameters:

binary_mask (numpy.ndarray) – 2D binary array where True indicates the object (dimensions are (y,x))
original_centroid (tuple, optional) – (y_centroid, x_centroid) from regionprops_table

Returns:

(y_centroid, x_centroid)

Return type:

tuple

calculate_object_properties(object_id_field, properties=None)[source]

Calculate properties of objects from ID field.

Parameters:

object_id_field (xarray.DataArray) – Field containing object IDs
properties (list, optional) – List of properties to calculate (defaults to [‘label’, ‘area’])

Returns:

object_props – Dataset containing calculated properties with ‘ID’ dimension

Return type:

xarray.Dataset

check_overlap_slice(ids_t0, ids_next)[source]

Find overlapping objects between two consecutive time slices.

Parameters:

ids_t0 (numpy.ndarray) – Object IDs at current time
ids_next (numpy.ndarray) – Object IDs at next time

Returns:

Array of shape (n_overlaps, 3) with [id_t0, id_next, overlap_area]

Return type:

numpy.ndarray

find_overlapping_objects(object_id_field)[source]

Find all overlapping objects across time.

Parameters:

object_id_field (xarray.DataArray) – Field containing object IDs

Returns:

overlap_objects_list_unique_filtered – Array of object ID pairs that overlap across time, with overlap area The object in the first column precedes the second column in time. The third column contains:

For structured grid: number of overlapping pixels (int32)

For unstructured grid: total overlapping area in m^2 (float32)

Return type:

(N x 3) numpy.ndarray

enforce_overlap_threshold(overlap_objects_list, object_props)[source]

Filter object pairs based on overlap threshold.

Parameters:

overlap_objects_list ((N x 3) numpy.ndarray) – Array of object ID pairs with overlap area
object_props (xarray.Dataset) – Object properties including area

Returns:

overlap_objects_list_filtered – Filtered array of object ID pairs that meet the overlap threshold

Return type:

(M x 3) numpy.ndarray

consolidate_object_ids(data_t_minus_2, data_t_minus_1, object_props, timestep)[source]

Consolidate object IDs between t-2 and t-1 to ensure consistent tracking.

This identifies objects at t-1 that are actually continuations of objects from t-2 (but got different IDs due to partitioning) and renames them to maintain consistent IDs across timesteps.

Parameters:

data_t_minus_2 (xr.DataArray) – Object field at timestep t-2
data_t_minus_1 (xr.DataArray) – Object field at timestep t-1 (will be modified)
object_props (xr.Dataset) – Object properties dataset (will be modified)
timestep (int) – Current timestep number for logging purposes

Returns:

data_t_minus_1_consolidated (xr.DataArray) – Updated t-1 field with consolidated IDs
object_props_updated (xr.Dataset) – Updated object properties with merged/deleted objects

Return type:

Tuple[DataArray, Dataset]

Notes

Uses self.overlap_threshold for determining consolidation eligibility
Updates object properties by recalculating for consolidated objects
Removes redundant child objects from object_props

compute_id_time_dict(da, child_objects, max_objects, all_objects=True)[source]

Generate lookup table mapping object IDs to their time index.

Parameters:

da (xarray.DataArray) – Field of object IDs
child_objects (list or array) – Object IDs to include in the dictionary
max_objects (int) – Maximum number of objects
all_objects (bool, default=True) – Whether to process all objects or just child_objects

Returns:

time_index_map – Dictionary mapping object IDs to time indices

Return type:

dict

track_objects(data_bin)[source]

Track objects through time to form events.

This is the main tracking method that handles splitting and merging of objects.

Parameters:

data_bin (xarray.DataArray) – Preprocessed binary data: Field of globally unique integer IDs of each element in connected regions. ID = 0 indicates no object.

Returns:

split_merged_events_ds (xarray.Dataset) – Dataset containing tracked events
merge_events (xarray.Dataset) – Dataset with merge information
N_events (int) – Final number of events

Return type:

Tuple[Dataset, Dataset, int]

cluster_rename_objects_and_props(object_id_field_unique, object_props, overlap_objects_list, merge_events)[source]

Cluster the object pairs and relabel to determine final event IDs.

Parameters:

object_id_field_unique (xarray.DataArray) – Field of unique object IDs. IDs must not be repeated across time.
object_props (xarray.Dataset) – Properties of each object that also need to be relabeled.
overlap_objects_list ((N x 2) numpy.ndarray) – Array of object ID pairs that indicate which objects are in the same event. The object in the first column precedes the second column in time.
merge_events (xarray.Dataset) – Information about merge events

Returns:

split_merged_events_ds – Dataset with relabeled events and their properties. ID = 0 indicates no object.

Return type:

xarray.Dataset

split_and_merge_objects(object_id_field_unique, object_props)[source]

Implement object splitting and merging logic.

This identifies and processes cases where objects split or merge over time, creating new object IDs as needed.

Parameters:

object_id_field_unique (xarray.DataArray) – Field of unique object IDs. IDs are required to be monotonically increasing with time.
object_props (xarray.Dataset) – Properties of each object

Returns:

(object_id_field, object_props, overlap_objects_list, merge_events)

Return type:

tuple

split_and_merge_objects_parallel(object_id_field_unique, object_props)[source]

Optimised parallel implementation of object splitting and merging.

This version is specifically designed for unstructured grids with more efficient memory handling and better parallelism than the standard split_and_merge_objects method. It processes data in chunks, handles merging events, and efficiently updates object IDs.

Parameters:

object_id_field_unique (xarray.DataArray) – Field of unique object IDs
object_props (xarray.Dataset) – Properties of each object

Returns:

(object_id_field, object_props, overlap_objects_list, merge_events)

Return type:

tuple

Basic Usage Examples

Simple Event Tracking

import xarray as xr
import marEx

# Load preprocessed extreme events data
extremes_ds = xr.open_dataset('extreme_events.nc')

# Basic tracking with default parameters
event_tracker = marEx.tracker(
    extremes_ds.extreme_events,  # Binary extreme events field
    extremes_ds.mask,            # Land-sea mask
    R_fill=8,                    # Fill holes with radius < 8 cells
    area_filter_absolute=100     # Remove objects smaller than 100 grid cells
)

# Run tracking algorithm
tracked_events = event_tracker.run()

Advanced Tracking Configuration

# Advanced tracking with merge/split handling
event_tracker = marEx.tracker(
    extremes_ds.extreme_events,
    extremes_ds.mask,
    R_fill=8,                    # Fill holes with radius < 8 cells
    area_filter_quartile=0.5,    # Remove smallest 50% of events (alternative to area_filter_absolute)
    T_fill=2,                    # Allow 2-day gaps in tracking
    allow_merging=True,          # Enable merge/split tracking
    overlap_threshold=0.5,       # 50% overlap required for continuity
    nn_partitioning=True,        # Use nearest-neighbor partitioning
    cell_areas=grid_areas        # Optional: physical cell areas (m²)
)

# Run tracking and get merge information
tracked_events, merges_ds = event_tracker.run(return_merges=True)

Unstructured Grid Tracking

# For unstructured grids, specify dimensions
dimensions = {'time': 'time', 'x': 'ncells'}
coordinates = {'time': 'time', 'x': 'lon', 'y': 'lat'}

event_tracker = marEx.tracker(
    extremes_ds.extreme_events,
    extremes_ds.mask,
    R_fill=5,                    # Adjusted for unstructured grid
    area_filter_quartile=0.4,
    dimensions=dimensions,
    coordinates=coordinates
)

tracked_events = event_tracker.run()

Parameter Reference

Required Parameters

data_binxarray.DataArray

Binary data to identify and track objects in (True = event, False = background). Must be Dask-backed and boolean type.

maskxarray.DataArray

Binary mask indicating valid regions (True = valid, False = invalid). Must be boolean type.

R_fillint

Radius for filling holes/gaps in spatial domain (in grid cells). Controls morphological operations to fill small gaps in binary objects.

How R_fill Works:

The R_fill parameter controls a two-step morphological cleaning operation: closing followed by opening. This process makes identified events spatially coherent and removes small, isolated noise.

Physical Rationale:

Closing (Fill Holes): Marine heatwaves are generally coherent spatial phenomena. Small “holes” within a larger event are often artefacts of data gridding or minor fluctuations below the threshold. Closing (dilation → erosion) fills these internal gaps, making the event representation physically realistic.
Opening (Remove Noise): Very small, isolated pixels flagged as events are often statistical noise or sensor errors rather than genuine marine heatwaves. Opening (erosion → dilation) removes these spurious objects, ensuring only spatially significant events are tracked.

Visual Process (using R_fill=1 as example):

    Initial State                  After Closing                  After Opening
┌──────────────────────┐       ┌──────────────────────┐       ┌──────────────────────┐
│   █████              │       │   █████              │       │   █████              │
│  ███████             │       │  ███████             │       │  ███████             │
│ ███ █████            │       │ █████████            │       │ █████████            │
│ ███  ████   █        │       │ █████████   █        │       │ █████████            │
│  ████████            │       │  ████████            │       │  ████████            │
│   ███████            │       │   ███████            │       │   ███████            │
│    █████             │       │    █████             │       │    █████             │
│           █          │       │           █          │       │                      │
└──────────────────────┘       └──────────────────────┘       └──────────────────────┘
│                              │                              │
└─ Main event with             └─ Closing fills internal      └─ Opening removes
   2 small holes inside           holes, making event            isolated noise pixels,
   + 2 isolated noise pixels      more coherent                  leaving clean event

Structuring Element (Kernel):

Operations use a disk-shaped kernel with diameter = 2 * R_fill + 1. For R_fill=2, the kernel diameter is 5 pixels:

┌──────────────┐
│      ██      │     Any hole or isolated object smaller
│   ██ ██ ██   │     than this disk will be filled or removed
│██ ██ ██ ██ ██│
│   ██ ██ ██   │
│      ██      │
└──────────────┘

Example: R_fill=8 uses a (circular) disk kernel with diameter 17 pixels, filling holes and removing objects up to ~8 grid cells in radius.

When to Adjust:

Smaller R_fill (e.g., 3-5): Preserves small-scale features; use for coarse data or fine structures
Larger R_fill (e.g., 10-15): Creates more coherent objects; use for noisy data or large-scale events

area_filter_quartilefloat

Quantile (0-1) for filtering smallest objects. For example, 0.25 removes smallest 25% of objects, 0.5 removes smallest 50%. Mutually exclusive with area_filter_absolute.

area_filter_absoluteint

Minimum area (in grid cells) for an object to be retained. For example, 25 keeps only events with 25 or more grid cells. Mutually exclusive with area_filter_quartile.

Core Tracking Parameters

T_fillint, default=2

Number of timesteps for filling temporal gaps (must be even). Allows events to be tracked across short temporal interruptions.

How T_fill Works:

T_fill = Maximum temporal gap (in days) for event continuity

Timeline with T_fill=2:

Day 0: ████████  ← Event detected
Day 1:            ← Gap (no detection)
Day 2:            ← Gap (no detection)  } T_fill=2 days
Day 3: ████████  ← Event detected again
       │        │
       └────────┴─ SAME event ID (gap ≤ T_fill)

Timeline with T_fill=0 (no gap filling):

Day 0: ████████  ← Event ID: 1
Day 1:            ← Gap
Day 2: ████████  ← NEW event ID: 2 (no gap tolerance)

Example: T_fill=2 allows 2-day gaps while maintaining the same event ID.

When to Adjust:

T_fill=0: No gap filling, strict continuity requirement, or for weekly/monthly sampling
T_fill=2-4: Standard gap filling for daily data (recommended)
T_fill=6-10: More permissive for noisy data

allow_mergingbool, default=True

Allow objects to split and merge across time.

True: Apply splitting & merging criteria, track merge events, and maintain original identities
False: Classical connected component labeling with simple time connectivity

nn_partitioningbool, default=True

Use nearest-neighbor cell-based partitioning for merging events (improved algorithm).

True: Partition merged child objects based on closest parent cell (recommended)
False: Use parent centroids for partitioning (legacy method with known issues)

Why This Matters: When objects merge, their new area must be partitioned and assigned back to the original parent identities. The choice of algorithm for this partitioning is critical.

The centroid method partitions based on proximity to parent centroids. This can be problematic for non-convex or L-shaped features, where the centroid may lie far from the feature’s cells, leading to physically unrealistic partitions.

In contrast, the nn (nearest-neighbor) method assigns each cell to the parent of the closest cell from the previous timestep. This results in a more intuitive and physically-based partition that respects the geometry of the parent features.

Visual Comparison:

Scenario: A large C-shaped feature (A) and a small nearby feature (B) merge into a single large object. We compare how centroid-based vs nearest-neighbor methods partition the merged child back to parent lineages.

1. Initial State (Time t)

Two distinct parent features with centroids marked by ●.

  ┌───┬───┬───┬───┬───┐
5 │ A │ A │ A │ A │ A │
  ├───┼───┼───┼───┼───┤
4 │ A │ A │   │   │   │
  ├───┼───┼───┼───┼───┤
3 │ A │ ● │   │ B ● B │  ← Centroid A at (3, 2), Centroid B at (3, 4.5)
  ├───┼───┼───┼───┼───┤
2 │ A │ A │   │   │   │
  ├───┼───┼───┼───┼───┤
1 │ A │ A │ A │ A │ A │
  └───┴───┴───┴───┴───┘
    1   2   3   4   5

2. Merged State (Time t+1)

The features grow and merge into a single large object marked with ‘#’. Question: how should these 20 cells be attributed to lineages A and B ?

  ┌───┬───┬───┬───┬───┐
5 │   │ # │ # │ # │ # │
  ├───┼───┼───┼───┼───┤
4 │ # │ # │ # │   │   │
  ├───┼───┼───┼───┼───┤
3 │ # │ # │ # │ # │ # │
  ├───┼───┼───┼───┼───┤
2 │ # │ # │ # │   │   │
  ├───┼───┼───┼───┼───┤
1 │   │ # │ # │ # │ # │
  └───┴───┴───┴───┴───┘
    1   2   3   4   5

3. Centroid Partition (nn_partitioning=False) - ❌ PROBLEMATIC

Each cell assigned to nearest parent centroid. The partition line ║ creates a geometric boundary that ignores actual cell topology. Result: B is an odd disjoint object now !

  ┌───┬───┬──╦┬───┬───┐
5 │   │ A │ A║│ B │ B │
  ├───┼───┼──╫┼───┼───┤
4 │ A │ A │ A║│   │   │
  ├───┼───┼──╫┼───┼───┤
3 │ A │ A │ A║│ B │ B │
  ├───┼───┼──╫┼───┼───┤
2 │ A │ A │ A║│   │   │
  ├───┼───┼──╫┼───┼───┤
1 │   │ A │ A║│ B │ B │
  └───┴───┴──╩┴───┴───┘
    1   2   3   4   5

Problem: B is an odd disjoint object with portions unrelated to the original parent. The original C-shaped object means that the centroid misrepresents A’s main body, thus creating unrealistic/nonrepresentative partitions.

**4. Nearest-Neighbour Partition (nn_partitioning=True)

Each cell assigned to parent of nearest parent cell. Natural boundaries respect actual spatial connectivity. Result: B remains contiguous (realistic growth pattern).

  ┌───┬───┬───┬───┬───┐
5 │   │ A │ A │ A │ A │
  ├───┼───┼───┼───┼───┤
4 │ A │ A │ A │   │   │
  ├───┼───┼───┼───┼───┤
3 │ A │ A │ B │ B │ B │
  ├───┼───┼───┼───┼───┤
2 │ A │ A │ A │   │   │
  ├───┼───┼───┼───┼───┤
1 │   │ A │ A │ A │ A │
  └───┴───┴───┴───┴───┘
    1   2   3   4   5

Solution: Growth of the resulting merged object is correctly attributed to A’s lineage based on cell-level proximity, not abstract geometric centroids.

Example: Small Object Problem:

# Centroid method (old, has issues)
tracker_centroid = marEx.tracker(
    extremes, mask,
    R_fill=8,
    allow_merging=True,
    nn_partitioning=False  # Small objects get unrealistic portions
)

# Nearest-neighbor method (new, recommended)
tracker_nn = marEx.tracker(
    extremes, mask,
    R_fill=8,
    allow_merging=True,
    nn_partitioning=True   # Realistic partitioning
)

When to Use:

Always use ``nn_partitioning=True`` (default) for accurate merging/splitting
Only use nn_partitioning=False for comparison with legacy results or Sun et al. (2023) methodology

overlap_thresholdfloat, default=0.5

Minimum fraction of overlap between objects to consider them the same event. Fraction of the smaller object’s area that must overlap with the larger object’s area.

How overlap_threshold Works - Two-Stage Process:

═══════════════════════════════════════════════════════════════════════════════
STAGE 1: OVERLAP CHECKING (Determines IF objects are related)
═══════════════════════════════════════════════════════════════════════════════

Formula: overlap_fraction = overlap_area / min(area_parent, area_child)
Decision: IF overlap_fraction >= overlap_threshold → Objects are LINKED

This stage applies to ALL transitions (1→1, 1→many, many→1, many→many)
This stage is INDEPENDENT of nn_partitioning choice

───────────────────────────────────────────────────────────────────────────────
Example 1: Threshold MET (overlap_threshold = 0.5)
───────────────────────────────────────────────────────────────────────────────

Time t=0:          Time t=1:          Overlap Check:

┌──────────┐       ┌─────────┐        overlap_area = 60 cells
│  Object  │       │ Object  │        min(100, 80) = 80
│    A     │   →   │    C    │
│   100    │       │   80    │        overlap_fraction = 60/80 = 0.75
│  cells   │       │  cells  │
└──────────┘       └─────────┘        0.75 >= 0.5 ✓ → LINKED (same event)

───────────────────────────────────────────────────────────────────────────────
Example 2: Threshold NOT MET (overlap_threshold = 0.5)
───────────────────────────────────────────────────────────────────────────────

Time t=0:          Time t=1:          Overlap Check:

┌──────────┐       ┌─────────┐        overlap_area = 30 cells
│  Object  │       │ Object  │        min(100, 80) = 80
│    A     │   ╳   │    C    │
│   100    │       │   80    │        overlap_fraction = 30/80 = 0.375
│  cells   │       │  cells  │
└──────────┘       └─────────┘        0.375 < 0.5 ✗ → NOT LINKED

                                      Result: A terminates, C starts as NEW event

═══════════════════════════════════════════════════════════════════════════════
STAGE 2: PARTITIONING (Determines HOW to divide merged areas)
═══════════════════════════════════════════════════════════════════════════════

This stage ONLY applies when a MERGE is detected (many parents → one child),
which also requires the overlap criterion (Stage 1) to be met for each parent.
This stage does NOT apply to splits (one parent → many children)
Choice of nn_partitioning (True/False) affects ONLY this stage

───────────────────────────────────────────────────────────────────────────────
Example 3: MERGE Scenario - Partitioning Applies
───────────────────────────────────────────────────────────────────────────────

Time t=0:                Time t=1:

┌─────────┐              ┌──────────────┐
│ Object A│              │   Object C   │
│   100   │──┐           │     120      │
│  cells  │  │           │    cells     │
└─────────┘  │           └──────────────┘
             │  MERGE
┌─────────┐  │
│ Object B│──┘           STAGE 1 - Overlap Checking:
│   40    │              ────────────────────────────
│  cells  │              A→C: 80/min(100,120) = 80/100 = 0.8 >= 0.5 ✓ LINKED
└─────────┘              B→C: 35/min(40,120)  = 35/40  = 0.875 >= 0.5 ✓ LINKED

                         Both A and B linked to C → MERGE DETECTED!

                         STAGE 2 - Partitioning (triggered by merge):
                         ─────────────────────────────────────────────
                         Question: How to divide C's 120 cells between A and B lineages?

                         if nn_partitioning=True:  Use nearest parent CELL
                         if nn_partitioning=False: Use nearest parent CENTROID

                         (See nn_partitioning documentation for partition methods)

───────────────────────────────────────────────────────────────────────────────
Example 4: SPLIT Scenario - No Partitioning Occurs
───────────────────────────────────────────────────────────────────────────────

Time t=0:          Time t=1:

┌─────────┐        ┌─────────┐
│ Object A│───┬───→│ Object B│
│   100   │   │    │   60    │
│  cells  │   │    │  cells  │
└─────────┘   │    └─────────┘
              │
              │    ┌─────────┐
              └───→│ Object C│
                   │   50    │
                   │  cells  │
                   └─────────┘

STAGE 1 - Overlap Checking:
────────────────────────────
A→B: 55/min(100,60) = 55/60 = 0.917 >= 0.5 ✓ LINKED
A→C: 45/min(100,50) = 45/50 = 0.9   >= 0.5 ✓ LINKED

One parent linked to two children → SPLIT DETECTED!

STAGE 2 - Partitioning:
────────────────────────
N.B.: NO PARTITIONING occurs for splits!
All objects (A, B, C) grouped into same event with same ID
Object B & C continue to evolve disjoint, yet will ultimately be labelled with the same event ID

Key Principles:

Overlap checking happens FIRST for all object pairs between consecutive timesteps
Partitioning happens SECOND and ONLY when merges are detected (many→one)
nn_partitioning choice does NOT affect which objects are linked in Stage 1
Splits (one→many) do NOT trigger partitioning - all objects share the same event ID

When to Adjust:

Lower threshold (e.g., 0.3): More continuous tracking, may link distant objects
Higher threshold (e.g., 0.7): More conservative, stricter continuity requirement
Default (0.5): Balanced approach for most applications

Grid Configuration Parameters

dimensionsdict, default={‘time’: ‘time’, ‘x’: ‘lon’, ‘y’: ‘lat’}

Mapping of conceptual dimensions to actual dimension names. For unstructured grids, use: {'time': 'time', 'x': 'ncells'}

coordinatesdict, default={‘time’: ‘time’, ‘x’: ‘lon’, ‘y’: ‘lat’}

Mapping of conceptual coordinates to actual coordinate names. For unstructured grids, use: {'time': 'time', 'x': 'lon', 'y': 'lat'}

grid_resolutionfloat, optional

Grid resolution in degrees for structured grids (ignored for unstructured grids). When provided, automatically calculates physical cell areas using spherical geometry. Overrides any provided cell_areas parameter for structured grids.

# Automatic from grid resolution
tracker = marEx.tracker(
    extremes, mask,
    R_fill=8,
    grid_resolution=0.25  # Automatically calculates spherical areas for 0.25° grid
)

# This is equivalent to manually provide 2D data array of cell areas
tracker = marEx.tracker(
    extremes, mask,
    R_fill=8,
    cell_areas=my_calculated_areas  # Manual calculation required
)

The calculation accounts for latitude-dependent cell areas using spherical geometry, providing accurate physical areas (in km²) for object/event size calculations.

cell_areasxr.DataArray, optional

Physical cell areas for area calculations.

For structured grids: Optional. If not provided, defaults to 1.0 for each cell (i.e. cell counts). N.B.: If neither cell_areas nor grid_resolution is provided, areas are in units of cells/pixels. Note: Overridden by grid_resolution if provided.
For unstructured grids: Required for physical area calculations.

Output Data Structure

The tracking algorithm returns an xarray Dataset with the following structure:

Main Tracking Dataset

# tracked_events Dataset structure:
xarray.Dataset
Dimensions: (lat, lon, time, ID, component, sibling_ID)
Coordinates:
    lat         (lat)
    lon         (lon)
    time        (time)
    ID          (ID)
Data variables:
    ID_field              (time, lat, lon)        int32       # Event ID field
    global_ID             (time, ID)              int32       # Global ID mapping
    area                  (time, ID)              float32     # Event areas
    centroid              (component, time, ID)   float32     # Event centroids
    presence              (time, ID)              bool        # Event presence
    time_start            (ID)                    datetime64  # Start times
    time_end              (ID)                    datetime64  # End times
    merge_ledger          (time, ID, sibling_ID)  int32       # Merge information

Key Variables:

ID_field: Binary field with tracked event IDs
global_ID: Unique ID mapping for each event at each time
area: Spatial area of each event through time (in units of cell counts, or physical units if cell_areas/grid_resolution provided)
centroid: (x,y) centroid coordinates of each event
presence: Boolean indicating event presence at each time
time_start/time_end: Temporal bounds of each event
merge_ledger: Sibling IDs for merging events (-1 = no merge)

Merge Information Dataset

# merges_ds Dataset structure (when return_merges=True):
xarray.Dataset
Dimensions: (merge_ID, parent_idx, child_idx)
Data variables:
    parent_IDs      (merge_ID, parent_idx)  int32       # Parent event IDs
    child_IDs       (merge_ID, child_idx)   int32       # Child event IDs
    overlap_areas   (merge_ID, parent_idx)  int32       # Overlap areas
    merge_time      (merge_ID)              datetime64  # Merge timestamps
    n_parents       (merge_ID)              int8        # Number of parents
    n_children      (merge_ID)              int8        # Number of children

Key Variables:

parent_IDs/child_IDs: Original parent and child IDs in merge events
overlap_areas: Spatial overlap between parent and child objects
merge_time: Timestamp of each merge event
n_parents/n_children: Number of objects involved in each merge

Advanced Usage Examples

Analysing Event Properties

# Run tracking
tracked_events, merges_ds = event_tracker.run(return_merges=True)

# Analyse event durations
event_durations = (tracked_events.time_end - tracked_events.time_start).dt.days
long_events = tracked_events.ID.where(event_durations > 10, drop=True)

# Find events present at specific time
time_slice = tracked_events.sel(time='2020-07-15')
active_events = time_slice.ID.where(time_slice.presence, drop=True)

# Calculate maximum event areas
max_areas = tracked_events.area.max(dim='time')
large_events = tracked_events.ID.where(max_areas > threshold, drop=True)

Merge Event Analysis

# Analyse merge events
if merges_ds is not None:
    # Find complex merge events (multiple parents/children)
    complex_merges = merges_ds.where(
        (merges_ds.n_parents > 1) | (merges_ds.n_children > 1),
        drop=True
    )

    # Analyse merge timing
    merge_times = merges_ds.merge_time.values
    seasonal_merges = merges_ds.groupby(merges_ds.merge_time.dt.season).count()

    # Find parent-child relationships
    for merge_id in complex_merges.merge_ID:
        parents = merges_ds.parent_IDs.sel(merge_ID=merge_id).values
        children = merges_ds.child_IDs.sel(merge_ID=merge_id).values
        print(f"Merge {merge_id}: {parents} → {children}")

Visualisation Integration

# Plot tracked events
config = marEx.PlotConfig(
    title='Tracked Marine Heatwave Events',
    plot_IDs=True,          # Special handling for event IDs
    cmap='tab20'            # Discrete colormap
)

# Plot single timestep
fig, ax, im = tracked_events.ID_field.isel(time=0).plotX.single_plot(config)

# Create animation
movie_path = tracked_events.ID_field.plotX.animate(
    config,
    plot_dir='./animations',
    file_name='tracked_events'
)

Performance Optimisation

Chunking Strategy

# Optimal chunking for tracking
chunk_size = {'time': 25, 'lat': -1, 'lon': -1}  # Keep spatial dims together
extremes_ds = xr.open_dataset('extremes.nc', chunks=chunk_size)

# For very large datasets
event_tracker = marEx.tracker(
    extremes_ds.extreme_events,
    extremes_ds.mask,
    R_fill=8,
    area_filter_quartile=0.5,
    # Performance optimizations
    dask_chunks=chunk_size
)

Memory Management

# For memory-constrained environments
event_tracker = marEx.tracker(
    extremes_ds.extreme_events,
    extremes_ds.mask,
    R_fill=6,                    # Smaller fill radius
    area_filter_quartile=0.75,   # Remove more small events
    T_fill=1,                    # Reduce temporal gap filling
    allow_merging=False          # Disable merge tracking for speed
)

Algorithm Details

Tracking Workflow

The tracking algorithm follows these key steps:

Binary Object Identification * Connected component labeling on each time slice * Morphological operations (opening/closing) for noise reduction * Size filtering based on area quartiles
Temporal Matching * Overlap-based matching between consecutive time steps * Handles one-to-one, one-to-many, and many-to-one relationships * Gap filling for short-duration interruptions
Event Lifecycle Management * Track initialisation, continuation, merging, and splitting * Unique ID assignment and genealogy tracking * Statistical property calculation
Post-processing * Merge event documentation * Output dataset construction

Merge/Split Algorithm

The advanced merge/split tracking implements:

# Improved merge partitioning logic:
# - Partition child objects based on nearest parent cell
# - Maintain original event identities across merges
# - Track genealogy of splitting and merging events
# - Use overlap threshold to determine event continuity

Error Handling

Common Issues and Solutions

Memory Errors:

# Solution: Reduce chunk sizes and increase filtering
event_tracker = marEx.tracker(
    data_bin, mask,
    R_fill=6,                  # Smaller fill radius
    area_filter_quartile=0.8,  # Remove more small events
    T_fill=1,                  # Reduce temporal gap filling
    dask_chunks={'time': 15}   # Smaller time chunks
)

Performance Issues:

# Solution: Optimise parameters for your data
event_tracker = marEx.tracker(
    data_bin, mask,
    R_fill=4,                  # Reduce morphological operations
    area_filter_quartile=0.75, # Remove more small events
    allow_merging=False        # Disable merge tracking
)

Tracking Quality Issues:

# Solution: Tune overlap and temporal parameters
event_tracker = marEx.tracker(
    data_bin, mask,
    R_fill=8,
    area_filter_quartile=0.5,
    overlap_threshold=0.3,     # Lower overlap requirement
    T_fill=4,                  # Allow longer temporal gaps
    allow_merging=True         # Enable merge tracking
)

Integration with Preprocessing

Complete Workflow Example

import xarray as xr
import marEx

# Step 1: Preprocess data
sst = xr.open_dataset('sst_daily.nc', chunks={'time': 365}).sst

extremes_ds = marEx.preprocess_data(
    sst,
    method_anomaly='shifting_baseline',
    method_extreme='hobday_extreme',
    threshold_percentile=95,
    window_year_baseline=15,
    smooth_days_baseline=21,
    window_days_hobday=11,
    dask_chunks={'time': 25}
)

# Step 2: Track events
event_tracker = marEx.tracker(
    extremes_ds.extreme_events,
    extremes_ds.mask,
    R_fill=8,
    area_filter_quartile=0.5,
    T_fill=2,
    allow_merging=True,
    overlap_threshold=0.5,
    nn_partitioning=True
)

# Step 3: Run tracking
tracked_events, merges_ds = event_tracker.run(return_merges=True)

# Step 4: Save results
tracked_events.to_netcdf('tracked_events.nc')
if merges_ds is not None:
    merges_ds.to_netcdf('merge_events.nc')

Tracking Module (marEx.track)

Overview

Main Classes

Tracking Algorithm Comparison

Detailed Documentation

Tracker Class

Basic Usage Examples

Simple Event Tracking

Advanced Tracking Configuration

Unstructured Grid Tracking

Parameter Reference

Required Parameters

Core Tracking Parameters

Grid Configuration Parameters

Output Data Structure

Main Tracking Dataset

Merge Information Dataset

Advanced Usage Examples

Analysing Event Properties

Merge Event Analysis

Visualisation Integration

Performance Optimisation

Chunking Strategy

Memory Management

Algorithm Details

Tracking Workflow

Merge/Split Algorithm

Error Handling

Common Issues and Solutions

Integration with Preprocessing

Complete Workflow Example

See Also

Tracking Module (`marEx.track`)