models.numba_optimized

Numba-Optimized Kernels

This module provides Numba-accelerated kernels for the predator-prey cellular automaton, including update kernels and spatial analysis functions.

Classes

PPKernel Wrapper for predator-prey update kernels with pre-allocated buffers.

Cluster Analysis

measure_cluster_sizes_fast # Fast cluster size measurement (sizes only).
detect_clusters_fast # Full cluster detection with labels.
get_cluster_stats_fast # Comprehensive cluster statistics.

Pair Correlation Functions

compute_pcf_periodic_fast # PCF for two position sets with periodic boundaries.
compute_all_pcfs_fast #Compute prey-prey, pred-pred, and prey-pred PCFs.

Utilities

set_numba_seed # Seed Numba's internal RNG.
warmup_numba_kernels # Pre-compile kernels to avoid first-run latency.

Example

from models.numba_optimized import (
    PPKernel,
    get_cluster_stats_fast,
    compute_all_pcfs_fast,
)

# Cluster analysis
stats = get_cluster_stats_fast(grid, species=1)
print(f"Largest cluster: {stats['largest']}")

# PCF computation
pcfs = compute_all_pcfs_fast(grid, max_distance=20.0)
prey_prey_dist, prey_prey_gr, _ = pcfs['prey_prey']

View Source

   1#!/usr/bin/env python3
   2"""
   3Numba-Optimized Kernels
   4=======================
   5
   6This module provides Numba-accelerated kernels for the predator-prey
   7cellular automaton, including update kernels and spatial analysis functions.
   8
   9Classes
  10-------
  11PPKernel
  12    Wrapper for predator-prey update kernels with pre-allocated buffers.
  13
  14Cluster Analysis
  15----------------
  16```python
  17measure_cluster_sizes_fast # Fast cluster size measurement (sizes only).
  18detect_clusters_fast # Full cluster detection with labels.
  19get_cluster_stats_fast # Comprehensive cluster statistics.
  20```
  21
  22Pair Correlation Functions
  23--------------------------
  24```python
  25compute_pcf_periodic_fast # PCF for two position sets with periodic boundaries.
  26compute_all_pcfs_fast #Compute prey-prey, pred-pred, and prey-pred PCFs.
  27```
  28
  29Utilities
  30---------
  31```python
  32set_numba_seed # Seed Numba's internal RNG.
  33warmup_numba_kernels # Pre-compile kernels to avoid first-run latency.
  34```
  35
  36Example
  37-------
  38```python
  39from models.numba_optimized import (
  40    PPKernel,
  41    get_cluster_stats_fast,
  42    compute_all_pcfs_fast,
  43)
  44
  45# Cluster analysis
  46stats = get_cluster_stats_fast(grid, species=1)
  47print(f"Largest cluster: {stats['largest']}")
  48
  49# PCF computation
  50pcfs = compute_all_pcfs_fast(grid, max_distance=20.0)
  51prey_prey_dist, prey_prey_gr, _ = pcfs['prey_prey']
  52```
  53"""
  54
  55import numpy as np
  56from typing import Tuple, Dict, Optional
  57
  58try:
  59    from numba import njit, prange
  60
  61    NUMBA_AVAILABLE = True
  62except ImportError:
  63    NUMBA_AVAILABLE = False
  64
  65    def njit(*args, **kwargs):
  66        def decorator(func):
  67            return func
  68
  69        return decorator
  70
  71    def prange(*args):
  72        return range(*args)
  73
  74
  75# ============================================================================
  76# RNG SEEDING
  77# ============================================================================
  78
  79
  80@njit(cache=True)
  81def set_numba_seed(seed: int) -> None:
  82    """
  83    Seed Numba's internal random number generator from within a JIT context.
  84
  85    This function ensures that Numba's independent random number generator
  86    is synchronized with the provided seed, enabling reproducibility for
  87    jit-compiled functions that use NumPy's random operations.
  88
  89    Parameters
  90    ----------
  91    seed : int
  92        The integer value used to initialize the random number generator.
  93
  94    Returns
  95    -------
  96    None
  97
  98    Notes
  99    -----
 100    Because Numba maintains its own internal state for random number
 101    generation, calling `np.random.seed()` in standard Python code will not
 102    affect jit-compiled functions. This helper must be called to bridge
 103    that gap.
 104    """
 105    np.random.seed(seed)
 106
 107
 108# ============================================================================
 109# PREDATOR-PREY KERNELS
 110# ============================================================================
 111
 112
 113@njit(cache=True)
 114def _pp_async_kernel_random(
 115    grid: np.ndarray,
 116    prey_death_arr: np.ndarray,
 117    p_birth_val: float,
 118    p_death_val: float,
 119    pred_birth_val: float,
 120    pred_death_val: float,
 121    dr_arr: np.ndarray,
 122    dc_arr: np.ndarray,
 123    evolve_sd: float,
 124    evolve_min: float,
 125    evolve_max: float,
 126    evolution_stopped: bool,
 127    occupied_buffer: np.ndarray,
 128) -> np.ndarray:
 129    """
 130    Asynchronous predator-prey update kernel with random neighbor selection.
 131
 132    This Numba-accelerated kernel performs an asynchronous update of the
 133    simulation grid. It identifies all occupied cells, shuffles them to
 134    ensure unbiased processing, and applies stochastic rules for prey
 135    mortality, prey reproduction (with optional parameter evolution),
 136    predator mortality, and predation.
 137
 138    Parameters
 139    ----------
 140    grid : np.ndarray
 141        2D integer array representing the simulation grid (0: Empty, 1: Prey, 2: Predator).
 142    prey_death_arr : np.ndarray
 143        2D float array storing the individual prey death rates for evolution tracking.
 144    p_birth_val : float
 145        Base probability of prey reproduction into an adjacent empty cell.
 146    p_death_val : float
 147        Base probability of prey death (though individual rates in `prey_death_arr` are used).
 148    pred_birth_val : float
 149        Probability of a predator reproducing after consuming prey.
 150    pred_death_val : float
 151        Probability of a predator dying.
 152    dr_arr : np.ndarray
 153        Array of row offsets defining the neighborhood.
 154    dc_arr : np.ndarray
 155        Array of column offsets defining the neighborhood.
 156    evolve_sd : float
 157        Standard deviation of the mutation applied to the prey death rate during reproduction.
 158    evolve_min : float
 159        Lower bound for the evolved prey death rate.
 160    evolve_max : float
 161        Upper bound for the evolved prey death rate.
 162    evolution_stopped : bool
 163        If True, offspring inherit the parent's death rate without mutation.
 164    occupied_buffer : np.ndarray
 165        Pre-allocated 2D array used to store and shuffle coordinates of occupied cells.
 166
 167    Returns
 168    -------
 169    grid : np.ndarray
 170        The updated simulation grid.
 171
 172    Notes
 173    -----
 174    The kernel uses periodic boundary conditions. The Fisher-Yates shuffle on
 175    `occupied_buffer` ensures that the asynchronous updates do not introduce
 176    directional bias.
 177    """
 178    rows, cols = grid.shape
 179    n_shifts = len(dr_arr)
 180
 181    # Collect occupied cells
 182    count = 0
 183    for r in range(rows):
 184        for c in range(cols):
 185            if grid[r, c] != 0:
 186                occupied_buffer[count, 0] = r
 187                occupied_buffer[count, 1] = c
 188                count += 1
 189
 190    # Fisher-Yates shuffle
 191    for i in range(count - 1, 0, -1):
 192        j = np.random.randint(0, i + 1)
 193        occupied_buffer[i, 0], occupied_buffer[j, 0] = (
 194            occupied_buffer[j, 0],
 195            occupied_buffer[i, 0],
 196        )
 197        occupied_buffer[i, 1], occupied_buffer[j, 1] = (
 198            occupied_buffer[j, 1],
 199            occupied_buffer[i, 1],
 200        )
 201
 202    # Process each occupied cell
 203    for i in range(count):
 204        r = occupied_buffer[i, 0]
 205        c = occupied_buffer[i, 1]
 206
 207        state = grid[r, c]
 208        if state == 0:
 209            continue
 210
 211        # Random neighbor selection
 212        nbi = np.random.randint(0, n_shifts)
 213        nr = (r + dr_arr[nbi]) % rows
 214        nc = (c + dc_arr[nbi]) % cols
 215
 216        if state == 1:  # PREY
 217            if np.random.random() < prey_death_arr[r, c]:
 218                grid[r, c] = 0
 219                prey_death_arr[r, c] = np.nan
 220            elif grid[nr, nc] == 0:
 221                if np.random.random() < p_birth_val:
 222                    grid[nr, nc] = 1
 223                    parent_val = prey_death_arr[r, c]
 224                    if not evolution_stopped:
 225                        child_val = parent_val + np.random.normal(0, evolve_sd)
 226                        if child_val < evolve_min:
 227                            child_val = evolve_min
 228                        if child_val > evolve_max:
 229                            child_val = evolve_max
 230                        prey_death_arr[nr, nc] = child_val
 231                    else:
 232                        prey_death_arr[nr, nc] = parent_val
 233
 234        elif state == 2:  # PREDATOR
 235            if np.random.random() < pred_death_val:
 236                grid[r, c] = 0
 237            elif grid[nr, nc] == 1:
 238                if np.random.random() < pred_birth_val:
 239                    grid[nr, nc] = 2
 240                    prey_death_arr[nr, nc] = np.nan
 241
 242    return grid
 243
 244
 245@njit(cache=True)
 246def _pp_async_kernel_directed(
 247    grid: np.ndarray,
 248    prey_death_arr: np.ndarray,
 249    p_birth_val: float,
 250    p_death_val: float,
 251    pred_birth_val: float,
 252    pred_death_val: float,
 253    dr_arr: np.ndarray,
 254    dc_arr: np.ndarray,
 255    evolve_sd: float,
 256    evolve_min: float,
 257    evolve_max: float,
 258    evolution_stopped: bool,
 259    occupied_buffer: np.ndarray,
 260) -> np.ndarray:
 261    """
 262    Asynchronous predator-prey update kernel with directed behavior.
 263
 264    This kernel implements "intelligent" species behavior: prey actively search
 265    for empty spaces to reproduce, and predators actively search for nearby
 266    prey to hunt. A two-pass approach is used to stochastically select a
 267    valid target from the neighborhood without heap allocation.
 268
 269    Parameters
 270    ----------
 271    grid : np.ndarray
 272        2D integer array representing the simulation grid (0: Empty, 1: Prey, 2: Predator).
 273    prey_death_arr : np.ndarray
 274        2D float array storing individual prey mortality rates for evolution.
 275    p_birth_val : float
 276        Probability of prey reproduction attempt.
 277    p_death_val : float
 278        Base probability of prey mortality.
 279    pred_birth_val : float
 280        Probability of a predator reproduction attempt (hunting success).
 281    pred_death_val : float
 282        Probability of predator mortality.
 283    dr_arr : np.ndarray
 284        Row offsets defining the spatial neighborhood (e.g., Moore or von Neumann).
 285    dc_arr : np.ndarray
 286        Column offsets defining the spatial neighborhood.
 287    evolve_sd : float
 288        Standard deviation for mutations in prey death rates.
 289    evolve_min : float
 290        Minimum allowable value for evolved prey death rates.
 291    evolve_max : float
 292        Maximum allowable value for evolved prey death rates.
 293    evolution_stopped : bool
 294        If True, prevents mutation during prey reproduction.
 295    occupied_buffer : np.ndarray
 296        Pre-allocated array for storing and shuffling active cell coordinates.
 297
 298    Returns
 299    -------
 300    grid : np.ndarray
 301        The updated simulation grid.
 302
 303    Notes
 304    -----
 305    The directed behavior significantly changes the system dynamics compared to
 306    random neighbor selection, often leading to different critical thresholds
 307    and spatial patterning. Periodic boundary conditions are applied.
 308    """
 309    rows, cols = grid.shape
 310    n_shifts = len(dr_arr)
 311
 312    # Collect occupied cells
 313    count = 0
 314    for r in range(rows):
 315        for c in range(cols):
 316            if grid[r, c] != 0:
 317                occupied_buffer[count, 0] = r
 318                occupied_buffer[count, 1] = c
 319                count += 1
 320
 321    # Fisher-Yates shuffle
 322    for i in range(count - 1, 0, -1):
 323        j = np.random.randint(0, i + 1)
 324        occupied_buffer[i, 0], occupied_buffer[j, 0] = (
 325            occupied_buffer[j, 0],
 326            occupied_buffer[i, 0],
 327        )
 328        occupied_buffer[i, 1], occupied_buffer[j, 1] = (
 329            occupied_buffer[j, 1],
 330            occupied_buffer[i, 1],
 331        )
 332
 333    # Process each occupied cell
 334    for i in range(count):
 335        r = occupied_buffer[i, 0]
 336        c = occupied_buffer[i, 1]
 337
 338        state = grid[r, c]
 339        if state == 0:
 340            continue
 341
 342        if state == 1:  # PREY - directed reproduction into empty cells
 343            # Check for death first
 344            if np.random.random() < prey_death_arr[r, c]:
 345                grid[r, c] = 0
 346                prey_death_arr[r, c] = np.nan
 347                continue
 348
 349            # Attempt reproduction with directed selection
 350            if np.random.random() < p_birth_val:
 351                # Pass 1: Count empty neighbors
 352                empty_count = 0
 353                for k in range(n_shifts):
 354                    check_r = (r + dr_arr[k]) % rows
 355                    check_c = (c + dc_arr[k]) % cols
 356                    if grid[check_r, check_c] == 0:
 357                        empty_count += 1
 358
 359                # Pass 2: Select random empty neighbor
 360                if empty_count > 0:
 361                    target_idx = np.random.randint(0, empty_count)
 362                    found = 0
 363                    nr, nc = r, c  # Initialize (will be overwritten)
 364                    for k in range(n_shifts):
 365                        check_r = (r + dr_arr[k]) % rows
 366                        check_c = (c + dc_arr[k]) % cols
 367                        if grid[check_r, check_c] == 0:
 368                            if found == target_idx:
 369                                nr, nc = check_r, check_c
 370                                break
 371                            found += 1
 372
 373                    # Reproduce into selected empty cell
 374                    grid[nr, nc] = 1
 375                    parent_val = prey_death_arr[r, c]
 376                    if not evolution_stopped:
 377                        child_val = parent_val + np.random.normal(0, evolve_sd)
 378                        if child_val < evolve_min:
 379                            child_val = evolve_min
 380                        if child_val > evolve_max:
 381                            child_val = evolve_max
 382                        prey_death_arr[nr, nc] = child_val
 383                    else:
 384                        prey_death_arr[nr, nc] = parent_val
 385
 386        elif state == 2:  # PREDATOR - directed hunting
 387            # Check for death first
 388            if np.random.random() < pred_death_val:
 389                grid[r, c] = 0
 390                continue
 391
 392            # Attempt hunting with directed selection
 393            if np.random.random() < pred_birth_val:
 394                # Pass 1: Count prey neighbors
 395                prey_count = 0
 396                for k in range(n_shifts):
 397                    check_r = (r + dr_arr[k]) % rows
 398                    check_c = (c + dc_arr[k]) % cols
 399                    if grid[check_r, check_c] == 1:
 400                        prey_count += 1
 401
 402                # Pass 2: Select random prey neighbor
 403                if prey_count > 0:
 404                    target_idx = np.random.randint(0, prey_count)
 405                    found = 0
 406                    nr, nc = r, c  # Initialize (will be overwritten)
 407                    for k in range(n_shifts):
 408                        check_r = (r + dr_arr[k]) % rows
 409                        check_c = (c + dc_arr[k]) % cols
 410                        if grid[check_r, check_c] == 1:
 411                            if found == target_idx:
 412                                nr, nc = check_r, check_c
 413                                break
 414                            found += 1
 415
 416                    # Hunt: prey cell becomes predator
 417                    grid[nr, nc] = 2
 418                    prey_death_arr[nr, nc] = np.nan
 419
 420    return grid
 421
 422
 423class PPKernel:
 424    """
 425    Wrapper for predator-prey kernel with pre-allocated buffers.
 426
 427    This class manages the spatial configuration and memory buffers required
 428    for the Numba-accelerated update kernels. By pre-allocating the
 429    `occupied_buffer`, it avoids expensive memory allocations during the
 430    simulation loop.
 431
 432    Parameters
 433    ----------
 434    rows : int
 435        Number of rows in the simulation grid.
 436    cols : int
 437        Number of columns in the simulation grid.
 438    neighborhood : {'moore', 'von_neumann'}, optional
 439        The neighborhood type determining adjacent cells. 'moore' includes
 440        diagonals (8 neighbors), 'von_neumann' does not (4 neighbors).
 441        Default is 'moore'.
 442    directed_hunting : bool, optional
 443        If True, uses the directed behavior kernel where species search for
 444        targets. If False, uses random neighbor selection. Default is False.
 445
 446    Attributes
 447    ----------
 448    rows : int
 449        Grid row count.
 450    cols : int
 451        Grid column count.
 452    directed_hunting : bool
 453        Toggle for intelligent behavior logic.
 454    """
 455
 456    def __init__(
 457        self,
 458        rows: int,
 459        cols: int,
 460        neighborhood: str = "moore",
 461        directed_hunting: bool = False,
 462    ):
 463        self.rows = rows
 464        self.cols = cols
 465        self.directed_hunting = directed_hunting
 466        self._occupied_buffer = np.empty((rows * cols, 2), dtype=np.int32)
 467
 468        if neighborhood == "moore":
 469            self._dr = np.array([-1, -1, -1, 0, 0, 1, 1, 1], dtype=np.int32)
 470            self._dc = np.array([-1, 0, 1, -1, 1, -1, 0, 1], dtype=np.int32)
 471        else:  # von Neumann
 472            self._dr = np.array([-1, 1, 0, 0], dtype=np.int32)
 473            self._dc = np.array([0, 0, -1, 1], dtype=np.int32)
 474
 475    def update(
 476        self,
 477        grid: np.ndarray,
 478        prey_death_arr: np.ndarray,
 479        prey_birth: float,
 480        prey_death: float,
 481        pred_birth: float,
 482        pred_death: float,
 483        evolve_sd: float = 0.1,
 484        evolve_min: float = 0.001,
 485        evolve_max: float = 0.1,
 486        evolution_stopped: bool = True,
 487    ) -> np.ndarray:
 488        """
 489        Execute a single asynchronous update step using the configured kernel.
 490
 491        Parameters
 492        ----------
 493        grid : np.ndarray
 494            The current 2D simulation grid.
 495        prey_death_arr : np.ndarray
 496            2D array of individual prey mortality rates.
 497        prey_birth : float
 498            Prey reproduction probability.
 499        prey_death : float
 500            Base prey mortality probability.
 501        pred_birth : float
 502            Predator reproduction (hunting success) probability.
 503        pred_death : float
 504            Predator mortality probability.
 505        evolve_sd : float, optional
 506            Mutation standard deviation (default 0.1).
 507        evolve_min : float, optional
 508            Minimum evolved death rate (default 0.001).
 509        evolve_max : float, optional
 510            Maximum evolved death rate (default 0.1).
 511        evolution_stopped : bool, optional
 512            Whether to disable mutation during this step (default True).
 513
 514        Returns
 515        -------
 516        np.ndarray
 517            The updated grid after one full asynchronous pass.
 518        """
 519        if self.directed_hunting:
 520            return _pp_async_kernel_directed(
 521                grid,
 522                prey_death_arr,
 523                prey_birth,
 524                prey_death,
 525                pred_birth,
 526                pred_death,
 527                self._dr,
 528                self._dc,
 529                evolve_sd,
 530                evolve_min,
 531                evolve_max,
 532                evolution_stopped,
 533                self._occupied_buffer,
 534            )
 535        else:
 536            return _pp_async_kernel_random(
 537                grid,
 538                prey_death_arr,
 539                prey_birth,
 540                prey_death,
 541                pred_birth,
 542                pred_death,
 543                self._dr,
 544                self._dc,
 545                evolve_sd,
 546                evolve_min,
 547                evolve_max,
 548                evolution_stopped,
 549                self._occupied_buffer,
 550            )
 551
 552
 553# ============================================================================
 554# CLUSTER DETECTION (ENHANCED)
 555# ============================================================================
 556
 557
 558@njit(cache=True)
 559def _flood_fill(
 560    grid: np.ndarray,
 561    visited: np.ndarray,
 562    start_r: int,
 563    start_c: int,
 564    target: int,
 565    rows: int,
 566    cols: int,
 567    moore: bool,
 568) -> int:
 569    """
 570    Perform a stack-based flood fill to measure the size of a connected cluster.
 571
 572    This Numba-accelerated function identifies all contiguous cells of a
 573    specific target value starting from a given coordinate. It supports
 574    both Moore and von Neumann neighborhoods and implements periodic
 575    boundary conditions (toroidal topology).
 576
 577    Parameters
 578    ----------
 579    grid : np.ndarray
 580        2D integer array representing the simulation environment.
 581    visited : np.ndarray
 582        2D boolean array tracked across calls to avoid re-processing cells.
 583    start_r : int
 584        Starting row index for the flood fill.
 585    start_c : int
 586        Starting column index for the flood fill.
 587    target : int
 588        The cell value (e.g., 1 for Prey, 2 for Predator) to include in the cluster.
 589    rows : int
 590        Total number of rows in the grid.
 591    cols : int
 592        Total number of columns in the grid.
 593    moore : bool
 594        If True, use a Moore neighborhood (8 neighbors). If False, use a
 595        von Neumann neighborhood (4 neighbors).
 596
 597    Returns
 598    -------
 599    size : int
 600        The total number of connected cells belonging to the cluster.
 601
 602    Notes
 603    -----
 604    The function uses a manual stack implementation to avoid recursion limit
 605    issues and is optimized for use within JIT-compiled loops.
 606    """
 607    max_stack = rows * cols
 608    stack_r = np.empty(max_stack, dtype=np.int32)
 609    stack_c = np.empty(max_stack, dtype=np.int32)
 610    stack_ptr = 0
 611
 612    stack_r[stack_ptr] = start_r
 613    stack_c[stack_ptr] = start_c
 614    stack_ptr += 1
 615    visited[start_r, start_c] = True
 616
 617    size = 0
 618
 619    if moore:
 620        dr = np.array([-1, -1, -1, 0, 0, 1, 1, 1], dtype=np.int32)
 621        dc = np.array([-1, 0, 1, -1, 1, -1, 0, 1], dtype=np.int32)
 622        n_neighbors = 8
 623    else:
 624        dr = np.array([-1, 1, 0, 0], dtype=np.int32)
 625        dc = np.array([0, 0, -1, 1], dtype=np.int32)
 626        n_neighbors = 4
 627
 628    while stack_ptr > 0:
 629        stack_ptr -= 1
 630        r = stack_r[stack_ptr]
 631        c = stack_c[stack_ptr]
 632        size += 1
 633
 634        for k in range(n_neighbors):
 635            nr = (r + dr[k]) % rows
 636            nc = (c + dc[k]) % cols
 637
 638            if not visited[nr, nc] and grid[nr, nc] == target:
 639                visited[nr, nc] = True
 640                stack_r[stack_ptr] = nr
 641                stack_c[stack_ptr] = nc
 642                stack_ptr += 1
 643
 644    return size
 645
 646
 647@njit(cache=True)
 648def _measure_clusters(grid: np.ndarray, species: int, moore: bool = True) -> np.ndarray:
 649    """
 650    Identify and measure the sizes of all connected clusters for a specific species.
 651
 652    This function scans the entire grid and initiates a flood-fill algorithm
 653    whenever an unvisited cell of the target species is encountered. It
 654    returns an array containing the size (cell count) of each identified cluster.
 655
 656    Parameters
 657    ----------
 658    grid : np.ndarray
 659        2D integer array representing the simulation environment.
 660    species : int
 661        The target species identifier (e.g., 1 for Prey, 2 for Predator).
 662    moore : bool, optional
 663        Determines the connectivity logic. If True, uses the Moore neighborhood
 664        (8 neighbors); if False, uses the von Neumann neighborhood (4 neighbors).
 665        Default is True.
 666
 667    Returns
 668    -------
 669    cluster_sizes : np.ndarray
 670        A 1D array of integers where each element represents the size of
 671        one connected cluster.
 672
 673    Notes
 674    -----
 675    This function is Numba-optimized and utilizes an internal `visited` mask
 676    to ensure each cell is processed only once, maintaining $O(N)$
 677    complexity relative to the number of cells.
 678    """
 679    rows, cols = grid.shape
 680    visited = np.zeros((rows, cols), dtype=np.bool_)
 681
 682    max_clusters = rows * cols
 683    sizes = np.empty(max_clusters, dtype=np.int32)
 684    n_clusters = 0
 685
 686    for r in range(rows):
 687        for c in range(cols):
 688            if grid[r, c] == species and not visited[r, c]:
 689                size = _flood_fill(grid, visited, r, c, species, rows, cols, moore)
 690                sizes[n_clusters] = size
 691                n_clusters += 1
 692
 693    return sizes[:n_clusters]
 694
 695
 696@njit(cache=True)
 697def _detect_clusters_numba(
 698    grid: np.ndarray,
 699    species: int,
 700    moore: bool,
 701) -> Tuple[np.ndarray, np.ndarray]:
 702    """
 703    Full cluster detection returning labels and sizes (Numba-accelerated).
 704
 705    Returns:
 706        labels: 2D int32 array where each cell contains its cluster ID (0 = non-target)
 707        sizes: 1D int32 array of cluster sizes (index i = size of cluster i+1)
 708    """
 709    rows, cols = grid.shape
 710    labels = np.zeros((rows, cols), dtype=np.int32)
 711
 712    if moore:
 713        dr = np.array([-1, -1, -1, 0, 0, 1, 1, 1], dtype=np.int32)
 714        dc = np.array([-1, 0, 1, -1, 1, -1, 0, 1], dtype=np.int32)
 715        n_neighbors = 8
 716    else:
 717        dr = np.array([-1, 1, 0, 0], dtype=np.int32)
 718        dc = np.array([0, 0, -1, 1], dtype=np.int32)
 719        n_neighbors = 4
 720
 721    max_clusters = rows * cols
 722    sizes = np.empty(max_clusters, dtype=np.int32)
 723    n_clusters = 0
 724    current_label = 1
 725
 726    max_stack = rows * cols
 727    stack_r = np.empty(max_stack, dtype=np.int32)
 728    stack_c = np.empty(max_stack, dtype=np.int32)
 729
 730    for start_r in range(rows):
 731        for start_c in range(cols):
 732            if grid[start_r, start_c] != species or labels[start_r, start_c] != 0:
 733                continue
 734
 735            stack_ptr = 0
 736            stack_r[stack_ptr] = start_r
 737            stack_c[stack_ptr] = start_c
 738            stack_ptr += 1
 739            labels[start_r, start_c] = current_label
 740            size = 0
 741
 742            while stack_ptr > 0:
 743                stack_ptr -= 1
 744                r = stack_r[stack_ptr]
 745                c = stack_c[stack_ptr]
 746                size += 1
 747
 748                for k in range(n_neighbors):
 749                    nr = (r + dr[k]) % rows
 750                    nc = (c + dc[k]) % cols
 751
 752                    if grid[nr, nc] == species and labels[nr, nc] == 0:
 753                        labels[nr, nc] = current_label
 754                        stack_r[stack_ptr] = nr
 755                        stack_c[stack_ptr] = nc
 756                        stack_ptr += 1
 757
 758            sizes[n_clusters] = size
 759            n_clusters += 1
 760            current_label += 1
 761
 762    return labels, sizes[:n_clusters]
 763
 764
 765# ============================================================================
 766# PUBLIC API - CLUSTER DETECTION
 767# ============================================================================
 768
 769
 770def measure_cluster_sizes_fast(
 771    grid: np.ndarray,
 772    species: int,
 773    neighborhood: str = "moore",
 774) -> np.ndarray:
 775    """
 776    Measure cluster sizes for a specific species using Numba-accelerated flood fill.
 777
 778    This function provides a high-performance interface for calculating cluster
 779    size statistics without the overhead of generating a full label map. It is
 780    optimized for large-scale simulation analysis where only distribution
 781    metrics (e.g., mean size, max size) are required.
 782
 783    Parameters
 784    ----------
 785    grid : np.ndarray
 786        A 2D array representing the simulation environment.
 787    species : int
 788        The target species identifier (e.g., 1 for Prey, 2 for Predator).
 789    neighborhood : {'moore', 'neumann'}, optional
 790        The connectivity rule. 'moore' uses 8-way connectivity (including diagonals);
 791        'neumann' uses 4-way connectivity. Default is 'moore'.
 792
 793    Returns
 794    -------
 795    cluster_sizes : np.ndarray
 796        A 1D array of integers, where each element is the cell count of an
 797        identified cluster.
 798
 799    Notes
 800    -----
 801    The input grid is cast to `int32` to ensure compatibility with the
 802    underlying JIT-compiled `_measure_clusters` kernel.
 803
 804    Examples
 805    --------
 806    >>> sizes = measure_cluster_sizes_fast(grid, species=1, neighborhood='moore')
 807    >>> if sizes.size > 0:
 808    ...     print(f"Largest cluster: {sizes.max()}")
 809    """
 810    grid_int = np.asarray(grid, dtype=np.int32)
 811    moore = neighborhood == "moore"
 812    return _measure_clusters(grid_int, np.int32(species), moore)
 813
 814
 815def detect_clusters_fast(
 816    grid: np.ndarray,
 817    species: int,
 818    neighborhood: str = "moore",
 819) -> Tuple[np.ndarray, Dict[int, int]]:
 820    """
 821    Perform full cluster detection with labels using Numba acceleration.
 822
 823    This function returns a label array for spatial analysis and a dictionary
 824    of cluster sizes. It is significantly faster than standard Python or
 825    SciPy equivalents for large simulation grids.
 826
 827    Parameters
 828    ----------
 829    grid : np.ndarray
 830        A 2D array representing the simulation environment.
 831    species : int
 832        The target species identifier (e.g., 1 for Prey, 2 for Predator).
 833    neighborhood : {'moore', 'neumann'}, optional
 834        The connectivity rule. 'moore' uses 8-way connectivity; 'neumann'
 835        uses 4-way connectivity. Default is 'moore'.
 836
 837    Returns
 838    -------
 839    labels : np.ndarray
 840        A 2D int32 array where each cell contains its unique cluster ID.
 841        Cells not belonging to the target species are 0.
 842    sizes : dict
 843        A dictionary mapping cluster IDs to their respective cell counts.
 844
 845    Notes
 846    -----
 847    The underlying Numba kernel uses a stack-based flood fill to avoid
 848    recursion limits and handles periodic boundary conditions.
 849
 850    Examples
 851    --------
 852    >>> labels, sizes = detect_clusters_fast(grid, species=1)
 853    >>> if sizes:
 854    ...     largest_id = max(sizes, key=sizes.get)
 855    ...     print(f"Cluster {largest_id} size: {sizes[largest_id]}")
 856    """
 857    grid_int = np.asarray(grid, dtype=np.int32)
 858    moore = neighborhood == "moore"
 859    labels, sizes_arr = _detect_clusters_numba(grid_int, np.int32(species), moore)
 860    sizes_dict = {i + 1: int(sizes_arr[i]) for i in range(len(sizes_arr))}
 861    return labels, sizes_dict
 862
 863
 864def get_cluster_stats_fast(
 865    grid: np.ndarray,
 866    species: int,
 867    neighborhood: str = "moore",
 868) -> Dict:
 869    """
 870    Compute comprehensive cluster statistics for a species using Numba acceleration.
 871
 872    This function integrates cluster detection and labeling to provide a
 873    full suite of spatial metrics. It calculates the cluster size distribution
 874    and the largest cluster fraction, which often serves as an order
 875    parameter in percolation theory and Phase 1-3 analyses.
 876
 877    Parameters
 878    ----------
 879    grid : np.ndarray
 880        A 2D array representing the simulation environment.
 881    species : int
 882        The target species identifier (e.g., 1 for Prey, 2 for Predator).
 883    neighborhood : {'moore', 'neumann'}, optional
 884        The connectivity rule. 'moore' uses 8-way connectivity; 'neumann'
 885        uses 4-way connectivity. Default is 'moore'.
 886
 887    Returns
 888    -------
 889    stats : dict
 890        A dictionary containing:
 891        - 'n_clusters': Total count of isolated clusters.
 892        - 'sizes': Sorted array (descending) of all cluster sizes.
 893        - 'largest': Size of the single largest cluster.
 894        - 'largest_fraction': Size of the largest cluster divided by
 895          the total population of the species.
 896        - 'mean_size': Average size of all clusters.
 897        - 'size_distribution': Frequency mapping of {size: count}.
 898        - 'labels': 2D array of unique cluster IDs.
 899        - 'size_dict': Mapping of {label_id: size}.
 900
 901    Examples
 902    --------
 903    >>> stats = get_cluster_stats_fast(grid, species=1)
 904    >>> print(f"Found {stats['n_clusters']} prey clusters.")
 905    >>> print(f"Order parameter: {stats['largest_fraction']:.3f}")
 906    """
 907    labels, size_dict = detect_clusters_fast(grid, species, neighborhood)
 908
 909    if len(size_dict) == 0:
 910        return {
 911            "n_clusters": 0,
 912            "sizes": np.array([], dtype=np.int32),
 913            "largest": 0,
 914            "largest_fraction": 0.0,
 915            "mean_size": 0.0,
 916            "size_distribution": {},
 917            "labels": labels,
 918            "size_dict": size_dict,
 919        }
 920
 921    sizes = np.array(list(size_dict.values()), dtype=np.int32)
 922    sizes_sorted = np.sort(sizes)[::-1]
 923    total_pop = int(np.sum(sizes))
 924    largest = int(sizes_sorted[0])
 925
 926    size_dist = {}
 927    for s in sizes:
 928        s_int = int(s)
 929        size_dist[s_int] = size_dist.get(s_int, 0) + 1
 930
 931    return {
 932        "n_clusters": len(size_dict),
 933        "sizes": sizes_sorted,
 934        "largest": largest,
 935        "largest_fraction": float(largest) / total_pop if total_pop > 0 else 0.0,
 936        "mean_size": float(np.mean(sizes)),
 937        "size_distribution": size_dist,
 938        "labels": labels,
 939        "size_dict": size_dict,
 940    }
 941
 942
 943# ============================================================================
 944# PCF COMPUTATION (Cell-list accelerated)
 945# ============================================================================
 946
 947
 948@njit(cache=True)
 949def _build_cell_list(
 950    positions: np.ndarray,
 951    n_cells: int,
 952    L_row: float,
 953    L_col: float,
 954) -> Tuple[np.ndarray, np.ndarray, np.ndarray, float, float]:
 955    """
 956    Build a cell list for spatial hashing to accelerate neighbor lookups.
 957
 958    This Numba-optimized function partitions a set of coordinates into a
 959    grid of cells. It uses a three-pass approach to calculate cell occupancy,
 960    compute starting offsets for each cell in a flat index array, and finally
 961    populate that array with position indices.
 962
 963    Parameters
 964    ----------
 965    positions : np.ndarray
 966        An (N, 2) float array of coordinates within the simulation domain.
 967    n_cells : int
 968        The number of cells along one dimension of the square grid.
 969    L_row : float
 970        The total height (row extent) of the simulation domain.
 971    L_col : float
 972        The total width (column extent) of the simulation domain.
 973
 974    Returns
 975    -------
 976    indices : np.ndarray
 977        A 1D array of original position indices, reordered so that indices
 978        belonging to the same cell are contiguous.
 979    offsets : np.ndarray
 980        A 2D array where `offsets[r, c]` is the starting index in the
 981        `indices` array for cell (r, c).
 982    cell_counts : np.ndarray
 983        A 2D array where `cell_counts[r, c]` is the number of points
 984        contained in cell (r, c).
 985    cell_size_r : float
 986        The calculated height of an individual cell.
 987    cell_size_c : float
 988        The calculated width of an individual cell.
 989
 990    Notes
 991    -----
 992    This implementation assumes periodic boundary conditions via the
 993    modulo operator during coordinate-to-cell mapping. It is designed to
 994    eliminate heap allocations within the main simulation loop by using
 995    Numba's efficient array handling.
 996    """
 997    n_pos = len(positions)
 998    cell_size_r = L_row / n_cells
 999    cell_size_c = L_col / n_cells
1000
1001    cell_counts = np.zeros((n_cells, n_cells), dtype=np.int32)
1002    for i in range(n_pos):
1003        cr = int(positions[i, 0] / cell_size_r) % n_cells
1004        cc = int(positions[i, 1] / cell_size_c) % n_cells
1005        cell_counts[cr, cc] += 1
1006
1007    offsets = np.zeros((n_cells, n_cells), dtype=np.int32)
1008    running = 0
1009    for cr in range(n_cells):
1010        for cc in range(n_cells):
1011            offsets[cr, cc] = running
1012            running += cell_counts[cr, cc]
1013
1014    indices = np.empty(n_pos, dtype=np.int32)
1015    fill_counts = np.zeros((n_cells, n_cells), dtype=np.int32)
1016    for i in range(n_pos):
1017        cr = int(positions[i, 0] / cell_size_r) % n_cells
1018        cc = int(positions[i, 1] / cell_size_c) % n_cells
1019        idx = offsets[cr, cc] + fill_counts[cr, cc]
1020        indices[idx] = i
1021        fill_counts[cr, cc] += 1
1022
1023    return indices, offsets, cell_counts, cell_size_r, cell_size_c
1024
1025
1026@njit(cache=True)
1027def _periodic_dist_sq(
1028    r1: float,
1029    c1: float,
1030    r2: float,
1031    c2: float,
1032    L_row: float,
1033    L_col: float,
1034) -> float:
1035    """
1036    Calculate the squared Euclidean distance between two points with periodic boundary conditions.
1037
1038    This Numba-optimized function accounts for toroidal topology by finding the
1039    shortest path between coordinates across the grid edges. Using the squared
1040    distance avoids the computational expense of a square root operation,
1041    making it ideal for high-frequency spatial queries.
1042
1043    Parameters
1044    ----------
1045    r1 : float
1046        Row coordinate of the first point.
1047    c1 : float
1048        Column coordinate of the first point.
1049    r2 : float
1050        Row coordinate of the second point.
1051    c2 : float
1052        Column coordinate of the second point.
1053    L_row : float
1054        Total height (row extent) of the periodic domain.
1055    L_col : float
1056        Total width (column extent) of the periodic domain.
1057
1058    Returns
1059    -------
1060    dist_sq : float
1061        The squared shortest distance between the two points.
1062
1063    Notes
1064    -----
1065    The function applies the minimum image convention, ensuring that the
1066    distance never exceeds half the domain length in any dimension.
1067    """
1068    dr = abs(r1 - r2)
1069    dc = abs(c1 - c2)
1070    if dr > L_row * 0.5:
1071        dr = L_row - dr
1072    if dc > L_col * 0.5:
1073        dc = L_col - dc
1074    return dr * dr + dc * dc
1075
1076
1077@njit(parallel=True, cache=True)
1078def _pcf_cell_list(
1079    pos_i: np.ndarray,
1080    pos_j: np.ndarray,
1081    indices_j: np.ndarray,
1082    offsets_j: np.ndarray,
1083    counts_j: np.ndarray,
1084    cell_size_r: float,
1085    cell_size_c: float,
1086    L_row: float,
1087    L_col: float,
1088    max_distance: float,
1089    n_bins: int,
1090    self_correlation: bool,
1091    n_cells: int,
1092) -> np.ndarray:
1093    """
1094    Compute a Pair Correlation Function (PCF) histogram using spatial cell lists.
1095
1096    This Numba-accelerated parallel kernel calculates distances between two sets
1097    of points (pos_i and pos_j). It uses a cell list (spatial hashing) to
1098    restrict distance calculations to neighboring cells within the maximum
1099    specified distance, significantly improving performance from $O(N^2)$
1100    to $O(N)$.
1101
1102    Parameters
1103    ----------
1104    pos_i : np.ndarray
1105        (N, 2) float array of coordinates for the primary species.
1106    pos_j : np.ndarray
1107        (M, 2) float array of coordinates for the secondary species.
1108    indices_j : np.ndarray
1109        Flattened indices of pos_j sorted by cell, produced by `_build_cell_list`.
1110    offsets_j : np.ndarray
1111        2D array of starting offsets for each cell in `indices_j`.
1112    counts_j : np.ndarray
1113        2D array of particle counts within each cell for species J.
1114    cell_size_r : float
1115        Height of a single spatial cell.
1116    cell_size_c : float
1117        Width of a single spatial cell.
1118    L_row : float
1119        Total height of the periodic domain.
1120    L_col : float
1121        Total width of the periodic domain.
1122    max_distance : float
1123        Maximum radial distance (r) to consider for the correlation.
1124    n_bins : int
1125        Number of bins in the distance histogram.
1126    self_correlation : bool
1127        If True, assumes species I and J are the same and avoids double-counting
1128        or self-interaction.
1129    n_cells : int
1130        Number of cells per dimension in the spatial hash grid.
1131
1132    Returns
1133    -------
1134    hist : np.ndarray
1135        A 1D array of length `n_bins` containing the counts of pairs found
1136        at each radial distance.
1137
1138    Notes
1139    -----
1140    The kernel uses `prange` for parallel execution across points in `pos_i`.
1141    Local histograms are used per thread to prevent race conditions during
1142    reduction. Periodic boundary conditions are handled via `_periodic_dist_sq`.
1143    """
1144    n_i = len(pos_i)
1145    bin_width = max_distance / n_bins
1146    max_dist_sq = max_distance * max_distance
1147    cells_to_check = int(np.ceil(max_distance / min(cell_size_r, cell_size_c))) + 1
1148
1149    hist = np.zeros(n_bins, dtype=np.int64)
1150
1151    for i in prange(n_i):
1152        local_hist = np.zeros(n_bins, dtype=np.int64)
1153        r1, c1 = pos_i[i, 0], pos_i[i, 1]
1154
1155        cell_r = int(r1 / cell_size_r) % n_cells
1156        cell_c = int(c1 / cell_size_c) % n_cells
1157
1158        for dcr in range(-cells_to_check, cells_to_check + 1):
1159            for dcc in range(-cells_to_check, cells_to_check + 1):
1160                ncr = (cell_r + dcr) % n_cells
1161                ncc = (cell_c + dcc) % n_cells
1162
1163                start = offsets_j[ncr, ncc]
1164                end = start + counts_j[ncr, ncc]
1165
1166                for idx in range(start, end):
1167                    j = indices_j[idx]
1168
1169                    if self_correlation and j <= i:
1170                        continue
1171
1172                    r2, c2 = pos_j[j, 0], pos_j[j, 1]
1173                    d_sq = _periodic_dist_sq(r1, c1, r2, c2, L_row, L_col)
1174
1175                    if 0 < d_sq < max_dist_sq:
1176                        d = np.sqrt(d_sq)
1177                        bin_idx = int(d / bin_width)
1178                        if bin_idx >= n_bins:
1179                            bin_idx = n_bins - 1
1180                        local_hist[bin_idx] += 1
1181
1182        for b in range(n_bins):
1183            hist[b] += local_hist[b]
1184
1185    if self_correlation:
1186        for b in range(n_bins):
1187            hist[b] *= 2
1188
1189    return hist
1190
1191
1192def compute_pcf_periodic_fast(
1193    positions_i: np.ndarray,
1194    positions_j: np.ndarray,
1195    grid_shape: Tuple[int, int],
1196    max_distance: float,
1197    n_bins: int = 50,
1198    self_correlation: bool = False,
1199) -> Tuple[np.ndarray, np.ndarray, int]:
1200    """
1201    Compute the Pair Correlation Function (PCF) using cell-list acceleration.
1202
1203    This high-level function coordinates the spatial hashing and histogram
1204    calculation to determine the $g(r)$ function. It normalizes the resulting
1205    histogram by the expected number of pairs in an ideal gas of the same
1206    density, accounting for the toroidal area of each radial bin.
1207
1208    Parameters
1209    ----------
1210    positions_i : np.ndarray
1211        (N, 2) array of coordinates for species I.
1212    positions_j : np.ndarray
1213        (M, 2) array of coordinates for species J.
1214    grid_shape : tuple of int
1215        The (rows, cols) dimensions of the simulation grid.
1216    max_distance : float
1217        The maximum radius to calculate correlations for.
1218    n_bins : int, optional
1219        Number of bins for the radial distribution (default 50).
1220    self_correlation : bool, optional
1221        Set to True if computing the correlation of a species with itself
1222        to avoid self-counting (default False).
1223
1224    Returns
1225    -------
1226    bin_centers : np.ndarray
1227        The central radial distance for each histogram bin.
1228    pcf : np.ndarray
1229        The normalized $g(r)$ values. A value of 1.0 indicates no spatial
1230        correlation; > 1.0 indicates clustering; < 1.0 indicates repulsion.
1231    total_pairs : int
1232        The total count of pairs found within the `max_distance`.
1233
1234    Notes
1235    -----
1236    The function dynamically determines the optimal number of cells for the
1237    spatial hash based on the `max_distance` and grid dimensions to maintain
1238    linear time complexity.
1239    """
1240    rows, cols = grid_shape
1241    L_row, L_col = float(rows), float(cols)
1242    area = L_row * L_col
1243
1244    bin_width = max_distance / n_bins
1245    bin_centers = np.linspace(bin_width / 2, max_distance - bin_width / 2, n_bins)
1246
1247    if len(positions_i) == 0 or len(positions_j) == 0:
1248        return bin_centers, np.ones(n_bins), 0
1249
1250    n_cells = max(4, int(min(rows, cols) / max_distance))
1251
1252    pos_i = np.ascontiguousarray(positions_i, dtype=np.float64)
1253    pos_j = np.ascontiguousarray(positions_j, dtype=np.float64)
1254
1255    indices_j, offsets_j, counts_j, cell_size_r, cell_size_c = _build_cell_list(
1256        pos_j, n_cells, L_row, L_col
1257    )
1258
1259    hist = _pcf_cell_list(
1260        pos_i,
1261        pos_j,
1262        indices_j,
1263        offsets_j,
1264        counts_j,
1265        cell_size_r,
1266        cell_size_c,
1267        L_row,
1268        L_col,
1269        max_distance,
1270        n_bins,
1271        self_correlation,
1272        n_cells,
1273    )
1274
1275    n_i, n_j = len(positions_i), len(positions_j)
1276    if self_correlation:
1277        density_product = n_i * (n_i - 1) / (area * area)
1278    else:
1279        density_product = n_i * n_j / (area * area)
1280
1281    expected = np.zeros(n_bins)
1282    for i in range(n_bins):
1283        r = bin_centers[i]
1284        annulus_area = 2 * np.pi * r * bin_width
1285        expected[i] = density_product * annulus_area * area
1286
1287    pcf = np.ones(n_bins)
1288    mask = expected > 1.0
1289    pcf[mask] = hist[mask] / expected[mask]
1290
1291    return bin_centers, pcf, int(np.sum(hist))
1292
1293
1294def compute_all_pcfs_fast(
1295    grid: np.ndarray,
1296    max_distance: Optional[float] = None,
1297    n_bins: int = 50,
1298) -> Dict[str, Tuple[np.ndarray, np.ndarray, int]]:
1299    """
1300    Compute all three species Pair Correlation Functions (PCFs) using cell-list acceleration.
1301
1302    This function calculates the spatial auto-correlations (Prey-Prey,
1303    Predator-Predator) and the cross-correlation (Prey-Predator) for a given
1304    simulation grid. It identifies particle positions and leverages
1305    Numba-accelerated cell lists to handle the computations efficiently.
1306
1307    Parameters
1308    ----------
1309    grid : np.ndarray
1310        2D integer array where 1 represents prey and 2 represents predators.
1311    max_distance : float, optional
1312        The maximum radial distance for the correlation. Defaults to 1/4
1313        of the minimum grid dimension if not provided.
1314    n_bins : int, optional
1315        Number of distance bins for the histogram. Default is 50.
1316
1317    Returns
1318    -------
1319    results : dict
1320        A dictionary with keys 'prey_prey', 'pred_pred', and 'prey_pred'.
1321        Each value is a tuple containing:
1322        - bin_centers (np.ndarray): Radial distances.
1323        - pcf_values (np.ndarray): Normalized g(r) values.
1324        - pair_count (int): Total number of pairs found.
1325
1326    Notes
1327    -----
1328    The PCF provides insight into the spatial organization of the system.
1329    g(r) > 1 at short distances indicates aggregation (clustering),
1330    while g(r) < 1 indicates exclusion or repulsion.
1331    """
1332    rows, cols = grid.shape
1333    if max_distance is None:
1334        max_distance = min(rows, cols) / 4.0
1335
1336    prey_pos = np.argwhere(grid == 1)
1337    pred_pos = np.argwhere(grid == 2)
1338
1339    results = {}
1340
1341    dist, pcf, n = compute_pcf_periodic_fast(
1342        prey_pos,
1343        prey_pos,
1344        (rows, cols),
1345        max_distance,
1346        n_bins,
1347        self_correlation=True,
1348    )
1349    results["prey_prey"] = (dist, pcf, n)
1350
1351    dist, pcf, n = compute_pcf_periodic_fast(
1352        pred_pos,
1353        pred_pos,
1354        (rows, cols),
1355        max_distance,
1356        n_bins,
1357        self_correlation=True,
1358    )
1359    results["pred_pred"] = (dist, pcf, n)
1360
1361    dist, pcf, n = compute_pcf_periodic_fast(
1362        prey_pos,
1363        pred_pos,
1364        (rows, cols),
1365        max_distance,
1366        n_bins,
1367        self_correlation=False,
1368    )
1369    results["prey_pred"] = (dist, pcf, n)
1370
1371    return results
1372
1373
1374# ============================================================================
1375# WARMUP & BENCHMARKS
1376# ============================================================================
1377
1378
1379def warmup_numba_kernels(grid_size: int = 100, directed_hunting: bool = False):
1380    """
1381    Pre-compile all Numba-accelerated kernels to avoid first-run latency.
1382
1383    This function executes a single step of the simulation and each analysis
1384    routine on a dummy grid. Because Numba uses Just-In-Time (JIT) compilation,
1385    the first call to a decorated function incurs a compilation overhead.
1386    Running this warmup ensures that subsequent experimental runs are timed
1387    accurately and perform at full speed.
1388
1389    Parameters
1390    ----------
1391    grid_size : int, optional
1392        The side length of the dummy grid used for warmup (default 100).
1393    directed_hunting : bool, optional
1394        If True, also warms up the directed behavior update kernel (default False).
1395
1396    Returns
1397    -------
1398    None
1399
1400    Notes
1401    -----
1402    This function checks for `NUMBA_AVAILABLE` before execution. It warms up
1403    the `PPKernel` (random and optionally directed), as well as the
1404    spatial analysis functions (`compute_all_pcfs_fast`, `detect_clusters_fast`, etc.).
1405    """
1406    if not NUMBA_AVAILABLE:
1407        return
1408
1409    set_numba_seed(0)
1410
1411    grid = np.zeros((grid_size, grid_size), dtype=np.int32)
1412    grid[::3, ::3] = 1
1413    grid[::5, ::5] = 2
1414
1415    prey_death_arr = np.full((grid_size, grid_size), 0.05, dtype=np.float64)
1416    prey_death_arr[grid != 1] = np.nan
1417
1418    # Always warmup random kernel
1419    kernel_random = PPKernel(grid_size, grid_size, directed_hunting=False)
1420    kernel_random.update(grid.copy(), prey_death_arr.copy(), 0.2, 0.05, 0.2, 0.1)
1421
1422    # Warmup directed kernel if requested
1423    if directed_hunting:
1424        kernel_directed = PPKernel(grid_size, grid_size, directed_hunting=True)
1425        kernel_directed.update(grid.copy(), prey_death_arr.copy(), 0.2, 0.05, 0.2, 0.1)
1426
1427    # Warmup analysis functions
1428    _ = compute_all_pcfs_fast(grid, max_distance=20.0, n_bins=20)
1429    _ = measure_cluster_sizes_fast(grid, 1)
1430    _ = detect_clusters_fast(grid, 1)
1431    _ = get_cluster_stats_fast(grid, 1)
1432
1433
1434def benchmark_kernels(grid_size: int = 100, n_runs: int = 20):
1435    """
1436    Benchmark the execution performance of random vs. directed update kernels.
1437
1438    This utility measures the average time per simulation step for both the
1439    stochastic (random neighbor) and heuristic (directed hunting/reproduction)
1440    update strategies. It accounts for the computational overhead introduced
1441    by the "intelligent" search logic used in directed mode.
1442
1443    Parameters
1444    ----------
1445    grid_size : int, optional
1446        The side length of the square simulation grid (default 100).
1447    n_runs : int, optional
1448        The number of iterations to perform for averaging performance (default 20).
1449
1450    Returns
1451    -------
1452    t_random : float
1453        Average time per step for the random kernel in milliseconds.
1454    t_directed : float
1455        Average time per step for the directed kernel in milliseconds.
1456
1457    Notes
1458    -----
1459    The function ensures a fair comparison by:
1460    1. Using a fixed seed for reproducible initial grid states.
1461    2. Warming up Numba kernels before timing to exclude JIT compilation latency.
1462    3. Copying the grid and death arrays for each iteration to maintain
1463       consistent population densities throughout the benchmark.
1464    """
1465    import time
1466
1467    print("=" * 60)
1468    print(f"KERNEL BENCHMARK ({grid_size}x{grid_size}, {n_runs} runs)")
1469    print(f"Numba available: {NUMBA_AVAILABLE}")
1470    print("=" * 60)
1471
1472    np.random.seed(42)
1473    grid = np.zeros((grid_size, grid_size), dtype=np.int32)
1474    n_prey = int(grid_size * grid_size * 0.30)
1475    n_pred = int(grid_size * grid_size * 0.15)
1476    positions = np.random.permutation(grid_size * grid_size)
1477    for pos in positions[:n_prey]:
1478        grid[pos // grid_size, pos % grid_size] = 1
1479    for pos in positions[n_prey : n_prey + n_pred]:
1480        grid[pos // grid_size, pos % grid_size] = 2
1481
1482    prey_death_arr = np.full((grid_size, grid_size), 0.05, dtype=np.float64)
1483    prey_death_arr[grid != 1] = np.nan
1484
1485    print(f"Initial: {np.sum(grid == 1)} prey, {np.sum(grid == 2)} predators")
1486
1487    # Warmup both kernels
1488    warmup_numba_kernels(grid_size, directed_hunting=True)
1489
1490    # Benchmark random kernel
1491    kernel_random = PPKernel(grid_size, grid_size, directed_hunting=False)
1492    t0 = time.perf_counter()
1493    for _ in range(n_runs):
1494        test_grid = grid.copy()
1495        test_arr = prey_death_arr.copy()
1496        kernel_random.update(test_grid, test_arr, 0.2, 0.05, 0.2, 0.1)
1497    t_random = (time.perf_counter() - t0) / n_runs * 1000
1498
1499    # Benchmark directed kernel
1500    kernel_directed = PPKernel(grid_size, grid_size, directed_hunting=True)
1501    t0 = time.perf_counter()
1502    for _ in range(n_runs):
1503        test_grid = grid.copy()
1504        test_arr = prey_death_arr.copy()
1505        kernel_directed.update(test_grid, test_arr, 0.2, 0.05, 0.2, 0.1)
1506    t_directed = (time.perf_counter() - t0) / n_runs * 1000
1507
1508    print(f"\nRandom kernel:   {t_random:.2f} ms/step")
1509    print(f"Directed kernel: {t_directed:.2f} ms/step")
1510    print(
1511        f"Overhead:        {t_directed - t_random:.2f} ms (+{100*(t_directed/t_random - 1):.1f}%)"
1512    )
1513
1514    return t_random, t_directed
1515
1516
1517def benchmark_cluster_detection(grid_size: int = 100, n_runs: int = 20):
1518    """
1519    Benchmark the performance of different cluster detection and analysis routines.
1520
1521    This function evaluates three levels of spatial analysis:
1522    1. Size measurement only (fastest, no label map).
1523    2. Full detection (returns label map and size dictionary).
1524    3. Comprehensive statistics (calculates distributions, means, and order parameters).
1525
1526    Parameters
1527    ----------
1528    grid_size : int, optional
1529        Side length of the square grid for benchmarking (default 100).
1530    n_runs : int, optional
1531        Number of iterations to average for performance results (default 20).
1532
1533    Returns
1534    -------
1535    stats : dict
1536        The result dictionary from the final comprehensive statistics run.
1537
1538    Notes
1539    -----
1540    The benchmark uses a fixed prey density of 30% to ensure a representative
1541    distribution of clusters. It pre-warms the Numba kernels to ensure that
1542    the measurements reflect execution speed rather than compilation time.
1543    """
1544    import time
1545
1546    print("=" * 60)
1547    print(f"CLUSTER DETECTION BENCHMARK ({grid_size}x{grid_size})")
1548    print(f"Numba available: {NUMBA_AVAILABLE}")
1549    print("=" * 60)
1550
1551    np.random.seed(42)
1552    grid = np.zeros((grid_size, grid_size), dtype=np.int32)
1553    n_prey = int(grid_size * grid_size * 0.30)
1554    positions = np.random.permutation(grid_size * grid_size)[:n_prey]
1555    for pos in positions:
1556        grid[pos // grid_size, pos % grid_size] = 1
1557
1558    print(f"Prey cells: {np.sum(grid == 1)}")
1559
1560    # Warmup
1561    _ = measure_cluster_sizes_fast(grid, 1)
1562    _ = detect_clusters_fast(grid, 1)
1563    _ = get_cluster_stats_fast(grid, 1)
1564
1565    # Benchmark sizes only
1566    t0 = time.perf_counter()
1567    for _ in range(n_runs):
1568        sizes = measure_cluster_sizes_fast(grid, 1)
1569    t_sizes = (time.perf_counter() - t0) / n_runs * 1000
1570    print(f"\nmeasure_cluster_sizes_fast: {t_sizes:.2f} ms  ({len(sizes)} clusters)")
1571
1572    # Benchmark full detection
1573    t0 = time.perf_counter()
1574    for _ in range(n_runs):
1575        labels, size_dict = detect_clusters_fast(grid, 1)
1576    t_detect = (time.perf_counter() - t0) / n_runs * 1000
1577    print(f"detect_clusters_fast:       {t_detect:.2f} ms  ({len(size_dict)} clusters)")
1578
1579    # Benchmark full stats
1580    t0 = time.perf_counter()
1581    for _ in range(n_runs):
1582        stats = get_cluster_stats_fast(grid, 1)
1583    t_stats = (time.perf_counter() - t0) / n_runs * 1000
1584    print(f"get_cluster_stats_fast:     {t_stats:.2f} ms")
1585
1586    print(
1587        f"\nOverhead for labels: {t_detect - t_sizes:.2f} ms (+{100*(t_detect/t_sizes - 1):.0f}%)"
1588    )
1589
1590    return stats
1591
1592
1593if __name__ == "__main__":
1594    print("\n" + "=" * 60)
1595    print("NUMBA-OPTIMIZED PP MODULE - BENCHMARKS")
1596    print("=" * 60 + "\n")
1597
1598    # Run kernel benchmarks
1599    benchmark_kernels(100)
1600
1601    print("\n")
1602
1603    # Run cluster benchmarks
1604    stats = benchmark_cluster_detection(100)
1605    print(
1606        f"\nSample stats: largest={stats['largest']}, "
1607        f"largest_fraction={stats['largest_fraction']:.3f}, "
1608        f"n_clusters={stats['n_clusters']}"
1609    )

@njit(cache=True)

def set_numba_seed(seed: int) -> None: View Source

 81@njit(cache=True)
 82def set_numba_seed(seed: int) -> None:
 83    """
 84    Seed Numba's internal random number generator from within a JIT context.
 85
 86    This function ensures that Numba's independent random number generator
 87    is synchronized with the provided seed, enabling reproducibility for
 88    jit-compiled functions that use NumPy's random operations.
 89
 90    Parameters
 91    ----------
 92    seed : int
 93        The integer value used to initialize the random number generator.
 94
 95    Returns
 96    -------
 97    None
 98
 99    Notes
100    -----
101    Because Numba maintains its own internal state for random number
102    generation, calling `np.random.seed()` in standard Python code will not
103    affect jit-compiled functions. This helper must be called to bridge
104    that gap.
105    """
106    np.random.seed(seed)

Seed Numba's internal random number generator from within a JIT context.

This function ensures that Numba's independent random number generator is synchronized with the provided seed, enabling reproducibility for jit-compiled functions that use NumPy's random operations.

Parameters

seed (int): The integer value used to initialize the random number generator.

Returns

None

Notes

Because Numba maintains its own internal state for random number generation, calling np.random.seed() in standard Python code will not affect jit-compiled functions. This helper must be called to bridge that gap.

class PPKernel: View Source

424class PPKernel:
425    """
426    Wrapper for predator-prey kernel with pre-allocated buffers.
427
428    This class manages the spatial configuration and memory buffers required
429    for the Numba-accelerated update kernels. By pre-allocating the
430    `occupied_buffer`, it avoids expensive memory allocations during the
431    simulation loop.
432
433    Parameters
434    ----------
435    rows : int
436        Number of rows in the simulation grid.
437    cols : int
438        Number of columns in the simulation grid.
439    neighborhood : {'moore', 'von_neumann'}, optional
440        The neighborhood type determining adjacent cells. 'moore' includes
441        diagonals (8 neighbors), 'von_neumann' does not (4 neighbors).
442        Default is 'moore'.
443    directed_hunting : bool, optional
444        If True, uses the directed behavior kernel where species search for
445        targets. If False, uses random neighbor selection. Default is False.
446
447    Attributes
448    ----------
449    rows : int
450        Grid row count.
451    cols : int
452        Grid column count.
453    directed_hunting : bool
454        Toggle for intelligent behavior logic.
455    """
456
457    def __init__(
458        self,
459        rows: int,
460        cols: int,
461        neighborhood: str = "moore",
462        directed_hunting: bool = False,
463    ):
464        self.rows = rows
465        self.cols = cols
466        self.directed_hunting = directed_hunting
467        self._occupied_buffer = np.empty((rows * cols, 2), dtype=np.int32)
468
469        if neighborhood == "moore":
470            self._dr = np.array([-1, -1, -1, 0, 0, 1, 1, 1], dtype=np.int32)
471            self._dc = np.array([-1, 0, 1, -1, 1, -1, 0, 1], dtype=np.int32)
472        else:  # von Neumann
473            self._dr = np.array([-1, 1, 0, 0], dtype=np.int32)
474            self._dc = np.array([0, 0, -1, 1], dtype=np.int32)
475
476    def update(
477        self,
478        grid: np.ndarray,
479        prey_death_arr: np.ndarray,
480        prey_birth: float,
481        prey_death: float,
482        pred_birth: float,
483        pred_death: float,
484        evolve_sd: float = 0.1,
485        evolve_min: float = 0.001,
486        evolve_max: float = 0.1,
487        evolution_stopped: bool = True,
488    ) -> np.ndarray:
489        """
490        Execute a single asynchronous update step using the configured kernel.
491
492        Parameters
493        ----------
494        grid : np.ndarray
495            The current 2D simulation grid.
496        prey_death_arr : np.ndarray
497            2D array of individual prey mortality rates.
498        prey_birth : float
499            Prey reproduction probability.
500        prey_death : float
501            Base prey mortality probability.
502        pred_birth : float
503            Predator reproduction (hunting success) probability.
504        pred_death : float
505            Predator mortality probability.
506        evolve_sd : float, optional
507            Mutation standard deviation (default 0.1).
508        evolve_min : float, optional
509            Minimum evolved death rate (default 0.001).
510        evolve_max : float, optional
511            Maximum evolved death rate (default 0.1).
512        evolution_stopped : bool, optional
513            Whether to disable mutation during this step (default True).
514
515        Returns
516        -------
517        np.ndarray
518            The updated grid after one full asynchronous pass.
519        """
520        if self.directed_hunting:
521            return _pp_async_kernel_directed(
522                grid,
523                prey_death_arr,
524                prey_birth,
525                prey_death,
526                pred_birth,
527                pred_death,
528                self._dr,
529                self._dc,
530                evolve_sd,
531                evolve_min,
532                evolve_max,
533                evolution_stopped,
534                self._occupied_buffer,
535            )
536        else:
537            return _pp_async_kernel_random(
538                grid,
539                prey_death_arr,
540                prey_birth,
541                prey_death,
542                pred_birth,
543                pred_death,
544                self._dr,
545                self._dc,
546                evolve_sd,
547                evolve_min,
548                evolve_max,
549                evolution_stopped,
550                self._occupied_buffer,
551            )

Wrapper for predator-prey kernel with pre-allocated buffers.

This class manages the spatial configuration and memory buffers required for the Numba-accelerated update kernels. By pre-allocating the occupied_buffer, it avoids expensive memory allocations during the simulation loop.

Parameters

rows (int): Number of rows in the simulation grid.
cols (int): Number of columns in the simulation grid.
neighborhood ({'moore', 'von_neumann'}, optional): The neighborhood type determining adjacent cells. 'moore' includes diagonals (8 neighbors), 'von_neumann' does not (4 neighbors). Default is 'moore'.
directed_hunting (bool, optional): If True, uses the directed behavior kernel where species search for targets. If False, uses random neighbor selection. Default is False.

Attributes

rows (int): Grid row count.
cols (int): Grid column count.
directed_hunting (bool): Toggle for intelligent behavior logic.

def update( self, grid: numpy.ndarray, prey_death_arr: numpy.ndarray, prey_birth: float, prey_death: float, pred_birth: float, pred_death: float, evolve_sd: float = 0.1, evolve_min: float = 0.001, evolve_max: float = 0.1, evolution_stopped: bool = True) -> numpy.ndarray: View Source

476    def update(
477        self,
478        grid: np.ndarray,
479        prey_death_arr: np.ndarray,
480        prey_birth: float,
481        prey_death: float,
482        pred_birth: float,
483        pred_death: float,
484        evolve_sd: float = 0.1,
485        evolve_min: float = 0.001,
486        evolve_max: float = 0.1,
487        evolution_stopped: bool = True,
488    ) -> np.ndarray:
489        """
490        Execute a single asynchronous update step using the configured kernel.
491
492        Parameters
493        ----------
494        grid : np.ndarray
495            The current 2D simulation grid.
496        prey_death_arr : np.ndarray
497            2D array of individual prey mortality rates.
498        prey_birth : float
499            Prey reproduction probability.
500        prey_death : float
501            Base prey mortality probability.
502        pred_birth : float
503            Predator reproduction (hunting success) probability.
504        pred_death : float
505            Predator mortality probability.
506        evolve_sd : float, optional
507            Mutation standard deviation (default 0.1).
508        evolve_min : float, optional
509            Minimum evolved death rate (default 0.001).
510        evolve_max : float, optional
511            Maximum evolved death rate (default 0.1).
512        evolution_stopped : bool, optional
513            Whether to disable mutation during this step (default True).
514
515        Returns
516        -------
517        np.ndarray
518            The updated grid after one full asynchronous pass.
519        """
520        if self.directed_hunting:
521            return _pp_async_kernel_directed(
522                grid,
523                prey_death_arr,
524                prey_birth,
525                prey_death,
526                pred_birth,
527                pred_death,
528                self._dr,
529                self._dc,
530                evolve_sd,
531                evolve_min,
532                evolve_max,
533                evolution_stopped,
534                self._occupied_buffer,
535            )
536        else:
537            return _pp_async_kernel_random(
538                grid,
539                prey_death_arr,
540                prey_birth,
541                prey_death,
542                pred_birth,
543                pred_death,
544                self._dr,
545                self._dc,
546                evolve_sd,
547                evolve_min,
548                evolve_max,
549                evolution_stopped,
550                self._occupied_buffer,
551            )

Execute a single asynchronous update step using the configured kernel.

Parameters

grid (np.ndarray): The current 2D simulation grid.
prey_death_arr (np.ndarray): 2D array of individual prey mortality rates.
prey_birth (float): Prey reproduction probability.
prey_death (float): Base prey mortality probability.
pred_birth (float): Predator reproduction (hunting success) probability.
pred_death (float): Predator mortality probability.
evolve_sd (float, optional): Mutation standard deviation (default 0.1).
evolve_min (float, optional): Minimum evolved death rate (default 0.001).
evolve_max (float, optional): Maximum evolved death rate (default 0.1).
evolution_stopped (bool, optional): Whether to disable mutation during this step (default True).

Returns

np.ndarray: The updated grid after one full asynchronous pass.

def measure_cluster_sizes_fast( grid: numpy.ndarray, species: int, neighborhood: str = 'moore') -> numpy.ndarray: View Source

771def measure_cluster_sizes_fast(
772    grid: np.ndarray,
773    species: int,
774    neighborhood: str = "moore",
775) -> np.ndarray:
776    """
777    Measure cluster sizes for a specific species using Numba-accelerated flood fill.
778
779    This function provides a high-performance interface for calculating cluster
780    size statistics without the overhead of generating a full label map. It is
781    optimized for large-scale simulation analysis where only distribution
782    metrics (e.g., mean size, max size) are required.
783
784    Parameters
785    ----------
786    grid : np.ndarray
787        A 2D array representing the simulation environment.
788    species : int
789        The target species identifier (e.g., 1 for Prey, 2 for Predator).
790    neighborhood : {'moore', 'neumann'}, optional
791        The connectivity rule. 'moore' uses 8-way connectivity (including diagonals);
792        'neumann' uses 4-way connectivity. Default is 'moore'.
793
794    Returns
795    -------
796    cluster_sizes : np.ndarray
797        A 1D array of integers, where each element is the cell count of an
798        identified cluster.
799
800    Notes
801    -----
802    The input grid is cast to `int32` to ensure compatibility with the
803    underlying JIT-compiled `_measure_clusters` kernel.
804
805    Examples
806    --------
807    >>> sizes = measure_cluster_sizes_fast(grid, species=1, neighborhood='moore')
808    >>> if sizes.size > 0:
809    ...     print(f"Largest cluster: {sizes.max()}")
810    """
811    grid_int = np.asarray(grid, dtype=np.int32)
812    moore = neighborhood == "moore"
813    return _measure_clusters(grid_int, np.int32(species), moore)

Measure cluster sizes for a specific species using Numba-accelerated flood fill.

This function provides a high-performance interface for calculating cluster size statistics without the overhead of generating a full label map. It is optimized for large-scale simulation analysis where only distribution metrics (e.g., mean size, max size) are required.

Parameters

grid (np.ndarray): A 2D array representing the simulation environment.
species (int): The target species identifier (e.g., 1 for Prey, 2 for Predator).
neighborhood ({'moore', 'neumann'}, optional): The connectivity rule. 'moore' uses 8-way connectivity (including diagonals); 'neumann' uses 4-way connectivity. Default is 'moore'.

Returns

cluster_sizes (np.ndarray): A 1D array of integers, where each element is the cell count of an identified cluster.

Notes

The input grid is cast to int32 to ensure compatibility with the underlying JIT-compiled _measure_clusters kernel.

Examples

>>> sizes = measure_cluster_sizes_fast(grid, species=1, neighborhood='moore')
>>> if sizes.size > 0:
...     print(f"Largest cluster: {sizes.max()}")

def detect_clusters_fast( grid: numpy.ndarray, species: int, neighborhood: str = 'moore') -> Tuple[numpy.ndarray, Dict[int, int]]: View Source

816def detect_clusters_fast(
817    grid: np.ndarray,
818    species: int,
819    neighborhood: str = "moore",
820) -> Tuple[np.ndarray, Dict[int, int]]:
821    """
822    Perform full cluster detection with labels using Numba acceleration.
823
824    This function returns a label array for spatial analysis and a dictionary
825    of cluster sizes. It is significantly faster than standard Python or
826    SciPy equivalents for large simulation grids.
827
828    Parameters
829    ----------
830    grid : np.ndarray
831        A 2D array representing the simulation environment.
832    species : int
833        The target species identifier (e.g., 1 for Prey, 2 for Predator).
834    neighborhood : {'moore', 'neumann'}, optional
835        The connectivity rule. 'moore' uses 8-way connectivity; 'neumann'
836        uses 4-way connectivity. Default is 'moore'.
837
838    Returns
839    -------
840    labels : np.ndarray
841        A 2D int32 array where each cell contains its unique cluster ID.
842        Cells not belonging to the target species are 0.
843    sizes : dict
844        A dictionary mapping cluster IDs to their respective cell counts.
845
846    Notes
847    -----
848    The underlying Numba kernel uses a stack-based flood fill to avoid
849    recursion limits and handles periodic boundary conditions.
850
851    Examples
852    --------
853    >>> labels, sizes = detect_clusters_fast(grid, species=1)
854    >>> if sizes:
855    ...     largest_id = max(sizes, key=sizes.get)
856    ...     print(f"Cluster {largest_id} size: {sizes[largest_id]}")
857    """
858    grid_int = np.asarray(grid, dtype=np.int32)
859    moore = neighborhood == "moore"
860    labels, sizes_arr = _detect_clusters_numba(grid_int, np.int32(species), moore)
861    sizes_dict = {i + 1: int(sizes_arr[i]) for i in range(len(sizes_arr))}
862    return labels, sizes_dict

Perform full cluster detection with labels using Numba acceleration.

This function returns a label array for spatial analysis and a dictionary of cluster sizes. It is significantly faster than standard Python or SciPy equivalents for large simulation grids.

Parameters

grid (np.ndarray): A 2D array representing the simulation environment.
species (int): The target species identifier (e.g., 1 for Prey, 2 for Predator).
neighborhood ({'moore', 'neumann'}, optional): The connectivity rule. 'moore' uses 8-way connectivity; 'neumann' uses 4-way connectivity. Default is 'moore'.

Returns

labels (np.ndarray): A 2D int32 array where each cell contains its unique cluster ID. Cells not belonging to the target species are 0.
sizes (dict): A dictionary mapping cluster IDs to their respective cell counts.

Notes

The underlying Numba kernel uses a stack-based flood fill to avoid recursion limits and handles periodic boundary conditions.

Examples

>>> labels, sizes = detect_clusters_fast(grid, species=1)
>>> if sizes:
...     largest_id = max(sizes, key=sizes.get)
...     print(f"Cluster {largest_id} size: {sizes[largest_id]}")

def get_cluster_stats_fast(grid: numpy.ndarray, species: int, neighborhood: str = 'moore') -> Dict: View Source

865def get_cluster_stats_fast(
866    grid: np.ndarray,
867    species: int,
868    neighborhood: str = "moore",
869) -> Dict:
870    """
871    Compute comprehensive cluster statistics for a species using Numba acceleration.
872
873    This function integrates cluster detection and labeling to provide a
874    full suite of spatial metrics. It calculates the cluster size distribution
875    and the largest cluster fraction, which often serves as an order
876    parameter in percolation theory and Phase 1-3 analyses.
877
878    Parameters
879    ----------
880    grid : np.ndarray
881        A 2D array representing the simulation environment.
882    species : int
883        The target species identifier (e.g., 1 for Prey, 2 for Predator).
884    neighborhood : {'moore', 'neumann'}, optional
885        The connectivity rule. 'moore' uses 8-way connectivity; 'neumann'
886        uses 4-way connectivity. Default is 'moore'.
887
888    Returns
889    -------
890    stats : dict
891        A dictionary containing:
892        - 'n_clusters': Total count of isolated clusters.
893        - 'sizes': Sorted array (descending) of all cluster sizes.
894        - 'largest': Size of the single largest cluster.
895        - 'largest_fraction': Size of the largest cluster divided by
896          the total population of the species.
897        - 'mean_size': Average size of all clusters.
898        - 'size_distribution': Frequency mapping of {size: count}.
899        - 'labels': 2D array of unique cluster IDs.
900        - 'size_dict': Mapping of {label_id: size}.
901
902    Examples
903    --------
904    >>> stats = get_cluster_stats_fast(grid, species=1)
905    >>> print(f"Found {stats['n_clusters']} prey clusters.")
906    >>> print(f"Order parameter: {stats['largest_fraction']:.3f}")
907    """
908    labels, size_dict = detect_clusters_fast(grid, species, neighborhood)
909
910    if len(size_dict) == 0:
911        return {
912            "n_clusters": 0,
913            "sizes": np.array([], dtype=np.int32),
914            "largest": 0,
915            "largest_fraction": 0.0,
916            "mean_size": 0.0,
917            "size_distribution": {},
918            "labels": labels,
919            "size_dict": size_dict,
920        }
921
922    sizes = np.array(list(size_dict.values()), dtype=np.int32)
923    sizes_sorted = np.sort(sizes)[::-1]
924    total_pop = int(np.sum(sizes))
925    largest = int(sizes_sorted[0])
926
927    size_dist = {}
928    for s in sizes:
929        s_int = int(s)
930        size_dist[s_int] = size_dist.get(s_int, 0) + 1
931
932    return {
933        "n_clusters": len(size_dict),
934        "sizes": sizes_sorted,
935        "largest": largest,
936        "largest_fraction": float(largest) / total_pop if total_pop > 0 else 0.0,
937        "mean_size": float(np.mean(sizes)),
938        "size_distribution": size_dist,
939        "labels": labels,
940        "size_dict": size_dict,
941    }

Compute comprehensive cluster statistics for a species using Numba acceleration.

This function integrates cluster detection and labeling to provide a full suite of spatial metrics. It calculates the cluster size distribution and the largest cluster fraction, which often serves as an order parameter in percolation theory and Phase 1-3 analyses.

Parameters

grid (np.ndarray): A 2D array representing the simulation environment.
species (int): The target species identifier (e.g., 1 for Prey, 2 for Predator).
neighborhood ({'moore', 'neumann'}, optional): The connectivity rule. 'moore' uses 8-way connectivity; 'neumann' uses 4-way connectivity. Default is 'moore'.

Returns

stats (dict): A dictionary containing:
- 'n_clusters': Total count of isolated clusters.
- 'sizes': Sorted array (descending) of all cluster sizes.
- 'largest': Size of the single largest cluster.
- 'largest_fraction': Size of the largest cluster divided by the total population of the species.
- 'mean_size': Average size of all clusters.
- 'size_distribution': Frequency mapping of {size: count}.
- 'labels': 2D array of unique cluster IDs.
- 'size_dict': Mapping of {label_id: size}.

Examples

>>> stats = get_cluster_stats_fast(grid, species=1)
>>> print(f"Found {stats['n_clusters']} prey clusters.")
>>> print(f"Order parameter: {stats['largest_fraction']:.3f}")

def compute_pcf_periodic_fast( positions_i: numpy.ndarray, positions_j: numpy.ndarray, grid_shape: Tuple[int, int], max_distance: float, n_bins: int = 50, self_correlation: bool = False) -> Tuple[numpy.ndarray, numpy.ndarray, int]: View Source

1193def compute_pcf_periodic_fast(
1194    positions_i: np.ndarray,
1195    positions_j: np.ndarray,
1196    grid_shape: Tuple[int, int],
1197    max_distance: float,
1198    n_bins: int = 50,
1199    self_correlation: bool = False,
1200) -> Tuple[np.ndarray, np.ndarray, int]:
1201    """
1202    Compute the Pair Correlation Function (PCF) using cell-list acceleration.
1203
1204    This high-level function coordinates the spatial hashing and histogram
1205    calculation to determine the $g(r)$ function. It normalizes the resulting
1206    histogram by the expected number of pairs in an ideal gas of the same
1207    density, accounting for the toroidal area of each radial bin.
1208
1209    Parameters
1210    ----------
1211    positions_i : np.ndarray
1212        (N, 2) array of coordinates for species I.
1213    positions_j : np.ndarray
1214        (M, 2) array of coordinates for species J.
1215    grid_shape : tuple of int
1216        The (rows, cols) dimensions of the simulation grid.
1217    max_distance : float
1218        The maximum radius to calculate correlations for.
1219    n_bins : int, optional
1220        Number of bins for the radial distribution (default 50).
1221    self_correlation : bool, optional
1222        Set to True if computing the correlation of a species with itself
1223        to avoid self-counting (default False).
1224
1225    Returns
1226    -------
1227    bin_centers : np.ndarray
1228        The central radial distance for each histogram bin.
1229    pcf : np.ndarray
1230        The normalized $g(r)$ values. A value of 1.0 indicates no spatial
1231        correlation; > 1.0 indicates clustering; < 1.0 indicates repulsion.
1232    total_pairs : int
1233        The total count of pairs found within the `max_distance`.
1234
1235    Notes
1236    -----
1237    The function dynamically determines the optimal number of cells for the
1238    spatial hash based on the `max_distance` and grid dimensions to maintain
1239    linear time complexity.
1240    """
1241    rows, cols = grid_shape
1242    L_row, L_col = float(rows), float(cols)
1243    area = L_row * L_col
1244
1245    bin_width = max_distance / n_bins
1246    bin_centers = np.linspace(bin_width / 2, max_distance - bin_width / 2, n_bins)
1247
1248    if len(positions_i) == 0 or len(positions_j) == 0:
1249        return bin_centers, np.ones(n_bins), 0
1250
1251    n_cells = max(4, int(min(rows, cols) / max_distance))
1252
1253    pos_i = np.ascontiguousarray(positions_i, dtype=np.float64)
1254    pos_j = np.ascontiguousarray(positions_j, dtype=np.float64)
1255
1256    indices_j, offsets_j, counts_j, cell_size_r, cell_size_c = _build_cell_list(
1257        pos_j, n_cells, L_row, L_col
1258    )
1259
1260    hist = _pcf_cell_list(
1261        pos_i,
1262        pos_j,
1263        indices_j,
1264        offsets_j,
1265        counts_j,
1266        cell_size_r,
1267        cell_size_c,
1268        L_row,
1269        L_col,
1270        max_distance,
1271        n_bins,
1272        self_correlation,
1273        n_cells,
1274    )
1275
1276    n_i, n_j = len(positions_i), len(positions_j)
1277    if self_correlation:
1278        density_product = n_i * (n_i - 1) / (area * area)
1279    else:
1280        density_product = n_i * n_j / (area * area)
1281
1282    expected = np.zeros(n_bins)
1283    for i in range(n_bins):
1284        r = bin_centers[i]
1285        annulus_area = 2 * np.pi * r * bin_width
1286        expected[i] = density_product * annulus_area * area
1287
1288    pcf = np.ones(n_bins)
1289    mask = expected > 1.0
1290    pcf[mask] = hist[mask] / expected[mask]
1291
1292    return bin_centers, pcf, int(np.sum(hist))

Compute the Pair Correlation Function (PCF) using cell-list acceleration.

This high-level function coordinates the spatial hashing and histogram calculation to determine the $g(r)$ function. It normalizes the resulting histogram by the expected number of pairs in an ideal gas of the same density, accounting for the toroidal area of each radial bin.

Parameters

positions_i (np.ndarray): (N, 2) array of coordinates for species I.
positions_j (np.ndarray): (M, 2) array of coordinates for species J.
grid_shape (tuple of int): The (rows, cols) dimensions of the simulation grid.
max_distance (float): The maximum radius to calculate correlations for.
n_bins (int, optional): Number of bins for the radial distribution (default 50).
self_correlation (bool, optional): Set to True if computing the correlation of a species with itself to avoid self-counting (default False).

Returns

bin_centers (np.ndarray): The central radial distance for each histogram bin.
pcf (np.ndarray): The normalized $g(r)$ values. A value of 1.0 indicates no spatial correlation; > 1.0 indicates clustering; < 1.0 indicates repulsion.
total_pairs (int): The total count of pairs found within the max_distance.

Notes

The function dynamically determines the optimal number of cells for the spatial hash based on the max_distance and grid dimensions to maintain linear time complexity.

def compute_all_pcfs_fast( grid: numpy.ndarray, max_distance: Optional[float] = None, n_bins: int = 50) -> Dict[str, Tuple[numpy.ndarray, numpy.ndarray, int]]: View Source

1295def compute_all_pcfs_fast(
1296    grid: np.ndarray,
1297    max_distance: Optional[float] = None,
1298    n_bins: int = 50,
1299) -> Dict[str, Tuple[np.ndarray, np.ndarray, int]]:
1300    """
1301    Compute all three species Pair Correlation Functions (PCFs) using cell-list acceleration.
1302
1303    This function calculates the spatial auto-correlations (Prey-Prey,
1304    Predator-Predator) and the cross-correlation (Prey-Predator) for a given
1305    simulation grid. It identifies particle positions and leverages
1306    Numba-accelerated cell lists to handle the computations efficiently.
1307
1308    Parameters
1309    ----------
1310    grid : np.ndarray
1311        2D integer array where 1 represents prey and 2 represents predators.
1312    max_distance : float, optional
1313        The maximum radial distance for the correlation. Defaults to 1/4
1314        of the minimum grid dimension if not provided.
1315    n_bins : int, optional
1316        Number of distance bins for the histogram. Default is 50.
1317
1318    Returns
1319    -------
1320    results : dict
1321        A dictionary with keys 'prey_prey', 'pred_pred', and 'prey_pred'.
1322        Each value is a tuple containing:
1323        - bin_centers (np.ndarray): Radial distances.
1324        - pcf_values (np.ndarray): Normalized g(r) values.
1325        - pair_count (int): Total number of pairs found.
1326
1327    Notes
1328    -----
1329    The PCF provides insight into the spatial organization of the system.
1330    g(r) > 1 at short distances indicates aggregation (clustering),
1331    while g(r) < 1 indicates exclusion or repulsion.
1332    """
1333    rows, cols = grid.shape
1334    if max_distance is None:
1335        max_distance = min(rows, cols) / 4.0
1336
1337    prey_pos = np.argwhere(grid == 1)
1338    pred_pos = np.argwhere(grid == 2)
1339
1340    results = {}
1341
1342    dist, pcf, n = compute_pcf_periodic_fast(
1343        prey_pos,
1344        prey_pos,
1345        (rows, cols),
1346        max_distance,
1347        n_bins,
1348        self_correlation=True,
1349    )
1350    results["prey_prey"] = (dist, pcf, n)
1351
1352    dist, pcf, n = compute_pcf_periodic_fast(
1353        pred_pos,
1354        pred_pos,
1355        (rows, cols),
1356        max_distance,
1357        n_bins,
1358        self_correlation=True,
1359    )
1360    results["pred_pred"] = (dist, pcf, n)
1361
1362    dist, pcf, n = compute_pcf_periodic_fast(
1363        prey_pos,
1364        pred_pos,
1365        (rows, cols),
1366        max_distance,
1367        n_bins,
1368        self_correlation=False,
1369    )
1370    results["prey_pred"] = (dist, pcf, n)
1371
1372    return results

Compute all three species Pair Correlation Functions (PCFs) using cell-list acceleration.

This function calculates the spatial auto-correlations (Prey-Prey, Predator-Predator) and the cross-correlation (Prey-Predator) for a given simulation grid. It identifies particle positions and leverages Numba-accelerated cell lists to handle the computations efficiently.

Parameters

grid (np.ndarray): 2D integer array where 1 represents prey and 2 represents predators.
max_distance (float, optional): The maximum radial distance for the correlation. Defaults to 1/4 of the minimum grid dimension if not provided.
n_bins (int, optional): Number of distance bins for the histogram. Default is 50.

Returns

results (dict): A dictionary with keys 'prey_prey', 'pred_pred', and 'prey_pred'. Each value is a tuple containing:
- bin_centers (np.ndarray): Radial distances.
- pcf_values (np.ndarray): Normalized g(r) values.
- pair_count (int): Total number of pairs found.

Notes

The PCF provides insight into the spatial organization of the system. g(r) > 1 at short distances indicates aggregation (clustering), while g(r) < 1 indicates exclusion or repulsion.

def warmup_numba_kernels(grid_size: int = 100, directed_hunting: bool = False): View Source

1380def warmup_numba_kernels(grid_size: int = 100, directed_hunting: bool = False):
1381    """
1382    Pre-compile all Numba-accelerated kernels to avoid first-run latency.
1383
1384    This function executes a single step of the simulation and each analysis
1385    routine on a dummy grid. Because Numba uses Just-In-Time (JIT) compilation,
1386    the first call to a decorated function incurs a compilation overhead.
1387    Running this warmup ensures that subsequent experimental runs are timed
1388    accurately and perform at full speed.
1389
1390    Parameters
1391    ----------
1392    grid_size : int, optional
1393        The side length of the dummy grid used for warmup (default 100).
1394    directed_hunting : bool, optional
1395        If True, also warms up the directed behavior update kernel (default False).
1396
1397    Returns
1398    -------
1399    None
1400
1401    Notes
1402    -----
1403    This function checks for `NUMBA_AVAILABLE` before execution. It warms up
1404    the `PPKernel` (random and optionally directed), as well as the
1405    spatial analysis functions (`compute_all_pcfs_fast`, `detect_clusters_fast`, etc.).
1406    """
1407    if not NUMBA_AVAILABLE:
1408        return
1409
1410    set_numba_seed(0)
1411
1412    grid = np.zeros((grid_size, grid_size), dtype=np.int32)
1413    grid[::3, ::3] = 1
1414    grid[::5, ::5] = 2
1415
1416    prey_death_arr = np.full((grid_size, grid_size), 0.05, dtype=np.float64)
1417    prey_death_arr[grid != 1] = np.nan
1418
1419    # Always warmup random kernel
1420    kernel_random = PPKernel(grid_size, grid_size, directed_hunting=False)
1421    kernel_random.update(grid.copy(), prey_death_arr.copy(), 0.2, 0.05, 0.2, 0.1)
1422
1423    # Warmup directed kernel if requested
1424    if directed_hunting:
1425        kernel_directed = PPKernel(grid_size, grid_size, directed_hunting=True)
1426        kernel_directed.update(grid.copy(), prey_death_arr.copy(), 0.2, 0.05, 0.2, 0.1)
1427
1428    # Warmup analysis functions
1429    _ = compute_all_pcfs_fast(grid, max_distance=20.0, n_bins=20)
1430    _ = measure_cluster_sizes_fast(grid, 1)
1431    _ = detect_clusters_fast(grid, 1)
1432    _ = get_cluster_stats_fast(grid, 1)

Pre-compile all Numba-accelerated kernels to avoid first-run latency.

This function executes a single step of the simulation and each analysis routine on a dummy grid. Because Numba uses Just-In-Time (JIT) compilation, the first call to a decorated function incurs a compilation overhead. Running this warmup ensures that subsequent experimental runs are timed accurately and perform at full speed.

Parameters

grid_size (int, optional): The side length of the dummy grid used for warmup (default 100).
directed_hunting (bool, optional): If True, also warms up the directed behavior update kernel (default False).

Returns

None

Notes

This function checks for NUMBA_AVAILABLE before execution. It warms up the PPKernel (random and optionally directed), as well as the spatial analysis functions (compute_all_pcfs_fast, detect_clusters_fast, etc.).

def benchmark_kernels(grid_size: int = 100, n_runs: int = 20): View Source

1435def benchmark_kernels(grid_size: int = 100, n_runs: int = 20):
1436    """
1437    Benchmark the execution performance of random vs. directed update kernels.
1438
1439    This utility measures the average time per simulation step for both the
1440    stochastic (random neighbor) and heuristic (directed hunting/reproduction)
1441    update strategies. It accounts for the computational overhead introduced
1442    by the "intelligent" search logic used in directed mode.
1443
1444    Parameters
1445    ----------
1446    grid_size : int, optional
1447        The side length of the square simulation grid (default 100).
1448    n_runs : int, optional
1449        The number of iterations to perform for averaging performance (default 20).
1450
1451    Returns
1452    -------
1453    t_random : float
1454        Average time per step for the random kernel in milliseconds.
1455    t_directed : float
1456        Average time per step for the directed kernel in milliseconds.
1457
1458    Notes
1459    -----
1460    The function ensures a fair comparison by:
1461    1. Using a fixed seed for reproducible initial grid states.
1462    2. Warming up Numba kernels before timing to exclude JIT compilation latency.
1463    3. Copying the grid and death arrays for each iteration to maintain
1464       consistent population densities throughout the benchmark.
1465    """
1466    import time
1467
1468    print("=" * 60)
1469    print(f"KERNEL BENCHMARK ({grid_size}x{grid_size}, {n_runs} runs)")
1470    print(f"Numba available: {NUMBA_AVAILABLE}")
1471    print("=" * 60)
1472
1473    np.random.seed(42)
1474    grid = np.zeros((grid_size, grid_size), dtype=np.int32)
1475    n_prey = int(grid_size * grid_size * 0.30)
1476    n_pred = int(grid_size * grid_size * 0.15)
1477    positions = np.random.permutation(grid_size * grid_size)
1478    for pos in positions[:n_prey]:
1479        grid[pos // grid_size, pos % grid_size] = 1
1480    for pos in positions[n_prey : n_prey + n_pred]:
1481        grid[pos // grid_size, pos % grid_size] = 2
1482
1483    prey_death_arr = np.full((grid_size, grid_size), 0.05, dtype=np.float64)
1484    prey_death_arr[grid != 1] = np.nan
1485
1486    print(f"Initial: {np.sum(grid == 1)} prey, {np.sum(grid == 2)} predators")
1487
1488    # Warmup both kernels
1489    warmup_numba_kernels(grid_size, directed_hunting=True)
1490
1491    # Benchmark random kernel
1492    kernel_random = PPKernel(grid_size, grid_size, directed_hunting=False)
1493    t0 = time.perf_counter()
1494    for _ in range(n_runs):
1495        test_grid = grid.copy()
1496        test_arr = prey_death_arr.copy()
1497        kernel_random.update(test_grid, test_arr, 0.2, 0.05, 0.2, 0.1)
1498    t_random = (time.perf_counter() - t0) / n_runs * 1000
1499
1500    # Benchmark directed kernel
1501    kernel_directed = PPKernel(grid_size, grid_size, directed_hunting=True)
1502    t0 = time.perf_counter()
1503    for _ in range(n_runs):
1504        test_grid = grid.copy()
1505        test_arr = prey_death_arr.copy()
1506        kernel_directed.update(test_grid, test_arr, 0.2, 0.05, 0.2, 0.1)
1507    t_directed = (time.perf_counter() - t0) / n_runs * 1000
1508
1509    print(f"\nRandom kernel:   {t_random:.2f} ms/step")
1510    print(f"Directed kernel: {t_directed:.2f} ms/step")
1511    print(
1512        f"Overhead:        {t_directed - t_random:.2f} ms (+{100*(t_directed/t_random - 1):.1f}%)"
1513    )
1514
1515    return t_random, t_directed

Benchmark the execution performance of random vs. directed update kernels.

This utility measures the average time per simulation step for both the stochastic (random neighbor) and heuristic (directed hunting/reproduction) update strategies. It accounts for the computational overhead introduced by the "intelligent" search logic used in directed mode.

Parameters

grid_size (int, optional): The side length of the square simulation grid (default 100).
n_runs (int, optional): The number of iterations to perform for averaging performance (default 20).

Returns

t_random (float): Average time per step for the random kernel in milliseconds.
t_directed (float): Average time per step for the directed kernel in milliseconds.

Notes

The function ensures a fair comparison by:

Using a fixed seed for reproducible initial grid states.
Warming up Numba kernels before timing to exclude JIT compilation latency.
Copying the grid and death arrays for each iteration to maintain consistent population densities throughout the benchmark.

def benchmark_cluster_detection(grid_size: int = 100, n_runs: int = 20): View Source

1518def benchmark_cluster_detection(grid_size: int = 100, n_runs: int = 20):
1519    """
1520    Benchmark the performance of different cluster detection and analysis routines.
1521
1522    This function evaluates three levels of spatial analysis:
1523    1. Size measurement only (fastest, no label map).
1524    2. Full detection (returns label map and size dictionary).
1525    3. Comprehensive statistics (calculates distributions, means, and order parameters).
1526
1527    Parameters
1528    ----------
1529    grid_size : int, optional
1530        Side length of the square grid for benchmarking (default 100).
1531    n_runs : int, optional
1532        Number of iterations to average for performance results (default 20).
1533
1534    Returns
1535    -------
1536    stats : dict
1537        The result dictionary from the final comprehensive statistics run.
1538
1539    Notes
1540    -----
1541    The benchmark uses a fixed prey density of 30% to ensure a representative
1542    distribution of clusters. It pre-warms the Numba kernels to ensure that
1543    the measurements reflect execution speed rather than compilation time.
1544    """
1545    import time
1546
1547    print("=" * 60)
1548    print(f"CLUSTER DETECTION BENCHMARK ({grid_size}x{grid_size})")
1549    print(f"Numba available: {NUMBA_AVAILABLE}")
1550    print("=" * 60)
1551
1552    np.random.seed(42)
1553    grid = np.zeros((grid_size, grid_size), dtype=np.int32)
1554    n_prey = int(grid_size * grid_size * 0.30)
1555    positions = np.random.permutation(grid_size * grid_size)[:n_prey]
1556    for pos in positions:
1557        grid[pos // grid_size, pos % grid_size] = 1
1558
1559    print(f"Prey cells: {np.sum(grid == 1)}")
1560
1561    # Warmup
1562    _ = measure_cluster_sizes_fast(grid, 1)
1563    _ = detect_clusters_fast(grid, 1)
1564    _ = get_cluster_stats_fast(grid, 1)
1565
1566    # Benchmark sizes only
1567    t0 = time.perf_counter()
1568    for _ in range(n_runs):
1569        sizes = measure_cluster_sizes_fast(grid, 1)
1570    t_sizes = (time.perf_counter() - t0) / n_runs * 1000
1571    print(f"\nmeasure_cluster_sizes_fast: {t_sizes:.2f} ms  ({len(sizes)} clusters)")
1572
1573    # Benchmark full detection
1574    t0 = time.perf_counter()
1575    for _ in range(n_runs):
1576        labels, size_dict = detect_clusters_fast(grid, 1)
1577    t_detect = (time.perf_counter() - t0) / n_runs * 1000
1578    print(f"detect_clusters_fast:       {t_detect:.2f} ms  ({len(size_dict)} clusters)")
1579
1580    # Benchmark full stats
1581    t0 = time.perf_counter()
1582    for _ in range(n_runs):
1583        stats = get_cluster_stats_fast(grid, 1)
1584    t_stats = (time.perf_counter() - t0) / n_runs * 1000
1585    print(f"get_cluster_stats_fast:     {t_stats:.2f} ms")
1586
1587    print(
1588        f"\nOverhead for labels: {t_detect - t_sizes:.2f} ms (+{100*(t_detect/t_sizes - 1):.0f}%)"
1589    )
1590
1591    return stats

Benchmark the performance of different cluster detection and analysis routines.

This function evaluates three levels of spatial analysis:

Size measurement only (fastest, no label map).
Full detection (returns label map and size dictionary).
Comprehensive statistics (calculates distributions, means, and order parameters).

Parameters

grid_size (int, optional): Side length of the square grid for benchmarking (default 100).
n_runs (int, optional): Number of iterations to average for performance results (default 20).

Returns

stats (dict): The result dictionary from the final comprehensive statistics run.

Notes

The benchmark uses a fixed prey density of 30% to ensure a representative distribution of clusters. It pre-warms the Numba kernels to ensure that the measurements reflect execution speed rather than compilation time.