models.numba_optimized
Numba-Optimized Kernels
This module provides Numba-accelerated kernels for the predator-prey cellular automaton, including update kernels and spatial analysis functions.
Classes
PPKernel Wrapper for predator-prey update kernels with pre-allocated buffers.
Cluster Analysis
measure_cluster_sizes_fast # Fast cluster size measurement (sizes only).
detect_clusters_fast # Full cluster detection with labels.
get_cluster_stats_fast # Comprehensive cluster statistics.
Pair Correlation Functions
compute_pcf_periodic_fast # PCF for two position sets with periodic boundaries.
compute_all_pcfs_fast #Compute prey-prey, pred-pred, and prey-pred PCFs.
Utilities
set_numba_seed # Seed Numba's internal RNG.
warmup_numba_kernels # Pre-compile kernels to avoid first-run latency.
Example
from models.numba_optimized import (
PPKernel,
get_cluster_stats_fast,
compute_all_pcfs_fast,
)
# Cluster analysis
stats = get_cluster_stats_fast(grid, species=1)
print(f"Largest cluster: {stats['largest']}")
# PCF computation
pcfs = compute_all_pcfs_fast(grid, max_distance=20.0)
prey_prey_dist, prey_prey_gr, _ = pcfs['prey_prey']
1#!/usr/bin/env python3 2""" 3Numba-Optimized Kernels 4======================= 5 6This module provides Numba-accelerated kernels for the predator-prey 7cellular automaton, including update kernels and spatial analysis functions. 8 9Classes 10------- 11PPKernel 12 Wrapper for predator-prey update kernels with pre-allocated buffers. 13 14Cluster Analysis 15---------------- 16```python 17measure_cluster_sizes_fast # Fast cluster size measurement (sizes only). 18detect_clusters_fast # Full cluster detection with labels. 19get_cluster_stats_fast # Comprehensive cluster statistics. 20``` 21 22Pair Correlation Functions 23-------------------------- 24```python 25compute_pcf_periodic_fast # PCF for two position sets with periodic boundaries. 26compute_all_pcfs_fast #Compute prey-prey, pred-pred, and prey-pred PCFs. 27``` 28 29Utilities 30--------- 31```python 32set_numba_seed # Seed Numba's internal RNG. 33warmup_numba_kernels # Pre-compile kernels to avoid first-run latency. 34``` 35 36Example 37------- 38```python 39from models.numba_optimized import ( 40 PPKernel, 41 get_cluster_stats_fast, 42 compute_all_pcfs_fast, 43) 44 45# Cluster analysis 46stats = get_cluster_stats_fast(grid, species=1) 47print(f"Largest cluster: {stats['largest']}") 48 49# PCF computation 50pcfs = compute_all_pcfs_fast(grid, max_distance=20.0) 51prey_prey_dist, prey_prey_gr, _ = pcfs['prey_prey'] 52``` 53""" 54 55import numpy as np 56from typing import Tuple, Dict, Optional 57 58try: 59 from numba import njit, prange 60 61 NUMBA_AVAILABLE = True 62except ImportError: 63 NUMBA_AVAILABLE = False 64 65 def njit(*args, **kwargs): 66 def decorator(func): 67 return func 68 69 return decorator 70 71 def prange(*args): 72 return range(*args) 73 74 75# ============================================================================ 76# RNG SEEDING 77# ============================================================================ 78 79 80@njit(cache=True) 81def set_numba_seed(seed: int) -> None: 82 """ 83 Seed Numba's internal random number generator from within a JIT context. 84 85 This function ensures that Numba's independent random number generator 86 is synchronized with the provided seed, enabling reproducibility for 87 jit-compiled functions that use NumPy's random operations. 88 89 Parameters 90 ---------- 91 seed : int 92 The integer value used to initialize the random number generator. 93 94 Returns 95 ------- 96 None 97 98 Notes 99 ----- 100 Because Numba maintains its own internal state for random number 101 generation, calling `np.random.seed()` in standard Python code will not 102 affect jit-compiled functions. This helper must be called to bridge 103 that gap. 104 """ 105 np.random.seed(seed) 106 107 108# ============================================================================ 109# PREDATOR-PREY KERNELS 110# ============================================================================ 111 112 113@njit(cache=True) 114def _pp_async_kernel_random( 115 grid: np.ndarray, 116 prey_death_arr: np.ndarray, 117 p_birth_val: float, 118 p_death_val: float, 119 pred_birth_val: float, 120 pred_death_val: float, 121 dr_arr: np.ndarray, 122 dc_arr: np.ndarray, 123 evolve_sd: float, 124 evolve_min: float, 125 evolve_max: float, 126 evolution_stopped: bool, 127 occupied_buffer: np.ndarray, 128) -> np.ndarray: 129 """ 130 Asynchronous predator-prey update kernel with random neighbor selection. 131 132 This Numba-accelerated kernel performs an asynchronous update of the 133 simulation grid. It identifies all occupied cells, shuffles them to 134 ensure unbiased processing, and applies stochastic rules for prey 135 mortality, prey reproduction (with optional parameter evolution), 136 predator mortality, and predation. 137 138 Parameters 139 ---------- 140 grid : np.ndarray 141 2D integer array representing the simulation grid (0: Empty, 1: Prey, 2: Predator). 142 prey_death_arr : np.ndarray 143 2D float array storing the individual prey death rates for evolution tracking. 144 p_birth_val : float 145 Base probability of prey reproduction into an adjacent empty cell. 146 p_death_val : float 147 Base probability of prey death (though individual rates in `prey_death_arr` are used). 148 pred_birth_val : float 149 Probability of a predator reproducing after consuming prey. 150 pred_death_val : float 151 Probability of a predator dying. 152 dr_arr : np.ndarray 153 Array of row offsets defining the neighborhood. 154 dc_arr : np.ndarray 155 Array of column offsets defining the neighborhood. 156 evolve_sd : float 157 Standard deviation of the mutation applied to the prey death rate during reproduction. 158 evolve_min : float 159 Lower bound for the evolved prey death rate. 160 evolve_max : float 161 Upper bound for the evolved prey death rate. 162 evolution_stopped : bool 163 If True, offspring inherit the parent's death rate without mutation. 164 occupied_buffer : np.ndarray 165 Pre-allocated 2D array used to store and shuffle coordinates of occupied cells. 166 167 Returns 168 ------- 169 grid : np.ndarray 170 The updated simulation grid. 171 172 Notes 173 ----- 174 The kernel uses periodic boundary conditions. The Fisher-Yates shuffle on 175 `occupied_buffer` ensures that the asynchronous updates do not introduce 176 directional bias. 177 """ 178 rows, cols = grid.shape 179 n_shifts = len(dr_arr) 180 181 # Collect occupied cells 182 count = 0 183 for r in range(rows): 184 for c in range(cols): 185 if grid[r, c] != 0: 186 occupied_buffer[count, 0] = r 187 occupied_buffer[count, 1] = c 188 count += 1 189 190 # Fisher-Yates shuffle 191 for i in range(count - 1, 0, -1): 192 j = np.random.randint(0, i + 1) 193 occupied_buffer[i, 0], occupied_buffer[j, 0] = ( 194 occupied_buffer[j, 0], 195 occupied_buffer[i, 0], 196 ) 197 occupied_buffer[i, 1], occupied_buffer[j, 1] = ( 198 occupied_buffer[j, 1], 199 occupied_buffer[i, 1], 200 ) 201 202 # Process each occupied cell 203 for i in range(count): 204 r = occupied_buffer[i, 0] 205 c = occupied_buffer[i, 1] 206 207 state = grid[r, c] 208 if state == 0: 209 continue 210 211 # Random neighbor selection 212 nbi = np.random.randint(0, n_shifts) 213 nr = (r + dr_arr[nbi]) % rows 214 nc = (c + dc_arr[nbi]) % cols 215 216 if state == 1: # PREY 217 if np.random.random() < prey_death_arr[r, c]: 218 grid[r, c] = 0 219 prey_death_arr[r, c] = np.nan 220 elif grid[nr, nc] == 0: 221 if np.random.random() < p_birth_val: 222 grid[nr, nc] = 1 223 parent_val = prey_death_arr[r, c] 224 if not evolution_stopped: 225 child_val = parent_val + np.random.normal(0, evolve_sd) 226 if child_val < evolve_min: 227 child_val = evolve_min 228 if child_val > evolve_max: 229 child_val = evolve_max 230 prey_death_arr[nr, nc] = child_val 231 else: 232 prey_death_arr[nr, nc] = parent_val 233 234 elif state == 2: # PREDATOR 235 if np.random.random() < pred_death_val: 236 grid[r, c] = 0 237 elif grid[nr, nc] == 1: 238 if np.random.random() < pred_birth_val: 239 grid[nr, nc] = 2 240 prey_death_arr[nr, nc] = np.nan 241 242 return grid 243 244 245@njit(cache=True) 246def _pp_async_kernel_directed( 247 grid: np.ndarray, 248 prey_death_arr: np.ndarray, 249 p_birth_val: float, 250 p_death_val: float, 251 pred_birth_val: float, 252 pred_death_val: float, 253 dr_arr: np.ndarray, 254 dc_arr: np.ndarray, 255 evolve_sd: float, 256 evolve_min: float, 257 evolve_max: float, 258 evolution_stopped: bool, 259 occupied_buffer: np.ndarray, 260) -> np.ndarray: 261 """ 262 Asynchronous predator-prey update kernel with directed behavior. 263 264 This kernel implements "intelligent" species behavior: prey actively search 265 for empty spaces to reproduce, and predators actively search for nearby 266 prey to hunt. A two-pass approach is used to stochastically select a 267 valid target from the neighborhood without heap allocation. 268 269 Parameters 270 ---------- 271 grid : np.ndarray 272 2D integer array representing the simulation grid (0: Empty, 1: Prey, 2: Predator). 273 prey_death_arr : np.ndarray 274 2D float array storing individual prey mortality rates for evolution. 275 p_birth_val : float 276 Probability of prey reproduction attempt. 277 p_death_val : float 278 Base probability of prey mortality. 279 pred_birth_val : float 280 Probability of a predator reproduction attempt (hunting success). 281 pred_death_val : float 282 Probability of predator mortality. 283 dr_arr : np.ndarray 284 Row offsets defining the spatial neighborhood (e.g., Moore or von Neumann). 285 dc_arr : np.ndarray 286 Column offsets defining the spatial neighborhood. 287 evolve_sd : float 288 Standard deviation for mutations in prey death rates. 289 evolve_min : float 290 Minimum allowable value for evolved prey death rates. 291 evolve_max : float 292 Maximum allowable value for evolved prey death rates. 293 evolution_stopped : bool 294 If True, prevents mutation during prey reproduction. 295 occupied_buffer : np.ndarray 296 Pre-allocated array for storing and shuffling active cell coordinates. 297 298 Returns 299 ------- 300 grid : np.ndarray 301 The updated simulation grid. 302 303 Notes 304 ----- 305 The directed behavior significantly changes the system dynamics compared to 306 random neighbor selection, often leading to different critical thresholds 307 and spatial patterning. Periodic boundary conditions are applied. 308 """ 309 rows, cols = grid.shape 310 n_shifts = len(dr_arr) 311 312 # Collect occupied cells 313 count = 0 314 for r in range(rows): 315 for c in range(cols): 316 if grid[r, c] != 0: 317 occupied_buffer[count, 0] = r 318 occupied_buffer[count, 1] = c 319 count += 1 320 321 # Fisher-Yates shuffle 322 for i in range(count - 1, 0, -1): 323 j = np.random.randint(0, i + 1) 324 occupied_buffer[i, 0], occupied_buffer[j, 0] = ( 325 occupied_buffer[j, 0], 326 occupied_buffer[i, 0], 327 ) 328 occupied_buffer[i, 1], occupied_buffer[j, 1] = ( 329 occupied_buffer[j, 1], 330 occupied_buffer[i, 1], 331 ) 332 333 # Process each occupied cell 334 for i in range(count): 335 r = occupied_buffer[i, 0] 336 c = occupied_buffer[i, 1] 337 338 state = grid[r, c] 339 if state == 0: 340 continue 341 342 if state == 1: # PREY - directed reproduction into empty cells 343 # Check for death first 344 if np.random.random() < prey_death_arr[r, c]: 345 grid[r, c] = 0 346 prey_death_arr[r, c] = np.nan 347 continue 348 349 # Attempt reproduction with directed selection 350 if np.random.random() < p_birth_val: 351 # Pass 1: Count empty neighbors 352 empty_count = 0 353 for k in range(n_shifts): 354 check_r = (r + dr_arr[k]) % rows 355 check_c = (c + dc_arr[k]) % cols 356 if grid[check_r, check_c] == 0: 357 empty_count += 1 358 359 # Pass 2: Select random empty neighbor 360 if empty_count > 0: 361 target_idx = np.random.randint(0, empty_count) 362 found = 0 363 nr, nc = r, c # Initialize (will be overwritten) 364 for k in range(n_shifts): 365 check_r = (r + dr_arr[k]) % rows 366 check_c = (c + dc_arr[k]) % cols 367 if grid[check_r, check_c] == 0: 368 if found == target_idx: 369 nr, nc = check_r, check_c 370 break 371 found += 1 372 373 # Reproduce into selected empty cell 374 grid[nr, nc] = 1 375 parent_val = prey_death_arr[r, c] 376 if not evolution_stopped: 377 child_val = parent_val + np.random.normal(0, evolve_sd) 378 if child_val < evolve_min: 379 child_val = evolve_min 380 if child_val > evolve_max: 381 child_val = evolve_max 382 prey_death_arr[nr, nc] = child_val 383 else: 384 prey_death_arr[nr, nc] = parent_val 385 386 elif state == 2: # PREDATOR - directed hunting 387 # Check for death first 388 if np.random.random() < pred_death_val: 389 grid[r, c] = 0 390 continue 391 392 # Attempt hunting with directed selection 393 if np.random.random() < pred_birth_val: 394 # Pass 1: Count prey neighbors 395 prey_count = 0 396 for k in range(n_shifts): 397 check_r = (r + dr_arr[k]) % rows 398 check_c = (c + dc_arr[k]) % cols 399 if grid[check_r, check_c] == 1: 400 prey_count += 1 401 402 # Pass 2: Select random prey neighbor 403 if prey_count > 0: 404 target_idx = np.random.randint(0, prey_count) 405 found = 0 406 nr, nc = r, c # Initialize (will be overwritten) 407 for k in range(n_shifts): 408 check_r = (r + dr_arr[k]) % rows 409 check_c = (c + dc_arr[k]) % cols 410 if grid[check_r, check_c] == 1: 411 if found == target_idx: 412 nr, nc = check_r, check_c 413 break 414 found += 1 415 416 # Hunt: prey cell becomes predator 417 grid[nr, nc] = 2 418 prey_death_arr[nr, nc] = np.nan 419 420 return grid 421 422 423class PPKernel: 424 """ 425 Wrapper for predator-prey kernel with pre-allocated buffers. 426 427 This class manages the spatial configuration and memory buffers required 428 for the Numba-accelerated update kernels. By pre-allocating the 429 `occupied_buffer`, it avoids expensive memory allocations during the 430 simulation loop. 431 432 Parameters 433 ---------- 434 rows : int 435 Number of rows in the simulation grid. 436 cols : int 437 Number of columns in the simulation grid. 438 neighborhood : {'moore', 'von_neumann'}, optional 439 The neighborhood type determining adjacent cells. 'moore' includes 440 diagonals (8 neighbors), 'von_neumann' does not (4 neighbors). 441 Default is 'moore'. 442 directed_hunting : bool, optional 443 If True, uses the directed behavior kernel where species search for 444 targets. If False, uses random neighbor selection. Default is False. 445 446 Attributes 447 ---------- 448 rows : int 449 Grid row count. 450 cols : int 451 Grid column count. 452 directed_hunting : bool 453 Toggle for intelligent behavior logic. 454 """ 455 456 def __init__( 457 self, 458 rows: int, 459 cols: int, 460 neighborhood: str = "moore", 461 directed_hunting: bool = False, 462 ): 463 self.rows = rows 464 self.cols = cols 465 self.directed_hunting = directed_hunting 466 self._occupied_buffer = np.empty((rows * cols, 2), dtype=np.int32) 467 468 if neighborhood == "moore": 469 self._dr = np.array([-1, -1, -1, 0, 0, 1, 1, 1], dtype=np.int32) 470 self._dc = np.array([-1, 0, 1, -1, 1, -1, 0, 1], dtype=np.int32) 471 else: # von Neumann 472 self._dr = np.array([-1, 1, 0, 0], dtype=np.int32) 473 self._dc = np.array([0, 0, -1, 1], dtype=np.int32) 474 475 def update( 476 self, 477 grid: np.ndarray, 478 prey_death_arr: np.ndarray, 479 prey_birth: float, 480 prey_death: float, 481 pred_birth: float, 482 pred_death: float, 483 evolve_sd: float = 0.1, 484 evolve_min: float = 0.001, 485 evolve_max: float = 0.1, 486 evolution_stopped: bool = True, 487 ) -> np.ndarray: 488 """ 489 Execute a single asynchronous update step using the configured kernel. 490 491 Parameters 492 ---------- 493 grid : np.ndarray 494 The current 2D simulation grid. 495 prey_death_arr : np.ndarray 496 2D array of individual prey mortality rates. 497 prey_birth : float 498 Prey reproduction probability. 499 prey_death : float 500 Base prey mortality probability. 501 pred_birth : float 502 Predator reproduction (hunting success) probability. 503 pred_death : float 504 Predator mortality probability. 505 evolve_sd : float, optional 506 Mutation standard deviation (default 0.1). 507 evolve_min : float, optional 508 Minimum evolved death rate (default 0.001). 509 evolve_max : float, optional 510 Maximum evolved death rate (default 0.1). 511 evolution_stopped : bool, optional 512 Whether to disable mutation during this step (default True). 513 514 Returns 515 ------- 516 np.ndarray 517 The updated grid after one full asynchronous pass. 518 """ 519 if self.directed_hunting: 520 return _pp_async_kernel_directed( 521 grid, 522 prey_death_arr, 523 prey_birth, 524 prey_death, 525 pred_birth, 526 pred_death, 527 self._dr, 528 self._dc, 529 evolve_sd, 530 evolve_min, 531 evolve_max, 532 evolution_stopped, 533 self._occupied_buffer, 534 ) 535 else: 536 return _pp_async_kernel_random( 537 grid, 538 prey_death_arr, 539 prey_birth, 540 prey_death, 541 pred_birth, 542 pred_death, 543 self._dr, 544 self._dc, 545 evolve_sd, 546 evolve_min, 547 evolve_max, 548 evolution_stopped, 549 self._occupied_buffer, 550 ) 551 552 553# ============================================================================ 554# CLUSTER DETECTION (ENHANCED) 555# ============================================================================ 556 557 558@njit(cache=True) 559def _flood_fill( 560 grid: np.ndarray, 561 visited: np.ndarray, 562 start_r: int, 563 start_c: int, 564 target: int, 565 rows: int, 566 cols: int, 567 moore: bool, 568) -> int: 569 """ 570 Perform a stack-based flood fill to measure the size of a connected cluster. 571 572 This Numba-accelerated function identifies all contiguous cells of a 573 specific target value starting from a given coordinate. It supports 574 both Moore and von Neumann neighborhoods and implements periodic 575 boundary conditions (toroidal topology). 576 577 Parameters 578 ---------- 579 grid : np.ndarray 580 2D integer array representing the simulation environment. 581 visited : np.ndarray 582 2D boolean array tracked across calls to avoid re-processing cells. 583 start_r : int 584 Starting row index for the flood fill. 585 start_c : int 586 Starting column index for the flood fill. 587 target : int 588 The cell value (e.g., 1 for Prey, 2 for Predator) to include in the cluster. 589 rows : int 590 Total number of rows in the grid. 591 cols : int 592 Total number of columns in the grid. 593 moore : bool 594 If True, use a Moore neighborhood (8 neighbors). If False, use a 595 von Neumann neighborhood (4 neighbors). 596 597 Returns 598 ------- 599 size : int 600 The total number of connected cells belonging to the cluster. 601 602 Notes 603 ----- 604 The function uses a manual stack implementation to avoid recursion limit 605 issues and is optimized for use within JIT-compiled loops. 606 """ 607 max_stack = rows * cols 608 stack_r = np.empty(max_stack, dtype=np.int32) 609 stack_c = np.empty(max_stack, dtype=np.int32) 610 stack_ptr = 0 611 612 stack_r[stack_ptr] = start_r 613 stack_c[stack_ptr] = start_c 614 stack_ptr += 1 615 visited[start_r, start_c] = True 616 617 size = 0 618 619 if moore: 620 dr = np.array([-1, -1, -1, 0, 0, 1, 1, 1], dtype=np.int32) 621 dc = np.array([-1, 0, 1, -1, 1, -1, 0, 1], dtype=np.int32) 622 n_neighbors = 8 623 else: 624 dr = np.array([-1, 1, 0, 0], dtype=np.int32) 625 dc = np.array([0, 0, -1, 1], dtype=np.int32) 626 n_neighbors = 4 627 628 while stack_ptr > 0: 629 stack_ptr -= 1 630 r = stack_r[stack_ptr] 631 c = stack_c[stack_ptr] 632 size += 1 633 634 for k in range(n_neighbors): 635 nr = (r + dr[k]) % rows 636 nc = (c + dc[k]) % cols 637 638 if not visited[nr, nc] and grid[nr, nc] == target: 639 visited[nr, nc] = True 640 stack_r[stack_ptr] = nr 641 stack_c[stack_ptr] = nc 642 stack_ptr += 1 643 644 return size 645 646 647@njit(cache=True) 648def _measure_clusters(grid: np.ndarray, species: int, moore: bool = True) -> np.ndarray: 649 """ 650 Identify and measure the sizes of all connected clusters for a specific species. 651 652 This function scans the entire grid and initiates a flood-fill algorithm 653 whenever an unvisited cell of the target species is encountered. It 654 returns an array containing the size (cell count) of each identified cluster. 655 656 Parameters 657 ---------- 658 grid : np.ndarray 659 2D integer array representing the simulation environment. 660 species : int 661 The target species identifier (e.g., 1 for Prey, 2 for Predator). 662 moore : bool, optional 663 Determines the connectivity logic. If True, uses the Moore neighborhood 664 (8 neighbors); if False, uses the von Neumann neighborhood (4 neighbors). 665 Default is True. 666 667 Returns 668 ------- 669 cluster_sizes : np.ndarray 670 A 1D array of integers where each element represents the size of 671 one connected cluster. 672 673 Notes 674 ----- 675 This function is Numba-optimized and utilizes an internal `visited` mask 676 to ensure each cell is processed only once, maintaining $O(N)$ 677 complexity relative to the number of cells. 678 """ 679 rows, cols = grid.shape 680 visited = np.zeros((rows, cols), dtype=np.bool_) 681 682 max_clusters = rows * cols 683 sizes = np.empty(max_clusters, dtype=np.int32) 684 n_clusters = 0 685 686 for r in range(rows): 687 for c in range(cols): 688 if grid[r, c] == species and not visited[r, c]: 689 size = _flood_fill(grid, visited, r, c, species, rows, cols, moore) 690 sizes[n_clusters] = size 691 n_clusters += 1 692 693 return sizes[:n_clusters] 694 695 696@njit(cache=True) 697def _detect_clusters_numba( 698 grid: np.ndarray, 699 species: int, 700 moore: bool, 701) -> Tuple[np.ndarray, np.ndarray]: 702 """ 703 Full cluster detection returning labels and sizes (Numba-accelerated). 704 705 Returns: 706 labels: 2D int32 array where each cell contains its cluster ID (0 = non-target) 707 sizes: 1D int32 array of cluster sizes (index i = size of cluster i+1) 708 """ 709 rows, cols = grid.shape 710 labels = np.zeros((rows, cols), dtype=np.int32) 711 712 if moore: 713 dr = np.array([-1, -1, -1, 0, 0, 1, 1, 1], dtype=np.int32) 714 dc = np.array([-1, 0, 1, -1, 1, -1, 0, 1], dtype=np.int32) 715 n_neighbors = 8 716 else: 717 dr = np.array([-1, 1, 0, 0], dtype=np.int32) 718 dc = np.array([0, 0, -1, 1], dtype=np.int32) 719 n_neighbors = 4 720 721 max_clusters = rows * cols 722 sizes = np.empty(max_clusters, dtype=np.int32) 723 n_clusters = 0 724 current_label = 1 725 726 max_stack = rows * cols 727 stack_r = np.empty(max_stack, dtype=np.int32) 728 stack_c = np.empty(max_stack, dtype=np.int32) 729 730 for start_r in range(rows): 731 for start_c in range(cols): 732 if grid[start_r, start_c] != species or labels[start_r, start_c] != 0: 733 continue 734 735 stack_ptr = 0 736 stack_r[stack_ptr] = start_r 737 stack_c[stack_ptr] = start_c 738 stack_ptr += 1 739 labels[start_r, start_c] = current_label 740 size = 0 741 742 while stack_ptr > 0: 743 stack_ptr -= 1 744 r = stack_r[stack_ptr] 745 c = stack_c[stack_ptr] 746 size += 1 747 748 for k in range(n_neighbors): 749 nr = (r + dr[k]) % rows 750 nc = (c + dc[k]) % cols 751 752 if grid[nr, nc] == species and labels[nr, nc] == 0: 753 labels[nr, nc] = current_label 754 stack_r[stack_ptr] = nr 755 stack_c[stack_ptr] = nc 756 stack_ptr += 1 757 758 sizes[n_clusters] = size 759 n_clusters += 1 760 current_label += 1 761 762 return labels, sizes[:n_clusters] 763 764 765# ============================================================================ 766# PUBLIC API - CLUSTER DETECTION 767# ============================================================================ 768 769 770def measure_cluster_sizes_fast( 771 grid: np.ndarray, 772 species: int, 773 neighborhood: str = "moore", 774) -> np.ndarray: 775 """ 776 Measure cluster sizes for a specific species using Numba-accelerated flood fill. 777 778 This function provides a high-performance interface for calculating cluster 779 size statistics without the overhead of generating a full label map. It is 780 optimized for large-scale simulation analysis where only distribution 781 metrics (e.g., mean size, max size) are required. 782 783 Parameters 784 ---------- 785 grid : np.ndarray 786 A 2D array representing the simulation environment. 787 species : int 788 The target species identifier (e.g., 1 for Prey, 2 for Predator). 789 neighborhood : {'moore', 'neumann'}, optional 790 The connectivity rule. 'moore' uses 8-way connectivity (including diagonals); 791 'neumann' uses 4-way connectivity. Default is 'moore'. 792 793 Returns 794 ------- 795 cluster_sizes : np.ndarray 796 A 1D array of integers, where each element is the cell count of an 797 identified cluster. 798 799 Notes 800 ----- 801 The input grid is cast to `int32` to ensure compatibility with the 802 underlying JIT-compiled `_measure_clusters` kernel. 803 804 Examples 805 -------- 806 >>> sizes = measure_cluster_sizes_fast(grid, species=1, neighborhood='moore') 807 >>> if sizes.size > 0: 808 ... print(f"Largest cluster: {sizes.max()}") 809 """ 810 grid_int = np.asarray(grid, dtype=np.int32) 811 moore = neighborhood == "moore" 812 return _measure_clusters(grid_int, np.int32(species), moore) 813 814 815def detect_clusters_fast( 816 grid: np.ndarray, 817 species: int, 818 neighborhood: str = "moore", 819) -> Tuple[np.ndarray, Dict[int, int]]: 820 """ 821 Perform full cluster detection with labels using Numba acceleration. 822 823 This function returns a label array for spatial analysis and a dictionary 824 of cluster sizes. It is significantly faster than standard Python or 825 SciPy equivalents for large simulation grids. 826 827 Parameters 828 ---------- 829 grid : np.ndarray 830 A 2D array representing the simulation environment. 831 species : int 832 The target species identifier (e.g., 1 for Prey, 2 for Predator). 833 neighborhood : {'moore', 'neumann'}, optional 834 The connectivity rule. 'moore' uses 8-way connectivity; 'neumann' 835 uses 4-way connectivity. Default is 'moore'. 836 837 Returns 838 ------- 839 labels : np.ndarray 840 A 2D int32 array where each cell contains its unique cluster ID. 841 Cells not belonging to the target species are 0. 842 sizes : dict 843 A dictionary mapping cluster IDs to their respective cell counts. 844 845 Notes 846 ----- 847 The underlying Numba kernel uses a stack-based flood fill to avoid 848 recursion limits and handles periodic boundary conditions. 849 850 Examples 851 -------- 852 >>> labels, sizes = detect_clusters_fast(grid, species=1) 853 >>> if sizes: 854 ... largest_id = max(sizes, key=sizes.get) 855 ... print(f"Cluster {largest_id} size: {sizes[largest_id]}") 856 """ 857 grid_int = np.asarray(grid, dtype=np.int32) 858 moore = neighborhood == "moore" 859 labels, sizes_arr = _detect_clusters_numba(grid_int, np.int32(species), moore) 860 sizes_dict = {i + 1: int(sizes_arr[i]) for i in range(len(sizes_arr))} 861 return labels, sizes_dict 862 863 864def get_cluster_stats_fast( 865 grid: np.ndarray, 866 species: int, 867 neighborhood: str = "moore", 868) -> Dict: 869 """ 870 Compute comprehensive cluster statistics for a species using Numba acceleration. 871 872 This function integrates cluster detection and labeling to provide a 873 full suite of spatial metrics. It calculates the cluster size distribution 874 and the largest cluster fraction, which often serves as an order 875 parameter in percolation theory and Phase 1-3 analyses. 876 877 Parameters 878 ---------- 879 grid : np.ndarray 880 A 2D array representing the simulation environment. 881 species : int 882 The target species identifier (e.g., 1 for Prey, 2 for Predator). 883 neighborhood : {'moore', 'neumann'}, optional 884 The connectivity rule. 'moore' uses 8-way connectivity; 'neumann' 885 uses 4-way connectivity. Default is 'moore'. 886 887 Returns 888 ------- 889 stats : dict 890 A dictionary containing: 891 - 'n_clusters': Total count of isolated clusters. 892 - 'sizes': Sorted array (descending) of all cluster sizes. 893 - 'largest': Size of the single largest cluster. 894 - 'largest_fraction': Size of the largest cluster divided by 895 the total population of the species. 896 - 'mean_size': Average size of all clusters. 897 - 'size_distribution': Frequency mapping of {size: count}. 898 - 'labels': 2D array of unique cluster IDs. 899 - 'size_dict': Mapping of {label_id: size}. 900 901 Examples 902 -------- 903 >>> stats = get_cluster_stats_fast(grid, species=1) 904 >>> print(f"Found {stats['n_clusters']} prey clusters.") 905 >>> print(f"Order parameter: {stats['largest_fraction']:.3f}") 906 """ 907 labels, size_dict = detect_clusters_fast(grid, species, neighborhood) 908 909 if len(size_dict) == 0: 910 return { 911 "n_clusters": 0, 912 "sizes": np.array([], dtype=np.int32), 913 "largest": 0, 914 "largest_fraction": 0.0, 915 "mean_size": 0.0, 916 "size_distribution": {}, 917 "labels": labels, 918 "size_dict": size_dict, 919 } 920 921 sizes = np.array(list(size_dict.values()), dtype=np.int32) 922 sizes_sorted = np.sort(sizes)[::-1] 923 total_pop = int(np.sum(sizes)) 924 largest = int(sizes_sorted[0]) 925 926 size_dist = {} 927 for s in sizes: 928 s_int = int(s) 929 size_dist[s_int] = size_dist.get(s_int, 0) + 1 930 931 return { 932 "n_clusters": len(size_dict), 933 "sizes": sizes_sorted, 934 "largest": largest, 935 "largest_fraction": float(largest) / total_pop if total_pop > 0 else 0.0, 936 "mean_size": float(np.mean(sizes)), 937 "size_distribution": size_dist, 938 "labels": labels, 939 "size_dict": size_dict, 940 } 941 942 943# ============================================================================ 944# PCF COMPUTATION (Cell-list accelerated) 945# ============================================================================ 946 947 948@njit(cache=True) 949def _build_cell_list( 950 positions: np.ndarray, 951 n_cells: int, 952 L_row: float, 953 L_col: float, 954) -> Tuple[np.ndarray, np.ndarray, np.ndarray, float, float]: 955 """ 956 Build a cell list for spatial hashing to accelerate neighbor lookups. 957 958 This Numba-optimized function partitions a set of coordinates into a 959 grid of cells. It uses a three-pass approach to calculate cell occupancy, 960 compute starting offsets for each cell in a flat index array, and finally 961 populate that array with position indices. 962 963 Parameters 964 ---------- 965 positions : np.ndarray 966 An (N, 2) float array of coordinates within the simulation domain. 967 n_cells : int 968 The number of cells along one dimension of the square grid. 969 L_row : float 970 The total height (row extent) of the simulation domain. 971 L_col : float 972 The total width (column extent) of the simulation domain. 973 974 Returns 975 ------- 976 indices : np.ndarray 977 A 1D array of original position indices, reordered so that indices 978 belonging to the same cell are contiguous. 979 offsets : np.ndarray 980 A 2D array where `offsets[r, c]` is the starting index in the 981 `indices` array for cell (r, c). 982 cell_counts : np.ndarray 983 A 2D array where `cell_counts[r, c]` is the number of points 984 contained in cell (r, c). 985 cell_size_r : float 986 The calculated height of an individual cell. 987 cell_size_c : float 988 The calculated width of an individual cell. 989 990 Notes 991 ----- 992 This implementation assumes periodic boundary conditions via the 993 modulo operator during coordinate-to-cell mapping. It is designed to 994 eliminate heap allocations within the main simulation loop by using 995 Numba's efficient array handling. 996 """ 997 n_pos = len(positions) 998 cell_size_r = L_row / n_cells 999 cell_size_c = L_col / n_cells 1000 1001 cell_counts = np.zeros((n_cells, n_cells), dtype=np.int32) 1002 for i in range(n_pos): 1003 cr = int(positions[i, 0] / cell_size_r) % n_cells 1004 cc = int(positions[i, 1] / cell_size_c) % n_cells 1005 cell_counts[cr, cc] += 1 1006 1007 offsets = np.zeros((n_cells, n_cells), dtype=np.int32) 1008 running = 0 1009 for cr in range(n_cells): 1010 for cc in range(n_cells): 1011 offsets[cr, cc] = running 1012 running += cell_counts[cr, cc] 1013 1014 indices = np.empty(n_pos, dtype=np.int32) 1015 fill_counts = np.zeros((n_cells, n_cells), dtype=np.int32) 1016 for i in range(n_pos): 1017 cr = int(positions[i, 0] / cell_size_r) % n_cells 1018 cc = int(positions[i, 1] / cell_size_c) % n_cells 1019 idx = offsets[cr, cc] + fill_counts[cr, cc] 1020 indices[idx] = i 1021 fill_counts[cr, cc] += 1 1022 1023 return indices, offsets, cell_counts, cell_size_r, cell_size_c 1024 1025 1026@njit(cache=True) 1027def _periodic_dist_sq( 1028 r1: float, 1029 c1: float, 1030 r2: float, 1031 c2: float, 1032 L_row: float, 1033 L_col: float, 1034) -> float: 1035 """ 1036 Calculate the squared Euclidean distance between two points with periodic boundary conditions. 1037 1038 This Numba-optimized function accounts for toroidal topology by finding the 1039 shortest path between coordinates across the grid edges. Using the squared 1040 distance avoids the computational expense of a square root operation, 1041 making it ideal for high-frequency spatial queries. 1042 1043 Parameters 1044 ---------- 1045 r1 : float 1046 Row coordinate of the first point. 1047 c1 : float 1048 Column coordinate of the first point. 1049 r2 : float 1050 Row coordinate of the second point. 1051 c2 : float 1052 Column coordinate of the second point. 1053 L_row : float 1054 Total height (row extent) of the periodic domain. 1055 L_col : float 1056 Total width (column extent) of the periodic domain. 1057 1058 Returns 1059 ------- 1060 dist_sq : float 1061 The squared shortest distance between the two points. 1062 1063 Notes 1064 ----- 1065 The function applies the minimum image convention, ensuring that the 1066 distance never exceeds half the domain length in any dimension. 1067 """ 1068 dr = abs(r1 - r2) 1069 dc = abs(c1 - c2) 1070 if dr > L_row * 0.5: 1071 dr = L_row - dr 1072 if dc > L_col * 0.5: 1073 dc = L_col - dc 1074 return dr * dr + dc * dc 1075 1076 1077@njit(parallel=True, cache=True) 1078def _pcf_cell_list( 1079 pos_i: np.ndarray, 1080 pos_j: np.ndarray, 1081 indices_j: np.ndarray, 1082 offsets_j: np.ndarray, 1083 counts_j: np.ndarray, 1084 cell_size_r: float, 1085 cell_size_c: float, 1086 L_row: float, 1087 L_col: float, 1088 max_distance: float, 1089 n_bins: int, 1090 self_correlation: bool, 1091 n_cells: int, 1092) -> np.ndarray: 1093 """ 1094 Compute a Pair Correlation Function (PCF) histogram using spatial cell lists. 1095 1096 This Numba-accelerated parallel kernel calculates distances between two sets 1097 of points (pos_i and pos_j). It uses a cell list (spatial hashing) to 1098 restrict distance calculations to neighboring cells within the maximum 1099 specified distance, significantly improving performance from $O(N^2)$ 1100 to $O(N)$. 1101 1102 Parameters 1103 ---------- 1104 pos_i : np.ndarray 1105 (N, 2) float array of coordinates for the primary species. 1106 pos_j : np.ndarray 1107 (M, 2) float array of coordinates for the secondary species. 1108 indices_j : np.ndarray 1109 Flattened indices of pos_j sorted by cell, produced by `_build_cell_list`. 1110 offsets_j : np.ndarray 1111 2D array of starting offsets for each cell in `indices_j`. 1112 counts_j : np.ndarray 1113 2D array of particle counts within each cell for species J. 1114 cell_size_r : float 1115 Height of a single spatial cell. 1116 cell_size_c : float 1117 Width of a single spatial cell. 1118 L_row : float 1119 Total height of the periodic domain. 1120 L_col : float 1121 Total width of the periodic domain. 1122 max_distance : float 1123 Maximum radial distance (r) to consider for the correlation. 1124 n_bins : int 1125 Number of bins in the distance histogram. 1126 self_correlation : bool 1127 If True, assumes species I and J are the same and avoids double-counting 1128 or self-interaction. 1129 n_cells : int 1130 Number of cells per dimension in the spatial hash grid. 1131 1132 Returns 1133 ------- 1134 hist : np.ndarray 1135 A 1D array of length `n_bins` containing the counts of pairs found 1136 at each radial distance. 1137 1138 Notes 1139 ----- 1140 The kernel uses `prange` for parallel execution across points in `pos_i`. 1141 Local histograms are used per thread to prevent race conditions during 1142 reduction. Periodic boundary conditions are handled via `_periodic_dist_sq`. 1143 """ 1144 n_i = len(pos_i) 1145 bin_width = max_distance / n_bins 1146 max_dist_sq = max_distance * max_distance 1147 cells_to_check = int(np.ceil(max_distance / min(cell_size_r, cell_size_c))) + 1 1148 1149 hist = np.zeros(n_bins, dtype=np.int64) 1150 1151 for i in prange(n_i): 1152 local_hist = np.zeros(n_bins, dtype=np.int64) 1153 r1, c1 = pos_i[i, 0], pos_i[i, 1] 1154 1155 cell_r = int(r1 / cell_size_r) % n_cells 1156 cell_c = int(c1 / cell_size_c) % n_cells 1157 1158 for dcr in range(-cells_to_check, cells_to_check + 1): 1159 for dcc in range(-cells_to_check, cells_to_check + 1): 1160 ncr = (cell_r + dcr) % n_cells 1161 ncc = (cell_c + dcc) % n_cells 1162 1163 start = offsets_j[ncr, ncc] 1164 end = start + counts_j[ncr, ncc] 1165 1166 for idx in range(start, end): 1167 j = indices_j[idx] 1168 1169 if self_correlation and j <= i: 1170 continue 1171 1172 r2, c2 = pos_j[j, 0], pos_j[j, 1] 1173 d_sq = _periodic_dist_sq(r1, c1, r2, c2, L_row, L_col) 1174 1175 if 0 < d_sq < max_dist_sq: 1176 d = np.sqrt(d_sq) 1177 bin_idx = int(d / bin_width) 1178 if bin_idx >= n_bins: 1179 bin_idx = n_bins - 1 1180 local_hist[bin_idx] += 1 1181 1182 for b in range(n_bins): 1183 hist[b] += local_hist[b] 1184 1185 if self_correlation: 1186 for b in range(n_bins): 1187 hist[b] *= 2 1188 1189 return hist 1190 1191 1192def compute_pcf_periodic_fast( 1193 positions_i: np.ndarray, 1194 positions_j: np.ndarray, 1195 grid_shape: Tuple[int, int], 1196 max_distance: float, 1197 n_bins: int = 50, 1198 self_correlation: bool = False, 1199) -> Tuple[np.ndarray, np.ndarray, int]: 1200 """ 1201 Compute the Pair Correlation Function (PCF) using cell-list acceleration. 1202 1203 This high-level function coordinates the spatial hashing and histogram 1204 calculation to determine the $g(r)$ function. It normalizes the resulting 1205 histogram by the expected number of pairs in an ideal gas of the same 1206 density, accounting for the toroidal area of each radial bin. 1207 1208 Parameters 1209 ---------- 1210 positions_i : np.ndarray 1211 (N, 2) array of coordinates for species I. 1212 positions_j : np.ndarray 1213 (M, 2) array of coordinates for species J. 1214 grid_shape : tuple of int 1215 The (rows, cols) dimensions of the simulation grid. 1216 max_distance : float 1217 The maximum radius to calculate correlations for. 1218 n_bins : int, optional 1219 Number of bins for the radial distribution (default 50). 1220 self_correlation : bool, optional 1221 Set to True if computing the correlation of a species with itself 1222 to avoid self-counting (default False). 1223 1224 Returns 1225 ------- 1226 bin_centers : np.ndarray 1227 The central radial distance for each histogram bin. 1228 pcf : np.ndarray 1229 The normalized $g(r)$ values. A value of 1.0 indicates no spatial 1230 correlation; > 1.0 indicates clustering; < 1.0 indicates repulsion. 1231 total_pairs : int 1232 The total count of pairs found within the `max_distance`. 1233 1234 Notes 1235 ----- 1236 The function dynamically determines the optimal number of cells for the 1237 spatial hash based on the `max_distance` and grid dimensions to maintain 1238 linear time complexity. 1239 """ 1240 rows, cols = grid_shape 1241 L_row, L_col = float(rows), float(cols) 1242 area = L_row * L_col 1243 1244 bin_width = max_distance / n_bins 1245 bin_centers = np.linspace(bin_width / 2, max_distance - bin_width / 2, n_bins) 1246 1247 if len(positions_i) == 0 or len(positions_j) == 0: 1248 return bin_centers, np.ones(n_bins), 0 1249 1250 n_cells = max(4, int(min(rows, cols) / max_distance)) 1251 1252 pos_i = np.ascontiguousarray(positions_i, dtype=np.float64) 1253 pos_j = np.ascontiguousarray(positions_j, dtype=np.float64) 1254 1255 indices_j, offsets_j, counts_j, cell_size_r, cell_size_c = _build_cell_list( 1256 pos_j, n_cells, L_row, L_col 1257 ) 1258 1259 hist = _pcf_cell_list( 1260 pos_i, 1261 pos_j, 1262 indices_j, 1263 offsets_j, 1264 counts_j, 1265 cell_size_r, 1266 cell_size_c, 1267 L_row, 1268 L_col, 1269 max_distance, 1270 n_bins, 1271 self_correlation, 1272 n_cells, 1273 ) 1274 1275 n_i, n_j = len(positions_i), len(positions_j) 1276 if self_correlation: 1277 density_product = n_i * (n_i - 1) / (area * area) 1278 else: 1279 density_product = n_i * n_j / (area * area) 1280 1281 expected = np.zeros(n_bins) 1282 for i in range(n_bins): 1283 r = bin_centers[i] 1284 annulus_area = 2 * np.pi * r * bin_width 1285 expected[i] = density_product * annulus_area * area 1286 1287 pcf = np.ones(n_bins) 1288 mask = expected > 1.0 1289 pcf[mask] = hist[mask] / expected[mask] 1290 1291 return bin_centers, pcf, int(np.sum(hist)) 1292 1293 1294def compute_all_pcfs_fast( 1295 grid: np.ndarray, 1296 max_distance: Optional[float] = None, 1297 n_bins: int = 50, 1298) -> Dict[str, Tuple[np.ndarray, np.ndarray, int]]: 1299 """ 1300 Compute all three species Pair Correlation Functions (PCFs) using cell-list acceleration. 1301 1302 This function calculates the spatial auto-correlations (Prey-Prey, 1303 Predator-Predator) and the cross-correlation (Prey-Predator) for a given 1304 simulation grid. It identifies particle positions and leverages 1305 Numba-accelerated cell lists to handle the computations efficiently. 1306 1307 Parameters 1308 ---------- 1309 grid : np.ndarray 1310 2D integer array where 1 represents prey and 2 represents predators. 1311 max_distance : float, optional 1312 The maximum radial distance for the correlation. Defaults to 1/4 1313 of the minimum grid dimension if not provided. 1314 n_bins : int, optional 1315 Number of distance bins for the histogram. Default is 50. 1316 1317 Returns 1318 ------- 1319 results : dict 1320 A dictionary with keys 'prey_prey', 'pred_pred', and 'prey_pred'. 1321 Each value is a tuple containing: 1322 - bin_centers (np.ndarray): Radial distances. 1323 - pcf_values (np.ndarray): Normalized g(r) values. 1324 - pair_count (int): Total number of pairs found. 1325 1326 Notes 1327 ----- 1328 The PCF provides insight into the spatial organization of the system. 1329 g(r) > 1 at short distances indicates aggregation (clustering), 1330 while g(r) < 1 indicates exclusion or repulsion. 1331 """ 1332 rows, cols = grid.shape 1333 if max_distance is None: 1334 max_distance = min(rows, cols) / 4.0 1335 1336 prey_pos = np.argwhere(grid == 1) 1337 pred_pos = np.argwhere(grid == 2) 1338 1339 results = {} 1340 1341 dist, pcf, n = compute_pcf_periodic_fast( 1342 prey_pos, 1343 prey_pos, 1344 (rows, cols), 1345 max_distance, 1346 n_bins, 1347 self_correlation=True, 1348 ) 1349 results["prey_prey"] = (dist, pcf, n) 1350 1351 dist, pcf, n = compute_pcf_periodic_fast( 1352 pred_pos, 1353 pred_pos, 1354 (rows, cols), 1355 max_distance, 1356 n_bins, 1357 self_correlation=True, 1358 ) 1359 results["pred_pred"] = (dist, pcf, n) 1360 1361 dist, pcf, n = compute_pcf_periodic_fast( 1362 prey_pos, 1363 pred_pos, 1364 (rows, cols), 1365 max_distance, 1366 n_bins, 1367 self_correlation=False, 1368 ) 1369 results["prey_pred"] = (dist, pcf, n) 1370 1371 return results 1372 1373 1374# ============================================================================ 1375# WARMUP & BENCHMARKS 1376# ============================================================================ 1377 1378 1379def warmup_numba_kernels(grid_size: int = 100, directed_hunting: bool = False): 1380 """ 1381 Pre-compile all Numba-accelerated kernels to avoid first-run latency. 1382 1383 This function executes a single step of the simulation and each analysis 1384 routine on a dummy grid. Because Numba uses Just-In-Time (JIT) compilation, 1385 the first call to a decorated function incurs a compilation overhead. 1386 Running this warmup ensures that subsequent experimental runs are timed 1387 accurately and perform at full speed. 1388 1389 Parameters 1390 ---------- 1391 grid_size : int, optional 1392 The side length of the dummy grid used for warmup (default 100). 1393 directed_hunting : bool, optional 1394 If True, also warms up the directed behavior update kernel (default False). 1395 1396 Returns 1397 ------- 1398 None 1399 1400 Notes 1401 ----- 1402 This function checks for `NUMBA_AVAILABLE` before execution. It warms up 1403 the `PPKernel` (random and optionally directed), as well as the 1404 spatial analysis functions (`compute_all_pcfs_fast`, `detect_clusters_fast`, etc.). 1405 """ 1406 if not NUMBA_AVAILABLE: 1407 return 1408 1409 set_numba_seed(0) 1410 1411 grid = np.zeros((grid_size, grid_size), dtype=np.int32) 1412 grid[::3, ::3] = 1 1413 grid[::5, ::5] = 2 1414 1415 prey_death_arr = np.full((grid_size, grid_size), 0.05, dtype=np.float64) 1416 prey_death_arr[grid != 1] = np.nan 1417 1418 # Always warmup random kernel 1419 kernel_random = PPKernel(grid_size, grid_size, directed_hunting=False) 1420 kernel_random.update(grid.copy(), prey_death_arr.copy(), 0.2, 0.05, 0.2, 0.1) 1421 1422 # Warmup directed kernel if requested 1423 if directed_hunting: 1424 kernel_directed = PPKernel(grid_size, grid_size, directed_hunting=True) 1425 kernel_directed.update(grid.copy(), prey_death_arr.copy(), 0.2, 0.05, 0.2, 0.1) 1426 1427 # Warmup analysis functions 1428 _ = compute_all_pcfs_fast(grid, max_distance=20.0, n_bins=20) 1429 _ = measure_cluster_sizes_fast(grid, 1) 1430 _ = detect_clusters_fast(grid, 1) 1431 _ = get_cluster_stats_fast(grid, 1) 1432 1433 1434def benchmark_kernels(grid_size: int = 100, n_runs: int = 20): 1435 """ 1436 Benchmark the execution performance of random vs. directed update kernels. 1437 1438 This utility measures the average time per simulation step for both the 1439 stochastic (random neighbor) and heuristic (directed hunting/reproduction) 1440 update strategies. It accounts for the computational overhead introduced 1441 by the "intelligent" search logic used in directed mode. 1442 1443 Parameters 1444 ---------- 1445 grid_size : int, optional 1446 The side length of the square simulation grid (default 100). 1447 n_runs : int, optional 1448 The number of iterations to perform for averaging performance (default 20). 1449 1450 Returns 1451 ------- 1452 t_random : float 1453 Average time per step for the random kernel in milliseconds. 1454 t_directed : float 1455 Average time per step for the directed kernel in milliseconds. 1456 1457 Notes 1458 ----- 1459 The function ensures a fair comparison by: 1460 1. Using a fixed seed for reproducible initial grid states. 1461 2. Warming up Numba kernels before timing to exclude JIT compilation latency. 1462 3. Copying the grid and death arrays for each iteration to maintain 1463 consistent population densities throughout the benchmark. 1464 """ 1465 import time 1466 1467 print("=" * 60) 1468 print(f"KERNEL BENCHMARK ({grid_size}x{grid_size}, {n_runs} runs)") 1469 print(f"Numba available: {NUMBA_AVAILABLE}") 1470 print("=" * 60) 1471 1472 np.random.seed(42) 1473 grid = np.zeros((grid_size, grid_size), dtype=np.int32) 1474 n_prey = int(grid_size * grid_size * 0.30) 1475 n_pred = int(grid_size * grid_size * 0.15) 1476 positions = np.random.permutation(grid_size * grid_size) 1477 for pos in positions[:n_prey]: 1478 grid[pos // grid_size, pos % grid_size] = 1 1479 for pos in positions[n_prey : n_prey + n_pred]: 1480 grid[pos // grid_size, pos % grid_size] = 2 1481 1482 prey_death_arr = np.full((grid_size, grid_size), 0.05, dtype=np.float64) 1483 prey_death_arr[grid != 1] = np.nan 1484 1485 print(f"Initial: {np.sum(grid == 1)} prey, {np.sum(grid == 2)} predators") 1486 1487 # Warmup both kernels 1488 warmup_numba_kernels(grid_size, directed_hunting=True) 1489 1490 # Benchmark random kernel 1491 kernel_random = PPKernel(grid_size, grid_size, directed_hunting=False) 1492 t0 = time.perf_counter() 1493 for _ in range(n_runs): 1494 test_grid = grid.copy() 1495 test_arr = prey_death_arr.copy() 1496 kernel_random.update(test_grid, test_arr, 0.2, 0.05, 0.2, 0.1) 1497 t_random = (time.perf_counter() - t0) / n_runs * 1000 1498 1499 # Benchmark directed kernel 1500 kernel_directed = PPKernel(grid_size, grid_size, directed_hunting=True) 1501 t0 = time.perf_counter() 1502 for _ in range(n_runs): 1503 test_grid = grid.copy() 1504 test_arr = prey_death_arr.copy() 1505 kernel_directed.update(test_grid, test_arr, 0.2, 0.05, 0.2, 0.1) 1506 t_directed = (time.perf_counter() - t0) / n_runs * 1000 1507 1508 print(f"\nRandom kernel: {t_random:.2f} ms/step") 1509 print(f"Directed kernel: {t_directed:.2f} ms/step") 1510 print( 1511 f"Overhead: {t_directed - t_random:.2f} ms (+{100*(t_directed/t_random - 1):.1f}%)" 1512 ) 1513 1514 return t_random, t_directed 1515 1516 1517def benchmark_cluster_detection(grid_size: int = 100, n_runs: int = 20): 1518 """ 1519 Benchmark the performance of different cluster detection and analysis routines. 1520 1521 This function evaluates three levels of spatial analysis: 1522 1. Size measurement only (fastest, no label map). 1523 2. Full detection (returns label map and size dictionary). 1524 3. Comprehensive statistics (calculates distributions, means, and order parameters). 1525 1526 Parameters 1527 ---------- 1528 grid_size : int, optional 1529 Side length of the square grid for benchmarking (default 100). 1530 n_runs : int, optional 1531 Number of iterations to average for performance results (default 20). 1532 1533 Returns 1534 ------- 1535 stats : dict 1536 The result dictionary from the final comprehensive statistics run. 1537 1538 Notes 1539 ----- 1540 The benchmark uses a fixed prey density of 30% to ensure a representative 1541 distribution of clusters. It pre-warms the Numba kernels to ensure that 1542 the measurements reflect execution speed rather than compilation time. 1543 """ 1544 import time 1545 1546 print("=" * 60) 1547 print(f"CLUSTER DETECTION BENCHMARK ({grid_size}x{grid_size})") 1548 print(f"Numba available: {NUMBA_AVAILABLE}") 1549 print("=" * 60) 1550 1551 np.random.seed(42) 1552 grid = np.zeros((grid_size, grid_size), dtype=np.int32) 1553 n_prey = int(grid_size * grid_size * 0.30) 1554 positions = np.random.permutation(grid_size * grid_size)[:n_prey] 1555 for pos in positions: 1556 grid[pos // grid_size, pos % grid_size] = 1 1557 1558 print(f"Prey cells: {np.sum(grid == 1)}") 1559 1560 # Warmup 1561 _ = measure_cluster_sizes_fast(grid, 1) 1562 _ = detect_clusters_fast(grid, 1) 1563 _ = get_cluster_stats_fast(grid, 1) 1564 1565 # Benchmark sizes only 1566 t0 = time.perf_counter() 1567 for _ in range(n_runs): 1568 sizes = measure_cluster_sizes_fast(grid, 1) 1569 t_sizes = (time.perf_counter() - t0) / n_runs * 1000 1570 print(f"\nmeasure_cluster_sizes_fast: {t_sizes:.2f} ms ({len(sizes)} clusters)") 1571 1572 # Benchmark full detection 1573 t0 = time.perf_counter() 1574 for _ in range(n_runs): 1575 labels, size_dict = detect_clusters_fast(grid, 1) 1576 t_detect = (time.perf_counter() - t0) / n_runs * 1000 1577 print(f"detect_clusters_fast: {t_detect:.2f} ms ({len(size_dict)} clusters)") 1578 1579 # Benchmark full stats 1580 t0 = time.perf_counter() 1581 for _ in range(n_runs): 1582 stats = get_cluster_stats_fast(grid, 1) 1583 t_stats = (time.perf_counter() - t0) / n_runs * 1000 1584 print(f"get_cluster_stats_fast: {t_stats:.2f} ms") 1585 1586 print( 1587 f"\nOverhead for labels: {t_detect - t_sizes:.2f} ms (+{100*(t_detect/t_sizes - 1):.0f}%)" 1588 ) 1589 1590 return stats 1591 1592 1593if __name__ == "__main__": 1594 print("\n" + "=" * 60) 1595 print("NUMBA-OPTIMIZED PP MODULE - BENCHMARKS") 1596 print("=" * 60 + "\n") 1597 1598 # Run kernel benchmarks 1599 benchmark_kernels(100) 1600 1601 print("\n") 1602 1603 # Run cluster benchmarks 1604 stats = benchmark_cluster_detection(100) 1605 print( 1606 f"\nSample stats: largest={stats['largest']}, " 1607 f"largest_fraction={stats['largest_fraction']:.3f}, " 1608 f"n_clusters={stats['n_clusters']}" 1609 )
81@njit(cache=True) 82def set_numba_seed(seed: int) -> None: 83 """ 84 Seed Numba's internal random number generator from within a JIT context. 85 86 This function ensures that Numba's independent random number generator 87 is synchronized with the provided seed, enabling reproducibility for 88 jit-compiled functions that use NumPy's random operations. 89 90 Parameters 91 ---------- 92 seed : int 93 The integer value used to initialize the random number generator. 94 95 Returns 96 ------- 97 None 98 99 Notes 100 ----- 101 Because Numba maintains its own internal state for random number 102 generation, calling `np.random.seed()` in standard Python code will not 103 affect jit-compiled functions. This helper must be called to bridge 104 that gap. 105 """ 106 np.random.seed(seed)
Seed Numba's internal random number generator from within a JIT context.
This function ensures that Numba's independent random number generator is synchronized with the provided seed, enabling reproducibility for jit-compiled functions that use NumPy's random operations.
Parameters
- seed (int): The integer value used to initialize the random number generator.
Returns
- None
Notes
Because Numba maintains its own internal state for random number
generation, calling np.random.seed() in standard Python code will not
affect jit-compiled functions. This helper must be called to bridge
that gap.
424class PPKernel: 425 """ 426 Wrapper for predator-prey kernel with pre-allocated buffers. 427 428 This class manages the spatial configuration and memory buffers required 429 for the Numba-accelerated update kernels. By pre-allocating the 430 `occupied_buffer`, it avoids expensive memory allocations during the 431 simulation loop. 432 433 Parameters 434 ---------- 435 rows : int 436 Number of rows in the simulation grid. 437 cols : int 438 Number of columns in the simulation grid. 439 neighborhood : {'moore', 'von_neumann'}, optional 440 The neighborhood type determining adjacent cells. 'moore' includes 441 diagonals (8 neighbors), 'von_neumann' does not (4 neighbors). 442 Default is 'moore'. 443 directed_hunting : bool, optional 444 If True, uses the directed behavior kernel where species search for 445 targets. If False, uses random neighbor selection. Default is False. 446 447 Attributes 448 ---------- 449 rows : int 450 Grid row count. 451 cols : int 452 Grid column count. 453 directed_hunting : bool 454 Toggle for intelligent behavior logic. 455 """ 456 457 def __init__( 458 self, 459 rows: int, 460 cols: int, 461 neighborhood: str = "moore", 462 directed_hunting: bool = False, 463 ): 464 self.rows = rows 465 self.cols = cols 466 self.directed_hunting = directed_hunting 467 self._occupied_buffer = np.empty((rows * cols, 2), dtype=np.int32) 468 469 if neighborhood == "moore": 470 self._dr = np.array([-1, -1, -1, 0, 0, 1, 1, 1], dtype=np.int32) 471 self._dc = np.array([-1, 0, 1, -1, 1, -1, 0, 1], dtype=np.int32) 472 else: # von Neumann 473 self._dr = np.array([-1, 1, 0, 0], dtype=np.int32) 474 self._dc = np.array([0, 0, -1, 1], dtype=np.int32) 475 476 def update( 477 self, 478 grid: np.ndarray, 479 prey_death_arr: np.ndarray, 480 prey_birth: float, 481 prey_death: float, 482 pred_birth: float, 483 pred_death: float, 484 evolve_sd: float = 0.1, 485 evolve_min: float = 0.001, 486 evolve_max: float = 0.1, 487 evolution_stopped: bool = True, 488 ) -> np.ndarray: 489 """ 490 Execute a single asynchronous update step using the configured kernel. 491 492 Parameters 493 ---------- 494 grid : np.ndarray 495 The current 2D simulation grid. 496 prey_death_arr : np.ndarray 497 2D array of individual prey mortality rates. 498 prey_birth : float 499 Prey reproduction probability. 500 prey_death : float 501 Base prey mortality probability. 502 pred_birth : float 503 Predator reproduction (hunting success) probability. 504 pred_death : float 505 Predator mortality probability. 506 evolve_sd : float, optional 507 Mutation standard deviation (default 0.1). 508 evolve_min : float, optional 509 Minimum evolved death rate (default 0.001). 510 evolve_max : float, optional 511 Maximum evolved death rate (default 0.1). 512 evolution_stopped : bool, optional 513 Whether to disable mutation during this step (default True). 514 515 Returns 516 ------- 517 np.ndarray 518 The updated grid after one full asynchronous pass. 519 """ 520 if self.directed_hunting: 521 return _pp_async_kernel_directed( 522 grid, 523 prey_death_arr, 524 prey_birth, 525 prey_death, 526 pred_birth, 527 pred_death, 528 self._dr, 529 self._dc, 530 evolve_sd, 531 evolve_min, 532 evolve_max, 533 evolution_stopped, 534 self._occupied_buffer, 535 ) 536 else: 537 return _pp_async_kernel_random( 538 grid, 539 prey_death_arr, 540 prey_birth, 541 prey_death, 542 pred_birth, 543 pred_death, 544 self._dr, 545 self._dc, 546 evolve_sd, 547 evolve_min, 548 evolve_max, 549 evolution_stopped, 550 self._occupied_buffer, 551 )
Wrapper for predator-prey kernel with pre-allocated buffers.
This class manages the spatial configuration and memory buffers required
for the Numba-accelerated update kernels. By pre-allocating the
occupied_buffer, it avoids expensive memory allocations during the
simulation loop.
Parameters
- rows (int): Number of rows in the simulation grid.
- cols (int): Number of columns in the simulation grid.
- neighborhood ({'moore', 'von_neumann'}, optional): The neighborhood type determining adjacent cells. 'moore' includes diagonals (8 neighbors), 'von_neumann' does not (4 neighbors). Default is 'moore'.
- directed_hunting (bool, optional): If True, uses the directed behavior kernel where species search for targets. If False, uses random neighbor selection. Default is False.
Attributes
- rows (int): Grid row count.
- cols (int): Grid column count.
- directed_hunting (bool): Toggle for intelligent behavior logic.
476 def update( 477 self, 478 grid: np.ndarray, 479 prey_death_arr: np.ndarray, 480 prey_birth: float, 481 prey_death: float, 482 pred_birth: float, 483 pred_death: float, 484 evolve_sd: float = 0.1, 485 evolve_min: float = 0.001, 486 evolve_max: float = 0.1, 487 evolution_stopped: bool = True, 488 ) -> np.ndarray: 489 """ 490 Execute a single asynchronous update step using the configured kernel. 491 492 Parameters 493 ---------- 494 grid : np.ndarray 495 The current 2D simulation grid. 496 prey_death_arr : np.ndarray 497 2D array of individual prey mortality rates. 498 prey_birth : float 499 Prey reproduction probability. 500 prey_death : float 501 Base prey mortality probability. 502 pred_birth : float 503 Predator reproduction (hunting success) probability. 504 pred_death : float 505 Predator mortality probability. 506 evolve_sd : float, optional 507 Mutation standard deviation (default 0.1). 508 evolve_min : float, optional 509 Minimum evolved death rate (default 0.001). 510 evolve_max : float, optional 511 Maximum evolved death rate (default 0.1). 512 evolution_stopped : bool, optional 513 Whether to disable mutation during this step (default True). 514 515 Returns 516 ------- 517 np.ndarray 518 The updated grid after one full asynchronous pass. 519 """ 520 if self.directed_hunting: 521 return _pp_async_kernel_directed( 522 grid, 523 prey_death_arr, 524 prey_birth, 525 prey_death, 526 pred_birth, 527 pred_death, 528 self._dr, 529 self._dc, 530 evolve_sd, 531 evolve_min, 532 evolve_max, 533 evolution_stopped, 534 self._occupied_buffer, 535 ) 536 else: 537 return _pp_async_kernel_random( 538 grid, 539 prey_death_arr, 540 prey_birth, 541 prey_death, 542 pred_birth, 543 pred_death, 544 self._dr, 545 self._dc, 546 evolve_sd, 547 evolve_min, 548 evolve_max, 549 evolution_stopped, 550 self._occupied_buffer, 551 )
Execute a single asynchronous update step using the configured kernel.
Parameters
- grid (np.ndarray): The current 2D simulation grid.
- prey_death_arr (np.ndarray): 2D array of individual prey mortality rates.
- prey_birth (float): Prey reproduction probability.
- prey_death (float): Base prey mortality probability.
- pred_birth (float): Predator reproduction (hunting success) probability.
- pred_death (float): Predator mortality probability.
- evolve_sd (float, optional): Mutation standard deviation (default 0.1).
- evolve_min (float, optional): Minimum evolved death rate (default 0.001).
- evolve_max (float, optional): Maximum evolved death rate (default 0.1).
- evolution_stopped (bool, optional): Whether to disable mutation during this step (default True).
Returns
- np.ndarray: The updated grid after one full asynchronous pass.
771def measure_cluster_sizes_fast( 772 grid: np.ndarray, 773 species: int, 774 neighborhood: str = "moore", 775) -> np.ndarray: 776 """ 777 Measure cluster sizes for a specific species using Numba-accelerated flood fill. 778 779 This function provides a high-performance interface for calculating cluster 780 size statistics without the overhead of generating a full label map. It is 781 optimized for large-scale simulation analysis where only distribution 782 metrics (e.g., mean size, max size) are required. 783 784 Parameters 785 ---------- 786 grid : np.ndarray 787 A 2D array representing the simulation environment. 788 species : int 789 The target species identifier (e.g., 1 for Prey, 2 for Predator). 790 neighborhood : {'moore', 'neumann'}, optional 791 The connectivity rule. 'moore' uses 8-way connectivity (including diagonals); 792 'neumann' uses 4-way connectivity. Default is 'moore'. 793 794 Returns 795 ------- 796 cluster_sizes : np.ndarray 797 A 1D array of integers, where each element is the cell count of an 798 identified cluster. 799 800 Notes 801 ----- 802 The input grid is cast to `int32` to ensure compatibility with the 803 underlying JIT-compiled `_measure_clusters` kernel. 804 805 Examples 806 -------- 807 >>> sizes = measure_cluster_sizes_fast(grid, species=1, neighborhood='moore') 808 >>> if sizes.size > 0: 809 ... print(f"Largest cluster: {sizes.max()}") 810 """ 811 grid_int = np.asarray(grid, dtype=np.int32) 812 moore = neighborhood == "moore" 813 return _measure_clusters(grid_int, np.int32(species), moore)
Measure cluster sizes for a specific species using Numba-accelerated flood fill.
This function provides a high-performance interface for calculating cluster size statistics without the overhead of generating a full label map. It is optimized for large-scale simulation analysis where only distribution metrics (e.g., mean size, max size) are required.
Parameters
- grid (np.ndarray): A 2D array representing the simulation environment.
- species (int): The target species identifier (e.g., 1 for Prey, 2 for Predator).
- neighborhood ({'moore', 'neumann'}, optional): The connectivity rule. 'moore' uses 8-way connectivity (including diagonals); 'neumann' uses 4-way connectivity. Default is 'moore'.
Returns
- cluster_sizes (np.ndarray): A 1D array of integers, where each element is the cell count of an identified cluster.
Notes
The input grid is cast to int32 to ensure compatibility with the
underlying JIT-compiled _measure_clusters kernel.
Examples
>>> sizes = measure_cluster_sizes_fast(grid, species=1, neighborhood='moore')
>>> if sizes.size > 0:
... print(f"Largest cluster: {sizes.max()}")
816def detect_clusters_fast( 817 grid: np.ndarray, 818 species: int, 819 neighborhood: str = "moore", 820) -> Tuple[np.ndarray, Dict[int, int]]: 821 """ 822 Perform full cluster detection with labels using Numba acceleration. 823 824 This function returns a label array for spatial analysis and a dictionary 825 of cluster sizes. It is significantly faster than standard Python or 826 SciPy equivalents for large simulation grids. 827 828 Parameters 829 ---------- 830 grid : np.ndarray 831 A 2D array representing the simulation environment. 832 species : int 833 The target species identifier (e.g., 1 for Prey, 2 for Predator). 834 neighborhood : {'moore', 'neumann'}, optional 835 The connectivity rule. 'moore' uses 8-way connectivity; 'neumann' 836 uses 4-way connectivity. Default is 'moore'. 837 838 Returns 839 ------- 840 labels : np.ndarray 841 A 2D int32 array where each cell contains its unique cluster ID. 842 Cells not belonging to the target species are 0. 843 sizes : dict 844 A dictionary mapping cluster IDs to their respective cell counts. 845 846 Notes 847 ----- 848 The underlying Numba kernel uses a stack-based flood fill to avoid 849 recursion limits and handles periodic boundary conditions. 850 851 Examples 852 -------- 853 >>> labels, sizes = detect_clusters_fast(grid, species=1) 854 >>> if sizes: 855 ... largest_id = max(sizes, key=sizes.get) 856 ... print(f"Cluster {largest_id} size: {sizes[largest_id]}") 857 """ 858 grid_int = np.asarray(grid, dtype=np.int32) 859 moore = neighborhood == "moore" 860 labels, sizes_arr = _detect_clusters_numba(grid_int, np.int32(species), moore) 861 sizes_dict = {i + 1: int(sizes_arr[i]) for i in range(len(sizes_arr))} 862 return labels, sizes_dict
Perform full cluster detection with labels using Numba acceleration.
This function returns a label array for spatial analysis and a dictionary of cluster sizes. It is significantly faster than standard Python or SciPy equivalents for large simulation grids.
Parameters
- grid (np.ndarray): A 2D array representing the simulation environment.
- species (int): The target species identifier (e.g., 1 for Prey, 2 for Predator).
- neighborhood ({'moore', 'neumann'}, optional): The connectivity rule. 'moore' uses 8-way connectivity; 'neumann' uses 4-way connectivity. Default is 'moore'.
Returns
- labels (np.ndarray): A 2D int32 array where each cell contains its unique cluster ID. Cells not belonging to the target species are 0.
- sizes (dict): A dictionary mapping cluster IDs to their respective cell counts.
Notes
The underlying Numba kernel uses a stack-based flood fill to avoid recursion limits and handles periodic boundary conditions.
Examples
>>> labels, sizes = detect_clusters_fast(grid, species=1)
>>> if sizes:
... largest_id = max(sizes, key=sizes.get)
... print(f"Cluster {largest_id} size: {sizes[largest_id]}")
865def get_cluster_stats_fast( 866 grid: np.ndarray, 867 species: int, 868 neighborhood: str = "moore", 869) -> Dict: 870 """ 871 Compute comprehensive cluster statistics for a species using Numba acceleration. 872 873 This function integrates cluster detection and labeling to provide a 874 full suite of spatial metrics. It calculates the cluster size distribution 875 and the largest cluster fraction, which often serves as an order 876 parameter in percolation theory and Phase 1-3 analyses. 877 878 Parameters 879 ---------- 880 grid : np.ndarray 881 A 2D array representing the simulation environment. 882 species : int 883 The target species identifier (e.g., 1 for Prey, 2 for Predator). 884 neighborhood : {'moore', 'neumann'}, optional 885 The connectivity rule. 'moore' uses 8-way connectivity; 'neumann' 886 uses 4-way connectivity. Default is 'moore'. 887 888 Returns 889 ------- 890 stats : dict 891 A dictionary containing: 892 - 'n_clusters': Total count of isolated clusters. 893 - 'sizes': Sorted array (descending) of all cluster sizes. 894 - 'largest': Size of the single largest cluster. 895 - 'largest_fraction': Size of the largest cluster divided by 896 the total population of the species. 897 - 'mean_size': Average size of all clusters. 898 - 'size_distribution': Frequency mapping of {size: count}. 899 - 'labels': 2D array of unique cluster IDs. 900 - 'size_dict': Mapping of {label_id: size}. 901 902 Examples 903 -------- 904 >>> stats = get_cluster_stats_fast(grid, species=1) 905 >>> print(f"Found {stats['n_clusters']} prey clusters.") 906 >>> print(f"Order parameter: {stats['largest_fraction']:.3f}") 907 """ 908 labels, size_dict = detect_clusters_fast(grid, species, neighborhood) 909 910 if len(size_dict) == 0: 911 return { 912 "n_clusters": 0, 913 "sizes": np.array([], dtype=np.int32), 914 "largest": 0, 915 "largest_fraction": 0.0, 916 "mean_size": 0.0, 917 "size_distribution": {}, 918 "labels": labels, 919 "size_dict": size_dict, 920 } 921 922 sizes = np.array(list(size_dict.values()), dtype=np.int32) 923 sizes_sorted = np.sort(sizes)[::-1] 924 total_pop = int(np.sum(sizes)) 925 largest = int(sizes_sorted[0]) 926 927 size_dist = {} 928 for s in sizes: 929 s_int = int(s) 930 size_dist[s_int] = size_dist.get(s_int, 0) + 1 931 932 return { 933 "n_clusters": len(size_dict), 934 "sizes": sizes_sorted, 935 "largest": largest, 936 "largest_fraction": float(largest) / total_pop if total_pop > 0 else 0.0, 937 "mean_size": float(np.mean(sizes)), 938 "size_distribution": size_dist, 939 "labels": labels, 940 "size_dict": size_dict, 941 }
Compute comprehensive cluster statistics for a species using Numba acceleration.
This function integrates cluster detection and labeling to provide a full suite of spatial metrics. It calculates the cluster size distribution and the largest cluster fraction, which often serves as an order parameter in percolation theory and Phase 1-3 analyses.
Parameters
- grid (np.ndarray): A 2D array representing the simulation environment.
- species (int): The target species identifier (e.g., 1 for Prey, 2 for Predator).
- neighborhood ({'moore', 'neumann'}, optional): The connectivity rule. 'moore' uses 8-way connectivity; 'neumann' uses 4-way connectivity. Default is 'moore'.
Returns
- stats (dict):
A dictionary containing:
- 'n_clusters': Total count of isolated clusters.
- 'sizes': Sorted array (descending) of all cluster sizes.
- 'largest': Size of the single largest cluster.
- 'largest_fraction': Size of the largest cluster divided by the total population of the species.
- 'mean_size': Average size of all clusters.
- 'size_distribution': Frequency mapping of {size: count}.
- 'labels': 2D array of unique cluster IDs.
- 'size_dict': Mapping of {label_id: size}.
Examples
>>> stats = get_cluster_stats_fast(grid, species=1)
>>> print(f"Found {stats['n_clusters']} prey clusters.")
>>> print(f"Order parameter: {stats['largest_fraction']:.3f}")
1193def compute_pcf_periodic_fast( 1194 positions_i: np.ndarray, 1195 positions_j: np.ndarray, 1196 grid_shape: Tuple[int, int], 1197 max_distance: float, 1198 n_bins: int = 50, 1199 self_correlation: bool = False, 1200) -> Tuple[np.ndarray, np.ndarray, int]: 1201 """ 1202 Compute the Pair Correlation Function (PCF) using cell-list acceleration. 1203 1204 This high-level function coordinates the spatial hashing and histogram 1205 calculation to determine the $g(r)$ function. It normalizes the resulting 1206 histogram by the expected number of pairs in an ideal gas of the same 1207 density, accounting for the toroidal area of each radial bin. 1208 1209 Parameters 1210 ---------- 1211 positions_i : np.ndarray 1212 (N, 2) array of coordinates for species I. 1213 positions_j : np.ndarray 1214 (M, 2) array of coordinates for species J. 1215 grid_shape : tuple of int 1216 The (rows, cols) dimensions of the simulation grid. 1217 max_distance : float 1218 The maximum radius to calculate correlations for. 1219 n_bins : int, optional 1220 Number of bins for the radial distribution (default 50). 1221 self_correlation : bool, optional 1222 Set to True if computing the correlation of a species with itself 1223 to avoid self-counting (default False). 1224 1225 Returns 1226 ------- 1227 bin_centers : np.ndarray 1228 The central radial distance for each histogram bin. 1229 pcf : np.ndarray 1230 The normalized $g(r)$ values. A value of 1.0 indicates no spatial 1231 correlation; > 1.0 indicates clustering; < 1.0 indicates repulsion. 1232 total_pairs : int 1233 The total count of pairs found within the `max_distance`. 1234 1235 Notes 1236 ----- 1237 The function dynamically determines the optimal number of cells for the 1238 spatial hash based on the `max_distance` and grid dimensions to maintain 1239 linear time complexity. 1240 """ 1241 rows, cols = grid_shape 1242 L_row, L_col = float(rows), float(cols) 1243 area = L_row * L_col 1244 1245 bin_width = max_distance / n_bins 1246 bin_centers = np.linspace(bin_width / 2, max_distance - bin_width / 2, n_bins) 1247 1248 if len(positions_i) == 0 or len(positions_j) == 0: 1249 return bin_centers, np.ones(n_bins), 0 1250 1251 n_cells = max(4, int(min(rows, cols) / max_distance)) 1252 1253 pos_i = np.ascontiguousarray(positions_i, dtype=np.float64) 1254 pos_j = np.ascontiguousarray(positions_j, dtype=np.float64) 1255 1256 indices_j, offsets_j, counts_j, cell_size_r, cell_size_c = _build_cell_list( 1257 pos_j, n_cells, L_row, L_col 1258 ) 1259 1260 hist = _pcf_cell_list( 1261 pos_i, 1262 pos_j, 1263 indices_j, 1264 offsets_j, 1265 counts_j, 1266 cell_size_r, 1267 cell_size_c, 1268 L_row, 1269 L_col, 1270 max_distance, 1271 n_bins, 1272 self_correlation, 1273 n_cells, 1274 ) 1275 1276 n_i, n_j = len(positions_i), len(positions_j) 1277 if self_correlation: 1278 density_product = n_i * (n_i - 1) / (area * area) 1279 else: 1280 density_product = n_i * n_j / (area * area) 1281 1282 expected = np.zeros(n_bins) 1283 for i in range(n_bins): 1284 r = bin_centers[i] 1285 annulus_area = 2 * np.pi * r * bin_width 1286 expected[i] = density_product * annulus_area * area 1287 1288 pcf = np.ones(n_bins) 1289 mask = expected > 1.0 1290 pcf[mask] = hist[mask] / expected[mask] 1291 1292 return bin_centers, pcf, int(np.sum(hist))
Compute the Pair Correlation Function (PCF) using cell-list acceleration.
This high-level function coordinates the spatial hashing and histogram calculation to determine the $g(r)$ function. It normalizes the resulting histogram by the expected number of pairs in an ideal gas of the same density, accounting for the toroidal area of each radial bin.
Parameters
- positions_i (np.ndarray): (N, 2) array of coordinates for species I.
- positions_j (np.ndarray): (M, 2) array of coordinates for species J.
- grid_shape (tuple of int): The (rows, cols) dimensions of the simulation grid.
- max_distance (float): The maximum radius to calculate correlations for.
- n_bins (int, optional): Number of bins for the radial distribution (default 50).
- self_correlation (bool, optional): Set to True if computing the correlation of a species with itself to avoid self-counting (default False).
Returns
- bin_centers (np.ndarray): The central radial distance for each histogram bin.
- pcf (np.ndarray): The normalized $g(r)$ values. A value of 1.0 indicates no spatial correlation; > 1.0 indicates clustering; < 1.0 indicates repulsion.
- total_pairs (int):
The total count of pairs found within the
max_distance.
Notes
The function dynamically determines the optimal number of cells for the
spatial hash based on the max_distance and grid dimensions to maintain
linear time complexity.
1295def compute_all_pcfs_fast( 1296 grid: np.ndarray, 1297 max_distance: Optional[float] = None, 1298 n_bins: int = 50, 1299) -> Dict[str, Tuple[np.ndarray, np.ndarray, int]]: 1300 """ 1301 Compute all three species Pair Correlation Functions (PCFs) using cell-list acceleration. 1302 1303 This function calculates the spatial auto-correlations (Prey-Prey, 1304 Predator-Predator) and the cross-correlation (Prey-Predator) for a given 1305 simulation grid. It identifies particle positions and leverages 1306 Numba-accelerated cell lists to handle the computations efficiently. 1307 1308 Parameters 1309 ---------- 1310 grid : np.ndarray 1311 2D integer array where 1 represents prey and 2 represents predators. 1312 max_distance : float, optional 1313 The maximum radial distance for the correlation. Defaults to 1/4 1314 of the minimum grid dimension if not provided. 1315 n_bins : int, optional 1316 Number of distance bins for the histogram. Default is 50. 1317 1318 Returns 1319 ------- 1320 results : dict 1321 A dictionary with keys 'prey_prey', 'pred_pred', and 'prey_pred'. 1322 Each value is a tuple containing: 1323 - bin_centers (np.ndarray): Radial distances. 1324 - pcf_values (np.ndarray): Normalized g(r) values. 1325 - pair_count (int): Total number of pairs found. 1326 1327 Notes 1328 ----- 1329 The PCF provides insight into the spatial organization of the system. 1330 g(r) > 1 at short distances indicates aggregation (clustering), 1331 while g(r) < 1 indicates exclusion or repulsion. 1332 """ 1333 rows, cols = grid.shape 1334 if max_distance is None: 1335 max_distance = min(rows, cols) / 4.0 1336 1337 prey_pos = np.argwhere(grid == 1) 1338 pred_pos = np.argwhere(grid == 2) 1339 1340 results = {} 1341 1342 dist, pcf, n = compute_pcf_periodic_fast( 1343 prey_pos, 1344 prey_pos, 1345 (rows, cols), 1346 max_distance, 1347 n_bins, 1348 self_correlation=True, 1349 ) 1350 results["prey_prey"] = (dist, pcf, n) 1351 1352 dist, pcf, n = compute_pcf_periodic_fast( 1353 pred_pos, 1354 pred_pos, 1355 (rows, cols), 1356 max_distance, 1357 n_bins, 1358 self_correlation=True, 1359 ) 1360 results["pred_pred"] = (dist, pcf, n) 1361 1362 dist, pcf, n = compute_pcf_periodic_fast( 1363 prey_pos, 1364 pred_pos, 1365 (rows, cols), 1366 max_distance, 1367 n_bins, 1368 self_correlation=False, 1369 ) 1370 results["prey_pred"] = (dist, pcf, n) 1371 1372 return results
Compute all three species Pair Correlation Functions (PCFs) using cell-list acceleration.
This function calculates the spatial auto-correlations (Prey-Prey, Predator-Predator) and the cross-correlation (Prey-Predator) for a given simulation grid. It identifies particle positions and leverages Numba-accelerated cell lists to handle the computations efficiently.
Parameters
- grid (np.ndarray): 2D integer array where 1 represents prey and 2 represents predators.
- max_distance (float, optional): The maximum radial distance for the correlation. Defaults to 1/4 of the minimum grid dimension if not provided.
- n_bins (int, optional): Number of distance bins for the histogram. Default is 50.
Returns
- results (dict):
A dictionary with keys 'prey_prey', 'pred_pred', and 'prey_pred'.
Each value is a tuple containing:
- bin_centers (np.ndarray): Radial distances.
- pcf_values (np.ndarray): Normalized g(r) values.
- pair_count (int): Total number of pairs found.
Notes
The PCF provides insight into the spatial organization of the system. g(r) > 1 at short distances indicates aggregation (clustering), while g(r) < 1 indicates exclusion or repulsion.
1380def warmup_numba_kernels(grid_size: int = 100, directed_hunting: bool = False): 1381 """ 1382 Pre-compile all Numba-accelerated kernels to avoid first-run latency. 1383 1384 This function executes a single step of the simulation and each analysis 1385 routine on a dummy grid. Because Numba uses Just-In-Time (JIT) compilation, 1386 the first call to a decorated function incurs a compilation overhead. 1387 Running this warmup ensures that subsequent experimental runs are timed 1388 accurately and perform at full speed. 1389 1390 Parameters 1391 ---------- 1392 grid_size : int, optional 1393 The side length of the dummy grid used for warmup (default 100). 1394 directed_hunting : bool, optional 1395 If True, also warms up the directed behavior update kernel (default False). 1396 1397 Returns 1398 ------- 1399 None 1400 1401 Notes 1402 ----- 1403 This function checks for `NUMBA_AVAILABLE` before execution. It warms up 1404 the `PPKernel` (random and optionally directed), as well as the 1405 spatial analysis functions (`compute_all_pcfs_fast`, `detect_clusters_fast`, etc.). 1406 """ 1407 if not NUMBA_AVAILABLE: 1408 return 1409 1410 set_numba_seed(0) 1411 1412 grid = np.zeros((grid_size, grid_size), dtype=np.int32) 1413 grid[::3, ::3] = 1 1414 grid[::5, ::5] = 2 1415 1416 prey_death_arr = np.full((grid_size, grid_size), 0.05, dtype=np.float64) 1417 prey_death_arr[grid != 1] = np.nan 1418 1419 # Always warmup random kernel 1420 kernel_random = PPKernel(grid_size, grid_size, directed_hunting=False) 1421 kernel_random.update(grid.copy(), prey_death_arr.copy(), 0.2, 0.05, 0.2, 0.1) 1422 1423 # Warmup directed kernel if requested 1424 if directed_hunting: 1425 kernel_directed = PPKernel(grid_size, grid_size, directed_hunting=True) 1426 kernel_directed.update(grid.copy(), prey_death_arr.copy(), 0.2, 0.05, 0.2, 0.1) 1427 1428 # Warmup analysis functions 1429 _ = compute_all_pcfs_fast(grid, max_distance=20.0, n_bins=20) 1430 _ = measure_cluster_sizes_fast(grid, 1) 1431 _ = detect_clusters_fast(grid, 1) 1432 _ = get_cluster_stats_fast(grid, 1)
Pre-compile all Numba-accelerated kernels to avoid first-run latency.
This function executes a single step of the simulation and each analysis routine on a dummy grid. Because Numba uses Just-In-Time (JIT) compilation, the first call to a decorated function incurs a compilation overhead. Running this warmup ensures that subsequent experimental runs are timed accurately and perform at full speed.
Parameters
- grid_size (int, optional): The side length of the dummy grid used for warmup (default 100).
- directed_hunting (bool, optional): If True, also warms up the directed behavior update kernel (default False).
Returns
- None
Notes
This function checks for NUMBA_AVAILABLE before execution. It warms up
the PPKernel (random and optionally directed), as well as the
spatial analysis functions (compute_all_pcfs_fast, detect_clusters_fast, etc.).
1435def benchmark_kernels(grid_size: int = 100, n_runs: int = 20): 1436 """ 1437 Benchmark the execution performance of random vs. directed update kernels. 1438 1439 This utility measures the average time per simulation step for both the 1440 stochastic (random neighbor) and heuristic (directed hunting/reproduction) 1441 update strategies. It accounts for the computational overhead introduced 1442 by the "intelligent" search logic used in directed mode. 1443 1444 Parameters 1445 ---------- 1446 grid_size : int, optional 1447 The side length of the square simulation grid (default 100). 1448 n_runs : int, optional 1449 The number of iterations to perform for averaging performance (default 20). 1450 1451 Returns 1452 ------- 1453 t_random : float 1454 Average time per step for the random kernel in milliseconds. 1455 t_directed : float 1456 Average time per step for the directed kernel in milliseconds. 1457 1458 Notes 1459 ----- 1460 The function ensures a fair comparison by: 1461 1. Using a fixed seed for reproducible initial grid states. 1462 2. Warming up Numba kernels before timing to exclude JIT compilation latency. 1463 3. Copying the grid and death arrays for each iteration to maintain 1464 consistent population densities throughout the benchmark. 1465 """ 1466 import time 1467 1468 print("=" * 60) 1469 print(f"KERNEL BENCHMARK ({grid_size}x{grid_size}, {n_runs} runs)") 1470 print(f"Numba available: {NUMBA_AVAILABLE}") 1471 print("=" * 60) 1472 1473 np.random.seed(42) 1474 grid = np.zeros((grid_size, grid_size), dtype=np.int32) 1475 n_prey = int(grid_size * grid_size * 0.30) 1476 n_pred = int(grid_size * grid_size * 0.15) 1477 positions = np.random.permutation(grid_size * grid_size) 1478 for pos in positions[:n_prey]: 1479 grid[pos // grid_size, pos % grid_size] = 1 1480 for pos in positions[n_prey : n_prey + n_pred]: 1481 grid[pos // grid_size, pos % grid_size] = 2 1482 1483 prey_death_arr = np.full((grid_size, grid_size), 0.05, dtype=np.float64) 1484 prey_death_arr[grid != 1] = np.nan 1485 1486 print(f"Initial: {np.sum(grid == 1)} prey, {np.sum(grid == 2)} predators") 1487 1488 # Warmup both kernels 1489 warmup_numba_kernels(grid_size, directed_hunting=True) 1490 1491 # Benchmark random kernel 1492 kernel_random = PPKernel(grid_size, grid_size, directed_hunting=False) 1493 t0 = time.perf_counter() 1494 for _ in range(n_runs): 1495 test_grid = grid.copy() 1496 test_arr = prey_death_arr.copy() 1497 kernel_random.update(test_grid, test_arr, 0.2, 0.05, 0.2, 0.1) 1498 t_random = (time.perf_counter() - t0) / n_runs * 1000 1499 1500 # Benchmark directed kernel 1501 kernel_directed = PPKernel(grid_size, grid_size, directed_hunting=True) 1502 t0 = time.perf_counter() 1503 for _ in range(n_runs): 1504 test_grid = grid.copy() 1505 test_arr = prey_death_arr.copy() 1506 kernel_directed.update(test_grid, test_arr, 0.2, 0.05, 0.2, 0.1) 1507 t_directed = (time.perf_counter() - t0) / n_runs * 1000 1508 1509 print(f"\nRandom kernel: {t_random:.2f} ms/step") 1510 print(f"Directed kernel: {t_directed:.2f} ms/step") 1511 print( 1512 f"Overhead: {t_directed - t_random:.2f} ms (+{100*(t_directed/t_random - 1):.1f}%)" 1513 ) 1514 1515 return t_random, t_directed
Benchmark the execution performance of random vs. directed update kernels.
This utility measures the average time per simulation step for both the stochastic (random neighbor) and heuristic (directed hunting/reproduction) update strategies. It accounts for the computational overhead introduced by the "intelligent" search logic used in directed mode.
Parameters
- grid_size (int, optional): The side length of the square simulation grid (default 100).
- n_runs (int, optional): The number of iterations to perform for averaging performance (default 20).
Returns
- t_random (float): Average time per step for the random kernel in milliseconds.
- t_directed (float): Average time per step for the directed kernel in milliseconds.
Notes
The function ensures a fair comparison by:
- Using a fixed seed for reproducible initial grid states.
- Warming up Numba kernels before timing to exclude JIT compilation latency.
- Copying the grid and death arrays for each iteration to maintain consistent population densities throughout the benchmark.
1518def benchmark_cluster_detection(grid_size: int = 100, n_runs: int = 20): 1519 """ 1520 Benchmark the performance of different cluster detection and analysis routines. 1521 1522 This function evaluates three levels of spatial analysis: 1523 1. Size measurement only (fastest, no label map). 1524 2. Full detection (returns label map and size dictionary). 1525 3. Comprehensive statistics (calculates distributions, means, and order parameters). 1526 1527 Parameters 1528 ---------- 1529 grid_size : int, optional 1530 Side length of the square grid for benchmarking (default 100). 1531 n_runs : int, optional 1532 Number of iterations to average for performance results (default 20). 1533 1534 Returns 1535 ------- 1536 stats : dict 1537 The result dictionary from the final comprehensive statistics run. 1538 1539 Notes 1540 ----- 1541 The benchmark uses a fixed prey density of 30% to ensure a representative 1542 distribution of clusters. It pre-warms the Numba kernels to ensure that 1543 the measurements reflect execution speed rather than compilation time. 1544 """ 1545 import time 1546 1547 print("=" * 60) 1548 print(f"CLUSTER DETECTION BENCHMARK ({grid_size}x{grid_size})") 1549 print(f"Numba available: {NUMBA_AVAILABLE}") 1550 print("=" * 60) 1551 1552 np.random.seed(42) 1553 grid = np.zeros((grid_size, grid_size), dtype=np.int32) 1554 n_prey = int(grid_size * grid_size * 0.30) 1555 positions = np.random.permutation(grid_size * grid_size)[:n_prey] 1556 for pos in positions: 1557 grid[pos // grid_size, pos % grid_size] = 1 1558 1559 print(f"Prey cells: {np.sum(grid == 1)}") 1560 1561 # Warmup 1562 _ = measure_cluster_sizes_fast(grid, 1) 1563 _ = detect_clusters_fast(grid, 1) 1564 _ = get_cluster_stats_fast(grid, 1) 1565 1566 # Benchmark sizes only 1567 t0 = time.perf_counter() 1568 for _ in range(n_runs): 1569 sizes = measure_cluster_sizes_fast(grid, 1) 1570 t_sizes = (time.perf_counter() - t0) / n_runs * 1000 1571 print(f"\nmeasure_cluster_sizes_fast: {t_sizes:.2f} ms ({len(sizes)} clusters)") 1572 1573 # Benchmark full detection 1574 t0 = time.perf_counter() 1575 for _ in range(n_runs): 1576 labels, size_dict = detect_clusters_fast(grid, 1) 1577 t_detect = (time.perf_counter() - t0) / n_runs * 1000 1578 print(f"detect_clusters_fast: {t_detect:.2f} ms ({len(size_dict)} clusters)") 1579 1580 # Benchmark full stats 1581 t0 = time.perf_counter() 1582 for _ in range(n_runs): 1583 stats = get_cluster_stats_fast(grid, 1) 1584 t_stats = (time.perf_counter() - t0) / n_runs * 1000 1585 print(f"get_cluster_stats_fast: {t_stats:.2f} ms") 1586 1587 print( 1588 f"\nOverhead for labels: {t_detect - t_sizes:.2f} ms (+{100*(t_detect/t_sizes - 1):.0f}%)" 1589 ) 1590 1591 return stats
Benchmark the performance of different cluster detection and analysis routines.
This function evaluates three levels of spatial analysis:
- Size measurement only (fastest, no label map).
- Full detection (returns label map and size dictionary).
- Comprehensive statistics (calculates distributions, means, and order parameters).
Parameters
- grid_size (int, optional): Side length of the square grid for benchmarking (default 100).
- n_runs (int, optional): Number of iterations to average for performance results (default 20).
Returns
- stats (dict): The result dictionary from the final comprehensive statistics run.
Notes
The benchmark uses a fixed prey density of 30% to ensure a representative distribution of clusters. It pre-warms the Numba kernels to ensure that the measurements reflect execution speed rather than compilation time.