experiments

Predator-Prey Hydra Effect Experiments

HPC-ready experiment runner for investigating the Hydra effect in predator-prey cellular automata.

Experimental Phases
  • Phase 1: Parameter sweep to find critical point (bifurcation + cluster analysis)
  • Phase 2: Self-organization analysis (evolution toward criticality)
  • Phase 3: Finite-size scaling at critical point
  • Phase 4: Sensitivity analysis across parameter regimes
  • Phase 5: Model extensions (directed hunting comparison)
Functions
run_single_simulation # Execute one simulation run and collect metrics.
run_phase1, run_phase2, run_phase3, run_phase4, run_phase5  # Phase-specific experiment runners.
Utilities
generate_unique_seed # Deterministic seed generation from parameters.
count_populations # Count species populations on grid.
get_evolved_stats # Statistics for evolved parameters.
average_pcfs # Average multiple PCF measurements.
save_results_jsonl, load_results_jsonl, save_results_npz # I/O functions for experiment results.
Command Line Usage
python experiments.py --phase 1                    # Run phase 1
python experiments.py --phase 1 --dry-run          # Estimate runtime
python experiments.py --phase all                  # Run all phases
python experiments.py --phase 1 --output results/  # Custom output
Programmatic Usage
from experiments import run_single_simulation, run_phase1
from models.config import PHASE1_CONFIG

# Single simulation
result = run_single_simulation(
    prey_birth=0.2,
    prey_death=0.05,
    predator_birth=0.8,
    predator_death=0.1,
    grid_size=100,
    seed=42,
    cfg=PHASE1_CONFIG,
)

# Full phase (writes to output directory)
import logging
results = run_phase1(PHASE1_CONFIG, Path("results/"), logging.getLogger())
   1#!/usr/bin/env python3
   2"""
   3Predator-Prey Hydra Effect Experiments
   4======================================
   5
   6HPC-ready experiment runner for investigating the Hydra effect in
   7predator-prey cellular automata.
   8
   9Experimental Phases
  10-------------------
  11- **Phase 1**: Parameter sweep to find critical point (bifurcation + cluster analysis)
  12- **Phase 2**: Self-organization analysis (evolution toward criticality)
  13- **Phase 3**: Finite-size scaling at critical point
  14- **Phase 4**: Sensitivity analysis across parameter regimes
  15- **Phase 5**: Model extensions (directed hunting comparison)
  16
  17Functions
  18---------
  19```python
  20run_single_simulation # Execute one simulation run and collect metrics.
  21run_phase1, run_phase2, run_phase3, run_phase4, run_phase5  # Phase-specific experiment runners.
  22```
  23
  24Utilities
  25---------
  26```python
  27generate_unique_seed # Deterministic seed generation from parameters.
  28count_populations # Count species populations on grid.
  29get_evolved_stats # Statistics for evolved parameters.
  30average_pcfs # Average multiple PCF measurements.
  31save_results_jsonl, load_results_jsonl, save_results_npz # I/O functions for experiment results.
  32```
  33
  34Command Line Usage
  35------------------
  36```bash
  37python experiments.py --phase 1                    # Run phase 1
  38python experiments.py --phase 1 --dry-run          # Estimate runtime
  39python experiments.py --phase all                  # Run all phases
  40python experiments.py --phase 1 --output results/  # Custom output
  41```
  42
  43Programmatic Usage
  44------------------
  45```python
  46from experiments import run_single_simulation, run_phase1
  47from models.config import PHASE1_CONFIG
  48
  49# Single simulation
  50result = run_single_simulation(
  51    prey_birth=0.2,
  52    prey_death=0.05,
  53    predator_birth=0.8,
  54    predator_death=0.1,
  55    grid_size=100,
  56    seed=42,
  57    cfg=PHASE1_CONFIG,
  58)
  59
  60# Full phase (writes to output directory)
  61import logging
  62results = run_phase1(PHASE1_CONFIG, Path("results/"), logging.getLogger())
  63```
  64"""
  65
  66import argparse
  67import hashlib
  68import json
  69import logging
  70import os
  71import sys
  72import time
  73from dataclasses import asdict
  74from pathlib import Path
  75from typing import Dict, List, Tuple, Optional
  76import warnings
  77
  78import numpy as np
  79from tqdm import tqdm
  80
  81warnings.filterwarnings("ignore")
  82
  83# Project imports
  84project_root = str(Path(__file__).parent.parent)
  85if project_root not in sys.path:
  86    sys.path.insert(0, project_root)
  87
  88from models.config import Config, get_phase_config, PHASE_CONFIGS
  89
  90# Numba imports
  91try:
  92    from models.numba_optimized import (
  93        compute_all_pcfs_fast,
  94        get_cluster_stats_fast,
  95        warmup_numba_kernels,
  96        set_numba_seed,
  97        NUMBA_AVAILABLE,
  98    )
  99
 100    USE_NUMBA = NUMBA_AVAILABLE
 101except ImportError:
 102    USE_NUMBA = False
 103
 104    def warmup_numba_kernels(size, **kwargs):
 105        pass
 106
 107    def set_numba_seed(seed):
 108        pass
 109
 110
 111# =============================================================================
 112# Utility Functions
 113# =============================================================================
 114
 115
 116def generate_unique_seed(params: dict, rep: int) -> int:
 117    """
 118    Create a deterministic seed from a dictionary of parameters and a repetition index.
 119
 120    This function serializes the input dictionary into a sorted JSON string,
 121    appends the repetition count, and hashes the resulting string using SHA-256.
 122    The first 8 characters of the hex digest are then converted to an integer
 123    to provide a stable, unique seed for random number generators.
 124
 125    Parameters
 126    ----------
 127    params : dict
 128        A dictionary of configuration parameters. Keys are sorted to ensure
 129        determinism regardless of insertion order.
 130    rep : int
 131        The repetition or iteration index, used to ensure different seeds
 132        are generated for the same parameter set across multiple runs.
 133
 134    Returns
 135    -------
 136    int
 137        A unique integer seed derived from the input parameters.
 138
 139    Examples
 140    --------
 141    >>> params = {'learning_rate': 0.01, 'batch_size': 32}
 142    >>> generate_unique_seed(params, 1)
 143    3432571217
 144    >>> generate_unique_seed(params, 2)
 145    3960013583
 146    """
 147    identifier = json.dumps(params, sort_keys=True) + f"_{rep}"
 148    return int(hashlib.sha256(identifier.encode()).hexdigest()[:8], 16)
 149
 150
 151def count_populations(grid: np.ndarray) -> Tuple[int, int, int]:
 152    """
 153    Count the number of empty, prey, and predator cells in the simulation grid.
 154
 155    Parameters
 156    ----------
 157    grid : np.ndarray
 158        A 2D NumPy array representing the simulation environment, where:
 159        - 0: Empty cell
 160        - 1: Prey
 161        - 2: Predator
 162
 163    Returns
 164    -------
 165    empty_count : int
 166        Total number of cells with a value of 0.
 167    prey_count : int
 168        Total number of cells with a value of 1.
 169    predator_count : int
 170        Total number of cells with a value of 2.
 171
 172    Examples
 173    --------
 174    >>> grid = np.array([[0, 1], [2, 1]])
 175    >>> count_populations(grid)
 176    (1, 2, 1)
 177    """
 178    return int(np.sum(grid == 0)), int(np.sum(grid == 1)), int(np.sum(grid == 2))
 179
 180
 181def get_evolved_stats(model, param: str) -> Dict:
 182    """
 183    Get statistics of an evolved parameter from the model.
 184
 185    This function retrieves parameter values from the model's internal storage,
 186    filters out NaN values, and calculates basic descriptive statistics.
 187
 188    Parameters
 189    ----------
 190    model : object
 191        The simulation model instance containing a `cell_params` attribute
 192        with a `.get()` method.
 193    param : str
 194        The name of the parameter to calculate statistics for.
 195
 196    Returns
 197    -------
 198    stats : dict
 199        A dictionary containing the following keys:
 200        - 'mean': Arithmetic mean of valid values.
 201        - 'std': Standard deviation of valid values.
 202        - 'min': Minimum valid value.
 203        - 'max': Maximum valid value.
 204        - 'n': Count of non-NaN values.
 205        If no valid data is found, all stats return NaN and n returns 0.
 206
 207    Examples
 208    --------
 209    >>> stats = get_evolved_stats(my_model, "speed")
 210    >>> print(stats['mean'])
 211    1.25
 212    """
 213    arr = model.cell_params.get(param)
 214    if arr is None:
 215        return {"mean": np.nan, "std": np.nan, "min": np.nan, "max": np.nan, "n": 0}
 216    valid = arr[~np.isnan(arr)]
 217    if len(valid) == 0:
 218        return {"mean": np.nan, "std": np.nan, "min": np.nan, "max": np.nan, "n": 0}
 219    return {
 220        "mean": float(np.mean(valid)),
 221        "std": float(np.std(valid)),
 222        "min": float(np.min(valid)),
 223        "max": float(np.max(valid)),
 224        "n": len(valid),
 225    }
 226
 227
 228def average_pcfs(
 229    pcf_list: List[Tuple[np.ndarray, np.ndarray, int]],
 230) -> Tuple[np.ndarray, np.ndarray, np.ndarray]:
 231    """
 232    Average multiple Pair Correlation Function (PCF) measurements and calculate standard error.
 233
 234    Parameters
 235    ----------
 236    pcf_list : list of tuple
 237        A list where each element is a tuple containing:
 238        - distances (np.ndarray): The radial distances (r).
 239        - pcf_values (np.ndarray): The correlation values g(r).
 240        - count (int): Metadata or weight (not used in current calculation).
 241
 242    Returns
 243    -------
 244    distances : np.ndarray
 245        The radial distances from the first entry in the list.
 246    pcf_mean : np.ndarray
 247        The element-wise mean of the PCF values across all measurements.
 248    pcf_se : np.ndarray
 249        The standard error of the mean for the PCF values.
 250
 251    Examples
 252    --------
 253    >>> data = [(np.array([0, 1]), np.array([1.0, 2.0]), 10),
 254    ...         (np.array([0, 1]), np.array([1.2, 1.8]), 12)]
 255    >>> dist, mean, se = average_pcfs(data)
 256    >>> mean
 257    array([1.1, 1.9])
 258    """
 259    if len(pcf_list) == 0:
 260        return np.array([]), np.array([]), np.array([])
 261
 262    distances = pcf_list[0][0]
 263    pcfs = np.array([p[1] for p in pcf_list])
 264
 265    pcf_mean = np.mean(pcfs, axis=0)
 266    pcf_se = np.std(pcfs, axis=0) / np.sqrt(len(pcfs))
 267
 268    return distances, pcf_mean, pcf_se
 269
 270
 271def save_results_jsonl(results: List[Dict], output_path: Path):
 272    """
 273    Save a list of dictionaries to a file in JSON Lines (JSONL) format.
 274
 275    Each dictionary in the list is serialized into a single JSON string and
 276    written as a new line. Non-serializable objects are converted to strings
 277    using the default string representation.
 278
 279    Parameters
 280    ----------
 281    results : list of dict
 282        The collection of result dictionaries to be saved.
 283    output_path : Path
 284        The file system path (pathlib.Path) where the JSONL file will be created.
 285
 286    Returns
 287    -------
 288    None
 289
 290    Notes
 291    -----
 292    The file is opened in 'w' (write) mode, which will overwrite any existing
 293    content at the specified path.
 294
 295    Examples
 296    --------
 297    >>> data = [{"id": 1, "score": 0.95}, {"id": 2, "score": 0.88}]
 298    >>> save_results_jsonl(data, Path("results.jsonl"))
 299    """
 300    with open(output_path, "w", encoding="utf-8") as f:
 301        for result in results:
 302            f.write(json.dumps(result, default=str) + "\n")
 303
 304
 305def save_results_npz(results: List[Dict], output_path: Path):
 306    """
 307    Save simulation results to a compressed NumPy (.npz) binary file.
 308
 309    This function flattens a list of result dictionaries into a single
 310    dictionary of NumPy arrays, prefixing keys with the run index to
 311    maintain data separation. The resulting file is compressed to
 312    reduce storage space.
 313
 314    Parameters
 315    ----------
 316    results : list of dict
 317        A list where each dictionary contains key-value pairs of
 318        simulation data (e.g., arrays, lists, or scalars).
 319    output_path : Path
 320        The file system path (pathlib.Path) where the compressed
 321        NPZ file will be saved.
 322
 323    Returns
 324    -------
 325    None
 326
 327    Notes
 328    -----
 329    The keys in the saved file follow the format 'run_{index}_{original_key}'.
 330    Values are automatically converted to NumPy arrays if they are not
 331    already.
 332
 333    Examples
 334    --------
 335    >>> results = [{"energy": [1, 2]}, {"energy": [3, 4]}]
 336    >>> save_results_npz(results, Path("output.npz"))
 337    """
 338    data = {}
 339    for i, res in enumerate(results):
 340        for key, val in res.items():
 341            data[f"run_{i}_{key}"] = np.array(val)
 342    np.savez_compressed(output_path, **data)
 343
 344
 345def load_results_jsonl(input_path: Path) -> List[Dict]:
 346    """
 347    Load simulation results from a JSON Lines (JSONL) formatted file.
 348
 349    This function reads a file line-by-line, parsing each line as an
 350    independent JSON object and aggregating them into a list of dictionaries.
 351
 352    Parameters
 353    ----------
 354    input_path : Path
 355        The file system path (pathlib.Path) to the JSONL file.
 356
 357    Returns
 358    -------
 359    results : list of dict
 360        A list of dictionaries reconstructed from the file content.
 361
 362    Raises
 363    ------
 364    FileNotFoundError
 365        If the specified input path does not exist.
 366    json.JSONDecodeError
 367        If a line in the file is not valid JSON.
 368
 369    Examples
 370    --------
 371    >>> data = load_results_jsonl(Path("results.jsonl"))
 372    >>> len(data)
 373    2
 374    """
 375    results = []
 376    with open(input_path, "r", encoding="utf-8") as f:
 377        for line in f:
 378            results.append(json.loads(line.strip()))
 379    return results
 380
 381
 382# =============================================================================
 383# Simulation Functionality
 384# =============================================================================
 385
 386
 387def run_single_simulation(
 388    prey_birth: float,
 389    prey_death: float,
 390    predator_birth: float,
 391    predator_death: float,
 392    grid_size: int,
 393    seed: int,
 394    cfg: Config,
 395    with_evolution: bool = False,
 396    compute_pcf: Optional[bool] = None,
 397) -> Dict:
 398    """
 399    Run a single Predator-Prey (PP) simulation and collect comprehensive metrics.
 400
 401    This function initializes a Cellular Automata model, executes a warmup phase
 402    to reach steady state, and then performs a measurement phase to track
 403    population dynamics, spatial clustering, and evolutionary changes.
 404
 405    Parameters
 406    ----------
 407    prey_birth : float
 408        The probability or rate of prey reproduction.
 409    prey_death : float
 410        The base probability or rate of prey mortality.
 411    predator_birth : float
 412        The probability or rate of predator reproduction upon consuming prey.
 413    predator_death : float
 414        The probability or rate of predator mortality.
 415    grid_size : int
 416        The side length of the square simulation grid.
 417    seed : int
 418        Random seed for ensuring reproducibility of the simulation run.
 419    cfg : Config
 420        A configuration object containing simulation hyperparameters (densities,
 421        sampling rates, timing, etc.).
 422    with_evolution : bool, optional
 423        If True, enables the evolution of the 'prey_death' parameter within
 424        the model (default is False).
 425    compute_pcf : bool, optional
 426        Explicit toggle for Pair Correlation Function calculation. If None,
 427        it is determined by `cfg.pcf_sample_rate` (default is None).
 428
 429    Returns
 430    -------
 431    result : dict
 432        A dictionary containing simulation results including:
 433        - Input parameters and survival flags.
 434        - Population mean and standard deviation for both species.
 435        - Cluster statistics (number of clusters, sizes, largest fractions).
 436        - Evolutionary statistics (mean, std, min, max, and final values).
 437        - PCF data and spatial indices (segregation and clustering).
 438        - Optional time series for populations and evolved parameters.
 439
 440    Notes
 441    -----
 442    The function relies on several external utilities: `count_populations`,
 443    `get_evolved_stats`, `get_cluster_stats_fast`, `compute_all_pcfs_fast`,
 444    and `average_pcfs`.
 445    """
 446
 447    from models.CA import PP
 448
 449    if USE_NUMBA:
 450        set_numba_seed(seed)
 451
 452    if compute_pcf is None:
 453        compute_pcf = cfg.collect_pcf and (np.random.random() < cfg.pcf_sample_rate)
 454
 455    # Initialize model
 456    model = PP(
 457        rows=grid_size,
 458        cols=grid_size,
 459        densities=cfg.densities,
 460        neighborhood="moore",  # NOTE: Default neighborhood
 461        params={
 462            "prey_birth": prey_birth,
 463            "prey_death": prey_death,
 464            "predator_death": predator_death,
 465            "predator_birth": predator_birth,
 466        },
 467        seed=seed,
 468        directed_hunting=cfg.directed_hunting,
 469    )
 470
 471    if with_evolution:
 472        model.evolve(
 473            "prey_death",
 474            sd=cfg.evolve_sd,
 475            min_val=cfg.evolve_min,
 476            max_val=cfg.evolve_max,
 477        )
 478
 479    # Scale timing with grid size
 480    warmup_steps = cfg.get_warmup_steps(grid_size)
 481    measurement_steps = cfg.get_measurement_steps(grid_size)
 482
 483    # Warmup phase
 484    for _ in range(warmup_steps):
 485        model.update()
 486
 487    # Measurement phase: start collecting our mertics
 488    prey_pops, pred_pops = [], []  # Prey populations and predator populations
 489    evolved_means, evolved_stds = [], []  # Evolution stats over time
 490    cluster_sizes_prey, cluster_sizes_pred = [], []  # Cluster sizes
 491    largest_fractions_prey, largest_fractions_pred = (
 492        [],
 493        [],
 494    )  # Largest cluster fractions = size of largest cluster / total population
 495    pcf_samples = {"prey_prey": [], "pred_pred": [], "prey_pred": []}
 496
 497    # Determine minimum count for analysis
 498    min_count = int(cfg.min_density_for_analysis * (grid_size**2))
 499
 500    for step in range(measurement_steps):
 501        model.update()
 502
 503        _, prey, pred = count_populations(model.grid)
 504        prey_pops.append(prey)
 505        pred_pops.append(pred)
 506
 507        # Track evolution
 508        if with_evolution:
 509            stats = get_evolved_stats(model, "prey_death")
 510            evolved_means.append(stats["mean"])
 511            evolved_stds.append(stats["std"])
 512
 513        # Cluster analysis (at end of measurement)
 514        if step == measurement_steps - 1:
 515            prey_survived = prey_pops[-1] > min_count
 516            pred_survived = pred_pops[-1] > (min_count // 4)
 517
 518            if prey_survived:
 519                prey_stats = get_cluster_stats_fast(model.grid, 1)
 520                cluster_sizes_prey = prey_stats["sizes"].tolist()
 521                largest_fractions_prey.append(prey_stats["largest_fraction"])
 522
 523            if pred_survived:
 524                pred_stats = get_cluster_stats_fast(model.grid, 2)
 525                cluster_sizes_pred = pred_stats["sizes"].tolist()
 526                largest_fractions_pred.append(pred_stats["largest_fraction"])
 527
 528            # PCF requires both
 529            if compute_pcf and prey_survived and pred_survived:
 530                max_dist = min(grid_size / 2, cfg.pcf_max_distance)
 531                pcf_data = compute_all_pcfs_fast(model.grid, max_dist, cfg.pcf_n_bins)
 532                pcf_samples["prey_prey"].append(pcf_data["prey_prey"])
 533                pcf_samples["pred_pred"].append(pcf_data["pred_pred"])
 534                pcf_samples["prey_pred"].append(pcf_data["prey_pred"])
 535
 536    # Compile results
 537    result = {
 538        # Parameters
 539        "prey_birth": prey_birth,
 540        "prey_death": prey_death,
 541        "predator_birth": predator_birth,
 542        "predator_death": predator_death,
 543        "grid_size": grid_size,
 544        "with_evolution": with_evolution,
 545        "seed": seed,
 546        # Population dynamics
 547        "prey_mean": float(np.mean(prey_pops)),
 548        "prey_std": float(np.std(prey_pops)),
 549        "pred_mean": float(np.mean(pred_pops)),
 550        "pred_std": float(np.std(pred_pops)),
 551        "prey_survived": prey_pops[-1] > min_count,
 552        "pred_survived": pred_pops[-1] > (min_count // 4),
 553        # Cluster statistics
 554        "prey_n_clusters": len(cluster_sizes_prey),
 555        "pred_n_clusters": len(cluster_sizes_pred),
 556        "prey_cluster_sizes": cluster_sizes_prey,
 557        "pred_cluster_sizes": cluster_sizes_pred,
 558        # Order parameters
 559        "prey_largest_fraction": (
 560            float(np.mean(largest_fractions_prey)) if largest_fractions_prey else np.nan
 561        ),
 562        "pred_largest_fraction": (
 563            float(np.mean(largest_fractions_pred)) if largest_fractions_pred else np.nan
 564        ),
 565    }
 566
 567    # Time series (if requested)
 568    if cfg.save_timeseries:
 569        subsample = cfg.timeseries_subsample
 570        result["prey_timeseries"] = prey_pops[
 571            ::subsample
 572        ]  # NOTE: Sample temporal data every 'subsample' steps
 573        result["pred_timeseries"] = pred_pops[::subsample]
 574
 575    # Evolution statistics
 576    if with_evolution and evolved_means:
 577        valid_means = [v for v in evolved_means if not np.isnan(v)]
 578        result["evolved_prey_death_mean"] = (
 579            float(np.mean(valid_means)) if valid_means else np.nan
 580        )
 581        result["evolved_prey_death_std"] = (
 582            float(np.mean([v for v in evolved_stds if not np.isnan(v)]))
 583            if evolved_stds
 584            else np.nan
 585        )
 586        result["evolved_prey_death_final"] = valid_means[-1] if valid_means else np.nan
 587        result["evolved_prey_death_min"] = (
 588            float(np.min(valid_means)) if valid_means else np.nan
 589        )
 590        result["evolved_prey_death_max"] = (
 591            float(np.max(valid_means)) if valid_means else np.nan
 592        )
 593        result["evolve_sd"] = cfg.evolve_sd
 594
 595        if cfg.save_timeseries:
 596            result["evolved_prey_death_timeseries"] = evolved_means[
 597                :: cfg.timeseries_subsample
 598            ]
 599
 600    # PCF statistics
 601    if pcf_samples["prey_prey"]:
 602        dist, pcf_rr, _ = average_pcfs(pcf_samples["prey_prey"])
 603        _, pcf_cc, _ = average_pcfs(pcf_samples["pred_pred"])
 604        _, pcf_cr, _ = average_pcfs(pcf_samples["prey_pred"])
 605
 606        result["pcf_distances"] = dist.tolist()
 607        result["pcf_prey_prey"] = pcf_rr.tolist()
 608        result["pcf_pred_pred"] = pcf_cc.tolist()
 609        result["pcf_prey_pred"] = pcf_cr.tolist()
 610
 611        # Short-range indices
 612        short_mask = dist < 3.0
 613        if np.any(short_mask):
 614            result["segregation_index"] = float(np.mean(pcf_cr[short_mask]))
 615            result["prey_clustering_index"] = float(np.mean(pcf_rr[short_mask]))
 616            result["pred_clustering_index"] = float(np.mean(pcf_cc[short_mask]))
 617
 618    return result
 619
 620
 621# =============================================================================
 622# Experiment Phases
 623# =============================================================================
 624
 625
 626def run_phase1(cfg: Config, output_dir: Path, logger: logging.Logger) -> List[Dict]:
 627    """
 628    Execute Phase 1 of the simulation: a parameter sweep to identify critical points.
 629
 630    This function performs a 1D sweep across varying prey mortality rates while
 631    keeping other parameters fixed. It utilizes parallel execution via joblib
 632    and saves results incrementally to a JSONL file to ensure data integrity
 633    during long-running batches.
 634
 635    Parameters
 636    ----------
 637    cfg : Config
 638        Configuration object containing simulation hyperparameters, sweep
 639        ranges, and execution settings (n_jobs, grid_size, etc.).
 640    output_dir : Path
 641        Directory where result files (JSONL) and metadata (JSON) will be stored.
 642    logger : logging.Logger
 643        Logger instance for tracking simulation progress and recording
 644        operational metadata.
 645
 646    Returns
 647    -------
 648    all_results : list of dict
 649        A list of dictionaries containing the metrics collected from every
 650        individual simulation run in the sweep.
 651
 652    Notes
 653    -----
 654    The function performs the following steps:
 655    1. Pre-warms Numba kernels for performance.
 656    2. Generates a deterministic set of simulation jobs using unique seeds.
 657    3. Executes simulations in parallel using a generator for memory efficiency.
 658    4. Records metadata including a timestamp and a serialized snapshot of
 659       the configuration.
 660    """
 661    from joblib import Parallel, delayed
 662
 663    warmup_numba_kernels(cfg.grid_size, directed_hunting=cfg.directed_hunting)
 664
 665    prey_deaths = cfg.get_prey_deaths()
 666
 667    # Build job list
 668    jobs = []
 669    # Sweep through prey_death only (prey_birth is fixed)
 670    for pd in prey_deaths:
 671        for rep in range(cfg.n_replicates):
 672            params = {"pd": pd}
 673
 674            seed = generate_unique_seed(params, rep)
 675            jobs.append(
 676                (
 677                    cfg.prey_birth,
 678                    pd,
 679                    cfg.predator_birth,
 680                    cfg.predator_death,
 681                    cfg.grid_size,
 682                    seed,
 683                    cfg,
 684                    False,
 685                )
 686            )
 687
 688    logger.info(f"Phase 1: {len(jobs):,} simulations")
 689    logger.info(
 690        f"  Grid: {cfg.n_prey_death} prey_death values × {cfg.n_replicates} reps (prey_birth={cfg.prey_birth})"
 691    )
 692    # Run with incremental saving
 693    output_jsonl = output_dir / "phase1_results.jsonl"
 694    all_results = []
 695
 696    with open(output_jsonl, "w", encoding="utf-8") as f:
 697        executor = Parallel(n_jobs=cfg.n_jobs, return_as="generator")
 698        tasks = (delayed(run_single_simulation)(*job) for job in jobs)
 699
 700        for result in tqdm(executor(tasks), total=len(jobs), desc="Phase 1"):
 701            f.write(json.dumps(result, default=str) + "\n")
 702            f.flush()
 703            all_results.append(result)
 704
 705    # Save metadata
 706    meta = {
 707        "phase": 1,
 708        "description": "Parameter sweep for critical point",
 709        "n_sims": len(all_results),
 710        "timestamp": time.strftime("%Y-%m-%d %H:%M:%S"),
 711        "config": asdict(cfg),
 712    }
 713    with open(output_dir / "phase1_metadata.json", "w") as f:
 714        json.dump(meta, f, indent=2, default=str)
 715
 716    logger.info(f"Phase 1 complete. Results: {output_jsonl}")
 717    return all_results
 718
 719
 720def run_phase2(cfg: Config, output_dir: Path, logger: logging.Logger) -> List[Dict]:
 721    """
 722    Execute Phase 2 of the simulation: self-organization and criticality analysis.
 723
 724    This phase tests the Self-Organized Criticality (SOC) hypothesis by
 725    initializing simulations at different points in the parameter space and
 726    observing whether evolutionary pressure drives the system toward a
 727    common critical point, regardless of initial prey mortality rates.
 728
 729    Parameters
 730    ----------
 731    cfg : Config
 732        Configuration object containing simulation hyperparameters, evolution
 733        settings, and execution constraints.
 734    output_dir : Path
 735        Directory where result files (JSONL) and metadata (JSON) will be stored.
 736    logger : logging.Logger
 737        Logger instance for tracking progress and evolutionary convergence.
 738
 739    Returns
 740    -------
 741    all_results : list of dict
 742        A list of dictionaries containing metrics from the evolutionary
 743        simulation runs.
 744
 745    Notes
 746    -----
 747    The function captures:
 748    1. Convergence of 'prey_death' across multiple replicates.
 749    2. Final steady-state population distributions.
 750    3. Incremental saving of results to prevent data loss.
 751    """
 752    from joblib import Parallel, delayed
 753
 754    warmup_numba_kernels(cfg.grid_size, directed_hunting=cfg.directed_hunting)
 755
 756    # Test at multiple prey_birth values
 757    pb = 0.2
 758    # Vary intial prey_death
 759    initial_prey_deaths = np.linspace(
 760        cfg.prey_death_range[0], cfg.prey_death_range[1], cfg.n_prey_death
 761    )
 762
 763    jobs = []
 764    for initial_pd in initial_prey_deaths:
 765        for rep in range(cfg.n_replicates):
 766            params = {"pb": pb, "initial_pd": initial_pd, "phase": 2}
 767            seed = generate_unique_seed(params, rep)
 768            jobs.append(
 769                (
 770                    pb,
 771                    initial_pd,
 772                    cfg.predator_birth,
 773                    cfg.predator_death,
 774                    cfg.grid_size,
 775                    seed,
 776                    cfg,
 777                    True,
 778                )
 779            )
 780
 781    logger.info(f"Phase 2: {len(jobs):,} simulations")
 782    logger.info(f"  prey_birth value: {pb}")
 783    logger.info(f"  initial prey_death values: {len(initial_prey_deaths)}")
 784    logger.info(f"  Replicates: {cfg.n_replicates}")
 785
 786    output_jsonl = output_dir / "phase2_results.jsonl"
 787    all_results = []
 788
 789    with open(output_jsonl, "w", encoding="utf-8") as f:
 790        executor = Parallel(n_jobs=cfg.n_jobs, return_as="generator")
 791        tasks = (delayed(run_single_simulation)(*job) for job in jobs)
 792
 793        for result in tqdm(executor(tasks), total=len(jobs), desc="Phase 2"):
 794            f.write(json.dumps(result, default=str) + "\n")
 795            f.flush()
 796            all_results.append(result)
 797
 798    meta = {
 799        "phase": 2,
 800        "description": "Self-organization toward criticality",
 801        "n_sims": len(all_results),
 802        "initial_prey_deaths": initial_prey_deaths.tolist(),
 803        "timestamp": time.strftime("%Y-%m-%d %H:%M:%S"),
 804    }
 805    with open(output_dir / "phase2_metadata.json", "w") as f:
 806        json.dump(meta, f, indent=2, default=str)
 807
 808    logger.info(f"Phase 2 complete. Results: {output_jsonl}")
 809    return all_results
 810
 811
 812def run_phase3(cfg: Config, output_dir: Path, logger: logging.Logger) -> List[Dict]:
 813    """
 814    Phase 3: Finite-size scaling at critical point.
 815
 816    - Multiple grid sizes at (critical_prey_birth, critical_prey_death)
 817    - Analyze cluster size cutoffs vs L
 818    """
 819    from joblib import Parallel, delayed
 820
 821    # NOTE: Tuned to critical points from phase 1
 822    pb = cfg.critical_prey_birth
 823    pd = cfg.critical_prey_death
 824
 825    logger.info(f"Phase 3: FSS at critical point (pb={pb}, pd={pd})")
 826
 827    for L in cfg.grid_sizes:
 828        warmup_numba_kernels(L, directed_hunting=cfg.directed_hunting)
 829
 830    jobs = []
 831    for L in cfg.grid_sizes:  # Sweep through grid sizes
 832        for rep in range(cfg.n_replicates):
 833            params = {"L": L, "phase": 3}
 834            seed = generate_unique_seed(params, rep)
 835            jobs.append(
 836                (pb, pd, cfg.predator_birth, cfg.predator_death, L, seed, cfg, False)
 837            )
 838
 839    logger.info(f"  Grid sizes: {cfg.grid_sizes}")
 840    logger.info(f"  Total simulations: {len(jobs):,}")
 841
 842    output_jsonl = output_dir / "phase3_results.jsonl"
 843    all_results = []
 844
 845    with open(output_jsonl, "w", encoding="utf-8") as f:
 846        executor = Parallel(n_jobs=cfg.n_jobs, return_as="generator")
 847        tasks = (delayed(run_single_simulation)(*job) for job in jobs)
 848
 849        for result in tqdm(executor(tasks), total=len(jobs), desc="Phase 3"):
 850            f.write(json.dumps(result, default=str) + "\n")
 851            f.flush()
 852            all_results.append(result)
 853
 854    # Post-run metadata: postprocessing will fit cluster cutoffs vs L
 855    meta = {
 856        "phase": 3,
 857        "description": "Finite-size scaling",
 858        "critical_point": {"prey_birth": pb, "prey_death": pd},
 859        "grid_sizes": cfg.grid_sizes,
 860        "n_sims": len(all_results),
 861        "timestamp": time.strftime("%Y-%m-%d %H:%M:%S"),
 862    }
 863    with open(output_dir / "phase3_metadata.json", "w") as f:
 864        json.dump(meta, f, indent=2, default=str)
 865
 866    logger.info(f"Phase 3 complete. Results: {output_jsonl}")
 867    return all_results
 868
 869
 870def run_phase4(cfg: Config, output_dir: Path, logger: logging.Logger) -> List[Dict]:
 871    """
 872    Execute Phase 3 of the simulation: Finite-Size Scaling (FSS) analysis.
 873
 874    This phase investigates how spatial structures, specifically cluster size
 875    cutoffs, scale with the system size (L) at the critical point identified
 876    in Phase 1. This is essential for determining the universality class of
 877    the phase transition.
 878
 879    Parameters
 880    ----------
 881    cfg : Config
 882        Configuration object containing critical point parameters, the list of
 883        grid sizes to test, and execution settings.
 884    output_dir : Path
 885        Directory where result files (JSONL) and FSS metadata (JSON) will be
 886        stored.
 887    logger : logging.Logger
 888        Logger instance for tracking progress across different grid sizes.
 889
 890    Returns
 891    -------
 892    all_results : list of dict
 893        A list of dictionaries containing metrics and cluster statistics for
 894        each grid size and replicate.
 895
 896    Notes
 897    -----
 898    The function performs the following:
 899    1. Iterates through multiple grid sizes defined in `cfg.grid_sizes`.
 900    2. Generates parallel jobs for each size using critical birth/death rates.
 901    3. Saves results incrementally to allow for post-simulation analysis of
 902       power-law exponents.
 903    """
 904    from joblib import Parallel, delayed
 905    import itertools
 906
 907    warmup_numba_kernels(cfg.grid_size, directed_hunting=cfg.directed_hunting)
 908
 909    # Define sweep values
 910    prey_death_values = np.linspace(0.05, 0.95, 10)  # 10 values for prey_death
 911    other_param_values = np.linspace(0.0, 1.0, 11)  # 11 values for the rest
 912
 913    # Logging
 914    logger.info(f"Phase 4: Full 4D Parameter Sweep")
 915    logger.info(f"  prey_death: 10 values from 0.05 to 0.95")
 916    logger.info(f"  prey_birth, pred_birth, pred_death: 11 values each from 0 to 1")
 917    logger.info(f"  Grid Size: {cfg.grid_size}")
 918    logger.info(f"  Replicates: {cfg.n_replicates}")
 919
 920    # Build parameter grid
 921    param_grid = itertools.product(
 922        other_param_values,  # prey_birth (11 values)
 923        prey_death_values,  # prey_death (10 values)
 924        other_param_values,  # predator_birth (11 values)
 925        other_param_values,  # predator_death (11 values)
 926    )
 927
 928    jobs = []
 929
 930    for pb, pd, pred_b, pred_d in param_grid:
 931        for rep in range(cfg.n_replicates):
 932            params_id = {
 933                "pb": pb,
 934                "pd": pd,
 935                "pred_b": pred_b,
 936                "pred_d": pred_d,
 937                "rep": rep,
 938            }
 939            seed = generate_unique_seed(params_id, rep)
 940
 941            jobs.append(
 942                (
 943                    pb,  # prey_birth
 944                    pd,  # prey_death
 945                    pred_b,  # predator_birth
 946                    pred_d,  # predator_death
 947                    cfg.grid_size,
 948                    seed,
 949                    cfg,
 950                    False,
 951                )
 952            )
 953
 954    logger.info(
 955        f"  Total simulations: {len(jobs):,}"
 956    )  # 11 * 10 * 11 * 11 * n_reps = 13,310 * n_reps
 957
 958    output_jsonl = output_dir / "phase4_results.jsonl"
 959    all_results = []
 960
 961    with open(output_jsonl, "w", encoding="utf-8") as f:
 962        executor = Parallel(n_jobs=cfg.n_jobs, return_as="generator")
 963        tasks = (delayed(run_single_simulation)(*job) for job in jobs)
 964
 965        for result in tqdm(executor(tasks), total=len(jobs), desc="Phase 4 (4D Sweep)"):
 966            f.write(json.dumps(result, default=str) + "\n")
 967            f.flush()
 968            all_results.append(result)
 969
 970    # Save Metadata
 971    meta = {
 972        "phase": 4,
 973        "description": "Global 4D Sensitivity Analysis",
 974        "prey_death_values": prey_death_values.tolist(),
 975        "other_param_values": other_param_values.tolist(),
 976        "parameters_varied": [
 977            "prey_birth",
 978            "prey_death",
 979            "predator_birth",
 980            "predator_death",
 981        ],
 982        "n_sims": len(all_results),
 983        "timestamp": time.strftime("%Y-%m-%d %H:%M:%S"),
 984        "config": asdict(cfg),
 985    }
 986    with open(output_dir / "phase4_metadata.json", "w") as f:
 987        json.dump(meta, f, indent=2, default=str)
 988
 989    logger.info(f"Phase 4 complete. Results: {output_jsonl}")
 990    return all_results
 991
 992
 993def run_phase5(cfg: Config, output_dir: Path, logger: logging.Logger) -> List[Dict]:
 994    """
 995    Execute Phase 5 of the simulation: Global 4D parameter sweep with directed hunting.
 996
 997    This phase performs a comprehensive sensitivity analysis by varying four key
 998    parameters (prey birth/death and predator birth/death) while directed
 999    hunting is enabled. The results allow for a direct comparison with Phase 4
1000    to determine how predator search behavior shifts the system's critical
1001    thresholds and stability.
1002
1003    Parameters
1004    ----------
1005    cfg : Config
1006        Configuration object containing simulation hyperparameters, parallel
1007        execution settings, and the fixed grid size for this phase.
1008    output_dir : Path
1009        Directory where the result JSONL file and execution metadata will
1010        be stored.
1011    logger : logging.Logger
1012        Logger instance for tracking the progress of the high-volume
1013        simulation batch.
1014
1015    Returns
1016    -------
1017    all_results : list of dict
1018        A list of dictionaries containing metrics for every simulation in
1019        the 4D parameter grid.
1020
1021    Notes
1022    -----
1023    The function utilizes a Cartesian product of parameter ranges to build a
1024    job list of over 13,000 unique parameter sets (multiplied by replicates).
1025    Seeds are uniquely generated to distinguish these runs from other phases
1026    even if parameter values overlap.
1027    """
1028    from joblib import Parallel, delayed
1029    import itertools
1030
1031    warmup_numba_kernels(cfg.grid_size, directed_hunting=cfg.directed_hunting)
1032
1033    # Define sweep values (same as Phase 4)
1034    prey_death_values = np.linspace(0.05, 0.95, 10)  # 10 values for prey_death
1035    other_param_values = np.linspace(0.0, 1.0, 11)  # 11 values for the rest
1036
1037    # Logging
1038    logger.info(f"Phase 5: Full 4D Parameter Sweep (Directed Hunting)")
1039    logger.info(f"  prey_death: 10 values from 0.05 to 0.95")
1040    logger.info(f"  prey_birth, pred_birth, pred_death: 11 values each from 0 to 1")
1041    logger.info(f"  Grid Size: {cfg.grid_size}")
1042    logger.info(f"  Replicates: {cfg.n_replicates}")
1043    logger.info(f"  Directed Hunting: {cfg.directed_hunting}")
1044
1045    # Build parameter grid
1046    param_grid = itertools.product(
1047        other_param_values,  # prey_birth (11 values)
1048        prey_death_values,  # prey_death (10 values)
1049        other_param_values,  # predator_birth (11 values)
1050        other_param_values,  # predator_death (11 values)
1051    )
1052
1053    jobs = []
1054
1055    for pb, pd, pred_b, pred_d in param_grid:
1056        for rep in range(cfg.n_replicates):
1057            # Include phase identifier to ensure different seeds from Phase 4
1058            params_id = {
1059                "pb": pb,
1060                "pd": pd,
1061                "pred_b": pred_b,
1062                "pred_d": pred_d,
1063                "phase": 6,
1064                "rep": rep,
1065            }
1066            seed = generate_unique_seed(params_id, rep)
1067
1068            jobs.append(
1069                (
1070                    pb,  # prey_birth
1071                    pd,  # prey_death
1072                    pred_b,  # predator_birth
1073                    pred_d,  # predator_death
1074                    cfg.grid_size,
1075                    seed,
1076                    cfg,
1077                    False,
1078                )
1079            )
1080
1081    logger.info(
1082        f"  Total simulations: {len(jobs):,}"
1083    )  # 11 * 10 * 11 * 11 * n_reps = 13,310 * n_reps
1084
1085    output_jsonl = output_dir / "phase5_results.jsonl"
1086    all_results = []
1087
1088    with open(output_jsonl, "w", encoding="utf-8") as f:
1089        executor = Parallel(n_jobs=cfg.n_jobs, return_as="generator")
1090        tasks = (delayed(run_single_simulation)(*job) for job in jobs)
1091
1092        for result in tqdm(
1093            executor(tasks), total=len(jobs), desc="Phase 6 (4D Sweep + Directed)"
1094        ):
1095            f.write(json.dumps(result, default=str) + "\n")
1096            f.flush()
1097            all_results.append(result)
1098
1099    # Save Metadata
1100    meta = {
1101        "phase": 5,
1102        "description": "Global 4D Sensitivity Analysis with Directed Hunting",
1103        "prey_death_values": prey_death_values.tolist(),
1104        "other_param_values": other_param_values.tolist(),
1105        "parameters_varied": [
1106            "prey_birth",
1107            "prey_death",
1108            "predator_birth",
1109            "predator_death",
1110        ],
1111        "directed_hunting": cfg.directed_hunting,
1112        "n_sims": len(all_results),
1113        "timestamp": time.strftime("%Y-%m-%d %H:%M:%S"),
1114        "config": asdict(cfg),
1115    }
1116    with open(output_dir / "phase6_metadata.json", "w") as f:
1117        json.dump(meta, f, indent=2, default=str)
1118
1119    logger.info(f"Phase 5 complete. Results: {output_jsonl}")
1120    return all_results
1121
1122
1123# =============================================================================
1124# Main:
1125# =============================================================================
1126
1127PHASE_RUNNERS = {
1128    1: run_phase1,
1129    2: run_phase2,
1130    3: run_phase3,
1131    4: run_phase4,
1132    5: run_phase5,
1133}
1134
1135
1136def main():
1137    """
1138    Organize the predator-prey experimental suite across multiple phases.
1139
1140    This entry point handles command-line arguments, sets up logging and output
1141    directories, and executes the requested simulation phases (1-5). It
1142    supports parallel execution, dry runs for runtime estimation, and
1143    automated configuration persistence.
1144
1145    Notes
1146    -----
1147    The script dynamically retrieves phase-specific configurations using
1148    `get_phase_config` and dispatches execution to the corresponding runner
1149    in the `PHASE_RUNNERS` mapping.
1150    """
1151    parser = argparse.ArgumentParser(
1152        description="Predator-Prey Hydra Effect Experiments",
1153        formatter_class=argparse.RawDescriptionHelpFormatter,
1154        epilog="""
1155Phases:
1156  1  Parameter sweep to find critical point
1157  2  Self-organization (evolution toward criticality)
1158  3  Finite-size scaling at critical point
1159  4  Sensitivity analysis across parameter regimes
1160  5  Model extensions (directed hunting comparison)
1161        """,
1162    )
1163    parser.add_argument(
1164        "--phase", type=str, required=True, help="Phase to run: 1-6 or 'all'"
1165    )
1166    parser.add_argument(
1167        "--output",
1168        type=Path,
1169        default=Path("results"),
1170        help="Output directory (default: results)",
1171    )
1172    parser.add_argument(
1173        "--cores", type=int, default=-1, help="Number of cores (-1 for all)"
1174    )
1175    parser.add_argument(
1176        "--dry-run", action="store_true", help="Estimate runtime without running"
1177    )
1178    args = parser.parse_args()
1179
1180    # Parse phase argument
1181    if args.phase.lower() == "all":
1182        phases = list(PHASE_RUNNERS.keys())
1183    else:
1184        try:
1185            phases = [int(args.phase)]
1186        except ValueError:
1187            print(f"Invalid phase: {args.phase}. Use 1-6 or 'all'")
1188            sys.exit(1)
1189
1190    # Setup output directory
1191    args.output.mkdir(parents=True, exist_ok=True)
1192
1193    # Setup logging
1194    logging.basicConfig(
1195        level=logging.INFO,
1196        format="%(asctime)s [%(levelname)s] %(message)s",
1197        handlers=[
1198            logging.FileHandler(args.output / "experiments.log"),
1199            logging.StreamHandler(),
1200        ],
1201    )
1202    logger = logging.getLogger(__name__)
1203
1204    # Header
1205    logger.info("=" * 60)
1206    logger.info("PREDATOR-PREY HYDRA EFFECT EXPERIMENTS")
1207    logger.info("=" * 60)
1208    logger.info(f"Phases: {phases}")
1209    logger.info(f"Output: {args.output}")
1210    logger.info(f"Cores: {args.cores}")
1211    logger.info(f"Numba: {'ENABLED' if USE_NUMBA else 'DISABLED'}")
1212
1213    # Process each phase
1214    for phase in phases:
1215        cfg = get_phase_config(phase)
1216        cfg.n_jobs = (
1217            args.cores
1218            if args.cores > 0
1219            else int(os.environ.get("SLURM_CPUS_PER_TASK", -1))
1220        )
1221
1222        logger.info("")
1223        logger.info(f"{'='*60}")
1224        logger.info(f"PHASE {phase}")
1225        logger.info(f"{'='*60}")
1226
1227        n_cores = cfg.n_jobs if cfg.n_jobs > 0 else os.cpu_count()
1228        logger.info(f"Estimated: {cfg.estimate_runtime(n_cores)}")
1229
1230        if args.dry_run:
1231            logger.info("Dry run - skipping execution")
1232            continue
1233
1234        # Save config
1235        with open(args.output / f"phase{phase}_config.json", "w") as f:
1236            json.dump(asdict(cfg), f, indent=2, default=str)
1237
1238        # Run phase
1239        start_time = time.time()
1240        runner = PHASE_RUNNERS[phase]
1241        runner(cfg, args.output, logger)
1242        elapsed = time.time() - start_time
1243
1244        logger.info(f"Phase {phase} runtime: {elapsed/60:.1f} minutes")
1245
1246    logger.info("")
1247    logger.info("=" * 60)
1248    logger.info("EXPERIMENTS COMPLETE")
1249    logger.info("=" * 60)
1250
1251
1252if __name__ == "__main__":
1253    main()
def generate_unique_seed(params: dict, rep: int) -> int:
117def generate_unique_seed(params: dict, rep: int) -> int:
118    """
119    Create a deterministic seed from a dictionary of parameters and a repetition index.
120
121    This function serializes the input dictionary into a sorted JSON string,
122    appends the repetition count, and hashes the resulting string using SHA-256.
123    The first 8 characters of the hex digest are then converted to an integer
124    to provide a stable, unique seed for random number generators.
125
126    Parameters
127    ----------
128    params : dict
129        A dictionary of configuration parameters. Keys are sorted to ensure
130        determinism regardless of insertion order.
131    rep : int
132        The repetition or iteration index, used to ensure different seeds
133        are generated for the same parameter set across multiple runs.
134
135    Returns
136    -------
137    int
138        A unique integer seed derived from the input parameters.
139
140    Examples
141    --------
142    >>> params = {'learning_rate': 0.01, 'batch_size': 32}
143    >>> generate_unique_seed(params, 1)
144    3432571217
145    >>> generate_unique_seed(params, 2)
146    3960013583
147    """
148    identifier = json.dumps(params, sort_keys=True) + f"_{rep}"
149    return int(hashlib.sha256(identifier.encode()).hexdigest()[:8], 16)

Create a deterministic seed from a dictionary of parameters and a repetition index.

This function serializes the input dictionary into a sorted JSON string, appends the repetition count, and hashes the resulting string using SHA-256. The first 8 characters of the hex digest are then converted to an integer to provide a stable, unique seed for random number generators.

Parameters
  • params (dict): A dictionary of configuration parameters. Keys are sorted to ensure determinism regardless of insertion order.
  • rep (int): The repetition or iteration index, used to ensure different seeds are generated for the same parameter set across multiple runs.
Returns
  • int: A unique integer seed derived from the input parameters.
Examples
>>> params = {'learning_rate': 0.01, 'batch_size': 32}
>>> generate_unique_seed(params, 1)
3432571217
>>> generate_unique_seed(params, 2)
3960013583
def count_populations(grid: numpy.ndarray) -> Tuple[int, int, int]:
152def count_populations(grid: np.ndarray) -> Tuple[int, int, int]:
153    """
154    Count the number of empty, prey, and predator cells in the simulation grid.
155
156    Parameters
157    ----------
158    grid : np.ndarray
159        A 2D NumPy array representing the simulation environment, where:
160        - 0: Empty cell
161        - 1: Prey
162        - 2: Predator
163
164    Returns
165    -------
166    empty_count : int
167        Total number of cells with a value of 0.
168    prey_count : int
169        Total number of cells with a value of 1.
170    predator_count : int
171        Total number of cells with a value of 2.
172
173    Examples
174    --------
175    >>> grid = np.array([[0, 1], [2, 1]])
176    >>> count_populations(grid)
177    (1, 2, 1)
178    """
179    return int(np.sum(grid == 0)), int(np.sum(grid == 1)), int(np.sum(grid == 2))

Count the number of empty, prey, and predator cells in the simulation grid.

Parameters
  • grid (np.ndarray): A 2D NumPy array representing the simulation environment, where:
    • 0: Empty cell
    • 1: Prey
    • 2: Predator
Returns
  • empty_count (int): Total number of cells with a value of 0.
  • prey_count (int): Total number of cells with a value of 1.
  • predator_count (int): Total number of cells with a value of 2.
Examples
>>> grid = np.array([[0, 1], [2, 1]])
>>> count_populations(grid)
(1, 2, 1)
def get_evolved_stats(model, param: str) -> Dict:
182def get_evolved_stats(model, param: str) -> Dict:
183    """
184    Get statistics of an evolved parameter from the model.
185
186    This function retrieves parameter values from the model's internal storage,
187    filters out NaN values, and calculates basic descriptive statistics.
188
189    Parameters
190    ----------
191    model : object
192        The simulation model instance containing a `cell_params` attribute
193        with a `.get()` method.
194    param : str
195        The name of the parameter to calculate statistics for.
196
197    Returns
198    -------
199    stats : dict
200        A dictionary containing the following keys:
201        - 'mean': Arithmetic mean of valid values.
202        - 'std': Standard deviation of valid values.
203        - 'min': Minimum valid value.
204        - 'max': Maximum valid value.
205        - 'n': Count of non-NaN values.
206        If no valid data is found, all stats return NaN and n returns 0.
207
208    Examples
209    --------
210    >>> stats = get_evolved_stats(my_model, "speed")
211    >>> print(stats['mean'])
212    1.25
213    """
214    arr = model.cell_params.get(param)
215    if arr is None:
216        return {"mean": np.nan, "std": np.nan, "min": np.nan, "max": np.nan, "n": 0}
217    valid = arr[~np.isnan(arr)]
218    if len(valid) == 0:
219        return {"mean": np.nan, "std": np.nan, "min": np.nan, "max": np.nan, "n": 0}
220    return {
221        "mean": float(np.mean(valid)),
222        "std": float(np.std(valid)),
223        "min": float(np.min(valid)),
224        "max": float(np.max(valid)),
225        "n": len(valid),
226    }

Get statistics of an evolved parameter from the model.

This function retrieves parameter values from the model's internal storage, filters out NaN values, and calculates basic descriptive statistics.

Parameters
  • model (object): The simulation model instance containing a cell_params attribute with a .get() method.
  • param (str): The name of the parameter to calculate statistics for.
Returns
  • stats (dict): A dictionary containing the following keys:
    • 'mean': Arithmetic mean of valid values.
    • 'std': Standard deviation of valid values.
    • 'min': Minimum valid value.
    • 'max': Maximum valid value.
    • 'n': Count of non-NaN values. If no valid data is found, all stats return NaN and n returns 0.
Examples
>>> stats = get_evolved_stats(my_model, "speed")
>>> print(stats['mean'])
1.25
def average_pcfs( pcf_list: List[Tuple[numpy.ndarray, numpy.ndarray, int]]) -> Tuple[numpy.ndarray, numpy.ndarray, numpy.ndarray]:
229def average_pcfs(
230    pcf_list: List[Tuple[np.ndarray, np.ndarray, int]],
231) -> Tuple[np.ndarray, np.ndarray, np.ndarray]:
232    """
233    Average multiple Pair Correlation Function (PCF) measurements and calculate standard error.
234
235    Parameters
236    ----------
237    pcf_list : list of tuple
238        A list where each element is a tuple containing:
239        - distances (np.ndarray): The radial distances (r).
240        - pcf_values (np.ndarray): The correlation values g(r).
241        - count (int): Metadata or weight (not used in current calculation).
242
243    Returns
244    -------
245    distances : np.ndarray
246        The radial distances from the first entry in the list.
247    pcf_mean : np.ndarray
248        The element-wise mean of the PCF values across all measurements.
249    pcf_se : np.ndarray
250        The standard error of the mean for the PCF values.
251
252    Examples
253    --------
254    >>> data = [(np.array([0, 1]), np.array([1.0, 2.0]), 10),
255    ...         (np.array([0, 1]), np.array([1.2, 1.8]), 12)]
256    >>> dist, mean, se = average_pcfs(data)
257    >>> mean
258    array([1.1, 1.9])
259    """
260    if len(pcf_list) == 0:
261        return np.array([]), np.array([]), np.array([])
262
263    distances = pcf_list[0][0]
264    pcfs = np.array([p[1] for p in pcf_list])
265
266    pcf_mean = np.mean(pcfs, axis=0)
267    pcf_se = np.std(pcfs, axis=0) / np.sqrt(len(pcfs))
268
269    return distances, pcf_mean, pcf_se

Average multiple Pair Correlation Function (PCF) measurements and calculate standard error.

Parameters
  • pcf_list (list of tuple): A list where each element is a tuple containing:
    • distances (np.ndarray): The radial distances (r).
    • pcf_values (np.ndarray): The correlation values g(r).
    • count (int): Metadata or weight (not used in current calculation).
Returns
  • distances (np.ndarray): The radial distances from the first entry in the list.
  • pcf_mean (np.ndarray): The element-wise mean of the PCF values across all measurements.
  • pcf_se (np.ndarray): The standard error of the mean for the PCF values.
Examples
>>> data = [(np.array([0, 1]), np.array([1.0, 2.0]), 10),
...         (np.array([0, 1]), np.array([1.2, 1.8]), 12)]
>>> dist, mean, se = average_pcfs(data)
>>> mean
array([1.1, 1.9])
def save_results_jsonl(results: List[Dict], output_path: pathlib.Path):
272def save_results_jsonl(results: List[Dict], output_path: Path):
273    """
274    Save a list of dictionaries to a file in JSON Lines (JSONL) format.
275
276    Each dictionary in the list is serialized into a single JSON string and
277    written as a new line. Non-serializable objects are converted to strings
278    using the default string representation.
279
280    Parameters
281    ----------
282    results : list of dict
283        The collection of result dictionaries to be saved.
284    output_path : Path
285        The file system path (pathlib.Path) where the JSONL file will be created.
286
287    Returns
288    -------
289    None
290
291    Notes
292    -----
293    The file is opened in 'w' (write) mode, which will overwrite any existing
294    content at the specified path.
295
296    Examples
297    --------
298    >>> data = [{"id": 1, "score": 0.95}, {"id": 2, "score": 0.88}]
299    >>> save_results_jsonl(data, Path("results.jsonl"))
300    """
301    with open(output_path, "w", encoding="utf-8") as f:
302        for result in results:
303            f.write(json.dumps(result, default=str) + "\n")

Save a list of dictionaries to a file in JSON Lines (JSONL) format.

Each dictionary in the list is serialized into a single JSON string and written as a new line. Non-serializable objects are converted to strings using the default string representation.

Parameters
  • results (list of dict): The collection of result dictionaries to be saved.
  • output_path (Path): The file system path (pathlib.Path) where the JSONL file will be created.
Returns
  • None
Notes

The file is opened in 'w' (write) mode, which will overwrite any existing content at the specified path.

Examples
>>> data = [{"id": 1, "score": 0.95}, {"id": 2, "score": 0.88}]
>>> save_results_jsonl(data, Path("results.jsonl"))
def save_results_npz(results: List[Dict], output_path: pathlib.Path):
306def save_results_npz(results: List[Dict], output_path: Path):
307    """
308    Save simulation results to a compressed NumPy (.npz) binary file.
309
310    This function flattens a list of result dictionaries into a single
311    dictionary of NumPy arrays, prefixing keys with the run index to
312    maintain data separation. The resulting file is compressed to
313    reduce storage space.
314
315    Parameters
316    ----------
317    results : list of dict
318        A list where each dictionary contains key-value pairs of
319        simulation data (e.g., arrays, lists, or scalars).
320    output_path : Path
321        The file system path (pathlib.Path) where the compressed
322        NPZ file will be saved.
323
324    Returns
325    -------
326    None
327
328    Notes
329    -----
330    The keys in the saved file follow the format 'run_{index}_{original_key}'.
331    Values are automatically converted to NumPy arrays if they are not
332    already.
333
334    Examples
335    --------
336    >>> results = [{"energy": [1, 2]}, {"energy": [3, 4]}]
337    >>> save_results_npz(results, Path("output.npz"))
338    """
339    data = {}
340    for i, res in enumerate(results):
341        for key, val in res.items():
342            data[f"run_{i}_{key}"] = np.array(val)
343    np.savez_compressed(output_path, **data)

Save simulation results to a compressed NumPy (.npz) binary file.

This function flattens a list of result dictionaries into a single dictionary of NumPy arrays, prefixing keys with the run index to maintain data separation. The resulting file is compressed to reduce storage space.

Parameters
  • results (list of dict): A list where each dictionary contains key-value pairs of simulation data (e.g., arrays, lists, or scalars).
  • output_path (Path): The file system path (pathlib.Path) where the compressed NPZ file will be saved.
Returns
  • None
Notes

The keys in the saved file follow the format 'run_{index}_{original_key}'. Values are automatically converted to NumPy arrays if they are not already.

Examples
>>> results = [{"energy": [1, 2]}, {"energy": [3, 4]}]
>>> save_results_npz(results, Path("output.npz"))
def load_results_jsonl(input_path: pathlib.Path) -> List[Dict]:
346def load_results_jsonl(input_path: Path) -> List[Dict]:
347    """
348    Load simulation results from a JSON Lines (JSONL) formatted file.
349
350    This function reads a file line-by-line, parsing each line as an
351    independent JSON object and aggregating them into a list of dictionaries.
352
353    Parameters
354    ----------
355    input_path : Path
356        The file system path (pathlib.Path) to the JSONL file.
357
358    Returns
359    -------
360    results : list of dict
361        A list of dictionaries reconstructed from the file content.
362
363    Raises
364    ------
365    FileNotFoundError
366        If the specified input path does not exist.
367    json.JSONDecodeError
368        If a line in the file is not valid JSON.
369
370    Examples
371    --------
372    >>> data = load_results_jsonl(Path("results.jsonl"))
373    >>> len(data)
374    2
375    """
376    results = []
377    with open(input_path, "r", encoding="utf-8") as f:
378        for line in f:
379            results.append(json.loads(line.strip()))
380    return results

Load simulation results from a JSON Lines (JSONL) formatted file.

This function reads a file line-by-line, parsing each line as an independent JSON object and aggregating them into a list of dictionaries.

Parameters
  • input_path (Path): The file system path (pathlib.Path) to the JSONL file.
Returns
  • results (list of dict): A list of dictionaries reconstructed from the file content.
Raises
  • FileNotFoundError: If the specified input path does not exist.
  • json.JSONDecodeError: If a line in the file is not valid JSON.
Examples
>>> data = load_results_jsonl(Path("results.jsonl"))
>>> len(data)
2
def run_single_simulation( prey_birth: float, prey_death: float, predator_birth: float, predator_death: float, grid_size: int, seed: int, cfg: models.config.Config, with_evolution: bool = False, compute_pcf: Optional[bool] = None) -> Dict:
388def run_single_simulation(
389    prey_birth: float,
390    prey_death: float,
391    predator_birth: float,
392    predator_death: float,
393    grid_size: int,
394    seed: int,
395    cfg: Config,
396    with_evolution: bool = False,
397    compute_pcf: Optional[bool] = None,
398) -> Dict:
399    """
400    Run a single Predator-Prey (PP) simulation and collect comprehensive metrics.
401
402    This function initializes a Cellular Automata model, executes a warmup phase
403    to reach steady state, and then performs a measurement phase to track
404    population dynamics, spatial clustering, and evolutionary changes.
405
406    Parameters
407    ----------
408    prey_birth : float
409        The probability or rate of prey reproduction.
410    prey_death : float
411        The base probability or rate of prey mortality.
412    predator_birth : float
413        The probability or rate of predator reproduction upon consuming prey.
414    predator_death : float
415        The probability or rate of predator mortality.
416    grid_size : int
417        The side length of the square simulation grid.
418    seed : int
419        Random seed for ensuring reproducibility of the simulation run.
420    cfg : Config
421        A configuration object containing simulation hyperparameters (densities,
422        sampling rates, timing, etc.).
423    with_evolution : bool, optional
424        If True, enables the evolution of the 'prey_death' parameter within
425        the model (default is False).
426    compute_pcf : bool, optional
427        Explicit toggle for Pair Correlation Function calculation. If None,
428        it is determined by `cfg.pcf_sample_rate` (default is None).
429
430    Returns
431    -------
432    result : dict
433        A dictionary containing simulation results including:
434        - Input parameters and survival flags.
435        - Population mean and standard deviation for both species.
436        - Cluster statistics (number of clusters, sizes, largest fractions).
437        - Evolutionary statistics (mean, std, min, max, and final values).
438        - PCF data and spatial indices (segregation and clustering).
439        - Optional time series for populations and evolved parameters.
440
441    Notes
442    -----
443    The function relies on several external utilities: `count_populations`,
444    `get_evolved_stats`, `get_cluster_stats_fast`, `compute_all_pcfs_fast`,
445    and `average_pcfs`.
446    """
447
448    from models.CA import PP
449
450    if USE_NUMBA:
451        set_numba_seed(seed)
452
453    if compute_pcf is None:
454        compute_pcf = cfg.collect_pcf and (np.random.random() < cfg.pcf_sample_rate)
455
456    # Initialize model
457    model = PP(
458        rows=grid_size,
459        cols=grid_size,
460        densities=cfg.densities,
461        neighborhood="moore",  # NOTE: Default neighborhood
462        params={
463            "prey_birth": prey_birth,
464            "prey_death": prey_death,
465            "predator_death": predator_death,
466            "predator_birth": predator_birth,
467        },
468        seed=seed,
469        directed_hunting=cfg.directed_hunting,
470    )
471
472    if with_evolution:
473        model.evolve(
474            "prey_death",
475            sd=cfg.evolve_sd,
476            min_val=cfg.evolve_min,
477            max_val=cfg.evolve_max,
478        )
479
480    # Scale timing with grid size
481    warmup_steps = cfg.get_warmup_steps(grid_size)
482    measurement_steps = cfg.get_measurement_steps(grid_size)
483
484    # Warmup phase
485    for _ in range(warmup_steps):
486        model.update()
487
488    # Measurement phase: start collecting our mertics
489    prey_pops, pred_pops = [], []  # Prey populations and predator populations
490    evolved_means, evolved_stds = [], []  # Evolution stats over time
491    cluster_sizes_prey, cluster_sizes_pred = [], []  # Cluster sizes
492    largest_fractions_prey, largest_fractions_pred = (
493        [],
494        [],
495    )  # Largest cluster fractions = size of largest cluster / total population
496    pcf_samples = {"prey_prey": [], "pred_pred": [], "prey_pred": []}
497
498    # Determine minimum count for analysis
499    min_count = int(cfg.min_density_for_analysis * (grid_size**2))
500
501    for step in range(measurement_steps):
502        model.update()
503
504        _, prey, pred = count_populations(model.grid)
505        prey_pops.append(prey)
506        pred_pops.append(pred)
507
508        # Track evolution
509        if with_evolution:
510            stats = get_evolved_stats(model, "prey_death")
511            evolved_means.append(stats["mean"])
512            evolved_stds.append(stats["std"])
513
514        # Cluster analysis (at end of measurement)
515        if step == measurement_steps - 1:
516            prey_survived = prey_pops[-1] > min_count
517            pred_survived = pred_pops[-1] > (min_count // 4)
518
519            if prey_survived:
520                prey_stats = get_cluster_stats_fast(model.grid, 1)
521                cluster_sizes_prey = prey_stats["sizes"].tolist()
522                largest_fractions_prey.append(prey_stats["largest_fraction"])
523
524            if pred_survived:
525                pred_stats = get_cluster_stats_fast(model.grid, 2)
526                cluster_sizes_pred = pred_stats["sizes"].tolist()
527                largest_fractions_pred.append(pred_stats["largest_fraction"])
528
529            # PCF requires both
530            if compute_pcf and prey_survived and pred_survived:
531                max_dist = min(grid_size / 2, cfg.pcf_max_distance)
532                pcf_data = compute_all_pcfs_fast(model.grid, max_dist, cfg.pcf_n_bins)
533                pcf_samples["prey_prey"].append(pcf_data["prey_prey"])
534                pcf_samples["pred_pred"].append(pcf_data["pred_pred"])
535                pcf_samples["prey_pred"].append(pcf_data["prey_pred"])
536
537    # Compile results
538    result = {
539        # Parameters
540        "prey_birth": prey_birth,
541        "prey_death": prey_death,
542        "predator_birth": predator_birth,
543        "predator_death": predator_death,
544        "grid_size": grid_size,
545        "with_evolution": with_evolution,
546        "seed": seed,
547        # Population dynamics
548        "prey_mean": float(np.mean(prey_pops)),
549        "prey_std": float(np.std(prey_pops)),
550        "pred_mean": float(np.mean(pred_pops)),
551        "pred_std": float(np.std(pred_pops)),
552        "prey_survived": prey_pops[-1] > min_count,
553        "pred_survived": pred_pops[-1] > (min_count // 4),
554        # Cluster statistics
555        "prey_n_clusters": len(cluster_sizes_prey),
556        "pred_n_clusters": len(cluster_sizes_pred),
557        "prey_cluster_sizes": cluster_sizes_prey,
558        "pred_cluster_sizes": cluster_sizes_pred,
559        # Order parameters
560        "prey_largest_fraction": (
561            float(np.mean(largest_fractions_prey)) if largest_fractions_prey else np.nan
562        ),
563        "pred_largest_fraction": (
564            float(np.mean(largest_fractions_pred)) if largest_fractions_pred else np.nan
565        ),
566    }
567
568    # Time series (if requested)
569    if cfg.save_timeseries:
570        subsample = cfg.timeseries_subsample
571        result["prey_timeseries"] = prey_pops[
572            ::subsample
573        ]  # NOTE: Sample temporal data every 'subsample' steps
574        result["pred_timeseries"] = pred_pops[::subsample]
575
576    # Evolution statistics
577    if with_evolution and evolved_means:
578        valid_means = [v for v in evolved_means if not np.isnan(v)]
579        result["evolved_prey_death_mean"] = (
580            float(np.mean(valid_means)) if valid_means else np.nan
581        )
582        result["evolved_prey_death_std"] = (
583            float(np.mean([v for v in evolved_stds if not np.isnan(v)]))
584            if evolved_stds
585            else np.nan
586        )
587        result["evolved_prey_death_final"] = valid_means[-1] if valid_means else np.nan
588        result["evolved_prey_death_min"] = (
589            float(np.min(valid_means)) if valid_means else np.nan
590        )
591        result["evolved_prey_death_max"] = (
592            float(np.max(valid_means)) if valid_means else np.nan
593        )
594        result["evolve_sd"] = cfg.evolve_sd
595
596        if cfg.save_timeseries:
597            result["evolved_prey_death_timeseries"] = evolved_means[
598                :: cfg.timeseries_subsample
599            ]
600
601    # PCF statistics
602    if pcf_samples["prey_prey"]:
603        dist, pcf_rr, _ = average_pcfs(pcf_samples["prey_prey"])
604        _, pcf_cc, _ = average_pcfs(pcf_samples["pred_pred"])
605        _, pcf_cr, _ = average_pcfs(pcf_samples["prey_pred"])
606
607        result["pcf_distances"] = dist.tolist()
608        result["pcf_prey_prey"] = pcf_rr.tolist()
609        result["pcf_pred_pred"] = pcf_cc.tolist()
610        result["pcf_prey_pred"] = pcf_cr.tolist()
611
612        # Short-range indices
613        short_mask = dist < 3.0
614        if np.any(short_mask):
615            result["segregation_index"] = float(np.mean(pcf_cr[short_mask]))
616            result["prey_clustering_index"] = float(np.mean(pcf_rr[short_mask]))
617            result["pred_clustering_index"] = float(np.mean(pcf_cc[short_mask]))
618
619    return result

Run a single Predator-Prey (PP) simulation and collect comprehensive metrics.

This function initializes a Cellular Automata model, executes a warmup phase to reach steady state, and then performs a measurement phase to track population dynamics, spatial clustering, and evolutionary changes.

Parameters
  • prey_birth (float): The probability or rate of prey reproduction.
  • prey_death (float): The base probability or rate of prey mortality.
  • predator_birth (float): The probability or rate of predator reproduction upon consuming prey.
  • predator_death (float): The probability or rate of predator mortality.
  • grid_size (int): The side length of the square simulation grid.
  • seed (int): Random seed for ensuring reproducibility of the simulation run.
  • cfg (Config): A configuration object containing simulation hyperparameters (densities, sampling rates, timing, etc.).
  • with_evolution (bool, optional): If True, enables the evolution of the 'prey_death' parameter within the model (default is False).
  • compute_pcf (bool, optional): Explicit toggle for Pair Correlation Function calculation. If None, it is determined by cfg.pcf_sample_rate (default is None).
Returns
  • result (dict): A dictionary containing simulation results including:
    • Input parameters and survival flags.
    • Population mean and standard deviation for both species.
    • Cluster statistics (number of clusters, sizes, largest fractions).
    • Evolutionary statistics (mean, std, min, max, and final values).
    • PCF data and spatial indices (segregation and clustering).
    • Optional time series for populations and evolved parameters.
Notes

The function relies on several external utilities: count_populations, get_evolved_stats, get_cluster_stats_fast, compute_all_pcfs_fast, and average_pcfs.

def run_phase1( cfg: models.config.Config, output_dir: pathlib.Path, logger: logging.Logger) -> List[Dict]:
627def run_phase1(cfg: Config, output_dir: Path, logger: logging.Logger) -> List[Dict]:
628    """
629    Execute Phase 1 of the simulation: a parameter sweep to identify critical points.
630
631    This function performs a 1D sweep across varying prey mortality rates while
632    keeping other parameters fixed. It utilizes parallel execution via joblib
633    and saves results incrementally to a JSONL file to ensure data integrity
634    during long-running batches.
635
636    Parameters
637    ----------
638    cfg : Config
639        Configuration object containing simulation hyperparameters, sweep
640        ranges, and execution settings (n_jobs, grid_size, etc.).
641    output_dir : Path
642        Directory where result files (JSONL) and metadata (JSON) will be stored.
643    logger : logging.Logger
644        Logger instance for tracking simulation progress and recording
645        operational metadata.
646
647    Returns
648    -------
649    all_results : list of dict
650        A list of dictionaries containing the metrics collected from every
651        individual simulation run in the sweep.
652
653    Notes
654    -----
655    The function performs the following steps:
656    1. Pre-warms Numba kernels for performance.
657    2. Generates a deterministic set of simulation jobs using unique seeds.
658    3. Executes simulations in parallel using a generator for memory efficiency.
659    4. Records metadata including a timestamp and a serialized snapshot of
660       the configuration.
661    """
662    from joblib import Parallel, delayed
663
664    warmup_numba_kernels(cfg.grid_size, directed_hunting=cfg.directed_hunting)
665
666    prey_deaths = cfg.get_prey_deaths()
667
668    # Build job list
669    jobs = []
670    # Sweep through prey_death only (prey_birth is fixed)
671    for pd in prey_deaths:
672        for rep in range(cfg.n_replicates):
673            params = {"pd": pd}
674
675            seed = generate_unique_seed(params, rep)
676            jobs.append(
677                (
678                    cfg.prey_birth,
679                    pd,
680                    cfg.predator_birth,
681                    cfg.predator_death,
682                    cfg.grid_size,
683                    seed,
684                    cfg,
685                    False,
686                )
687            )
688
689    logger.info(f"Phase 1: {len(jobs):,} simulations")
690    logger.info(
691        f"  Grid: {cfg.n_prey_death} prey_death values × {cfg.n_replicates} reps (prey_birth={cfg.prey_birth})"
692    )
693    # Run with incremental saving
694    output_jsonl = output_dir / "phase1_results.jsonl"
695    all_results = []
696
697    with open(output_jsonl, "w", encoding="utf-8") as f:
698        executor = Parallel(n_jobs=cfg.n_jobs, return_as="generator")
699        tasks = (delayed(run_single_simulation)(*job) for job in jobs)
700
701        for result in tqdm(executor(tasks), total=len(jobs), desc="Phase 1"):
702            f.write(json.dumps(result, default=str) + "\n")
703            f.flush()
704            all_results.append(result)
705
706    # Save metadata
707    meta = {
708        "phase": 1,
709        "description": "Parameter sweep for critical point",
710        "n_sims": len(all_results),
711        "timestamp": time.strftime("%Y-%m-%d %H:%M:%S"),
712        "config": asdict(cfg),
713    }
714    with open(output_dir / "phase1_metadata.json", "w") as f:
715        json.dump(meta, f, indent=2, default=str)
716
717    logger.info(f"Phase 1 complete. Results: {output_jsonl}")
718    return all_results

Execute Phase 1 of the simulation: a parameter sweep to identify critical points.

This function performs a 1D sweep across varying prey mortality rates while keeping other parameters fixed. It utilizes parallel execution via joblib and saves results incrementally to a JSONL file to ensure data integrity during long-running batches.

Parameters
  • cfg (Config): Configuration object containing simulation hyperparameters, sweep ranges, and execution settings (n_jobs, grid_size, etc.).
  • output_dir (Path): Directory where result files (JSONL) and metadata (JSON) will be stored.
  • logger (logging.Logger): Logger instance for tracking simulation progress and recording operational metadata.
Returns
  • all_results (list of dict): A list of dictionaries containing the metrics collected from every individual simulation run in the sweep.
Notes

The function performs the following steps:

  1. Pre-warms Numba kernels for performance.
  2. Generates a deterministic set of simulation jobs using unique seeds.
  3. Executes simulations in parallel using a generator for memory efficiency.
  4. Records metadata including a timestamp and a serialized snapshot of the configuration.
def run_phase2( cfg: models.config.Config, output_dir: pathlib.Path, logger: logging.Logger) -> List[Dict]:
721def run_phase2(cfg: Config, output_dir: Path, logger: logging.Logger) -> List[Dict]:
722    """
723    Execute Phase 2 of the simulation: self-organization and criticality analysis.
724
725    This phase tests the Self-Organized Criticality (SOC) hypothesis by
726    initializing simulations at different points in the parameter space and
727    observing whether evolutionary pressure drives the system toward a
728    common critical point, regardless of initial prey mortality rates.
729
730    Parameters
731    ----------
732    cfg : Config
733        Configuration object containing simulation hyperparameters, evolution
734        settings, and execution constraints.
735    output_dir : Path
736        Directory where result files (JSONL) and metadata (JSON) will be stored.
737    logger : logging.Logger
738        Logger instance for tracking progress and evolutionary convergence.
739
740    Returns
741    -------
742    all_results : list of dict
743        A list of dictionaries containing metrics from the evolutionary
744        simulation runs.
745
746    Notes
747    -----
748    The function captures:
749    1. Convergence of 'prey_death' across multiple replicates.
750    2. Final steady-state population distributions.
751    3. Incremental saving of results to prevent data loss.
752    """
753    from joblib import Parallel, delayed
754
755    warmup_numba_kernels(cfg.grid_size, directed_hunting=cfg.directed_hunting)
756
757    # Test at multiple prey_birth values
758    pb = 0.2
759    # Vary intial prey_death
760    initial_prey_deaths = np.linspace(
761        cfg.prey_death_range[0], cfg.prey_death_range[1], cfg.n_prey_death
762    )
763
764    jobs = []
765    for initial_pd in initial_prey_deaths:
766        for rep in range(cfg.n_replicates):
767            params = {"pb": pb, "initial_pd": initial_pd, "phase": 2}
768            seed = generate_unique_seed(params, rep)
769            jobs.append(
770                (
771                    pb,
772                    initial_pd,
773                    cfg.predator_birth,
774                    cfg.predator_death,
775                    cfg.grid_size,
776                    seed,
777                    cfg,
778                    True,
779                )
780            )
781
782    logger.info(f"Phase 2: {len(jobs):,} simulations")
783    logger.info(f"  prey_birth value: {pb}")
784    logger.info(f"  initial prey_death values: {len(initial_prey_deaths)}")
785    logger.info(f"  Replicates: {cfg.n_replicates}")
786
787    output_jsonl = output_dir / "phase2_results.jsonl"
788    all_results = []
789
790    with open(output_jsonl, "w", encoding="utf-8") as f:
791        executor = Parallel(n_jobs=cfg.n_jobs, return_as="generator")
792        tasks = (delayed(run_single_simulation)(*job) for job in jobs)
793
794        for result in tqdm(executor(tasks), total=len(jobs), desc="Phase 2"):
795            f.write(json.dumps(result, default=str) + "\n")
796            f.flush()
797            all_results.append(result)
798
799    meta = {
800        "phase": 2,
801        "description": "Self-organization toward criticality",
802        "n_sims": len(all_results),
803        "initial_prey_deaths": initial_prey_deaths.tolist(),
804        "timestamp": time.strftime("%Y-%m-%d %H:%M:%S"),
805    }
806    with open(output_dir / "phase2_metadata.json", "w") as f:
807        json.dump(meta, f, indent=2, default=str)
808
809    logger.info(f"Phase 2 complete. Results: {output_jsonl}")
810    return all_results

Execute Phase 2 of the simulation: self-organization and criticality analysis.

This phase tests the Self-Organized Criticality (SOC) hypothesis by initializing simulations at different points in the parameter space and observing whether evolutionary pressure drives the system toward a common critical point, regardless of initial prey mortality rates.

Parameters
  • cfg (Config): Configuration object containing simulation hyperparameters, evolution settings, and execution constraints.
  • output_dir (Path): Directory where result files (JSONL) and metadata (JSON) will be stored.
  • logger (logging.Logger): Logger instance for tracking progress and evolutionary convergence.
Returns
  • all_results (list of dict): A list of dictionaries containing metrics from the evolutionary simulation runs.
Notes

The function captures:

  1. Convergence of 'prey_death' across multiple replicates.
  2. Final steady-state population distributions.
  3. Incremental saving of results to prevent data loss.
def run_phase3( cfg: models.config.Config, output_dir: pathlib.Path, logger: logging.Logger) -> List[Dict]:
813def run_phase3(cfg: Config, output_dir: Path, logger: logging.Logger) -> List[Dict]:
814    """
815    Phase 3: Finite-size scaling at critical point.
816
817    - Multiple grid sizes at (critical_prey_birth, critical_prey_death)
818    - Analyze cluster size cutoffs vs L
819    """
820    from joblib import Parallel, delayed
821
822    # NOTE: Tuned to critical points from phase 1
823    pb = cfg.critical_prey_birth
824    pd = cfg.critical_prey_death
825
826    logger.info(f"Phase 3: FSS at critical point (pb={pb}, pd={pd})")
827
828    for L in cfg.grid_sizes:
829        warmup_numba_kernels(L, directed_hunting=cfg.directed_hunting)
830
831    jobs = []
832    for L in cfg.grid_sizes:  # Sweep through grid sizes
833        for rep in range(cfg.n_replicates):
834            params = {"L": L, "phase": 3}
835            seed = generate_unique_seed(params, rep)
836            jobs.append(
837                (pb, pd, cfg.predator_birth, cfg.predator_death, L, seed, cfg, False)
838            )
839
840    logger.info(f"  Grid sizes: {cfg.grid_sizes}")
841    logger.info(f"  Total simulations: {len(jobs):,}")
842
843    output_jsonl = output_dir / "phase3_results.jsonl"
844    all_results = []
845
846    with open(output_jsonl, "w", encoding="utf-8") as f:
847        executor = Parallel(n_jobs=cfg.n_jobs, return_as="generator")
848        tasks = (delayed(run_single_simulation)(*job) for job in jobs)
849
850        for result in tqdm(executor(tasks), total=len(jobs), desc="Phase 3"):
851            f.write(json.dumps(result, default=str) + "\n")
852            f.flush()
853            all_results.append(result)
854
855    # Post-run metadata: postprocessing will fit cluster cutoffs vs L
856    meta = {
857        "phase": 3,
858        "description": "Finite-size scaling",
859        "critical_point": {"prey_birth": pb, "prey_death": pd},
860        "grid_sizes": cfg.grid_sizes,
861        "n_sims": len(all_results),
862        "timestamp": time.strftime("%Y-%m-%d %H:%M:%S"),
863    }
864    with open(output_dir / "phase3_metadata.json", "w") as f:
865        json.dump(meta, f, indent=2, default=str)
866
867    logger.info(f"Phase 3 complete. Results: {output_jsonl}")
868    return all_results

Phase 3: Finite-size scaling at critical point.

  • Multiple grid sizes at (critical_prey_birth, critical_prey_death)
  • Analyze cluster size cutoffs vs L
def run_phase4( cfg: models.config.Config, output_dir: pathlib.Path, logger: logging.Logger) -> List[Dict]:
871def run_phase4(cfg: Config, output_dir: Path, logger: logging.Logger) -> List[Dict]:
872    """
873    Execute Phase 3 of the simulation: Finite-Size Scaling (FSS) analysis.
874
875    This phase investigates how spatial structures, specifically cluster size
876    cutoffs, scale with the system size (L) at the critical point identified
877    in Phase 1. This is essential for determining the universality class of
878    the phase transition.
879
880    Parameters
881    ----------
882    cfg : Config
883        Configuration object containing critical point parameters, the list of
884        grid sizes to test, and execution settings.
885    output_dir : Path
886        Directory where result files (JSONL) and FSS metadata (JSON) will be
887        stored.
888    logger : logging.Logger
889        Logger instance for tracking progress across different grid sizes.
890
891    Returns
892    -------
893    all_results : list of dict
894        A list of dictionaries containing metrics and cluster statistics for
895        each grid size and replicate.
896
897    Notes
898    -----
899    The function performs the following:
900    1. Iterates through multiple grid sizes defined in `cfg.grid_sizes`.
901    2. Generates parallel jobs for each size using critical birth/death rates.
902    3. Saves results incrementally to allow for post-simulation analysis of
903       power-law exponents.
904    """
905    from joblib import Parallel, delayed
906    import itertools
907
908    warmup_numba_kernels(cfg.grid_size, directed_hunting=cfg.directed_hunting)
909
910    # Define sweep values
911    prey_death_values = np.linspace(0.05, 0.95, 10)  # 10 values for prey_death
912    other_param_values = np.linspace(0.0, 1.0, 11)  # 11 values for the rest
913
914    # Logging
915    logger.info(f"Phase 4: Full 4D Parameter Sweep")
916    logger.info(f"  prey_death: 10 values from 0.05 to 0.95")
917    logger.info(f"  prey_birth, pred_birth, pred_death: 11 values each from 0 to 1")
918    logger.info(f"  Grid Size: {cfg.grid_size}")
919    logger.info(f"  Replicates: {cfg.n_replicates}")
920
921    # Build parameter grid
922    param_grid = itertools.product(
923        other_param_values,  # prey_birth (11 values)
924        prey_death_values,  # prey_death (10 values)
925        other_param_values,  # predator_birth (11 values)
926        other_param_values,  # predator_death (11 values)
927    )
928
929    jobs = []
930
931    for pb, pd, pred_b, pred_d in param_grid:
932        for rep in range(cfg.n_replicates):
933            params_id = {
934                "pb": pb,
935                "pd": pd,
936                "pred_b": pred_b,
937                "pred_d": pred_d,
938                "rep": rep,
939            }
940            seed = generate_unique_seed(params_id, rep)
941
942            jobs.append(
943                (
944                    pb,  # prey_birth
945                    pd,  # prey_death
946                    pred_b,  # predator_birth
947                    pred_d,  # predator_death
948                    cfg.grid_size,
949                    seed,
950                    cfg,
951                    False,
952                )
953            )
954
955    logger.info(
956        f"  Total simulations: {len(jobs):,}"
957    )  # 11 * 10 * 11 * 11 * n_reps = 13,310 * n_reps
958
959    output_jsonl = output_dir / "phase4_results.jsonl"
960    all_results = []
961
962    with open(output_jsonl, "w", encoding="utf-8") as f:
963        executor = Parallel(n_jobs=cfg.n_jobs, return_as="generator")
964        tasks = (delayed(run_single_simulation)(*job) for job in jobs)
965
966        for result in tqdm(executor(tasks), total=len(jobs), desc="Phase 4 (4D Sweep)"):
967            f.write(json.dumps(result, default=str) + "\n")
968            f.flush()
969            all_results.append(result)
970
971    # Save Metadata
972    meta = {
973        "phase": 4,
974        "description": "Global 4D Sensitivity Analysis",
975        "prey_death_values": prey_death_values.tolist(),
976        "other_param_values": other_param_values.tolist(),
977        "parameters_varied": [
978            "prey_birth",
979            "prey_death",
980            "predator_birth",
981            "predator_death",
982        ],
983        "n_sims": len(all_results),
984        "timestamp": time.strftime("%Y-%m-%d %H:%M:%S"),
985        "config": asdict(cfg),
986    }
987    with open(output_dir / "phase4_metadata.json", "w") as f:
988        json.dump(meta, f, indent=2, default=str)
989
990    logger.info(f"Phase 4 complete. Results: {output_jsonl}")
991    return all_results

Execute Phase 3 of the simulation: Finite-Size Scaling (FSS) analysis.

This phase investigates how spatial structures, specifically cluster size cutoffs, scale with the system size (L) at the critical point identified in Phase 1. This is essential for determining the universality class of the phase transition.

Parameters
  • cfg (Config): Configuration object containing critical point parameters, the list of grid sizes to test, and execution settings.
  • output_dir (Path): Directory where result files (JSONL) and FSS metadata (JSON) will be stored.
  • logger (logging.Logger): Logger instance for tracking progress across different grid sizes.
Returns
  • all_results (list of dict): A list of dictionaries containing metrics and cluster statistics for each grid size and replicate.
Notes

The function performs the following:

  1. Iterates through multiple grid sizes defined in cfg.grid_sizes.
  2. Generates parallel jobs for each size using critical birth/death rates.
  3. Saves results incrementally to allow for post-simulation analysis of power-law exponents.
def run_phase5( cfg: models.config.Config, output_dir: pathlib.Path, logger: logging.Logger) -> List[Dict]:
 994def run_phase5(cfg: Config, output_dir: Path, logger: logging.Logger) -> List[Dict]:
 995    """
 996    Execute Phase 5 of the simulation: Global 4D parameter sweep with directed hunting.
 997
 998    This phase performs a comprehensive sensitivity analysis by varying four key
 999    parameters (prey birth/death and predator birth/death) while directed
1000    hunting is enabled. The results allow for a direct comparison with Phase 4
1001    to determine how predator search behavior shifts the system's critical
1002    thresholds and stability.
1003
1004    Parameters
1005    ----------
1006    cfg : Config
1007        Configuration object containing simulation hyperparameters, parallel
1008        execution settings, and the fixed grid size for this phase.
1009    output_dir : Path
1010        Directory where the result JSONL file and execution metadata will
1011        be stored.
1012    logger : logging.Logger
1013        Logger instance for tracking the progress of the high-volume
1014        simulation batch.
1015
1016    Returns
1017    -------
1018    all_results : list of dict
1019        A list of dictionaries containing metrics for every simulation in
1020        the 4D parameter grid.
1021
1022    Notes
1023    -----
1024    The function utilizes a Cartesian product of parameter ranges to build a
1025    job list of over 13,000 unique parameter sets (multiplied by replicates).
1026    Seeds are uniquely generated to distinguish these runs from other phases
1027    even if parameter values overlap.
1028    """
1029    from joblib import Parallel, delayed
1030    import itertools
1031
1032    warmup_numba_kernels(cfg.grid_size, directed_hunting=cfg.directed_hunting)
1033
1034    # Define sweep values (same as Phase 4)
1035    prey_death_values = np.linspace(0.05, 0.95, 10)  # 10 values for prey_death
1036    other_param_values = np.linspace(0.0, 1.0, 11)  # 11 values for the rest
1037
1038    # Logging
1039    logger.info(f"Phase 5: Full 4D Parameter Sweep (Directed Hunting)")
1040    logger.info(f"  prey_death: 10 values from 0.05 to 0.95")
1041    logger.info(f"  prey_birth, pred_birth, pred_death: 11 values each from 0 to 1")
1042    logger.info(f"  Grid Size: {cfg.grid_size}")
1043    logger.info(f"  Replicates: {cfg.n_replicates}")
1044    logger.info(f"  Directed Hunting: {cfg.directed_hunting}")
1045
1046    # Build parameter grid
1047    param_grid = itertools.product(
1048        other_param_values,  # prey_birth (11 values)
1049        prey_death_values,  # prey_death (10 values)
1050        other_param_values,  # predator_birth (11 values)
1051        other_param_values,  # predator_death (11 values)
1052    )
1053
1054    jobs = []
1055
1056    for pb, pd, pred_b, pred_d in param_grid:
1057        for rep in range(cfg.n_replicates):
1058            # Include phase identifier to ensure different seeds from Phase 4
1059            params_id = {
1060                "pb": pb,
1061                "pd": pd,
1062                "pred_b": pred_b,
1063                "pred_d": pred_d,
1064                "phase": 6,
1065                "rep": rep,
1066            }
1067            seed = generate_unique_seed(params_id, rep)
1068
1069            jobs.append(
1070                (
1071                    pb,  # prey_birth
1072                    pd,  # prey_death
1073                    pred_b,  # predator_birth
1074                    pred_d,  # predator_death
1075                    cfg.grid_size,
1076                    seed,
1077                    cfg,
1078                    False,
1079                )
1080            )
1081
1082    logger.info(
1083        f"  Total simulations: {len(jobs):,}"
1084    )  # 11 * 10 * 11 * 11 * n_reps = 13,310 * n_reps
1085
1086    output_jsonl = output_dir / "phase5_results.jsonl"
1087    all_results = []
1088
1089    with open(output_jsonl, "w", encoding="utf-8") as f:
1090        executor = Parallel(n_jobs=cfg.n_jobs, return_as="generator")
1091        tasks = (delayed(run_single_simulation)(*job) for job in jobs)
1092
1093        for result in tqdm(
1094            executor(tasks), total=len(jobs), desc="Phase 6 (4D Sweep + Directed)"
1095        ):
1096            f.write(json.dumps(result, default=str) + "\n")
1097            f.flush()
1098            all_results.append(result)
1099
1100    # Save Metadata
1101    meta = {
1102        "phase": 5,
1103        "description": "Global 4D Sensitivity Analysis with Directed Hunting",
1104        "prey_death_values": prey_death_values.tolist(),
1105        "other_param_values": other_param_values.tolist(),
1106        "parameters_varied": [
1107            "prey_birth",
1108            "prey_death",
1109            "predator_birth",
1110            "predator_death",
1111        ],
1112        "directed_hunting": cfg.directed_hunting,
1113        "n_sims": len(all_results),
1114        "timestamp": time.strftime("%Y-%m-%d %H:%M:%S"),
1115        "config": asdict(cfg),
1116    }
1117    with open(output_dir / "phase6_metadata.json", "w") as f:
1118        json.dump(meta, f, indent=2, default=str)
1119
1120    logger.info(f"Phase 5 complete. Results: {output_jsonl}")
1121    return all_results

Execute Phase 5 of the simulation: Global 4D parameter sweep with directed hunting.

This phase performs a comprehensive sensitivity analysis by varying four key parameters (prey birth/death and predator birth/death) while directed hunting is enabled. The results allow for a direct comparison with Phase 4 to determine how predator search behavior shifts the system's critical thresholds and stability.

Parameters
  • cfg (Config): Configuration object containing simulation hyperparameters, parallel execution settings, and the fixed grid size for this phase.
  • output_dir (Path): Directory where the result JSONL file and execution metadata will be stored.
  • logger (logging.Logger): Logger instance for tracking the progress of the high-volume simulation batch.
Returns
  • all_results (list of dict): A list of dictionaries containing metrics for every simulation in the 4D parameter grid.
Notes

The function utilizes a Cartesian product of parameter ranges to build a job list of over 13,000 unique parameter sets (multiplied by replicates). Seeds are uniquely generated to distinguish these runs from other phases even if parameter values overlap.

def main():
1137def main():
1138    """
1139    Organize the predator-prey experimental suite across multiple phases.
1140
1141    This entry point handles command-line arguments, sets up logging and output
1142    directories, and executes the requested simulation phases (1-5). It
1143    supports parallel execution, dry runs for runtime estimation, and
1144    automated configuration persistence.
1145
1146    Notes
1147    -----
1148    The script dynamically retrieves phase-specific configurations using
1149    `get_phase_config` and dispatches execution to the corresponding runner
1150    in the `PHASE_RUNNERS` mapping.
1151    """
1152    parser = argparse.ArgumentParser(
1153        description="Predator-Prey Hydra Effect Experiments",
1154        formatter_class=argparse.RawDescriptionHelpFormatter,
1155        epilog="""
1156Phases:
1157  1  Parameter sweep to find critical point
1158  2  Self-organization (evolution toward criticality)
1159  3  Finite-size scaling at critical point
1160  4  Sensitivity analysis across parameter regimes
1161  5  Model extensions (directed hunting comparison)
1162        """,
1163    )
1164    parser.add_argument(
1165        "--phase", type=str, required=True, help="Phase to run: 1-6 or 'all'"
1166    )
1167    parser.add_argument(
1168        "--output",
1169        type=Path,
1170        default=Path("results"),
1171        help="Output directory (default: results)",
1172    )
1173    parser.add_argument(
1174        "--cores", type=int, default=-1, help="Number of cores (-1 for all)"
1175    )
1176    parser.add_argument(
1177        "--dry-run", action="store_true", help="Estimate runtime without running"
1178    )
1179    args = parser.parse_args()
1180
1181    # Parse phase argument
1182    if args.phase.lower() == "all":
1183        phases = list(PHASE_RUNNERS.keys())
1184    else:
1185        try:
1186            phases = [int(args.phase)]
1187        except ValueError:
1188            print(f"Invalid phase: {args.phase}. Use 1-6 or 'all'")
1189            sys.exit(1)
1190
1191    # Setup output directory
1192    args.output.mkdir(parents=True, exist_ok=True)
1193
1194    # Setup logging
1195    logging.basicConfig(
1196        level=logging.INFO,
1197        format="%(asctime)s [%(levelname)s] %(message)s",
1198        handlers=[
1199            logging.FileHandler(args.output / "experiments.log"),
1200            logging.StreamHandler(),
1201        ],
1202    )
1203    logger = logging.getLogger(__name__)
1204
1205    # Header
1206    logger.info("=" * 60)
1207    logger.info("PREDATOR-PREY HYDRA EFFECT EXPERIMENTS")
1208    logger.info("=" * 60)
1209    logger.info(f"Phases: {phases}")
1210    logger.info(f"Output: {args.output}")
1211    logger.info(f"Cores: {args.cores}")
1212    logger.info(f"Numba: {'ENABLED' if USE_NUMBA else 'DISABLED'}")
1213
1214    # Process each phase
1215    for phase in phases:
1216        cfg = get_phase_config(phase)
1217        cfg.n_jobs = (
1218            args.cores
1219            if args.cores > 0
1220            else int(os.environ.get("SLURM_CPUS_PER_TASK", -1))
1221        )
1222
1223        logger.info("")
1224        logger.info(f"{'='*60}")
1225        logger.info(f"PHASE {phase}")
1226        logger.info(f"{'='*60}")
1227
1228        n_cores = cfg.n_jobs if cfg.n_jobs > 0 else os.cpu_count()
1229        logger.info(f"Estimated: {cfg.estimate_runtime(n_cores)}")
1230
1231        if args.dry_run:
1232            logger.info("Dry run - skipping execution")
1233            continue
1234
1235        # Save config
1236        with open(args.output / f"phase{phase}_config.json", "w") as f:
1237            json.dump(asdict(cfg), f, indent=2, default=str)
1238
1239        # Run phase
1240        start_time = time.time()
1241        runner = PHASE_RUNNERS[phase]
1242        runner(cfg, args.output, logger)
1243        elapsed = time.time() - start_time
1244
1245        logger.info(f"Phase {phase} runtime: {elapsed/60:.1f} minutes")
1246
1247    logger.info("")
1248    logger.info("=" * 60)
1249    logger.info("EXPERIMENTS COMPLETE")
1250    logger.info("=" * 60)

Organize the predator-prey experimental suite across multiple phases.

This entry point handles command-line arguments, sets up logging and output directories, and executes the requested simulation phases (1-5). It supports parallel execution, dry runs for runtime estimation, and automated configuration persistence.

Notes

The script dynamically retrieves phase-specific configurations using get_phase_config and dispatches execution to the corresponding runner in the PHASE_RUNNERS mapping.