Dockerized perf benchmarker for Monte Carlo simulation engine. More...

Namespaces
namespace	scripts

namespace	scripts.run_perf

Variables
str	scripts.run_perf.ROOT_DIR = "$(cd "$(dirname "${BASH_SOURCE[0]}")/.." && pwd)"

	scripts.run_perf.PYTHONPATH

int	scripts.run_perf.DEFAULT_TRIALS = 100000000

tuple	scripts.run_perf.ALL_METHODS = ("Sequential" "Heap" "Pool" "SIMD")

str	scripts.run_perf.SCRIPT_DIR = "$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"

	scripts.run_perf.BATCHID = $(uuidgen \| cut -d'-' -f1)

str	scripts.run_perf.BUILD_PATH = "./build/montecarlo"

	scripts.run_perf.GLOBAL_TIMESTAMP = $(date "+%Y-%m-%d_%H-%M-%S")

str	scripts.run_perf.ARG1 = "$1"

str	scripts.run_perf.ARG2 = "$2"

	scripts.run_perf.ARG3

	scripts.run_perf.INSERT_DB = true

int	scripts.run_perf.TRIALS = $DEFAULT_TRIALS

tuple	scripts.run_perf.METHODS = ("${ALL_METHODS[@]}")

str	scripts.run_perf.LOG_DIR = "db/logs/batch_${BATCHID}_${GLOBAL_TIMESTAMP}"

str	scripts.run_perf.PERF_EVENTS = ""

	scripts.run_perf.METHOD_TIMESTAMP = $(date +"%Y-%m-%d %H:%M:%S")

str	scripts.run_perf.LOG_PATH = "$LOG_DIR/perf_${METHOD}_${METHOD_TIMESTAMP}.csv"

str	scripts.run_perf.PERF_PARQUET = "$LOG_DIR/perf_results_${METHOD}_${METHOD_TIMESTAMP}_${BATCHID}.parquet"

	scripts.run_perf.START_NS = $(date +%s%N)

	scripts.run_perf.END = $(date +%s%N)

	scripts.run_perf.WALL_NS = $((END - START_NS))

	scripts.run_perf.WALL_S = $(awk "BEGIN {printf \"%.6f\", $WALL_NS / 1000000000}")

Detailed Description

Dockerized perf benchmarker for Monte Carlo simulation engine.

Benchmarks simulation methods (SIMD, Pool, Heap, etc.) using perf stat, and logs system performance metrics to structured logs (CSV, Parquet).

=== Metrics Logged ===

Core Execution

cycles: Total CPU clock cycles consumed
instr: Total instructions executed
ipc: Instructions per cycle (efficiency metric)
cycles_per_trial: Avg CPU cycles used per simulation trial

Time

wall_time_s: Real-world elapsed time (sec)
wall_time_ns: Real-world elapsed time (nanoseconds, high-precision)

Cache Accesses

l1_loads: L1 data cache load attempts
l1_misses: L1 cache load misses (ideal <5%)
l2_loads: L2 cache load attempts (may not be supported)
l2_misses: L2 cache load misses (may not be supported)
l3_loads: Last-level (L3) cache loads (may not be supported)
l3_misses: L3 misses (fallback to RAM = slow, may not be supported)
cache_loads: Aggregated cache loads
cache_miss: Aggregated cache misses

Memory Paging

tlb_loads: TLB accesses (address translation)
tlb_misses: TLB misses (page walk penalty)

Branch Prediction

branch_instr: Total branch instructions
branch_misses: Branch mispredictions (pipeline flushes)

Derived

miss_per_trial: Cache+TLB misses per trial (normalized locality metric)

=== Compatibility Notes (L2/L3 Caveats) ===

Some performance counters are not consistently available across systems:

Intel: L2/L3 counters usually available as L2-dcache-*, LLC-*
AMD: L2/L3 counters may be missing (Zen 3/4); use perf list to verify
Virtual machines and containers may block PMU access

Always validate support using: $ perf list | grep -i l2 $ lscpu # check microarchitecture

=== ClickHouse Integration ===

By default, results are inserted into ClickHouse at the end of each batch run. To enable this:

You must first run: make init (first time setup) or make up (to start services)
Requires Docker and a running ClickHouse instance

To skip ClickHouse insertion (e.g., CI, dry runs): ./run_perf.sh 50000000 SIMD insert_db=false

=== Usage === ./run_perf.sh # Run all methods with default trials, insert to DB ./run_perf.sh 50000000 # All methods, custom trials ./run_perf.sh SIMD # Single method, default trials ./run_perf.sh 50000000 Pool # Custom trials and single method ./run_perf.sh 50000000 SIMD insert_db=false # Run without inserting to ClickHouse

=== Output Files === db/logs/batch_<BATCHID>/perf_<METHOD>_<TIMESTAMP>.csv → Raw perf stat output

db/logs/batch_<BATCHID>/perf_results_<METHOD>_<TIMESTAMP>_<BATCHID>.parquet → Parsed structured metrics for that method

db/logs/batch_<BATCHID>/perf_results_all_<BATCHID>.parquet → Combined metrics across all methods (for analysis or dashboarding)

Definition in file run_perf.sh.

Namespaces

Variables

Detailed Description