Monte Carlo Benchmarking Engine
High-performance SIMD Monte Carlo engine (AVX2/NEON) with custom memory allocators and perf logging.
 
Loading...
Searching...
No Matches
schema.py File Reference

Defines the canonical schema used across ETL, validation, and ClickHouse ingestion. More...

Go to the source code of this file.

Namespaces

namespace  pipeline
 
namespace  pipeline.schema
 

Variables

dict pipeline.schema.SCHEMA
 

Detailed Description

Defines the canonical schema used across ETL, validation, and ClickHouse ingestion.

Description
This file defines the global SCHEMA dictionary that maps column names to Polars dtypes and nullability flags. It is used for:
  • Casting CSV inputs via safe_vector_cast()
  • Enforcing field consistency across benchmarks
  • Generating ClickHouse CREATE TABLE statements
Format
SCHEMA = { "Column Name": (Polars DataType, is_nullable: bool), ... }
Design Notes
  • All timestamps use millisecond-resolution Datetime
  • Percent fields are stored as Float64 (0–100%)
  • L2/L3-related fields are nullable by default (may not be available on all CPUs)
  • Field names match CSV headers and ClickHouse columns exactly

Definition in file schema.py.