← Back to Portfolio
backtesting walk-forward financial-analysis global-adjustment ieso demand-data machine-learning xgboost

IESO Coincident Peak Prediction

ML-driven prediction of Ontario's top-5 system demand hours for Class A Global Adjustment optimization

Pythonpandasmatplotlibseabornnumpyrequestsopenpyxlscikit-learnxgboostlightgbmshapjoblib

Key Findings

  • Regression-first approach avoids the extreme class imbalance problem (5 peaks out of 8,760 hours)
  • XGBoost consistently captures 3–5 of 5 coincident peaks across backtested base periods
  • Model-guided curtailment reduces false alarm days by 70–80% vs. naive temperature heuristic
  • For a 1.5 MW facility, predicted GA savings range from $200,000–$350,000 annually
  • Humidex and cooling degree hours outperform raw temperature as predictive features
  • SHAP feature importance aligns with engineering domain knowledge

Scope

A 5-notebook machine learning project that predicts Ontario's top-5 coincident peak demand hours each base period — the hours that set Global Adjustment (GA) cost allocation for Class A electricity customers. The project covers the full ML lifecycle: data acquisition and EDA, feature engineering, model training and selection, walk-forward backtesting with financial valuation, and operational deployment design. The target customer is a 1–2 MW industrial or commercial facility paying Class A GA rates, where correctly predicting and curtailing during even 3 of 5 peak hours can reduce annual GA charges by $150,000–$400,000. The analytical framework generalizes to any ICI customer large enough for Class A designation (typically >1 MW average demand).

Data

15 base periods (2010–2025) of IESO hourly Ontario demand data combined from local ICI files and IESO public CSVs — approximately 130,000 hourly observations. Open-Meteo historical weather data for the Toronto region (hourly temperature, humidity, wind speed, solar radiation) covering the same period. Features are engineered from both sources: humidex, daily cooling degree hours, demand momentum (previous day's maximum), calendar features (day of week, month, holiday flags), and peak context variables (current threshold, peaks identified so far in the base period). The dataset exhibits extreme class imbalance — only 5 positive hours out of roughly 8,760 per base period.

Analytical Approach

The project uses a regression-first strategy: instead of treating peak detection as a classification problem (which suffers from 1750:1 class imbalance), an XGBoost regression model predicts daily maximum Ontario demand, and a threshold-based alert system converts predictions into actionable RED/YELLOW/GREEN risk levels. Feature engineering emphasizes domain knowledge — humidex captures the humidity-driven cooling load that raw temperature misses, cooling degree hours integrate thermal stress over the day, and demand momentum captures grid-level inertia. Model selection compares gradient boosting (XGBoost, LightGBM) against linear baselines and classification approaches, with SHAP analysis confirming feature importance aligns with engineering intuition. Walk-forward backtesting across 5+ base periods validates out-of-sample performance, and a confusion-matrix framework quantifies the asymmetric cost of missed peaks vs. false alarms. The operational design specifies a REST API architecture with daily automated predictions from live IESO feeds and weather forecasts.

Outcome

The XGBoost regression model consistently identifies 3–5 of 5 coincident peak hours across backtested base periods, with RED alert precision high enough to keep false alarm days manageable (typically 10–15 curtailment days per summer vs. 60+ for a naive temperature heuristic). For a 1.5 MW Class A facility, model-guided curtailment during predicted peaks reduces annual GA charges by an estimated $200,000–$350,000 depending on base period volatility and curtailment depth. The regression approach outperforms direct classification by naturally providing prediction intervals — enabling risk-tiered alerts rather than binary yes/no decisions. SHAP analysis confirms temperature-derived features dominate importance, followed by demand momentum and calendar variables. The walk-forward validation demonstrates that the model generalizes across base periods with different peak timing patterns, and the operational design supports annual retraining after each base period closes.

Ontario’s Global Adjustment mechanism allocates billions of dollars in electricity system costs based on a facility’s consumption during just 5 hours per year — the coincident peak hours. For Class A customers, the financial stakes of missing even one peak are enormous: a single missed hour can increase annual GA charges by 50,00050,000–100,000 for a mid-size industrial facility.

This 5-notebook project builds a complete ML prediction system from raw IESO demand data through to operational deployment. The approach deliberately avoids treating peak detection as a classification problem — with only 5 positive examples per year, classifiers either miss peaks or generate unacceptable false alarm rates. Instead, a regression model predicts daily maximum demand, and domain-calibrated thresholds convert predictions into actionable curtailment signals. The walk-forward backtesting framework validates that the model generalizes across base periods with fundamentally different peak timing patterns, and the financial valuation demonstrates clear ROI for facilities in the 1–2 MW range.