Exploratory Data Analysis

Abriliam Consulting — Industrial Energy Management

Before building any models, we need to understand the shape and character of the data. This notebook examines the chiller plant dataset through summary statistics, time-series trends, and correlation analysis to surface anomalies and guide the diagnostic investigation.

Missing values in each column:
oat_C                   0
wb_C                    0
occ                     0
tons                    0
chw_sup_C               0
chw_ret_C               0
chw_dT_C                0
chw_flow_m3h            0
cw_sup_C                0
cw_ret_C                0
cw_dT_C                 0
cw_flow_m3h             0
approach_C              0
dp_kpa                  0
chiller_kw              0
tower_fan_kw            0
chw_pump_kw             0
cw_pump_kw              0
plant_kw                0
kw_per_ton              0
plant_kw_per_ton        0
tower_fan_kw_per_ton    0
pumping_kw_per_ton      0
dtype: int64

Summary statistics:
             oat_C         wb_C          occ         tons    chw_sup_C  \
count  1344.000000  1344.000000  1344.000000  1344.000000  1344.000000   
mean     21.971768    18.469857     0.437281   202.310180     6.497118   
std       5.537674     5.223145     0.236582    72.427236     0.199888   
min       9.528023     7.674085     0.003374    40.000000     5.814391   
25%      17.243689    13.835159     0.298208   147.201633     6.359207   
50%      22.019321    18.588505     0.368380   214.134704     6.508027   
75%      26.550146    23.203806     0.601501   257.748715     6.637654   
max      34.455590    29.209971     0.985438   360.631196     7.078945   

         chw_ret_C     chw_dT_C  chw_flow_m3h     cw_sup_C     cw_ret_C  ...  \
count  1344.000000  1344.000000   1344.000000  1344.000000  1344.000000  ...   
mean     11.698814     5.201697    123.286156    23.223926    27.521415  ...   
std       0.977320     0.958199     53.350509     5.367137     5.622375  ...   
min       9.626562     3.216000     18.801814    10.437609    14.091599  ...   
25%      10.794124     4.256318     82.148731    18.504226    22.558544  ...   
50%      11.957734     5.557457    122.019542    23.301704    27.724563  ...   
75%      12.561173     6.052536    158.086490    27.887926    32.345008  ...   
max      13.841881     7.081364    312.989726    35.000361    40.469019  ...   

            dp_kpa   chiller_kw  tower_fan_kw  chw_pump_kw   cw_pump_kw  \
count  1344.000000  1344.000000   1344.000000  1344.000000  1344.000000   
mean    175.668842   645.271362     25.011770    43.988172     8.599194   
std      19.673575   190.714987      7.080368    21.491271     2.602534   
min     134.258563   202.598553      8.978128     6.000000     5.000000   
25%     158.402368   488.184334     19.638425    27.582625     6.388679   
50%     171.764386   676.740724     23.809398    41.524940     8.381974   
75%     193.622700   807.096917     30.098654    59.688591    10.689970   
max     219.801919   900.000000     46.654823    85.000000    16.127471   

          plant_kw   kw_per_ton  plant_kw_per_ton  tower_fan_kw_per_ton  \
count  1344.000000  1344.000000       1344.000000           1344.000000   
mean    722.870498     3.331991          3.740465              0.147641   
std     211.832906     0.534258          0.623603              0.087723   
min     228.816352     2.495624          2.775560              0.032155   
25%     549.172073     3.046748          3.399471              0.090503   
50%     757.822436     3.191399          3.583410              0.121378   
75%     899.344803     3.405071          3.836126              0.170387   
max    1041.126945     6.857364          7.590985              0.633392   

       pumping_kw_per_ton  
count         1344.000000  
mean             0.260833  
std              0.064297  
min              0.143180  
25%              0.203011  
50%              0.242365  
75%              0.317086  
max              0.560377  

[8 rows x 23 columns]

Data Quality Check

No missing values across all 23 columns — the dataset is complete. Key observations from the summary statistics:

Time-Series Overview

The triple-axis time-series plot reveals several important patterns:

Moving Average Analysis

The 240-hour (10-day) simple moving average of kW/ton shows a clear upward drift starting in early July. This long-term trend confirms that something changed in plant operations — the plant is becoming less efficient even after smoothing out weather and load variability. Weekend periods (shaded) consistently show higher kW/ton due to the part-load penalty.

Condenser Water Flow vs Efficiency

Condenser water flow and chiller efficiency are correlated — higher CW flow corresponds to higher loads and generally better kW/ton. The spline smoothing helps visualize the underlying trend without hourly noise. Both metrics show seasonal variation driven by outdoor conditions.

oat_C wb_C occ tons chw_sup_C chw_ret_C chw_dT_C chw_flow_m3h cw_sup_C cw_ret_C ... cw_pump_kw plant_kw kw_per_ton plant_kw_per_ton tower_fan_kw_per_ton pumping_kw_per_ton kw_per_ton_15_sma kw_per_ton_5_sma kw_per_ton_24_sma kw_per_ton_240_sma
count 1344.000000 1344.000000 1344.000000 1344.000000 1344.000000 1344.000000 1344.000000 1344.000000 1344.000000 1344.000000 ... 1344.000000 1344.000000 1344.000000 1344.000000 1344.000000 1344.000000 1330.000000 1340.000000 1344.000000 1344.000000
mean 21.971768 18.469857 0.437281 202.310180 6.497118 11.698814 5.201697 123.286156 23.223926 27.521415 ... 8.599194 722.870498 3.331991 3.740465 0.147641 0.260833 3.325178 3.329938 3.332719 3.367520
std 5.537674 5.223145 0.236582 72.427236 0.199888 0.977320 0.958199 53.350509 5.367137 5.622375 ... 2.602534 211.832906 0.534258 0.623603 0.087723 0.064297 0.351797 0.436612 0.325486 0.173593
min 9.528023 7.674085 0.003374 40.000000 5.814391 9.626562 3.216000 18.801814 10.437609 14.091599 ... 5.000000 228.816352 2.495624 2.775560 0.032155 0.143180 2.936343 2.760202 3.037449 3.125798
25% 17.243689 13.835159 0.298208 147.201633 6.359207 10.794124 4.256318 82.148731 18.504226 22.558544 ... 6.388679 549.172073 3.046748 3.399471 0.090503 0.203011 3.106304 3.074437 3.105068 3.246344
50% 22.019321 18.588505 0.368380 214.134704 6.508027 11.957734 5.557457 122.019542 23.301704 27.724563 ... 8.381974 757.822436 3.191399 3.583410 0.121378 0.242365 3.195772 3.199272 3.197062 3.385068
75% 26.550146 23.203806 0.601501 257.748715 6.637654 12.561173 6.052536 158.086490 27.887926 32.345008 ... 10.689970 899.344803 3.405071 3.836126 0.170387 0.317086 3.375653 3.407882 3.449104 3.413820
max 34.455590 29.209971 0.985438 360.631196 7.078945 13.841881 7.081364 312.989726 35.000361 40.469019 ... 16.127471 1041.126945 6.857364 7.590985 0.633392 0.560377 4.729038 5.944731 4.878574 4.878574

8 rows × 27 columns