In 2009, unconventional shale gas emerged as the dominant driver in North American natural gas markets. Rapid increases in shale gas production and shale-driven upward revisions to the U.S....
When the Dog Shivers
Modeling variables improves daily estimates of gas demand.
holidays and weekends. However, for residential and small commercial customers, the reductions are small: 18,487 Dth/day on holidays, 10,215 on Fridays, 18,019 on Saturdays, and 7,650 on Sundays. Including holidays and weekends in the model reduces the RMSE to 29,679 Dth/day. These reductions represent gas days which, in the Eastern time zone, run from 10 am to 10 a.m.— e.g., the Friday gas day ends at 10 a.m. Saturday.
Model With Variables
When the model results include all explanatory variables, the fitted demands, plotted against temperature, are fuzzy (see Figure 9) , in contrast to Figure 1, where the fitted demands are crisp. Days with a given temperature have a variety of wind speeds, prior day temperatures, etc., causing these fuzzy results. Figure 10 shows the residuals, most of which have amplitude less than 100,000 Dth/day. The RMSE, 29,679 Dth/day, represents a 52-percent reduction in the 61,843 Dth/day RMSE of the original model that included HDD as the only explanatory variable.
Figure 11 shows the values of the coefficients. The values of Base, Tran, the intercept and the sine amplitude have changed from those reported earlier, the result of adding successive variables to the model.
This approach to modeling effectively reduced model error in Columbia Gas of Ohio’s demand curves, reducing the company’s daily balancing and storage capacity requirements.
Author’s Technical Note: Analysts usually use linear regression to determine the best-fit model. However, four of the parameters in the Columbia model are non-linear: Base, Tran, Bet, and Alp, requiring the use of non-linear regression. Columbia used Statistical Analysis System (SAS) PROC NLIN.
The following example illustrates the non-linearity of Base. Assuming a model with Base = 60 degrees F and two groups of customers:
Group 1 : Demand = 100,000 + 25,000 * max(0, 60 degrees F – Temp), and
Group 2 : Demand = 10,000 + 3,000 * max(0, 60 degrees F – Temp), then if all the coefficients are linear, the combined group has demand as follows:
Combined Group : Demand = 110,000 + 28,000 * max(0, 120 degrees F – Temp), The combined group would indeed have an intercept of 110,000 Dth and a temperature response of 28,000 Dth/degrees F, but would have a HDD Base of 60 degrees F, not 120 degrees F. The model is non-linear in the Base.
Columbia used non-linear regression to determine Base and the other non-linear parameters. Alternatively, one could use linear regression with trial and error, trying Base = 65 degrees F, 64 degrees F, 63 degrees F, etc.
1. G.P. Williams and L. W. Gold, Ground Temperatures, Canadian Building Digest , July 1, 1976.
2. David Meredith, Penn State Institutes of Energy and the Environment, Intro to Building Environmental Systems , Chapter 10 .