Thursday, September 14, 2017

A Decision Tree Method for Building Energy Demand Modeling

Summary and Critique by Claude Bingham

A Decision Tree Method for Building Energy Demand Modeling 


Energy consumption has been identified as a major factor in long-term building impact. Additionally, newer buildings have consumed more and more energy over time. To that end, the researchers in this project wanted to construct an accurate predictive model that would be able to estimate future energy use of buildings.

They chose decision tree methodology to create the predictive models. Regression methods were noted to be too complex for users with limited mathematical training; the researchers saw neural networks as 'black boxes,' unreplicable for some of the same reasons of regression. Building simulations cannot accurately predict building occupant behavior patterns and therefore can only estimate what a building's energy consumption could be in a statically situational environment. Decision trees, however, is relatively simple, can manipulate numerical and categorical data, and does not require much computation.

Decision trees use a flowchart-like structure to show hierarchy, status, and category of data. In this study, for example, a decision tree depicted the outside temperature, if a room was occupied, and whether the air conditioning was on because of those previous two factors. Based on the number of recorded occurrences of each possible variable state, energy use can be approximated for an individual room.

To verify the actual ability of such a model to create reliably accurate predictions, the research team used the C4.5 decision tree algorithm with open-source WEKA data-mining software. This pairing was chosen for their flexibility and ability to apply multiple types of data. The constructed model is then tested against predicted values. In this research study, the model was constructed to include six categorical variables and four numerical variables based on data collected from 80 buildings in six districts in Japan. The resulting value was set to be either 'HIGH' or 'LOW' energy use intensity.

The test model was able to correctly predict 92% of expected cases. The researchers noted that the confidence interval was 80%, too low to be consistently reliable and the model was miss-attributing variables at times. This was likely due to the size of the data set and limited variable hierarchy (also tied to data set size and variety).


This research benefited greatly from examining reasons for and against using various methodologies for predictive studies. The experiment was well-explained and well executed, with one exception. The sample size for the test data was too small. While the results were both reasonably accurate, and passably reliable, it shows decision trees are not downward scalable for smaller data samples. This methodology appears to work well with large data sets, but not smaller ones.


  1. This is an interesting article that exemplifies one of the major weaknesses of the decision trees as it does not work well with smaller data sets. This does raise the hypothetical question on if it can be used with a higher confidence interval in a smaller data set, with a vast hierarchy, or vice versa?

    1. I believe so. The smaller you can make the decision steps, the more likely it is the result is accurate. That takes a lot more time and money than is usually efficient though. Also, some things just do not break down evenly into smaller increments.

  2. I agree with your point Claude. Decision trees are more likely to be accurate when there are fewer decisions and outcomes in a tree. It is unrealistic to plan for all contingencies that may arise a result of a decision. This could result in a unrealistic decision tree guiding you towards a bad decision. Given that decision trees are primarily based off expectations, I believe decision tree mapping should be a continuous, ongoing method that accounts for any changes in expectations or the external environment. Thus, I also believe decision trees mapping should be cautious when trying to forecast events far into the future. They are likely to be more accurate in the short term given the complexity of events and our dynamic expectations.

  3. I have read your blog its very attractive and impressive. I like your blog machine learning online course