Demand Forecasting With Multiple Regression Course Notes
Demand Forecasting With Multiple Regression Course Notes
Demand Forecasting With Multiple Regression Course Notes
William Swart is professor of Marketing and Supply Chain Management at East Carolina University. He holds a Ph.D. in Operations Research and a M.S. in Industrial and Systems Engineering from the Georgia Institute of Technology and a B.S. in Industrial Engineering with Honors from Clemson University. Dr. Swarts experience is diversified between industry and academia. In academia, he served as Provost and Vice Chancellor for Academic Affairs at East Carolina University, Dean of Engineering at New Jersey Institute of Technology and Old Dominion University, Associate Dean of Business and Economics at California State University, Chairman of the Department of Industrial Engineering and Management Systems at the University of Central Florida. In industry, he served as Vice President for Operations Systems and Vice President for Management Information Systems at Burger King Corporation. As Dean of Engineering and Technology at New Jersey Institute of Technology, he supervised the development of a number of green engineering initiatives including the establishment of the Multi-lifecycle Engineering Research Center. Dr. Swart remains active as a strategic consultant for industry. His professional achievements have been honored by the Institute of Industrial Engineers with the 1994 Operations Research Practice Award. He was awarded the Achievement in Operations Research Medal from the Institute for Operation Research and Management Sciences (INFORMS) and has been named an Edelman Laureate for twice having been a finalist in the prestigious Edelman Competition for the best Operations Research application in the world. Professor Swart has been professionally active in Latin America, Europe, Asia, and the Middle East and is fluent in four languages. He has over 100 publications and has been Principal Investigator for grants and contracts in excess of $10 million.
Course Outline
An accurate forecast of future demand is an absolute requirement for planning production without creating wasteful overages or shortages and hence constitutes a cornerstone of successful green engineering. This module introduces multiple regression from a users perspective and shows step b y step how you can create a statistically robust forecasting formula based on variables that you believe play a role in determining the demand for your product. The appendix of the module shows hands on and step by step how someone with limited statistical and computer spreadsheet knowledge can implement the shown steps using Microsoft Excel and have a working forecasting system.
Course Transcript
Background
We developed this module partly because, in the recent eye for transport survey, it was indicated that forecasting is an area where most respondents still feel that they have room for improvement. Only 22% of retail and consumer product supply chain executives rated their forecasting capabilities as either good or excellent. In other words, 30% rated their forecasting as less than satisfactory or very poor, or if we rephrase this one more time, we can report that 78% of respondents would not rate their forecasting capabilities as anything better than merely satisfactory.
Module Rationale
Consequently, the rationale for this module being part of the IEEE green production management series is that an accurate prediction of future demand is a requirement in order to plan production without creating wasteful overages or shortages and hence constitutes a cornerstone of successful green engineering.
Course Transcript
Learning Objectives
The learning objectives of this module are basically that you will learn to use multiple regression for forecasting. As part of that you will learn to postulate causal or independent variables, you will learn to use the output from multiple regression software products to develop a robust forecasting model and finally, you will learn how to make point and interval forecasts.
Course Transcript
that were produced by lets say different approaches, then we apply various measures to the forecast to quantify the goodness of each. Of course then, we would want to select the best.
Measure of Goodness
The most common measure of forecast goodness is the sum of the squared difference between the actual and forecasted demand (residuals) over the test period. Take a look at the example shown. Thus, finding the best forecast is a mathematical optimization problem. For causal forecasting, minimizing the sum of squares is accomplished via regression. For time series forecasting, other measures of goodness and methods of optimization may be used.
Example 1a
Here we have an example. Suppose that we have data over a 19 time interval. Now, if we wanted to see which of two methods would be the best forecasting method, we would first select a test period. We would assume that that test period is one that would encompass a number of periods, typically the number of periods ahead of time that you would want to forecast. Then we would pretend that we did not have the actual demand, we would use the first 14 periods in this case, to come up with a forecast for period 15, 16, 17, 18 and 19. If we had two forecasting methods, then lets say that these are the results that we would have.
Example 1b
These results we could show on a graph and we can see that the blue line is the actual data; we can see that the red line is the forecast obtained by method one and the green line is the forecast obtained from method two during this particular test period.
Example 1c
In order to determine whether method one or method two was the best forecasting method, we would quantify the difference between the forecasts obtained by each method and the actual demand. So we would calculate the method one residual, which is the difference between the demand in period 15 and the forecast for period 15. We would also calculate the residual for period 15 for method two, which would be the demand in period 15 minus the forecast for period 15. Then we would square those and we would repeat that calculation for each of the periods in our test period, which is from period 15 through 19. We would then add those squared residuals. We would find that method one gives us a sum of squared residuals of 26,091.66. Method two would give us a sum of square residuals of 132,302.95.
Course Transcript
Consequently, we would draw the conclusion that method one is the best forecasting method for the data in this example.
Multiple Regression
If we next focus our attention on the main topic of this module, multiple regression, then we would indicate that multiple regression requires a sample of data that includes the values of a dependent variable that well always refer to as Y and a number of corresponding, and Ill say that number is M an arbitrary number, corresponding independent variables, X1, X2 through XM and typically M could be as small as one variable if you only have one independent variable, then you have what is mostly referred to as simple regression. If you have more than one independent variable then we have what is normally referred to as multiple regression. Once you have identified the independent variables and incidentally, those are variables that you have reason to believe would impact the value of your dependent variable Y. Once you have selected your independent variables and the dependent variable, then you postulate that you will be able to obtain an equation by which to forecast the value of Y. The forecasted value of Y is usually not the same as the actual value of Y and consequently we refer to Y* as the forecasted value of Y and we hypothesize that that is going to be equal to a set of values, an intercept if you will, that we call 0 and then we have added to that a slope for each of the independent variables multiplied by the independent variables, in this case it would be as shown here. Now, the s are betas, we refer to as the regression coefficients. Again, for us to pick Y and X we have to have values for those. The unknowns in that equation are going to be the s and we are going to utilize multiple regression to find the values of 0, 1, 2, through M that gives us values for Y* that minimize the sum of residuals squared. In other words, we want to minimize the difference between Y and Y*2, the sum of those. In multiple regression, the test period is not a small period, but we utilize the entire data set as the test period.
A Numerical Example
To illustrate how one goes about doing multiple regression, we are going to take a numerical example, which is a modification of an example given in the book by Makridakes and Wheelwright, that is listed in the references. In our particular example, we are going to say that we have 14 years of data and we are trying to predict the sales for a company that we call the Carolina Plate Glass Company. Now, this particular company is a fictitious company, but we assume then that the executives of that particular company have gotten together and
Course Transcript
they have found that their principle customers are the automobile producers and the builders. So they then feel that they have reason to believe that if they have the production of automobiles for a particular year and the building contracts that are awarded in a particular year, then they should be able to predict their sales fairly well.
Course Transcript
which data is entered and the types of statistical analysis they provide above and beyond basic multiple regression. They differ in the amount and type of output analysis including graphics that they provide. However, all multiple regression software packages provide similar basic multiple regression output and well refer to that as BMR, basic multiple regression output.
BMR Output
Now, the basic multiple regression output that is provided by all software packages encompasses information about regression statistics, something that is typically referred to as ANOVA or analysis of variance output, it provides information regarding the actual values and the predicted values, in other words, residual output. It gives information about the correlation matrix.
Course Transcript
Course Transcript
percentage of the variability. Again, this is no reason to stop, we simply continue understanding that we just wish that we could do better.
Applying Test 1
In applying the test, its very simple. We simply go to that first part of the output, we go to the regression statistics, we look for where it says adjusted R2 and we look at that number and that number is greater than 0.6. Consequently, we say that test one is okay. We always recommend that somehow in the regression output we type out the results of each of the tests that we take.
Applying Test 2
Applying test two, just like applying test one is a simple process. We now go to the ANOVA, the analysis of variance tables that are given to us. I have marked in red where the F statistic is located. In our case, the F statistic has a value of 60.41, which definitely is greater than five. Consequently, we pronounce test two as being satisfied.
Course Transcript
coefficient is greater than 0.70. If two previously thought independent variables are correlated, then one of the two is not only redundant in predicting the dependent variable, but it compounds any error associated with those variables. Test 3 is passed if ALL the correlation coefficients between the INDEPENDENT variables have an absolute value equal to or less than 0.70. If there is only one pair of independent variables that have an absolute value of their correlation coefficient greater than 0.7, then we discard from the regression model the independent variable that has the least correlation coefficient in absolute value with the dependent variable.
Applying Test 3
For our example, your correlation output would be what you see at the top left-hand corner of this particular spreadsheet. Now, below that, we explain the meaning of all of that information because the key of what we have said is that the test involves checking the correlation coefficients between independent variables. We only have two independent
Course Transcript
variables X1 and X2 and the correlation coefficient between X1 and X2 is given to us in green. In this case its 0.030414. The other information involves if we have to remove a coefficient, which we dont have in this particular case or we have to remove a variable and you see here, that the red number is the correlation between variable X1 automobile production and sales and the black variable involved is the correlation coefficient between building contracts awarded and sales. It is important to remember that when we want to use independent variables to predict a dependent variable it is good to have high correlation between the independent variable and the dependent variable, but not between the independent variables.
Course Transcript
Applying Test 4
In applying test four, remember in our example we already applied test one, test two and test three and they were all satisfied. Now we are applying test four. The T statistic is indicated in the column in red. We find that theres a T statistic that is less than two in absolute value, but its associated with the intercept. In other words, the intercept is not significantly different from zero and of course, if something is not significantly different from zero, then we might as well call it zero. Test four is not okay because of that. Consequently, we must remove the intercept from the model. Know that there is no independent variable associated with the intercept; nevertheless we must remove the intercept.
Course Transcript
residuals, which is given as part of the regression output, should appear to be Gaussian. If it is not, then it means that there are other independent variables that have not been identified or that the data exhibits other correlation. When this happens, you may want to consult someone that has considerable statistical expertise regarding how to resolve this problem. However, if all of the other tests are passed and you are stuck and you cant go anywhere else, then we suggest that you proceed with caution, understanding that there is still something out there that is not random, that is impacting the value of your sales.
Applying Test 5
What we have found here is that once we have the distribution of residuals, then it almost seems to be shaped by a Gaussian distribution, a bell-shaped or normal distribution, but it seems to be bimodal. The second class seems to be higher than the class before and after it and this bimodal distribution is one that we are always concerned about, its not bell -shaped and when you have this bimodal distribution, then that is usually an indication that you may have some other correlation, something else in play. To remove that so that all your residuals are random and not due to some other cause, you may want to consider someone that has a greater statistical expertise, but we dont have that. We look at this and say, well gosh, we know theres something going on here, but well just go on and proceed with our analysis.
Course Transcript
Making a Forecast
To make a forecast to go into the future, we have to remember that we hypothesized a relationship that says that our forecasted value was an intercept plus a slope of X1 x X1, the slope for automobile production x automobile production plus the slope of building contracts awarded times building contracts awarded. Weve indicated before that the best values from 0, 1, and 2 were gotten by a regression and what we do is we look at our last regression output, which in our case was the regression, no INT output.
Course Transcript
Course Transcript
Appendix
This module provides an appendix, which are instructions for doing everything that we have shown in this module using the Excel data analysis add-in on multiple regression. This is not a recommendation that you should choose this specific tool for multiple regression. There are many fine products, but we thought it would be useful to illustrate what might be involved in regression by providing this appendix.
Course Transcript
Course Transcript
Course Transcript
Course Transcript
class number one we have an upper limit of -56 and some, our second class we have an upper limit of -29 and some, all the way up to the five intervals or classes that we wanted.
Course Transcript
Then from the Excel insert tab we click line and then we click one of the options, I usually click the first option. Doing this creates the actual versus predicted Y graph. The purpose of obtaining this graph is to obtain a visual of how the estimates for Y obtained from the regression compared to the actual values.
Course Summary
In this tutorial we reviewed the importance of creating demand forecasting and the steps involved in using multiple regression to create a demand forecast. We used the Solver Add-in for Microsoft Excel to demonstrate how to create a demand forecast using multiple regression.
Glossary
Residual
The difference between actual demand and forecasted demand
Sum of Squares
The sum of the residuals for each data period squared. It is the measure of forecast goodness in multiple regression.
Multiple Regression
This is a simple, yet reasonable, algorithm than is used to establish minimum expected performance on a dataset. For instance, the eigenfaces approach based on principal component analysis is the baseline algorithm for face recognition. And, the silhouette correlation approach establishes the baseline for gait recognition.
Regression Coefficients
The coefficients associated with of each term of a regression model which will minimize the residuals and is obtained from the regression analysis.
Robust Model
A regression model that is statistically significant (passes all 5 tests for robustness)
Adjusted R Square
Indicates the percentage of variability of the dependent variable explained by the regression model.
Multicolinearity
The existence of significant correlation between previously hypothesized independent variables.
T test
A test used in multiple regression to determine if the regression coefficients are significantly different than zero.
Expected forecast
The value of the dependent variable as computed from the regression equation.
References
MS Nixon, T Tan, R Chellappa, "Human Identification Based on Gait," Springer 2006, ISBN 978-0-387-24424-2. Nixon, M.S.; Carter, J.N.; "Automatic Recognition by Gait," Proceedings of the IEEE, vol.94, no.11, pp.2013-2024, Nov. 2006 Kale, A.; Sundaresan, A.; Rajagopalan, A.N.; Cuntoor, N.P.; Roy-Chowdhury, A.K.; Kruger, V.; Chellappa, R.; "Identification of humans using gait," Image Processing, IEEE Transactions on, vol.13, no.9, pp.1163-1173, Sept. 2004. Han, J.; Bhanu, B.; "Individual recognition using gait energy image," Pattern Analysis and Machine Intelligence, IEEE Transactions on , vol.28, no.2, pp.316-322, Feb. 20. Sarkar, S., Liu, Z.: Gait Recognition. In: Handbook of Biometrics. Springer (2008) Z. Liu and S. Sarkar, Improved Gait Recognition by Gait Dynamics Normalization, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 28, no. 6, pp. 863 876, June 2006. Z. Liu and S. Sarkar, Effect of Silhouette Quality on Hard Problems in Gait Recognition, IEEE Transactions on Systems, Man, and Cybernetics-Part B, vol. 35, no. 2, pp. 170 183, Apr. 2005. S. Sarkar, P. Jonathon Phillips, Z. Liu, I. Robledo, P. Grother, K. Bowyer, The Human ID Gait Challenge Problem: Data Sets, Performance, and Analysis, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 27, no. 2, pp. 162177, Feb. 2005. H. Vajaria, T. Islam, P. Mohanty, S. Sarkar, R. Sankar, R. Kasturi, Evaluation and analysis of a face and voice outdoor multi-biometric system, Pattern Recognition Letters, vol. 28, no. 12, pp. 1572 1580, Sept. 2007. Z. Liu and S. Sarkar, Outdoor recognition at a distance by fusing gait and face, Image and Vision Computing, vol. 25, no. 6, pp. 817832, June 2007.