Under severe multicollinearity, OLS gives unreliable estimates. Some of the most recent methods to estimation under severe multicollinearity are the MEL estimators. The details of this method are available in
Estimation under Multicollinearity: Application of Restricted Liu and Maximum Entropy Estimators to the Portland Cement Dataset and
Multicollinearity and Modular Maximum Entropy Leuven Estimator.
Given the data on y (dependent variable) and X = (x1, x2, ..., xm) and the regression model specification as y = a1x1 + a2x2 + ... + amxm + u, one may use the fortran computer program, RMEL.FOR. The program may be fed in the computer (file rmel.for) in text using some text editor such as EDIT.COM of Microsoft. The rules of fortran must be followed. C (comment) is always in the first column. Statement numbers must be between 1st and 5th columns. The sixth column is for continuation of the previous statement, else blank. The statements are between 7th and 72nd columns. If a statement is large in size so that it does not end on the 72nd column or before that, it has to be continued in the next line with a * or a one-digit number in the 6th column.
The program source codes (FORTRAN) may be downloaded from website (click here) or alternatively from website (click here). This program carries out optimization by the random walk method. There is another program that optimizes the max entropy objective function by the Differential Evolution method of global optimization, which may be downloaded (click here for download).
The program may be compiled by a suitable FORTRAN compiler (e.g. Microsoft Fortran Compiler or Force compiler or FORCE Fortran 77 compiler free download), which, after compilation will yield RMEL.EXE.
How to run the program : Name of the program RMEL.EXE
- Store your data in some file (to be used as the input file), say DATS, with the help of an editor like EDIT.COM or any other editor in simple ASCII codes.
- Data must be fed in the following manner
y |
X1 |
X2 |
X3 |
... |
Xm |
21.7 |
12.5 |
3.9 |
3.8 |
... |
1.0 |
21.2 |
11.8 |
13.7 |
1.7 |
... |
1.0 |
36.9 |
5.1 |
17.7 |
6.5 |
... |
1.0 |
... |
... |
... |
... |
... |
... |
38.0 |
3.0 |
9.1 |
11.2 |
... |
1.0 |
In the table above, the first column is for Y, the second for X1, ?.. and the last column for the intercept. If your model does not have intercept then do not feed the unit (last) column.
M is the number of independent variables, in the above example 4 (including the last column). In fact, M is the number of parameters (regression coefficients including the intercept if any) to be estimated.
N is the number of observations in the sample.
- Run the program by typing (in DOS mode) RMEL
- enter.
- The program will take the input file name. Type the input file name within single quotes, e.g. 'DATS' Strike the enter key.
- The program will take output file name in which the results will be stored. Feed the file name within single quotes, e.g. 'RESULTS' Strike the enter key.
- The program will take a metric value (1 or 2). It is for the normalization of residuals. If you suspect that the errors in Y have large outliers, metric value of 1 is better. Otherwise metric value of 2 is ok.
- The program will take another metric value. If you feed the metric value 1 it will use the absolute norm for the regression parameters. It is the MMEL estimator of Mishra. If you feed 2, it will take the Euclidean norm, the estimator is then MEL estimator of Paris. MMEL performs better than the MEL estimator
.
- The program will take the MEL1 value. It is either 0 or 1. A choice of 0 will maximize entropy in regression parameters only. The choice of 1 will maximize entropy in coefficients as well as the residuals. You may try both and compare the results.
- The program will ask for No. of observations (N) and no. of variables (M).
- The program will ask for seed of random numbers and the number of iterations. For seed, odd values of five digits such as 23411 or 11211 etc may be given. The value must not exceed 32001. The no. of iterations may be somewhere between 30 to 100.
- Results are stored in the output file named by you. Open the file for reading the results.
Reference :
Krishnamurthy, E V & S K Sen (1976). Computer-Based Numerical Algorithms. Affiliated
East-West Press, New Delhi.