public class MillerUpdatingRegression extends Object implements UpdatingMultipleLinearRegression
UpdatingMultipleLinearRegression
interface.
The algorithm is described in:
Algorithm AS 274: Least Squares Routines to Supplement Those of Gentleman Author(s): Alan J. Miller Source: Journal of the Royal Statistical Society. Series C (Applied Statistics), Vol. 41, No. 2 (1992), pp. 458-478 Published by: Blackwell Publishing for the Royal Statistical Society Stable URL: http://www.jstor.org/stable/2347583
This method for multiple regression forms the solution to the OLS problem by updating the QR decomposition as described by Gentleman.
Constructor and Description |
---|
MillerUpdatingRegression(int numberOfVariables,
boolean includeConstant)
Primary constructor for the MillerUpdatingRegression.
|
MillerUpdatingRegression(int numberOfVariables,
boolean includeConstant,
double errorTolerance)
This is the augmented constructor for the MillerUpdatingRegression class.
|
Modifier and Type | Method and Description |
---|---|
void |
addObservation(double[] x,
double y)
Adds an observation to the regression model.
|
void |
addObservations(double[][] x,
double[] y)
Adds multiple observations to the model.
|
void |
clear()
As the name suggests, clear wipes the internals and reorders everything in the
canonical order.
|
double |
getDiagonalOfHatMatrix(double[] row_data)
Gets the diagonal of the Hat matrix also known as the leverage matrix.
|
long |
getN()
Gets the number of observations added to the regression model.
|
int[] |
getOrderOfRegressors()
Gets the order of the regressors, useful if some type of reordering
has been called.
|
double[] |
getPartialCorrelations(int in)
In the original algorithm only the partial correlations of the regressors
is returned to the user.
|
boolean |
hasIntercept()
A getter method which determines whether a constant is included.
|
RegressionResults |
regress()
Conducts a regression on the data in the model, using all regressors.
|
RegressionResults |
regress(int numberOfRegressors)
Conducts a regression on the data in the model, using a subset of regressors.
|
RegressionResults |
regress(int[] variablesToInclude)
Conducts a regression on the data in the model, using regressors in array
Calling this method will change the internal order of the regressors
and care is required in interpreting the hatmatrix.
|
public MillerUpdatingRegression(int numberOfVariables, boolean includeConstant, double errorTolerance) throws ModelSpecificationException
numberOfVariables
- number of regressors to expect, not including constantincludeConstant
- include a constant automaticallyerrorTolerance
- zero tolerance, how machine zero is determinedModelSpecificationException
- if numberOfVariables is less than 1
public MillerUpdatingRegression(int numberOfVariables, boolean includeConstant) throws ModelSpecificationException
numberOfVariables
- maximum number of potential regressorsincludeConstant
- include a constant automaticallyModelSpecificationException
- if numberOfVariables is less than 1
public boolean hasIntercept()
hasIntercept
in interface UpdatingMultipleLinearRegression
public long getN()
getN
in interface UpdatingMultipleLinearRegression
public void addObservation(double[] x, double y) throws ModelSpecificationException
addObservation
in interface UpdatingMultipleLinearRegression
x
- the array with regressor valuesy
- the value of dependent variable given these regressorsModelSpecificationException
- if the length of x
does not equal
the number of independent variables in the modelpublic void addObservations(double[][] x, double[] y) throws ModelSpecificationException
addObservations
in interface UpdatingMultipleLinearRegression
x
- observations on the regressorsy
- observations on the regressandModelSpecificationException
- if x
is not rectangular, does not match
the length of y
or does not contain sufficient data to estimate the modelpublic void clear()
clear
in interface UpdatingMultipleLinearRegression
public double[] getPartialCorrelations(int in)
corr = { corrxx - lower triangular corrxy - bottom row of the matrix } Replaces subroutines PCORR and COR of: ALGORITHM AS274 APPL. STATIST. (1992) VOL.41, NO. 2
Calculate partial correlations after the variables in rows 1, 2, ..., IN have been forced into the regression. If IN = 1, and the first row of R represents a constant in the model, then the usual simple correlations are returned.
If IN = 0, the value returned in array CORMAT for the correlation of variables Xi & Xj is:
sum ( Xi.Xj ) / Sqrt ( sum (Xi^2) . sum (Xj^2) )
On return, array CORMAT contains the upper triangle of the matrix of partial correlations stored by rows, excluding the 1's on the diagonal. e.g. if IN = 2, the consecutive elements returned are: (3,4) (3,5) ... (3,ncol), (4,5) (4,6) ... (4,ncol), etc. Array YCORR stores the partial correlations with the Y-variable starting with YCORR(IN+1) = partial correlation with the variable in position (IN+1).
in
- how many of the regressors to include (either in canonical
order, or in the current reordered state)public double getDiagonalOfHatMatrix(double[] row_data)
row_data
- returns the diagonal of the hat matrix for this observationpublic int[] getOrderOfRegressors()
public RegressionResults regress() throws ModelSpecificationException
regress
in interface UpdatingMultipleLinearRegression
ModelSpecificationException
- - thrown if number of observations is
less than the number of variablespublic RegressionResults regress(int numberOfRegressors) throws ModelSpecificationException
numberOfRegressors
- many of the regressors to include (either in canonical
order, or in the current reordered state)ModelSpecificationException
- - thrown if number of observations is
less than the number of variables or number of regressors requested
is greater than the regressors in the modelpublic RegressionResults regress(int[] variablesToInclude) throws ModelSpecificationException
regress
in interface UpdatingMultipleLinearRegression
variablesToInclude
- array of variables to include in regressionModelSpecificationException
- - thrown if number of observations is
less than the number of variables, the number of regressors requested
is greater than the regressors in the model or a regressor index in
regressor array does not existCopyright © 2003–2016 The Apache Software Foundation. All rights reserved.