Package weka.classifiers.evaluation
Class RegressionAnalysis
java.lang.Object
weka.classifiers.evaluation.RegressionAnalysis
Analyzes linear regression model by using the Student's t-test on each
coefficient. Also calculates R^2 value and F-test value.
More information: http://en.wikipedia.org/wiki/Student's_t-test
http://en.wikipedia.org/wiki/Linear_regression
http://en.wikipedia.org/wiki/Ordinary_least_squares
- Version:
- $Revision: $
- Author:
- Chris Meyer: cmeyer@udel.edu University of Delaware, Newark, DE, USA CISC 612: Design extension implementation
-
Constructor Summary
-
Method Summary
Modifier and TypeMethodDescriptionstatic double
calculateAdjRSquared
(double rsq, int n, int k) Returns the adjusted R-squared value for a linear regression model.static double
calculateFStat
(double rsq, int n, int k) Returns the F-statistic for a linear regression model.static double
calculateRSquared
(Instances data, double ssr) Returns the R-squared value for a linear regression model, where sum of squared residuals is already calculated.static double
calculateSSR
(Instances data, Attribute chosen, double slope, double intercept) Returns the sum of squared residuals of the simple linear regression model: y = a + bx.static double[]
calculateStdErrorOfCoef
(Instances data, boolean[] selected, double ssr, int n, int k) Returns an array of the standard errors of the coefficients in a multiple linear regression.static double[]
calculateStdErrorOfCoef
(Instances data, Attribute chosen, double slope, double intercept, int df) Returns the standard errors of slope and intercept for a simple linear regression model: y = a + bx.static double[]
calculateTStats
(double[] coef, double[] stderror, int k) Returns an array of the t-statistic of each coefficient in a multiple linear regression model.Returns the revision string.
-
Constructor Details
-
RegressionAnalysis
public RegressionAnalysis()
-
-
Method Details
-
calculateSSR
public static double calculateSSR(Instances data, Attribute chosen, double slope, double intercept) throws Exception Returns the sum of squared residuals of the simple linear regression model: y = a + bx.- Parameters:
data
- (the data set)chosen
- (chosen x-attribute)slope
- (slope determined by simple linear regression model)intercept
- (intercept determined by simple linear regression model)- Returns:
- sum of squared residuals
- Throws:
Exception
- if there is a missing class value in data
-
calculateRSquared
Returns the R-squared value for a linear regression model, where sum of squared residuals is already calculated. This works for either a simple or a multiple linear regression model.- Parameters:
data
- (the data set)ssr
- (sum of squared residuals)- Returns:
- R^2 value
- Throws:
Exception
- if there is a missing class value in data
-
calculateAdjRSquared
public static double calculateAdjRSquared(double rsq, int n, int k) Returns the adjusted R-squared value for a linear regression model. This works for either a simple or a multiple linear regression model.- Parameters:
rsq
- (the model's R-squared value)n
- (the number of instances in the data)k
- (the number of coefficients in the model: k>=2)- Returns:
- the adjusted R squared value
-
calculateFStat
public static double calculateFStat(double rsq, int n, int k) Returns the F-statistic for a linear regression model.- Parameters:
rsq
- (the model's R-squared value)n
- (the number of instances in the data)k
- (the number of coefficients in the model: k>=2)- Returns:
- F-statistic
-
calculateStdErrorOfCoef
public static double[] calculateStdErrorOfCoef(Instances data, Attribute chosen, double slope, double intercept, int df) throws Exception Returns the standard errors of slope and intercept for a simple linear regression model: y = a + bx. The first element is the standard error of slope, the second element is standard error of intercept.- Parameters:
data
- (the data set)chosen
- (chosen x-attribute)slope
- (slope determined by simple linear regression model)intercept
- (intercept determined by simple linear regression model)df
- (number of instances - 2)- Returns:
- array of standard errors of slope and intercept
- Throws:
Exception
- if there is a missing class value in data
-
calculateStdErrorOfCoef
public static double[] calculateStdErrorOfCoef(Instances data, boolean[] selected, double ssr, int n, int k) throws Exception Returns an array of the standard errors of the coefficients in a multiple linear regression. The last element in the array is the standard error of the constant coefficient. The standard error array is used to calculate the t-statistics.- Parameters:
data
- (the data setselected
- (flags indicating variables used in the regression)ssr
- (sum of squared residuals)n
- (number of instances)k
- (number of coefficients; includes constant)- Returns:
- array of standard errors of coefficients
- Throws:
Exception
- if there is a missing class value in data
-
calculateTStats
public static double[] calculateTStats(double[] coef, double[] stderror, int k) Returns an array of the t-statistic of each coefficient in a multiple linear regression model.- Parameters:
coef
- (array holding the value of each coefficient)stderror
- (array holding each coefficient's standard error)k
- (number of coefficients, includes constant)- Returns:
- array of t-statistics of coefficients
-
getRevision
Returns the revision string.- Returns:
- the revision
-