Package weka.core
Class ContingencyTables
java.lang.Object
weka.core.ContingencyTables
- All Implemented Interfaces:
RevisionHandler
Class implementing some statistical routines for contingency tables.
- Version:
- $Revision: 10057 $
- Author:
- Eibe Frank (eibe@cs.waikato.ac.nz)
-
Field Summary
-
Constructor Summary
-
Method Summary
Modifier and TypeMethodDescriptionstatic double
chiSquared
(double[][] matrix, boolean yates) Returns chi-squared probability for a given matrix.static double
chiVal
(double[][] matrix, boolean useYates) Computes chi-squared statistic for a contingency table.static boolean
cochransCriterion
(double[][] matrix) Tests if Cochran's criterion is fullfilled for the given contingency table.static double
CramersV
(double[][] matrix) Computes Cramer's V for a contingency table.static double
entropy
(double[] array) Computes the entropy of the given array.static double
entropyConditionedOnColumns
(double[][] matrix) Computes conditional entropy of the rows given the columns.static double
entropyConditionedOnRows
(double[][] matrix) Computes conditional entropy of the columns given the rows.static double
entropyConditionedOnRows
(double[][] train, double[][] test, double numClasses) Computes conditional entropy of the columns given the rows of the test matrix with respect to the train matrix.static double
entropyOverColumns
(double[][] matrix) Computes the columns' entropy for the given contingency table.static double
entropyOverRows
(double[][] matrix) Computes the rows' entropy for the given contingency table.static double
gainRatio
(double[][] matrix) Computes gain ratio for contingency table (split on rows).Returns the revision string.static double
lnFunc
(double num) Help method for computing entropy.static double
log2MultipleHypergeometric
(double[][] matrix) Returns negative base 2 logarithm of multiple hypergeometric probability for a contingency table.static void
Main method for testing this class.static double[][]
reduceMatrix
(double[][] matrix) Reduces a matrix by deleting all zero rows and columns.static double
symmetricalUncertainty
(double[][] matrix) Calculates the symmetrical uncertainty for base 2.static double
tauVal
(double[][] matrix) Computes Goodman and Kruskal's tau-value for a contingency table.
-
Field Details
-
log2
public static final double log2The natural logarithm of 2
-
-
Constructor Details
-
ContingencyTables
public ContingencyTables()
-
-
Method Details
-
chiSquared
public static double chiSquared(double[][] matrix, boolean yates) Returns chi-squared probability for a given matrix.- Parameters:
matrix
- the contigency tableyates
- is Yates' correction to be used?- Returns:
- the chi-squared probability
-
chiVal
public static double chiVal(double[][] matrix, boolean useYates) Computes chi-squared statistic for a contingency table.- Parameters:
matrix
- the contigency tableuseYates
- is Yates' correction to be used?- Returns:
- the value of the chi-squared statistic
-
cochransCriterion
public static boolean cochransCriterion(double[][] matrix) Tests if Cochran's criterion is fullfilled for the given contingency table. Rows and columns with all zeros are not considered relevant.- Parameters:
matrix
- the contigency table to be tested- Returns:
- true if contingency table is ok, false if not
-
CramersV
public static double CramersV(double[][] matrix) Computes Cramer's V for a contingency table.- Parameters:
matrix
- the contingency table- Returns:
- Cramer's V
-
entropy
public static double entropy(double[] array) Computes the entropy of the given array.- Parameters:
array
- the array- Returns:
- the entropy
-
entropyConditionedOnColumns
public static double entropyConditionedOnColumns(double[][] matrix) Computes conditional entropy of the rows given the columns.- Parameters:
matrix
- the contingency table- Returns:
- the conditional entropy of the rows given the columns
-
entropyConditionedOnRows
public static double entropyConditionedOnRows(double[][] matrix) Computes conditional entropy of the columns given the rows.- Parameters:
matrix
- the contingency table- Returns:
- the conditional entropy of the columns given the rows
-
entropyConditionedOnRows
public static double entropyConditionedOnRows(double[][] train, double[][] test, double numClasses) Computes conditional entropy of the columns given the rows of the test matrix with respect to the train matrix. Uses a Laplace prior. Does NOT normalize the entropy.- Parameters:
train
- the train matrixtest
- the test matrixnumClasses
- the number of symbols for Laplace- Returns:
- the entropy
-
entropyOverRows
public static double entropyOverRows(double[][] matrix) Computes the rows' entropy for the given contingency table.- Parameters:
matrix
- the contingency table- Returns:
- the rows' entropy
-
entropyOverColumns
public static double entropyOverColumns(double[][] matrix) Computes the columns' entropy for the given contingency table.- Parameters:
matrix
- the contingency table- Returns:
- the columns' entropy
-
gainRatio
public static double gainRatio(double[][] matrix) Computes gain ratio for contingency table (split on rows). Returns Double.MAX_VALUE if the split entropy is 0.- Parameters:
matrix
- the contingency table- Returns:
- the gain ratio
-
log2MultipleHypergeometric
public static double log2MultipleHypergeometric(double[][] matrix) Returns negative base 2 logarithm of multiple hypergeometric probability for a contingency table.- Parameters:
matrix
- the contingency table- Returns:
- the log of the hypergeometric probability of the contingency table
-
reduceMatrix
public static double[][] reduceMatrix(double[][] matrix) Reduces a matrix by deleting all zero rows and columns.- Parameters:
matrix
- the matrix to be reduced- Returns:
- the matrix with all zero rows and columns deleted
-
symmetricalUncertainty
public static double symmetricalUncertainty(double[][] matrix) Calculates the symmetrical uncertainty for base 2.- Parameters:
matrix
- the contingency table- Returns:
- the calculated symmetrical uncertainty
-
tauVal
public static double tauVal(double[][] matrix) Computes Goodman and Kruskal's tau-value for a contingency table.- Parameters:
matrix
- the contingency table- Returns:
- Goodman and Kruskal's tau-value
-
lnFunc
public static double lnFunc(double num) Help method for computing entropy. -
getRevision
Returns the revision string.- Specified by:
getRevision
in interfaceRevisionHandler
- Returns:
- the revision
-
main
Main method for testing this class.
-