Package weka.attributeSelection
Class PrincipalComponents
java.lang.Object
weka.attributeSelection.ASEvaluation
weka.attributeSelection.UnsupervisedAttributeEvaluator
weka.attributeSelection.PrincipalComponents
- All Implemented Interfaces:
Serializable
,AttributeEvaluator
,AttributeTransformer
,CapabilitiesHandler
,CapabilitiesIgnorer
,CommandlineRunnable
,OptionHandler
,RevisionHandler
public class PrincipalComponents
extends UnsupervisedAttributeEvaluator
implements AttributeTransformer, OptionHandler
Performs a principal components analysis and
transformation of the data. Use in conjunction with a Ranker search.
Dimensionality reduction is accomplished by choosing enough eigenvectors to
account for some percentage of the variance in the original data---default
0.95 (95%). Attribute noise can be filtered by transforming to the PC space,
eliminating some of the worst eigenvectors, and then transforming back to the
original space.
Valid options are:
-C Center (rather than standardize) the data and compute PCA using the covariance (rather than the correlation) matrix.
-R Retain enough PC attributes to account for this proportion of variance in the original data. (default = 0.95)
-O Transform through the PC space and back to the original space.
-A Maximum number of attributes to include in transformed attribute names. (-1 = include all)
- Version:
- $Revision: 15519 $
- Author:
- Mark Hall (mhall@cs.waikato.ac.nz), Gabi Schmidberger (gabi@cs.waikato.ac.nz)
- See Also:
-
Constructor Summary
-
Method Summary
Modifier and TypeMethodDescriptionvoid
buildEvaluator
(Instances data) Initializes principal components and performs the analysisReturns the tip text for this propertyconvertInstance
(Instance instance) Transform an instance in original (unormalized) format.double
evaluateAttribute
(int att) Evaluates the merit of a transformed attribute.Returns the capabilities of this evaluator.boolean
Get whether to center (rather than standardize) the data.double[][]
Return the correlation/covariance matrixdouble[]
Return the eigenvalues corresponding to the eigenvectorsReturn the header of the training data after all filtering - i.e missing values and nominal to binary.int
Gets maximum number of attributes to include in transformed attribute names.String[]
Gets the current settings of PrincipalComponentsReturns the revision string.boolean
Gets whether the data is to be transformed back to the original space.double[][]
Return the unsorted eigenvectorsdouble
Gets the proportion of total variance to account for when retaining principal componentsReturns a string describing this attribute transformervoid
Intializes the evaluator, filters the input data and computes the correlation/covariance matrix.Returns an enumeration describing the available options.static void
Main method for testing this classstatic String
matrixToString
(double[][] matrix) Return a matrix as a StringReturns the tip text for this propertyvoid
setCenterData
(boolean center) Set whether to center (rather than standardize) the data.void
setMaximumAttributeNames
(int m) Sets maximum number of attributes to include in transformed attribute names.void
setOptions
(String[] options) Parses a given list of options.void
setTransformBackToOriginal
(boolean b) Sets whether the data should be transformed back to the original spacevoid
setVarianceCovered
(double vc) Sets the amount of variance to account for when retaining principal componentstoString()
Returns a description of this attribute transformerReturns the tip text for this propertytransformedData
(Instances data) Gets the transformed training data.Returns just the header for the transformed data (ie.Returns the tip text for this propertyMethods inherited from class weka.attributeSelection.ASEvaluation
clean, doNotCheckCapabilitiesTipText, forName, getDoNotCheckCapabilities, makeCopies, postExecution, postProcess, preExecution, run, runEvaluator, setDoNotCheckCapabilities
-
Constructor Details
-
PrincipalComponents
public PrincipalComponents()
-
-
Method Details
-
globalInfo
Returns a string describing this attribute transformer- Returns:
- a description of the evaluator suitable for displaying in the explorer/experimenter gui
-
listOptions
Returns an enumeration describing the available options.- Specified by:
listOptions
in interfaceOptionHandler
- Overrides:
listOptions
in classASEvaluation
- Returns:
- an enumeration of all the available options.
-
setOptions
Parses a given list of options. Valid options are:-C Center (rather than standardize) the data and compute PCA using the covariance (rather than the correlation) matrix.
-R Retain enough PC attributes to account for this proportion of variance in the original data. (default = 0.95)
-O Transform through the PC space and back to the original space.
-A Maximum number of attributes to include in transformed attribute names. (-1 = include all)
- Specified by:
setOptions
in interfaceOptionHandler
- Overrides:
setOptions
in classASEvaluation
- Parameters:
options
- the list of options as an array of strings- Throws:
Exception
- if an option is not supported
-
centerDataTipText
Returns the tip text for this property- Returns:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
setCenterData
public void setCenterData(boolean center) Set whether to center (rather than standardize) the data. If set to true then PCA is computed from the covariance rather than correlation matrix.- Parameters:
center
- true if the data is to be centered rather than standardized
-
getCenterData
public boolean getCenterData()Get whether to center (rather than standardize) the data. If true then PCA is computed from the covariance rather than correlation matrix.- Returns:
- true if the data is to be centered rather than standardized.
-
varianceCoveredTipText
Returns the tip text for this property- Returns:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
setVarianceCovered
public void setVarianceCovered(double vc) Sets the amount of variance to account for when retaining principal components- Parameters:
vc
- the proportion of total variance to account for
-
getVarianceCovered
public double getVarianceCovered()Gets the proportion of total variance to account for when retaining principal components- Returns:
- the proportion of variance to account for
-
maximumAttributeNamesTipText
Returns the tip text for this property- Returns:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
setMaximumAttributeNames
public void setMaximumAttributeNames(int m) Sets maximum number of attributes to include in transformed attribute names.- Parameters:
m
- the maximum number of attributes
-
getMaximumAttributeNames
public int getMaximumAttributeNames()Gets maximum number of attributes to include in transformed attribute names.- Returns:
- the maximum number of attributes
-
transformBackToOriginalTipText
Returns the tip text for this property- Returns:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
setTransformBackToOriginal
public void setTransformBackToOriginal(boolean b) Sets whether the data should be transformed back to the original space- Parameters:
b
- true if the data should be transformed back to the original space
-
getTransformBackToOriginal
public boolean getTransformBackToOriginal()Gets whether the data is to be transformed back to the original space.- Returns:
- true if the data is to be transformed back to the original space
-
getOptions
Gets the current settings of PrincipalComponents- Specified by:
getOptions
in interfaceOptionHandler
- Overrides:
getOptions
in classASEvaluation
- Returns:
- an array of strings suitable for passing to setOptions()
-
getCapabilities
Returns the capabilities of this evaluator.- Specified by:
getCapabilities
in interfaceCapabilitiesHandler
- Overrides:
getCapabilities
in classASEvaluation
- Returns:
- the capabilities of this evaluator
- See Also:
-
buildEvaluator
Initializes principal components and performs the analysis- Specified by:
buildEvaluator
in classASEvaluation
- Parameters:
data
- the instances to analyse/transform- Throws:
Exception
- if analysis fails
-
initializeAndComputeMatrix
Intializes the evaluator, filters the input data and computes the correlation/covariance matrix.- Parameters:
data
- the instances to analyse- Throws:
Exception
- if a problem occurs
-
transformedHeader
Returns just the header for the transformed data (ie. an empty set of instances. This is so that AttributeSelection can determine the structure of the transformed data without actually having to get all the transformed data through transformedData().- Specified by:
transformedHeader
in interfaceAttributeTransformer
- Returns:
- the header of the transformed data.
- Throws:
Exception
- if the header of the transformed data can't be determined.
-
getFilteredInputFormat
Return the header of the training data after all filtering - i.e missing values and nominal to binary.- Returns:
- the header of the training data after all filtering.
-
getCorrelationMatrix
public double[][] getCorrelationMatrix()Return the correlation/covariance matrix- Returns:
- the correlation or covariance matrix
-
getUnsortedEigenVectors
public double[][] getUnsortedEigenVectors()Return the unsorted eigenvectors- Returns:
- the unsorted eigenvectors
-
getEigenValues
public double[] getEigenValues()Return the eigenvalues corresponding to the eigenvectors- Returns:
- the eigenvalues
-
transformedData
Gets the transformed training data.- Specified by:
transformedData
in interfaceAttributeTransformer
- Returns:
- the transformed training data
- Throws:
Exception
- if transformed data can't be returned
-
evaluateAttribute
Evaluates the merit of a transformed attribute. This is defined to be 1 minus the cumulative variance explained. Merit can't be meaningfully evaluated if the data is to be transformed back to the original space.- Specified by:
evaluateAttribute
in interfaceAttributeEvaluator
- Parameters:
att
- the attribute to be evaluated- Returns:
- the merit of a transformed attribute
- Throws:
Exception
- if attribute can't be evaluated
-
toString
Returns a description of this attribute transformer -
matrixToString
Return a matrix as a String- Parameters:
matrix
- that is decribed as a string- Returns:
- a String describing a matrix
-
convertInstance
Transform an instance in original (unormalized) format. Convert back to the original space if requested.- Specified by:
convertInstance
in interfaceAttributeTransformer
- Parameters:
instance
- an instance in the original (unormalized) format- Returns:
- a transformed instance
- Throws:
Exception
- if instance cant be transformed
-
getRevision
Returns the revision string.- Specified by:
getRevision
in interfaceRevisionHandler
- Overrides:
getRevision
in classASEvaluation
- Returns:
- the revision
-
main
Main method for testing this class- Parameters:
argv
- should contain the command line arguments to the evaluator/transformer (see AttributeSelection)
-