Package weka.attributeSelection
Class CfsSubsetEval
java.lang.Object
weka.attributeSelection.ASEvaluation
weka.attributeSelection.CfsSubsetEval
- All Implemented Interfaces:
Serializable
,SubsetEvaluator
,CapabilitiesHandler
,CapabilitiesIgnorer
,CommandlineRunnable
,OptionHandler
,RevisionHandler
,TechnicalInformationHandler
,ThreadSafe
public class CfsSubsetEval
extends ASEvaluation
implements SubsetEvaluator, ThreadSafe, OptionHandler, TechnicalInformationHandler
CfsSubsetEval :
Evaluates the worth of a subset of attributes by considering the individual predictive ability of each feature along with the degree of redundancy between them.
Subsets of features that are highly correlated with the class while having low intercorrelation are preferred.
For more information see:
M. A. Hall (1998). Correlation-based Feature Subset Selection for Machine Learning. Hamilton, New Zealand. BibTeX:
Evaluates the worth of a subset of attributes by considering the individual predictive ability of each feature along with the degree of redundancy between them.
Subsets of features that are highly correlated with the class while having low intercorrelation are preferred.
For more information see:
M. A. Hall (1998). Correlation-based Feature Subset Selection for Machine Learning. Hamilton, New Zealand. BibTeX:
@phdthesis{Hall1998, address = {Hamilton, New Zealand}, author = {M. A. Hall}, school = {University of Waikato}, title = {Correlation-based Feature Subset Selection for Machine Learning}, year = {1998} }Valid options are:
-M Treat missing values as a separate value.
-L Don't include locally predictive attributes.
-Z Precompute the full correlation matrix at the outset, rather than compute correlations lazily (as needed) during the search. Use this in conjuction with parallel processing in order to speed up a backward search.
-P <int> The size of the thread pool, for example, the number of cores in the CPU. (default 1)
-E <int> The number of threads to use, which should be >= size of thread pool. (default 1)
-D Output debugging info.
- Version:
- $Revision: 15519 $
- Author:
- Mark Hall (mhall@cs.waikato.ac.nz)
- See Also:
-
Constructor Summary
-
Method Summary
Modifier and TypeMethodDescriptionvoid
buildEvaluator
(Instances data) Generates a attribute evaluator.void
clean()
Tells the evaluator that the attribute selection process is complete.Returns the tip text for this propertydouble
evaluateSubset
(BitSet subset) evaluates a subset of attributesReturns the capabilities of this evaluator.boolean
getDebug()
Set whether to output debugging infoboolean
Return true if including locally predictive attributesboolean
Return true is missing is treated as a separate valueint
Gets the number of threads.String[]
Gets the current settings of CfsSubsetEvalint
Gets the number of threads.boolean
Get whether to pre-compute the full correlation matrix at the outset, rather than computing individual correlations lazily (as needed) during the search.Returns the revision string.Returns an instance of a TechnicalInformation object, containing detailed information about the technical background of this class, e.g., paper reference or book this class is based on.Returns a string describing this attribute evaluatorReturns an enumeration describing the available options.Returns the tip text for this propertystatic void
Main method for testing this class.Returns the tip text for this propertyint[]
postProcess
(int[] attributeSet) Calls locallyPredictive in order to include locally predictive attributes (if requested).void
setDebug
(boolean d) Set whether to output debugging infovoid
setLocallyPredictive
(boolean b) Include locally predictive attributesvoid
setMissingSeparate
(boolean b) Treat missing as a separate valuevoid
setNumThreads
(int nT) Sets the number of threadsvoid
setOptions
(String[] options) Parses and sets a given list of options.void
setPoolSize
(int nT) Sets the number of threadsvoid
setPreComputeCorrelationMatrix
(boolean p) Set whether to pre-compute the full correlation matrix at the outset, rather than computing individual correlations lazily (as needed) during the search.toString()
returns a string describing CFSMethods inherited from class weka.attributeSelection.ASEvaluation
doNotCheckCapabilitiesTipText, forName, getDoNotCheckCapabilities, makeCopies, postExecution, preExecution, run, runEvaluator, setDoNotCheckCapabilities
-
Constructor Details
-
CfsSubsetEval
public CfsSubsetEval()Constructor
-
-
Method Details
-
globalInfo
Returns a string describing this attribute evaluator- Returns:
- a description of the evaluator suitable for displaying in the explorer/experimenter gui
-
getTechnicalInformation
Returns an instance of a TechnicalInformation object, containing detailed information about the technical background of this class, e.g., paper reference or book this class is based on.- Specified by:
getTechnicalInformation
in interfaceTechnicalInformationHandler
- Returns:
- the technical information about this class
-
listOptions
Returns an enumeration describing the available options.- Specified by:
listOptions
in interfaceOptionHandler
- Overrides:
listOptions
in classASEvaluation
- Returns:
- an enumeration of all the available options.
-
setOptions
Parses and sets a given list of options. Valid options are:-M Treat missing values as a separate value.
-L Don't include locally predictive attributes.
-Z Precompute the full correlation matrix at the outset, rather than compute correlations lazily (as needed) during the search. Use this in conjuction with parallel processing in order to speed up a backward search.
-P <int> The size of the thread pool, for example, the number of cores in the CPU. (default 1)
-E <int> The number of threads to use, which should be >= size of thread pool. (default 1)
-D Output debugging info.
- Specified by:
setOptions
in interfaceOptionHandler
- Overrides:
setOptions
in classASEvaluation
- Parameters:
options
- the list of options as an array of strings- Throws:
Exception
- if an option is not supported
-
preComputeCorrelationMatrixTipText
- Returns:
- a string to describe the option
-
setPreComputeCorrelationMatrix
public void setPreComputeCorrelationMatrix(boolean p) Set whether to pre-compute the full correlation matrix at the outset, rather than computing individual correlations lazily (as needed) during the search.- Parameters:
p
- true if the correlation matrix is to be pre-computed at the outset
-
getPreComputeCorrelationMatrix
public boolean getPreComputeCorrelationMatrix()Get whether to pre-compute the full correlation matrix at the outset, rather than computing individual correlations lazily (as needed) during the search.- Returns:
- true if the correlation matrix is to be pre-computed at the outset
-
numThreadsTipText
- Returns:
- a string to describe the option
-
getNumThreads
public int getNumThreads()Gets the number of threads. -
setNumThreads
public void setNumThreads(int nT) Sets the number of threads -
poolSizeTipText
- Returns:
- a string to describe the option
-
getPoolSize
public int getPoolSize()Gets the number of threads. -
setPoolSize
public void setPoolSize(int nT) Sets the number of threads -
locallyPredictiveTipText
Returns the tip text for this property- Returns:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
setLocallyPredictive
public void setLocallyPredictive(boolean b) Include locally predictive attributes- Parameters:
b
- true or false
-
getLocallyPredictive
public boolean getLocallyPredictive()Return true if including locally predictive attributes- Returns:
- true if locally predictive attributes are to be used
-
missingSeparateTipText
Returns the tip text for this property- Returns:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
setMissingSeparate
public void setMissingSeparate(boolean b) Treat missing as a separate value- Parameters:
b
- true or false
-
getMissingSeparate
public boolean getMissingSeparate()Return true is missing is treated as a separate value- Returns:
- true if missing is to be treated as a separate value
-
setDebug
public void setDebug(boolean d) Set whether to output debugging info- Parameters:
d
- true if debugging info is to be output
-
getDebug
public boolean getDebug()Set whether to output debugging info- Returns:
- true if debugging info is to be output
-
debugTipText
Returns the tip text for this property- Returns:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
getOptions
Gets the current settings of CfsSubsetEval- Specified by:
getOptions
in interfaceOptionHandler
- Overrides:
getOptions
in classASEvaluation
- Returns:
- an array of strings suitable for passing to setOptions()
-
getCapabilities
Returns the capabilities of this evaluator.- Specified by:
getCapabilities
in interfaceCapabilitiesHandler
- Overrides:
getCapabilities
in classASEvaluation
- Returns:
- the capabilities of this evaluator
- See Also:
-
buildEvaluator
Generates a attribute evaluator. Has to initialize all fields of the evaluator that are not being set via options. CFS also discretises attributes (if necessary) and initializes the correlation matrix.- Specified by:
buildEvaluator
in classASEvaluation
- Parameters:
data
- set of instances serving as training data- Throws:
Exception
- if the evaluator has not been generated successfully
-
evaluateSubset
evaluates a subset of attributes- Specified by:
evaluateSubset
in interfaceSubsetEvaluator
- Parameters:
subset
- a bitset representing the attribute subset to be evaluated- Returns:
- the merit
- Throws:
Exception
- if the subset could not be evaluated
-
toString
returns a string describing CFS -
postProcess
Calls locallyPredictive in order to include locally predictive attributes (if requested).- Overrides:
postProcess
in classASEvaluation
- Parameters:
attributeSet
- the set of attributes found by the search- Returns:
- a possibly ranked list of postprocessed attributes
- Throws:
Exception
- if postprocessing fails for some reason
-
clean
public void clean()Description copied from class:ASEvaluation
Tells the evaluator that the attribute selection process is complete. It can then clean up data structures, references to training data as necessary in order to save memory- Overrides:
clean
in classASEvaluation
-
getRevision
Returns the revision string.- Specified by:
getRevision
in interfaceRevisionHandler
- Overrides:
getRevision
in classASEvaluation
- Returns:
- the revision
-
main
Main method for testing this class.- Parameters:
args
- the options
-