Package weka.attributeSelection
Class WrapperSubsetEval
java.lang.Object
weka.attributeSelection.ASEvaluation
weka.attributeSelection.WrapperSubsetEval
- All Implemented Interfaces:
Serializable
,SubsetEvaluator
,CapabilitiesHandler
,CapabilitiesIgnorer
,CommandlineRunnable
,OptionHandler
,RevisionHandler
,TechnicalInformationHandler
public class WrapperSubsetEval
extends ASEvaluation
implements SubsetEvaluator, OptionHandler, TechnicalInformationHandler
WrapperSubsetEval:
Evaluates attribute sets by using a learning scheme. Cross validation is used to estimate the accuracy of the learning scheme for a set of attributes.
For more information see:
Evaluates attribute sets by using a learning scheme. Cross validation is used to estimate the accuracy of the learning scheme for a set of attributes.
For more information see:
Ron Kohavi, George H. John (1997). Wrappers for feature subset selection. Artificial Intelligence. 97(1-2):273-324.
BibTeX:@article{Kohavi1997, author = {Ron Kohavi and George H. John}, journal = {Artificial Intelligence}, note = {Special issue on relevance}, number = {1-2}, pages = {273-324}, title = {Wrappers for feature subset selection}, volume = {97}, year = {1997}, ISSN = {0004-3702} }
Valid options are:
-B <base learner> class name of base learner to use for accuracy estimation. Place any classifier options LAST on the command line following a "--". eg.: -B weka.classifiers.bayes.NaiveBayes ... -- -K (default: weka.classifiers.rules.ZeroR)
-F <num> number of cross validation folds to use for estimating accuracy. (default=5)
-R <seed> Seed for cross validation accuracy testimation. (default = 1)
-T <num> threshold by which to execute another cross validation (standard deviation---expressed as a percentage of the mean). (default: 0.01 (1%))
-E <acc | rmse | mae | f-meas | auc | auprc> Performance evaluation measure to use for selecting attributes. (Default = accuracy for discrete class and rmse for numeric class)
-IRclass <label | index> Optional class value (label or 1-based index) to use in conjunction with IR statistics (f-meas, auc or auprc). Omitting this option will use the class-weighted average.
Options specific to scheme weka.classifiers.rules.ZeroR:
-D If set, classifier is run in debug mode and may output additional info to the console
- Version:
- $Revision: 15519 $
- Author:
- Mark Hall (mhall@cs.waikato.ac.nz)
- See Also:
-
Field Summary
Modifier and TypeFieldDescriptionstatic final int
static final int
static final int
static final int
static final int
static final int
static final int
static final int
static final int
static final Tag[]
Holds all tags for metrics -
Constructor Summary
-
Method Summary
Modifier and TypeMethodDescriptionvoid
buildEvaluator
(Instances data) Generates a attribute evaluator.Returns the tip text for this propertyvoid
clean()
Tells the evaluator that the attribute selection process is complete.double
evaluateSubset
(BitSet subset) Evaluates a subset of attributesReturns the tip text for this propertyReturns the tip text for this propertyReturns the capabilities of this evaluator.Get the classifier used as the base learner.Gets the currently set performance evaluation measure used for selecting attributes for the decision tableint
getFolds()
Get the number of folds used for accuracy estimationGet the class value (label or index) to use with IR metric evaluation of subsets.String[]
Gets the current settings of WrapperSubsetEval.Returns the revision string.int
getSeed()
Get the random number seed used for cross validationReturns an instance of a TechnicalInformation object, containing detailed information about the technical background of this class, e.g., paper reference or book this class is based on.double
Get the value of the thresholdReturns a string describing this attribute evaluatorReturns the tip text for this propertyReturns an enumeration describing the available options.static void
Main method for testing this class.Returns the tip text for this propertyvoid
setClassifier
(Classifier newClassifier) Set the classifier to use for accuracy estimationvoid
setEvaluationMeasure
(SelectedTag newMethod) Sets the performance evaluation measure to use for selecting attributes for the decision tablevoid
setFolds
(int f) Set the number of folds to use for accuracy estimationvoid
setIRClassValue
(String val) Set the class value (label or index) to use with IR metric evaluation of subsets.void
setOptions
(String[] options) Parses a given list of options.void
setSeed
(int s) Set the seed to use for cross validationvoid
setThreshold
(double t) Set the value of the threshold for repeating cross validationReturns the tip text for this propertytoString()
Returns a string describing the wrapperMethods inherited from class weka.attributeSelection.ASEvaluation
doNotCheckCapabilitiesTipText, forName, getDoNotCheckCapabilities, makeCopies, postExecution, postProcess, preExecution, run, runEvaluator, setDoNotCheckCapabilities
-
Field Details
-
EVAL_DEFAULT
public static final int EVAL_DEFAULT- See Also:
-
EVAL_ACCURACY
public static final int EVAL_ACCURACY- See Also:
-
EVAL_RMSE
public static final int EVAL_RMSE- See Also:
-
EVAL_MAE
public static final int EVAL_MAE- See Also:
-
EVAL_FMEASURE
public static final int EVAL_FMEASURE- See Also:
-
EVAL_AUC
public static final int EVAL_AUC- See Also:
-
EVAL_AUPRC
public static final int EVAL_AUPRC- See Also:
-
EVAL_CORRELATION
public static final int EVAL_CORRELATION- See Also:
-
EVAL_PLUGIN
public static final int EVAL_PLUGIN- See Also:
-
TAGS_EVALUATION
Holds all tags for metrics
-
-
Constructor Details
-
WrapperSubsetEval
public WrapperSubsetEval()Constructor. Calls restOptions to set default options
-
-
Method Details
-
globalInfo
Returns a string describing this attribute evaluator- Returns:
- a description of the evaluator suitable for displaying in the explorer/experimenter gui
-
getTechnicalInformation
Returns an instance of a TechnicalInformation object, containing detailed information about the technical background of this class, e.g., paper reference or book this class is based on.- Specified by:
getTechnicalInformation
in interfaceTechnicalInformationHandler
- Returns:
- the technical information about this class
-
listOptions
Returns an enumeration describing the available options.- Specified by:
listOptions
in interfaceOptionHandler
- Overrides:
listOptions
in classASEvaluation
- Returns:
- an enumeration of all the available options.
-
setOptions
Parses a given list of options.
Valid options are:
-B <base learner> class name of base learner to use for accuracy estimation. Place any classifier options LAST on the command line following a "--". eg.: -B weka.classifiers.bayes.NaiveBayes ... -- -K (default: weka.classifiers.rules.ZeroR)
-F <num> number of cross validation folds to use for estimating accuracy. (default=5)
-R <seed> Seed for cross validation accuracy testimation. (default = 1)
-T <num> threshold by which to execute another cross validation (standard deviation---expressed as a percentage of the mean). (default: 0.01 (1%))
-E <acc | rmse | mae | f-meas | auc | auprc> Performance evaluation measure to use for selecting attributes. (Default = accuracy for discrete class and rmse for numeric class)
-IRclass <label | index> Optional class value (label or 1-based index) to use in conjunction with IR statistics (f-meas, auc or auprc). Omitting this option will use the class-weighted average.
Options specific to scheme weka.classifiers.rules.ZeroR:
-D If set, classifier is run in debug mode and may output additional info to the console
- Specified by:
setOptions
in interfaceOptionHandler
- Overrides:
setOptions
in classASEvaluation
- Parameters:
options
- the list of options as an array of strings- Throws:
Exception
- if an option is not supported
-
setIRClassValue
Set the class value (label or index) to use with IR metric evaluation of subsets. Leaving this unset will result in the class weighted average for the IR metric being used.- Parameters:
val
- the class label or 1-based index of the class label to use when evaluating subsets with an IR metric
-
getIRClassValue
Get the class value (label or index) to use with IR metric evaluation of subsets. Leaving this unset will result in the class weighted average for the IR metric being used.- Returns:
- the class label or 1-based index of the class label to use when evaluating subsets with an IR metric
-
IRClassValueTipText
Returns the tip text for this property- Returns:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
evaluationMeasureTipText
Returns the tip text for this property- Returns:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
getEvaluationMeasure
Gets the currently set performance evaluation measure used for selecting attributes for the decision table- Returns:
- the performance evaluation measure
-
setEvaluationMeasure
Sets the performance evaluation measure to use for selecting attributes for the decision table- Parameters:
newMethod
- the new performance evaluation metric to use
-
thresholdTipText
Returns the tip text for this property- Returns:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
setThreshold
public void setThreshold(double t) Set the value of the threshold for repeating cross validation- Parameters:
t
- the value of the threshold
-
getThreshold
public double getThreshold()Get the value of the threshold- Returns:
- the threshold as a double
-
foldsTipText
Returns the tip text for this property- Returns:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
setFolds
public void setFolds(int f) Set the number of folds to use for accuracy estimation- Parameters:
f
- the number of folds
-
getFolds
public int getFolds()Get the number of folds used for accuracy estimation- Returns:
- the number of folds
-
seedTipText
Returns the tip text for this property- Returns:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
setSeed
public void setSeed(int s) Set the seed to use for cross validation- Parameters:
s
- the seed
-
getSeed
public int getSeed()Get the random number seed used for cross validation- Returns:
- the seed
-
classifierTipText
Returns the tip text for this property- Returns:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
setClassifier
Set the classifier to use for accuracy estimation- Parameters:
newClassifier
- the Classifier to use.
-
getClassifier
Get the classifier used as the base learner.- Returns:
- the classifier used as the classifier
-
getOptions
Gets the current settings of WrapperSubsetEval.- Specified by:
getOptions
in interfaceOptionHandler
- Overrides:
getOptions
in classASEvaluation
- Returns:
- an array of strings suitable for passing to setOptions()
-
getCapabilities
Returns the capabilities of this evaluator.- Specified by:
getCapabilities
in interfaceCapabilitiesHandler
- Overrides:
getCapabilities
in classASEvaluation
- Returns:
- the capabilities of this evaluator
- See Also:
-
buildEvaluator
Generates a attribute evaluator. Has to initialize all fields of the evaluator that are not being set via options.- Specified by:
buildEvaluator
in classASEvaluation
- Parameters:
data
- set of instances serving as training data- Throws:
Exception
- if the evaluator has not been generated successfully
-
evaluateSubset
Evaluates a subset of attributes- Specified by:
evaluateSubset
in interfaceSubsetEvaluator
- Parameters:
subset
- a bitset representing the attribute subset to be evaluated- Returns:
- the error rate
- Throws:
Exception
- if the subset could not be evaluated
-
toString
Returns a string describing the wrapper -
getRevision
Returns the revision string.- Specified by:
getRevision
in interfaceRevisionHandler
- Overrides:
getRevision
in classASEvaluation
- Returns:
- the revision
-
clean
public void clean()Description copied from class:ASEvaluation
Tells the evaluator that the attribute selection process is complete. It can then clean up data structures, references to training data as necessary in order to save memory- Overrides:
clean
in classASEvaluation
-
main
Main method for testing this class.- Parameters:
args
- the options
-