Package weka.attributeSelection
Class ClassifierSubsetEval
java.lang.Object
weka.attributeSelection.ASEvaluation
weka.attributeSelection.HoldOutSubsetEvaluator
weka.attributeSelection.ClassifierSubsetEval
- All Implemented Interfaces:
Serializable
,ErrorBasedMeritEvaluator
,SubsetEvaluator
,CapabilitiesHandler
,CapabilitiesIgnorer
,CommandlineRunnable
,OptionHandler
,RevisionHandler
public class ClassifierSubsetEval
extends HoldOutSubsetEvaluator
implements OptionHandler, ErrorBasedMeritEvaluator
Classifier subset evaluator:
Evaluates attribute subsets on training data or a separate hold out testing set. Uses a classifier to estimate the 'merit' of a set of attributes.
Valid options are:
Evaluates attribute subsets on training data or a separate hold out testing set. Uses a classifier to estimate the 'merit' of a set of attributes.
Valid options are:
-B <classifier> class name of the classifier to use for accuracy estimation. Place any classifier options LAST on the command line following a "--". eg.: -B weka.classifiers.bayes.NaiveBayes ... -- -K (default: weka.classifiers.rules.ZeroR)
-T Use the training data to estimate accuracy.
-H <filename> Name of the hold out/test set to estimate accuracy on.
-percentage-split Perform a percentage split on the training data. Use in conjunction with -T.
-P Split percentage to use (default = 90).
-S Random seed for percentage split (default = 1).
-E <DEFAULT|ACC|RMSE|MAE|F-MEAS|AUC|AUPRC|CORR-COEFF> Performance evaluation measure to use for selecting attributes. (Default = default: accuracy for discrete class and rmse for numeric class)
-IRclass <label | index> Optional class value (label or 1-based index) to use in conjunction with IR statistics (f-meas, auc or auprc). Omitting this option will use the class-weighted average.
Options specific to scheme weka.classifiers.rules.ZeroR:
-output-debug-info If set, classifier is run in debug mode and may output additional info to the console
-do-not-check-capabilities If set, classifier capabilities are not checked before classifier is built (use with caution).
-num-decimal-places The number of decimal places for the output of numbers in the model (default 2).
-batch-size The desired batch size for batch prediction (default 100).
- Version:
- $Revision: 10332 $
- Author:
- Mark Hall (mhall@cs.waikato.ac.nz)
- See Also:
-
Field Summary
Modifier and TypeFieldDescriptionstatic final int
static final int
static final int
static final int
static final int
static final int
static final int
static final int
static final int
static final Tag[]
Holds all tags for metrics -
Constructor Summary
-
Method Summary
Modifier and TypeMethodDescriptionvoid
buildEvaluator
(Instances data) Generates a attribute evaluator.Returns the tip text for this propertydouble
evaluateSubset
(BitSet subset) Evaluates a subset of attributesdouble
evaluateSubset
(BitSet subset, Instance holdOut, boolean retrain) Evaluates a subset of attributes with respect to a single instance.double
evaluateSubset
(BitSet subset, Instances holdOut) Evaluates a subset of attributes with respect to a set of instances.Returns the tip text for this propertyReturns the capabilities of this evaluator.Get the classifier used as the base learner.Gets the currently set performance evaluation measure used for selecting attributes for the decision tableGets the file that holds hold out/test instances.Get the class value (label or index) to use with IR metric evaluation of subsets.String[]
Gets the current settings of ClassifierSubsetEvalReturns the revision string.int
getSeed()
Get the random seed used to randomize the data before performing a percentage splitGet the split percentage to useboolean
Get whether to perform a percentage split on the training data for evaluationboolean
Get if training data is to be used instead of hold out/test dataReturns a string describing this attribute evaluatorReturns the tip text for this propertyReturns the tip text for this propertyReturns an enumeration describing the available options.static void
Main method for testing this class.Returns the tip text for this propertyvoid
setClassifier
(Classifier newClassifier) Set the classifier to use for accuracy estimationvoid
setEvaluationMeasure
(SelectedTag newMethod) Sets the performance evaluation measure to use for selecting attributes for the decision tablevoid
Set the file that contains hold out/test instancesvoid
setIRClassValue
(String val) Set the class value (label or index) to use with IR metric evaluation of subsets.void
setOptions
(String[] options) Parses a given list of options.void
setSeed
(int s) Set the random seed used to randomize the data before performing a percentage splitvoid
Set the split percentage to usevoid
setUsePercentageSplit
(boolean p) Set whether to perform a percentage split on the training data for evaluationvoid
setUseTraining
(boolean t) Set if training data is to be used instead of hold out/test dataReturns the tip text for this propertytoString()
Returns a string describing classifierSubsetEvalReturns the tip text for this propertyReturns the tip text for this propertyMethods inherited from class weka.attributeSelection.ASEvaluation
clean, doNotCheckCapabilitiesTipText, forName, getDoNotCheckCapabilities, makeCopies, postExecution, postProcess, preExecution, run, runEvaluator, setDoNotCheckCapabilities
-
Field Details
-
EVAL_DEFAULT
public static final int EVAL_DEFAULT- See Also:
-
EVAL_ACCURACY
public static final int EVAL_ACCURACY- See Also:
-
EVAL_RMSE
public static final int EVAL_RMSE- See Also:
-
EVAL_MAE
public static final int EVAL_MAE- See Also:
-
EVAL_FMEASURE
public static final int EVAL_FMEASURE- See Also:
-
EVAL_AUC
public static final int EVAL_AUC- See Also:
-
EVAL_AUPRC
public static final int EVAL_AUPRC- See Also:
-
EVAL_CORRELATION
public static final int EVAL_CORRELATION- See Also:
-
EVAL_PLUGIN
public static final int EVAL_PLUGIN- See Also:
-
TAGS_EVALUATION
Holds all tags for metrics
-
-
Constructor Details
-
ClassifierSubsetEval
public ClassifierSubsetEval()
-
-
Method Details
-
globalInfo
Returns a string describing this attribute evaluator- Returns:
- a description of the evaluator suitable for displaying in the explorer/experimenter gui
-
listOptions
Returns an enumeration describing the available options.- Specified by:
listOptions
in interfaceOptionHandler
- Overrides:
listOptions
in classASEvaluation
- Returns:
- an enumeration of all the available options.
-
setOptions
Parses a given list of options. Valid options are:-B <classifier> class name of the classifier to use for accuracy estimation. Place any classifier options LAST on the command line following a "--". eg.: -B weka.classifiers.bayes.NaiveBayes ... -- -K (default: weka.classifiers.rules.ZeroR)
-T Use the training data to estimate accuracy.
-H <filename> Name of the hold out/test set to estimate accuracy on.
-percentage-split Perform a percentage split on the training data. Use in conjunction with -T.
-P Split percentage to use (default = 90).
-S Random seed for percentage split (default = 1).
-E <DEFAULT|ACC|RMSE|MAE|F-MEAS|AUC|AUPRC|CORR-COEFF> Performance evaluation measure to use for selecting attributes. (Default = default: accuracy for discrete class and rmse for numeric class)
-IRclass <label | index> Optional class value (label or 1-based index) to use in conjunction with IR statistics (f-meas, auc or auprc). Omitting this option will use the class-weighted average.
Options specific to scheme weka.classifiers.rules.ZeroR:
-output-debug-info If set, classifier is run in debug mode and may output additional info to the console
-do-not-check-capabilities If set, classifier capabilities are not checked before classifier is built (use with caution).
-num-decimal-places The number of decimal places for the output of numbers in the model (default 2).
-batch-size The desired batch size for batch prediction (default 100).
- Specified by:
setOptions
in interfaceOptionHandler
- Overrides:
setOptions
in classASEvaluation
- Parameters:
options
- the list of options as an array of strings- Throws:
Exception
- if an option is not supported
-
seedTipText
Returns the tip text for this property- Returns:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
setSeed
public void setSeed(int s) Set the random seed used to randomize the data before performing a percentage split- Parameters:
s
- the seed to use
-
getSeed
public int getSeed()Get the random seed used to randomize the data before performing a percentage split- Returns:
- the seed to use
-
usePercentageSplitTipText
Returns the tip text for this property- Returns:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
setUsePercentageSplit
public void setUsePercentageSplit(boolean p) Set whether to perform a percentage split on the training data for evaluation- Parameters:
p
- true if a percentage split is to be performed
-
getUsePercentageSplit
public boolean getUsePercentageSplit()Get whether to perform a percentage split on the training data for evaluation- Returns:
- true if a percentage split is to be performed
-
splitPercentTipText
Returns the tip text for this property- Returns:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
setSplitPercent
Set the split percentage to use- Parameters:
sp
- the split percentage to use
-
getSplitPercent
Get the split percentage to use- Returns:
- the split percentage to use
-
setIRClassValue
Set the class value (label or index) to use with IR metric evaluation of subsets. Leaving this unset will result in the class weighted average for the IR metric being used.- Parameters:
val
- the class label or 1-based index of the class label to use when evaluating subsets with an IR metric
-
getIRClassValue
Get the class value (label or index) to use with IR metric evaluation of subsets. Leaving this unset will result in the class weighted average for the IR metric being used.- Returns:
- the class label or 1-based index of the class label to use when evaluating subsets with an IR metric
-
IRClassValueTipText
Returns the tip text for this property- Returns:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
evaluationMeasureTipText
Returns the tip text for this property- Returns:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
getEvaluationMeasure
Gets the currently set performance evaluation measure used for selecting attributes for the decision table- Returns:
- the performance evaluation measure
-
setEvaluationMeasure
Sets the performance evaluation measure to use for selecting attributes for the decision table- Parameters:
newMethod
- the new performance evaluation metric to use
-
classifierTipText
Returns the tip text for this property- Returns:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
setClassifier
Set the classifier to use for accuracy estimation- Parameters:
newClassifier
- the Classifier to use.
-
getClassifier
Get the classifier used as the base learner.- Returns:
- the classifier used as the classifier
-
holdOutFileTipText
Returns the tip text for this property- Returns:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
getHoldOutFile
Gets the file that holds hold out/test instances.- Returns:
- File that contains hold out instances
-
setHoldOutFile
Set the file that contains hold out/test instances- Parameters:
h
- the hold out file
-
useTrainingTipText
Returns the tip text for this property- Returns:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
getUseTraining
public boolean getUseTraining()Get if training data is to be used instead of hold out/test data- Returns:
- true if training data is to be used instead of hold out data
-
setUseTraining
public void setUseTraining(boolean t) Set if training data is to be used instead of hold out/test data- Parameters:
t
- true if training data is to be used instead of hold out data
-
getOptions
Gets the current settings of ClassifierSubsetEval- Specified by:
getOptions
in interfaceOptionHandler
- Overrides:
getOptions
in classASEvaluation
- Returns:
- an array of strings suitable for passing to setOptions()
-
getCapabilities
Returns the capabilities of this evaluator.- Specified by:
getCapabilities
in interfaceCapabilitiesHandler
- Overrides:
getCapabilities
in classASEvaluation
- Returns:
- the capabilities of this evaluator
- See Also:
-
buildEvaluator
Generates a attribute evaluator. Has to initialize all fields of the evaluator that are not being set via options.- Specified by:
buildEvaluator
in classASEvaluation
- Parameters:
data
- set of instances serving as training data- Throws:
Exception
- if the evaluator has not been generated successfully
-
evaluateSubset
Evaluates a subset of attributes- Specified by:
evaluateSubset
in interfaceSubsetEvaluator
- Parameters:
subset
- a bitset representing the attribute subset to be evaluated- Returns:
- the error rate
- Throws:
Exception
- if the subset could not be evaluated
-
evaluateSubset
Evaluates a subset of attributes with respect to a set of instances. Calling this function overrides any test/hold out instances set from setHoldOutFile.- Specified by:
evaluateSubset
in classHoldOutSubsetEvaluator
- Parameters:
subset
- a bitset representing the attribute subset to be evaluatedholdOut
- a set of instances (possibly separate and distinct from those use to build/train the evaluator) with which to evaluate the merit of the subset- Returns:
- the "merit" of the subset on the holdOut data
- Throws:
Exception
- if the subset cannot be evaluated
-
evaluateSubset
Evaluates a subset of attributes with respect to a single instance. Calling this function overides any hold out/test instances set through setHoldOutFile.- Specified by:
evaluateSubset
in classHoldOutSubsetEvaluator
- Parameters:
subset
- a bitset representing the attribute subset to be evaluatedholdOut
- a single instance (possibly not one of those used to build/train the evaluator) with which to evaluate the merit of the subsetretrain
- true if the classifier should be retrained with respect to the new subset before testing on the holdOut instance.- Returns:
- the "merit" of the subset on the holdOut instance
- Throws:
Exception
- if the subset cannot be evaluated
-
toString
Returns a string describing classifierSubsetEval -
getRevision
Returns the revision string.- Specified by:
getRevision
in interfaceRevisionHandler
- Overrides:
getRevision
in classASEvaluation
- Returns:
- the revision
-
main
Main method for testing this class.- Parameters:
args
- the options
-