java.lang.Object

weka.attributeSelection.ASEvaluation

weka.attributeSelection.WrapperSubsetEval

All Implemented Interfaces:: Serializable, SubsetEvaluator, CapabilitiesHandler, CapabilitiesIgnorer, CommandlineRunnable, OptionHandler, RevisionHandler, TechnicalInformationHandler

public class WrapperSubsetEval extends ASEvaluation implements SubsetEvaluator, OptionHandler, TechnicalInformationHandler

WrapperSubsetEval:

Evaluates attribute sets by using a learning scheme. Cross validation is used to estimate the accuracy of the learning scheme for a set of attributes.

For more information see:

Ron Kohavi, George H. John (1997). Wrappers for feature subset selection. Artificial Intelligence. 97(1-2):273-324.

BibTeX:

 @article{Kohavi1997,
    author = {Ron Kohavi and George H. John},
    journal = {Artificial Intelligence},
    note = {Special issue on relevance},
    number = {1-2},
    pages = {273-324},
    title = {Wrappers for feature subset selection},
    volume = {97},
    year = {1997},
    ISSN = {0004-3702}
 }

Valid options are:

 -B <base learner>
  class name of base learner to use for  accuracy estimation.
  Place any classifier options LAST on the command line
  following a "--". eg.:
   -B weka.classifiers.bayes.NaiveBayes ... -- -K
  (default: weka.classifiers.rules.ZeroR)

 -F <num>
  number of cross validation folds to use for estimating accuracy.
  (default=5)

 -R <seed>
  Seed for cross validation accuracy testimation.
  (default = 1)

 -T <num>
  threshold by which to execute another cross validation
  (standard deviation---expressed as a percentage of the mean).
  (default: 0.01 (1%))

 -E <acc | rmse | mae | f-meas | auc | auprc>
  Performance evaluation measure to use for selecting attributes.
  (Default = accuracy for discrete class and rmse for numeric class)

 -IRclass <label | index>
  Optional class value (label or 1-based index) to use in conjunction with
  IR statistics (f-meas, auc or auprc). Omitting this option will use
  the class-weighted average.

 Options specific to scheme weka.classifiers.rules.ZeroR:

 -D
  If set, classifier is run in debug mode and
  may output additional info to the console

Version:

$Revision: 15519 $

Author:

Mark Hall (mhall@cs.waikato.ac.nz)

See Also:

Serialized Form

Field Summary

Fields

Modifier and Type

Field

Description

static final int

EVAL_ACCURACY

static final int

EVAL_AUC

static final int

EVAL_AUPRC

static final int

EVAL_CORRELATION

static final int

EVAL_DEFAULT

static final int

EVAL_FMEASURE

static final int

EVAL_MAE

static final int

EVAL_PLUGIN

static final int

EVAL_RMSE

static final Tag[]

TAGS_EVALUATION

Holds all tags for metrics
Constructor Summary

Constructors

Constructor

Description

WrapperSubsetEval()

Constructor.
Method Summary

Modifier and Type

Method

Description

void

buildEvaluator(Instances data)

Generates a attribute evaluator.

String

classifierTipText()

Returns the tip text for this property

void

clean()

Tells the evaluator that the attribute selection process is complete.

double

evaluateSubset(BitSet subset)

Evaluates a subset of attributes

String

evaluationMeasureTipText()

Returns the tip text for this property

String

foldsTipText()

Returns the tip text for this property

Capabilities

getCapabilities()

Returns the capabilities of this evaluator.

Classifier

getClassifier()

Get the classifier used as the base learner.

SelectedTag

getEvaluationMeasure()

Gets the currently set performance evaluation measure used for selecting attributes for the decision table

int

getFolds()

Get the number of folds used for accuracy estimation

String

getIRClassValue()

Get the class value (label or index) to use with IR metric evaluation of subsets.

String[]

getOptions()

Gets the current settings of WrapperSubsetEval.

String

getRevision()

Returns the revision string.

int

getSeed()

Get the random number seed used for cross validation

TechnicalInformation

getTechnicalInformation()

Returns an instance of a TechnicalInformation object, containing detailed information about the technical background of this class, e.g., paper reference or book this class is based on.

double

getThreshold()

Get the value of the threshold

String

globalInfo()

Returns a string describing this attribute evaluator

String

IRClassValueTipText()

Returns the tip text for this property

Enumeration<Option>

listOptions()

Returns an enumeration describing the available options.

static void

main(String[] args)

Main method for testing this class.

String

seedTipText()

Returns the tip text for this property

void

setClassifier(Classifier newClassifier)

Set the classifier to use for accuracy estimation

void

setEvaluationMeasure(SelectedTag newMethod)

Sets the performance evaluation measure to use for selecting attributes for the decision table

void

setFolds(int f)

Set the number of folds to use for accuracy estimation

void

setIRClassValue(String val)

Set the class value (label or index) to use with IR metric evaluation of subsets.

void

setOptions(String[] options)

Parses a given list of options.

void

setSeed(int s)

Set the seed to use for cross validation

void

setThreshold(double t)

Set the value of the threshold for repeating cross validation

String

thresholdTipText()

Returns the tip text for this property

String

toString()

Returns a string describing the wrapper

Methods inherited from class weka.attributeSelection.ASEvaluation
doNotCheckCapabilitiesTipText, forName, getDoNotCheckCapabilities, makeCopies, postExecution, postProcess, preExecution, run, runEvaluator, setDoNotCheckCapabilities

Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, wait, wait, wait

Field Details
- EVAL_DEFAULT
  
  public static final int EVAL_DEFAULT
  See Also:
  
  Constant Field Values
- EVAL_ACCURACY
  
  public static final int EVAL_ACCURACY
  See Also:
  
  Constant Field Values
- EVAL_RMSE
  
  public static final int EVAL_RMSE
  See Also:
  
  Constant Field Values
- EVAL_MAE
  
  public static final int EVAL_MAE
  See Also:
  
  Constant Field Values
- EVAL_FMEASURE
  
  public static final int EVAL_FMEASURE
  See Also:
  
  Constant Field Values
- EVAL_AUC
  
  public static final int EVAL_AUC
  See Also:
  
  Constant Field Values
- EVAL_AUPRC
  
  public static final int EVAL_AUPRC
  See Also:
  
  Constant Field Values
- EVAL_CORRELATION
  
  public static final int EVAL_CORRELATION
  See Also:
  
  Constant Field Values
- EVAL_PLUGIN
  
  public static final int EVAL_PLUGIN
  See Also:
  
  Constant Field Values
- TAGS_EVALUATION
  
  public static final Tag[] TAGS_EVALUATION
  
  Holds all tags for metrics
Constructor Details
- WrapperSubsetEval
  
  public WrapperSubsetEval()
  
  Constructor. Calls restOptions to set default options
Method Details
- globalInfo
  
  public String globalInfo()
  
  Returns a string describing this attribute evaluator
  
  Returns:
  
  a description of the evaluator suitable for displaying in the explorer/experimenter gui
- getTechnicalInformation
  
  public TechnicalInformation getTechnicalInformation()
  
  Returns an instance of a TechnicalInformation object, containing detailed information about the technical background of this class, e.g., paper reference or book this class is based on.
  
  Specified by:
  
  getTechnicalInformation in interface TechnicalInformationHandler
  
  Returns:
  
  the technical information about this class
- listOptions
  
  public Enumeration<Option> listOptions()
  
  Returns an enumeration describing the available options.
  
  Specified by:
  
  listOptions in interface OptionHandler
  
  Overrides:
  
  listOptions in class ASEvaluation
  
  Returns:
  
  an enumeration of all the available options.
- setOptions
  
  public void setOptions(String[] options) throws Exception
  Parses a given list of options.
  Valid options are:
  
  -B <base learner> class name of base learner to use for accuracy estimation. Place any classifier options LAST on the command line following a "--". eg.: -B weka.classifiers.bayes.NaiveBayes ... -- -K (default: weka.classifiers.rules.ZeroR)
  
  -F <num> number of cross validation folds to use for estimating accuracy. (default=5)
  
  -R <seed> Seed for cross validation accuracy testimation. (default = 1)
  
  -T <num> threshold by which to execute another cross validation (standard deviation---expressed as a percentage of the mean). (default: 0.01 (1%))
  
  -E <acc | rmse | mae | f-meas | auc | auprc> Performance evaluation measure to use for selecting attributes. (Default = accuracy for discrete class and rmse for numeric class)
  
  -IRclass <label | index> Optional class value (label or 1-based index) to use in conjunction with IR statistics (f-meas, auc or auprc). Omitting this option will use the class-weighted average.
  
  Options specific to scheme weka.classifiers.rules.ZeroR:
  
  -D If set, classifier is run in debug mode and may output additional info to the console
  Specified by:
  
  setOptions in interface OptionHandler
  
  Overrides:
  
  setOptions in class ASEvaluation
  
  Parameters:
  
  options - the list of options as an array of strings
  
  Throws:
  
  Exception - if an option is not supported
- setIRClassValue
  
  public void setIRClassValue(String val)
  
  Set the class value (label or index) to use with IR metric evaluation of subsets. Leaving this unset will result in the class weighted average for the IR metric being used.
  
  Parameters:
  
  val - the class label or 1-based index of the class label to use when evaluating subsets with an IR metric
- getIRClassValue
  
  public String getIRClassValue()
  
  Get the class value (label or index) to use with IR metric evaluation of subsets. Leaving this unset will result in the class weighted average for the IR metric being used.
  
  Returns:
  
  the class label or 1-based index of the class label to use when evaluating subsets with an IR metric
- IRClassValueTipText
  
  public String IRClassValueTipText()
  
  Returns the tip text for this property
  
  Returns:
  
  tip text for this property suitable for displaying in the explorer/experimenter gui
- evaluationMeasureTipText
  
  public String evaluationMeasureTipText()
  
  Returns the tip text for this property
  
  Returns:
  
  tip text for this property suitable for displaying in the explorer/experimenter gui
- getEvaluationMeasure
  
  public SelectedTag getEvaluationMeasure()
  
  Gets the currently set performance evaluation measure used for selecting attributes for the decision table
  
  Returns:
  
  the performance evaluation measure
- setEvaluationMeasure
  
  public void setEvaluationMeasure(SelectedTag newMethod)
  
  Sets the performance evaluation measure to use for selecting attributes for the decision table
  
  Parameters:
  
  newMethod - the new performance evaluation metric to use
- thresholdTipText
  
  public String thresholdTipText()
  
  Returns the tip text for this property
  
  Returns:
  
  tip text for this property suitable for displaying in the explorer/experimenter gui
- setThreshold
  
  public void setThreshold(double t)
  
  Set the value of the threshold for repeating cross validation
  
  Parameters:
  
  t - the value of the threshold
- getThreshold
  
  public double getThreshold()
  
  Get the value of the threshold
  
  Returns:
  
  the threshold as a double
- foldsTipText
  
  public String foldsTipText()
  
  Returns the tip text for this property
  
  Returns:
  
  tip text for this property suitable for displaying in the explorer/experimenter gui
- setFolds
  
  public void setFolds(int f)
  
  Set the number of folds to use for accuracy estimation
  
  Parameters:
  
  f - the number of folds
- getFolds
  
  public int getFolds()
  
  Get the number of folds used for accuracy estimation
  
  Returns:
  
  the number of folds
- seedTipText
  
  public String seedTipText()
  
  Returns the tip text for this property
  
  Returns:
  
  tip text for this property suitable for displaying in the explorer/experimenter gui
- setSeed
  
  public void setSeed(int s)
  
  Set the seed to use for cross validation
  
  Parameters:
  
  s - the seed
- getSeed
  
  public int getSeed()
  
  Get the random number seed used for cross validation
  
  Returns:
  
  the seed
- classifierTipText
  
  public String classifierTipText()
  
  Returns the tip text for this property
  
  Returns:
  
  tip text for this property suitable for displaying in the explorer/experimenter gui
- setClassifier
  
  public void setClassifier(Classifier newClassifier)
  
  Set the classifier to use for accuracy estimation
  
  Parameters:
  
  newClassifier - the Classifier to use.
- getClassifier
  
  public Classifier getClassifier()
  
  Get the classifier used as the base learner.
  
  Returns:
  
  the classifier used as the classifier
- getOptions
  
  public String[] getOptions()
  
  Gets the current settings of WrapperSubsetEval.
  
  Specified by:
  
  getOptions in interface OptionHandler
  
  Overrides:
  
  getOptions in class ASEvaluation
  
  Returns:
  
  an array of strings suitable for passing to setOptions()
- getCapabilities
  
  public Capabilities getCapabilities()
  
  Returns the capabilities of this evaluator.
  Specified by:
  
  getCapabilities in interface CapabilitiesHandler
  
  Overrides:
  
  getCapabilities in class ASEvaluation
  
  Returns:
  
  the capabilities of this evaluator
  
  See Also:
  
  Capabilities
- buildEvaluator
  
  public void buildEvaluator(Instances data) throws Exception
  
  Generates a attribute evaluator. Has to initialize all fields of the evaluator that are not being set via options.
  
  Specified by:
  
  buildEvaluator in class ASEvaluation
  
  Parameters:
  
  data - set of instances serving as training data
  
  Throws:
  
  Exception - if the evaluator has not been generated successfully
- evaluateSubset
  
  public double evaluateSubset(BitSet subset) throws Exception
  
  Evaluates a subset of attributes
  
  Specified by:
  
  evaluateSubset in interface SubsetEvaluator
  
  Parameters:
  
  subset - a bitset representing the attribute subset to be evaluated
  
  Returns:
  
  the error rate
  
  Throws:
  
  Exception - if the subset could not be evaluated
- toString
  
  public String toString()
  
  Returns a string describing the wrapper
  
  Overrides:
  
  toString in class Object
  
  Returns:
  
  the description as a string
- getRevision
  
  public String getRevision()
  
  Returns the revision string.
  
  Specified by:
  
  getRevision in interface RevisionHandler
  
  Overrides:
  
  getRevision in class ASEvaluation
  
  Returns:
  
  the revision
- clean
  
  public void clean()
  
  Description copied from class: ASEvaluation
  
  Tells the evaluator that the attribute selection process is complete. It can then clean up data structures, references to training data as necessary in order to save memory
  
  Overrides:
  
  clean in class ASEvaluation
- main
  
  public static void main(String[] args)
  
  Main method for testing this class.
  
  Parameters:
  
  args - the options

Class WrapperSubsetEval

Field Summary

Constructor Summary

Method Summary

Methods inherited from class weka.attributeSelection.ASEvaluation

Methods inherited from class java.lang.Object

Field Details

EVAL_DEFAULT

EVAL_ACCURACY

EVAL_RMSE

EVAL_MAE

EVAL_FMEASURE

EVAL_AUC

EVAL_AUPRC

EVAL_CORRELATION

EVAL_PLUGIN

TAGS_EVALUATION

Constructor Details

WrapperSubsetEval

Method Details

globalInfo

getTechnicalInformation

listOptions

setOptions

setIRClassValue

getIRClassValue

IRClassValueTipText

evaluationMeasureTipText

getEvaluationMeasure

setEvaluationMeasure

thresholdTipText

setThreshold

getThreshold

foldsTipText

setFolds

getFolds

seedTipText

setSeed

getSeed

classifierTipText

setClassifier

getClassifier

getOptions

getCapabilities

buildEvaluator

evaluateSubset

toString

getRevision

clean

main