Class WrapperSubsetEval

java.lang.Object
weka.attributeSelection.ASEvaluation
weka.attributeSelection.WrapperSubsetEval
All Implemented Interfaces:
Serializable, SubsetEvaluator, CapabilitiesHandler, CapabilitiesIgnorer, CommandlineRunnable, OptionHandler, RevisionHandler, TechnicalInformationHandler

public class WrapperSubsetEval extends ASEvaluation implements SubsetEvaluator, OptionHandler, TechnicalInformationHandler
WrapperSubsetEval:

Evaluates attribute sets by using a learning scheme. Cross validation is used to estimate the accuracy of the learning scheme for a set of attributes.

For more information see:

Ron Kohavi, George H. John (1997). Wrappers for feature subset selection. Artificial Intelligence. 97(1-2):273-324.

BibTeX:
 @article{Kohavi1997,
    author = {Ron Kohavi and George H. John},
    journal = {Artificial Intelligence},
    note = {Special issue on relevance},
    number = {1-2},
    pages = {273-324},
    title = {Wrappers for feature subset selection},
    volume = {97},
    year = {1997},
    ISSN = {0004-3702}
 }
 

Valid options are:

 -B <base learner>
  class name of base learner to use for  accuracy estimation.
  Place any classifier options LAST on the command line
  following a "--". eg.:
   -B weka.classifiers.bayes.NaiveBayes ... -- -K
  (default: weka.classifiers.rules.ZeroR)
 
 -F <num>
  number of cross validation folds to use for estimating accuracy.
  (default=5)
 
 -R <seed>
  Seed for cross validation accuracy testimation.
  (default = 1)
 
 -T <num>
  threshold by which to execute another cross validation
  (standard deviation---expressed as a percentage of the mean).
  (default: 0.01 (1%))
 
 -E <acc | rmse | mae | f-meas | auc | auprc>
  Performance evaluation measure to use for selecting attributes.
  (Default = accuracy for discrete class and rmse for numeric class)
 
 -IRclass <label | index>
  Optional class value (label or 1-based index) to use in conjunction with
  IR statistics (f-meas, auc or auprc). Omitting this option will use
  the class-weighted average.
 
 Options specific to scheme weka.classifiers.rules.ZeroR:
 
 -D
  If set, classifier is run in debug mode and
  may output additional info to the console
 
Version:
$Revision: 15519 $
Author:
Mark Hall (mhall@cs.waikato.ac.nz)
See Also:
  • Field Details

  • Constructor Details

    • WrapperSubsetEval

      public WrapperSubsetEval()
      Constructor. Calls restOptions to set default options
  • Method Details

    • globalInfo

      public String globalInfo()
      Returns a string describing this attribute evaluator
      Returns:
      a description of the evaluator suitable for displaying in the explorer/experimenter gui
    • getTechnicalInformation

      public TechnicalInformation getTechnicalInformation()
      Returns an instance of a TechnicalInformation object, containing detailed information about the technical background of this class, e.g., paper reference or book this class is based on.
      Specified by:
      getTechnicalInformation in interface TechnicalInformationHandler
      Returns:
      the technical information about this class
    • listOptions

      public Enumeration<Option> listOptions()
      Returns an enumeration describing the available options.
      Specified by:
      listOptions in interface OptionHandler
      Overrides:
      listOptions in class ASEvaluation
      Returns:
      an enumeration of all the available options.
    • setOptions

      public void setOptions(String[] options) throws Exception

      Parses a given list of options.

      Valid options are:
       -B <base learner>
        class name of base learner to use for  accuracy estimation.
        Place any classifier options LAST on the command line
        following a "--". eg.:
         -B weka.classifiers.bayes.NaiveBayes ... -- -K
        (default: weka.classifiers.rules.ZeroR)
       
       -F <num>
        number of cross validation folds to use for estimating accuracy.
        (default=5)
       
       -R <seed>
        Seed for cross validation accuracy testimation.
        (default = 1)
       
       -T <num>
        threshold by which to execute another cross validation
        (standard deviation---expressed as a percentage of the mean).
        (default: 0.01 (1%))
       
       -E <acc | rmse | mae | f-meas | auc | auprc>
        Performance evaluation measure to use for selecting attributes.
        (Default = accuracy for discrete class and rmse for numeric class)
       
       -IRclass <label | index>
        Optional class value (label or 1-based index) to use in conjunction with
        IR statistics (f-meas, auc or auprc). Omitting this option will use
        the class-weighted average.
       
       Options specific to scheme weka.classifiers.rules.ZeroR:
       
       -D
        If set, classifier is run in debug mode and
        may output additional info to the console
       
      Specified by:
      setOptions in interface OptionHandler
      Overrides:
      setOptions in class ASEvaluation
      Parameters:
      options - the list of options as an array of strings
      Throws:
      Exception - if an option is not supported
    • setIRClassValue

      public void setIRClassValue(String val)
      Set the class value (label or index) to use with IR metric evaluation of subsets. Leaving this unset will result in the class weighted average for the IR metric being used.
      Parameters:
      val - the class label or 1-based index of the class label to use when evaluating subsets with an IR metric
    • getIRClassValue

      public String getIRClassValue()
      Get the class value (label or index) to use with IR metric evaluation of subsets. Leaving this unset will result in the class weighted average for the IR metric being used.
      Returns:
      the class label or 1-based index of the class label to use when evaluating subsets with an IR metric
    • IRClassValueTipText

      public String IRClassValueTipText()
      Returns the tip text for this property
      Returns:
      tip text for this property suitable for displaying in the explorer/experimenter gui
    • evaluationMeasureTipText

      public String evaluationMeasureTipText()
      Returns the tip text for this property
      Returns:
      tip text for this property suitable for displaying in the explorer/experimenter gui
    • getEvaluationMeasure

      public SelectedTag getEvaluationMeasure()
      Gets the currently set performance evaluation measure used for selecting attributes for the decision table
      Returns:
      the performance evaluation measure
    • setEvaluationMeasure

      public void setEvaluationMeasure(SelectedTag newMethod)
      Sets the performance evaluation measure to use for selecting attributes for the decision table
      Parameters:
      newMethod - the new performance evaluation metric to use
    • thresholdTipText

      public String thresholdTipText()
      Returns the tip text for this property
      Returns:
      tip text for this property suitable for displaying in the explorer/experimenter gui
    • setThreshold

      public void setThreshold(double t)
      Set the value of the threshold for repeating cross validation
      Parameters:
      t - the value of the threshold
    • getThreshold

      public double getThreshold()
      Get the value of the threshold
      Returns:
      the threshold as a double
    • foldsTipText

      public String foldsTipText()
      Returns the tip text for this property
      Returns:
      tip text for this property suitable for displaying in the explorer/experimenter gui
    • setFolds

      public void setFolds(int f)
      Set the number of folds to use for accuracy estimation
      Parameters:
      f - the number of folds
    • getFolds

      public int getFolds()
      Get the number of folds used for accuracy estimation
      Returns:
      the number of folds
    • seedTipText

      public String seedTipText()
      Returns the tip text for this property
      Returns:
      tip text for this property suitable for displaying in the explorer/experimenter gui
    • setSeed

      public void setSeed(int s)
      Set the seed to use for cross validation
      Parameters:
      s - the seed
    • getSeed

      public int getSeed()
      Get the random number seed used for cross validation
      Returns:
      the seed
    • classifierTipText

      public String classifierTipText()
      Returns the tip text for this property
      Returns:
      tip text for this property suitable for displaying in the explorer/experimenter gui
    • setClassifier

      public void setClassifier(Classifier newClassifier)
      Set the classifier to use for accuracy estimation
      Parameters:
      newClassifier - the Classifier to use.
    • getClassifier

      public Classifier getClassifier()
      Get the classifier used as the base learner.
      Returns:
      the classifier used as the classifier
    • getOptions

      public String[] getOptions()
      Gets the current settings of WrapperSubsetEval.
      Specified by:
      getOptions in interface OptionHandler
      Overrides:
      getOptions in class ASEvaluation
      Returns:
      an array of strings suitable for passing to setOptions()
    • getCapabilities

      public Capabilities getCapabilities()
      Returns the capabilities of this evaluator.
      Specified by:
      getCapabilities in interface CapabilitiesHandler
      Overrides:
      getCapabilities in class ASEvaluation
      Returns:
      the capabilities of this evaluator
      See Also:
    • buildEvaluator

      public void buildEvaluator(Instances data) throws Exception
      Generates a attribute evaluator. Has to initialize all fields of the evaluator that are not being set via options.
      Specified by:
      buildEvaluator in class ASEvaluation
      Parameters:
      data - set of instances serving as training data
      Throws:
      Exception - if the evaluator has not been generated successfully
    • evaluateSubset

      public double evaluateSubset(BitSet subset) throws Exception
      Evaluates a subset of attributes
      Specified by:
      evaluateSubset in interface SubsetEvaluator
      Parameters:
      subset - a bitset representing the attribute subset to be evaluated
      Returns:
      the error rate
      Throws:
      Exception - if the subset could not be evaluated
    • toString

      public String toString()
      Returns a string describing the wrapper
      Overrides:
      toString in class Object
      Returns:
      the description as a string
    • getRevision

      public String getRevision()
      Returns the revision string.
      Specified by:
      getRevision in interface RevisionHandler
      Overrides:
      getRevision in class ASEvaluation
      Returns:
      the revision
    • clean

      public void clean()
      Description copied from class: ASEvaluation
      Tells the evaluator that the attribute selection process is complete. It can then clean up data structures, references to training data as necessary in order to save memory
      Overrides:
      clean in class ASEvaluation
    • main

      public static void main(String[] args)
      Main method for testing this class.
      Parameters:
      args - the options