Class LogitBoost

All Implemented Interfaces:
Serializable, Cloneable, Classifier, IterativeClassifier, Sourcable, BatchPredictor, CapabilitiesHandler, CapabilitiesIgnorer, CommandlineRunnable, OptionHandler, Randomizable, RevisionHandler, TechnicalInformationHandler, WeightedInstancesHandler

Class for performing additive logistic regression.
This class performs classification using a regression scheme as the base learner, and can handle multi-class problems. For more information, see

J. Friedman, T. Hastie, R. Tibshirani (1998). Additive Logistic Regression: a Statistical View of Boosting. Stanford University.

BibTeX:

 @techreport{Friedman1998,
    address = {Stanford University},
    author = {J. Friedman and T. Hastie and R. Tibshirani},
    title = {Additive Logistic Regression: a Statistical View of Boosting},
    year = {1998},
    PS = {http://www-stat.stanford.edu/\~jhf/ftp/boost.ps}
 }
 

Valid options are:

 -Q
  Use resampling instead of reweighting for boosting.
 
 -use-estimated-priors
  Use estimated priors rather than uniform ones.
 
 -P <percent>
  Percentage of weight mass to base training on.
  (default 100, reduce to around 90 speed up)
 
 -L <num>
  Threshold on the improvement of the likelihood.
  (default -Double.MAX_VALUE)
 
 -H <num>
  Shrinkage parameter.
  (default 1)
 
 -Z <num>
  Z max threshold for responses.
  (default 3)
 
 -O <int>
  The size of the thread pool, for example, the number of cores in the CPU. (default 1)
 
 -E <int>
  The number of threads to use for batch prediction, which should be >= size of thread pool.
  (default 1)
 
 -S <num>
  Random number seed.
  (default 1)
 
 -I <num>
  Number of iterations.
  (default 10)
 
 -W
  Full name of base classifier.
  (default: weka.classifiers.trees.DecisionStump)
 
 -output-debug-info
  If set, classifier is run in debug mode and
  may output additional info to the console
 
 -do-not-check-capabilities
  If set, classifier capabilities are not checked before classifier is built
  (use with caution).
 
 Options specific to classifier weka.classifiers.trees.DecisionStump:
 
 -output-debug-info
  If set, classifier is run in debug mode and
  may output additional info to the console
 
 -do-not-check-capabilities
  If set, classifier capabilities are not checked before classifier is built
  (use with caution).
 
Options after -- are passed to the designated learner.

Version:
$Revision: 15519 $
Author:
Len Trigg (trigg@cs.waikato.ac.nz), Eibe Frank (eibe@cs.waikato.ac.nz)
See Also:
  • Constructor Details

    • LogitBoost

      public LogitBoost()
      Constructor.
  • Method Details

    • globalInfo

      public String globalInfo()
      Returns a string describing classifier
      Returns:
      a description suitable for displaying in the explorer/experimenter gui
    • getTechnicalInformation

      public TechnicalInformation getTechnicalInformation()
      Returns an instance of a TechnicalInformation object, containing detailed information about the technical background of this class, e.g., paper reference or book this class is based on.
      Specified by:
      getTechnicalInformation in interface TechnicalInformationHandler
      Returns:
      the technical information about this class
    • listOptions

      public Enumeration<Option> listOptions()
      Returns an enumeration describing the available options.
      Specified by:
      listOptions in interface OptionHandler
      Overrides:
      listOptions in class RandomizableIteratedSingleClassifierEnhancer
      Returns:
      an enumeration of all the available options.
    • setOptions

      public void setOptions(String[] options) throws Exception
      Parses a given list of options.

      Valid options are:

       -Q
        Use resampling instead of reweighting for boosting.
       
       -use-estimated-priors
        Use estimated priors rather than uniform ones.
       
       -P <percent>
        Percentage of weight mass to base training on.
        (default 100, reduce to around 90 speed up)
       
       -L <num>
        Threshold on the improvement of the likelihood.
        (default -Double.MAX_VALUE)
       
       -H <num>
        Shrinkage parameter.
        (default 1)
       
       -Z <num>
        Z max threshold for responses.
        (default 3)
       
       -O <int>
        The size of the thread pool, for example, the number of cores in the CPU. (default 1)
       
       -E <int>
        The number of threads to use for batch prediction, which should be >= size of thread pool.
        (default 1)
       
       -S <num>
        Random number seed.
        (default 1)
       
       -I <num>
        Number of iterations.
        (default 10)
       
       -W
        Full name of base classifier.
        (default: weka.classifiers.trees.DecisionStump)
       
       -output-debug-info
        If set, classifier is run in debug mode and
        may output additional info to the console
       
       -do-not-check-capabilities
        If set, classifier capabilities are not checked before classifier is built
        (use with caution).
       
       Options specific to classifier weka.classifiers.trees.DecisionStump:
       
       -output-debug-info
        If set, classifier is run in debug mode and
        may output additional info to the console
       
       -do-not-check-capabilities
        If set, classifier capabilities are not checked before classifier is built
        (use with caution).
       
      Options after -- are passed to the designated learner.

      Specified by:
      setOptions in interface OptionHandler
      Overrides:
      setOptions in class RandomizableIteratedSingleClassifierEnhancer
      Parameters:
      options - the list of options as an array of strings
      Throws:
      Exception - if an option is not supported
    • getOptions

      public String[] getOptions()
      Gets the current settings of the Classifier.
      Specified by:
      getOptions in interface OptionHandler
      Overrides:
      getOptions in class RandomizableIteratedSingleClassifierEnhancer
      Returns:
      an array of strings suitable for passing to setOptions
    • ZMaxTipText

      public String ZMaxTipText()
      Returns the tip text for this property
      Returns:
      tip text for this property suitable for displaying in the explorer/experimenter gui
    • setZMax

      public void setZMax(double zMax)
      Set the Z max threshold on the responses
      Parameters:
      zMax - the threshold to use
    • getZMax

      public double getZMax()
      Get the Z max threshold on the responses
      Returns:
      the threshold to use
    • shrinkageTipText

      public String shrinkageTipText()
      Returns the tip text for this property
      Returns:
      tip text for this property suitable for displaying in the explorer/experimenter gui
    • getShrinkage

      public double getShrinkage()
      Get the value of Shrinkage.
      Returns:
      Value of Shrinkage.
    • setShrinkage

      public void setShrinkage(double newShrinkage)
      Set the value of Shrinkage.
      Parameters:
      newShrinkage - Value to assign to Shrinkage.
    • likelihoodThresholdTipText

      public String likelihoodThresholdTipText()
      Returns the tip text for this property
      Returns:
      tip text for this property suitable for displaying in the explorer/experimenter gui
    • getLikelihoodThreshold

      public double getLikelihoodThreshold()
      Get the value of Precision.
      Returns:
      Value of Precision.
    • setLikelihoodThreshold

      public void setLikelihoodThreshold(double newPrecision)
      Set the value of Precision.
      Parameters:
      newPrecision - Value to assign to Precision.
    • useResamplingTipText

      public String useResamplingTipText()
      Returns the tip text for this property
      Returns:
      tip text for this property suitable for displaying in the explorer/experimenter gui
    • setUseResampling

      public void setUseResampling(boolean r)
      Set resampling mode
      Parameters:
      r - true if resampling should be done
    • getUseResampling

      public boolean getUseResampling()
      Get whether resampling is turned on
      Returns:
      true if resampling output is on
    • useEstimatedPriorsTipText

      public String useEstimatedPriorsTipText()
      Returns the tip text for this property
      Returns:
      tip text for this property suitable for displaying in the explorer/experimenter gui
    • setUseEstimatedPriors

      public void setUseEstimatedPriors(boolean r)
      Set resampling mode
      Parameters:
      r - true if resampling should be done
    • getUseEstimatedPriors

      public boolean getUseEstimatedPriors()
      Get whether resampling is turned on
      Returns:
      true if resampling output is on
    • weightThresholdTipText

      public String weightThresholdTipText()
      Returns the tip text for this property
      Returns:
      tip text for this property suitable for displaying in the explorer/experimenter gui
    • setWeightThreshold

      public void setWeightThreshold(int threshold)
      Set weight thresholding
      Parameters:
      threshold - the percentage of weight mass used for training
    • getWeightThreshold

      public int getWeightThreshold()
      Get the degree of weight thresholding
      Returns:
      the percentage of weight mass used for training
    • numThreadsTipText

      public String numThreadsTipText()
      Returns:
      a string to describe the option
    • getNumThreads

      public int getNumThreads()
      Gets the number of threads.
    • setNumThreads

      public void setNumThreads(int nT)
      Sets the number of threads
    • poolSizeTipText

      public String poolSizeTipText()
      Returns:
      a string to describe the option
    • getPoolSize

      public int getPoolSize()
      Gets the number of threads.
    • setPoolSize

      public void setPoolSize(int nT)
      Sets the number of threads
    • getCapabilities

      public Capabilities getCapabilities()
      Returns default capabilities of the classifier.
      Specified by:
      getCapabilities in interface CapabilitiesHandler
      Specified by:
      getCapabilities in interface Classifier
      Overrides:
      getCapabilities in class SingleClassifierEnhancer
      Returns:
      the capabilities of this classifier
      See Also:
    • buildClassifier

      public void buildClassifier(Instances data) throws Exception
      Method used to build the classifier.
      Specified by:
      buildClassifier in interface Classifier
      Overrides:
      buildClassifier in class IteratedSingleClassifierEnhancer
      Parameters:
      data - the training data to be used for generating the bagged classifier.
      Throws:
      Exception - if the classifier could not be built successfully
    • initializeClassifier

      public void initializeClassifier(Instances data) throws Exception
      Builds the boosted classifier
      Specified by:
      initializeClassifier in interface IterativeClassifier
      Parameters:
      data - the data to train the classifier with
      Throws:
      Exception - if building fails, e.g., can't handle data
    • next

      public boolean next() throws Exception
      Perform another iteration of boosting.
      Specified by:
      next in interface IterativeClassifier
      Returns:
      false if no further iterations could be performed, true otherwise
      Throws:
      Exception - if this iteration fails for unexpected reasons
    • resumeTipText

      public String resumeTipText()
      Tool tip text for the resume property
      Returns:
      the tool tip text for the finalize property
    • setResume

      public void setResume(boolean resume)
      If called with argument true, then the next time done() is called the model is effectively "frozen" and no further iterations can be performed
      Specified by:
      setResume in interface IterativeClassifier
      Parameters:
      resume - true if the model is to be finalized after performing iterations
    • getResume

      public boolean getResume()
      Returns true if the model is to be finalized (or has been finalized) after training.
      Specified by:
      getResume in interface IterativeClassifier
      Returns:
      the current value of finalize
    • done

      public void done()
      Clean up after boosting.
      Specified by:
      done in interface IterativeClassifier
    • classifiers

      public Classifier[][] classifiers()
      Returns the array of classifiers that have been built.
      Returns:
      the built classifiers
    • implementsMoreEfficientBatchPrediction

      public boolean implementsMoreEfficientBatchPrediction()
      Performs efficient batch prediction
      Specified by:
      implementsMoreEfficientBatchPrediction in interface BatchPredictor
      Overrides:
      implementsMoreEfficientBatchPrediction in class AbstractClassifier
      Returns:
      true, as LogitBoost can perform efficient batch prediction
    • distributionForInstance

      public double[] distributionForInstance(Instance inst) throws Exception
      Calculates the class membership probabilities for the given test instance.
      Specified by:
      distributionForInstance in interface Classifier
      Overrides:
      distributionForInstance in class AbstractClassifier
      Parameters:
      inst - the instance to be classified
      Returns:
      predicted class probability distribution
      Throws:
      Exception - if instance could not be classified successfully
    • distributionsForInstances

      public double[][] distributionsForInstances(Instances insts) throws Exception
      Calculates the class membership probabilities for the given test instances. Uses multi-threading if requested.
      Specified by:
      distributionsForInstances in interface BatchPredictor
      Overrides:
      distributionsForInstances in class AbstractClassifier
      Parameters:
      insts - the instances to be classified
      Returns:
      predicted class probability distributions
      Throws:
      Exception - if instances could not be classified successfully
    • toSource

      public String toSource(String className) throws Exception
      Returns the boosted model as Java source code.
      Specified by:
      toSource in interface Sourcable
      Parameters:
      className - the classname in the generated code
      Returns:
      the tree as Java source code
      Throws:
      Exception - if something goes wrong
    • toString

      public String toString()
      Returns description of the boosted classifier.
      Overrides:
      toString in class Object
      Returns:
      description of the boosted classifier as a string
    • getRevision

      public String getRevision()
      Returns the revision string.
      Specified by:
      getRevision in interface RevisionHandler
      Overrides:
      getRevision in class AbstractClassifier
      Returns:
      the revision
    • main

      public static void main(String[] argv)
      Main method for testing this class.
      Parameters:
      argv - the options