Class Logistic

java.lang.Object
weka.classifiers.AbstractClassifier
weka.classifiers.functions.Logistic
All Implemented Interfaces:
Serializable, Cloneable, Classifier, Aggregateable<Logistic>, BatchPredictor, CapabilitiesHandler, CapabilitiesIgnorer, CommandlineRunnable, OptionHandler, PMMLProducer, RevisionHandler, TechnicalInformationHandler, WeightedInstancesHandler

Class for building and using a multinomial logistic regression model with a ridge estimator.

There are some modifications, however, compared to the paper of leCessie and van Houwelingen(1992):

If there are k classes for n instances with m attributes, the parameter matrix B to be calculated will be an m*(k-1) matrix.

The probability for class j with the exception of the last class is

Pj(Xi) = exp(XiBj)/((sum[j=1..(k-1)]exp(Xi*Bj))+1)

The last class has probability

1-(sum[j=1..(k-1)]Pj(Xi))
= 1/((sum[j=1..(k-1)]exp(Xi*Bj))+1)

The (negative) multinomial log-likelihood is thus:

L = -sum[i=1..n]{
sum[j=1..(k-1)](Yij * ln(Pj(Xi)))
+(1 - (sum[j=1..(k-1)]Yij))
* ln(1 - sum[j=1..(k-1)]Pj(Xi))
} + ridge * (B^2)

In order to find the matrix B for which L is minimised, a Quasi-Newton Method is used to search for the optimized values of the m*(k-1) variables. Note that before we use the optimization procedure, we 'squeeze' the matrix B into a m*(k-1) vector. For details of the optimization procedure, please check weka.core.Optimization class.

Although original Logistic Regression does not deal with instance weights, we modify the algorithm a little bit to handle the instance weights.

For more information see:

le Cessie, S., van Houwelingen, J.C. (1992). Ridge Estimators in Logistic Regression. Applied Statistics. 41(1):191-201.

Note: Missing values are replaced using a ReplaceMissingValuesFilter, and nominal attributes are transformed into numeric attributes using a NominalToBinaryFilter.

BibTeX:

 @article{leCessie1992,
    author = {le Cessie, S. and van Houwelingen, J.C.},
    journal = {Applied Statistics},
    number = {1},
    pages = {191-201},
    title = {Ridge Estimators in Logistic Regression},
    volume = {41},
    year = {1992}
 }
 

Valid options are:

 -D
  Turn on debugging output.
 
 -S
  Do not standardize the attributes in the input data.
 
 -R <ridge>
  Set the ridge in the log-likelihood.
 
 -M <number>
  Set the maximum number of iterations (default -1, until convergence).
 
Version:
$Revision: 15534 $
Author:
Xin Xu (xx5@cs.waikato.ac.nz)
See Also:
  • Constructor Details

    • Logistic

      public Logistic()
      Constructor that sets the default number of decimal places to 4.
  • Method Details

    • globalInfo

      public String globalInfo()
      Returns a string describing this classifier
      Returns:
      a description of the classifier suitable for displaying in the explorer/experimenter gui
    • getTechnicalInformation

      public TechnicalInformation getTechnicalInformation()
      Returns an instance of a TechnicalInformation object, containing detailed information about the technical background of this class, e.g., paper reference or book this class is based on.
      Specified by:
      getTechnicalInformation in interface TechnicalInformationHandler
      Returns:
      the technical information about this class
    • listOptions

      public Enumeration<Option> listOptions()
      Returns an enumeration describing the available options
      Specified by:
      listOptions in interface OptionHandler
      Overrides:
      listOptions in class AbstractClassifier
      Returns:
      an enumeration of all the available options
    • setOptions

      public void setOptions(String[] options) throws Exception
      Parses a given list of options.

      Valid options are:

       -D
        Turn on debugging output.
       
       -S
        Do not standardize the attributes in the input data.
       
       -R <ridge>
        Set the ridge in the log-likelihood.
       
       -M <number>
        Set the maximum number of iterations (default -1, until convergence).
       
      Specified by:
      setOptions in interface OptionHandler
      Overrides:
      setOptions in class AbstractClassifier
      Parameters:
      options - the list of options as an array of strings
      Throws:
      Exception - if an option is not supported
    • getOptions

      public String[] getOptions()
      Gets the current settings of the classifier.
      Specified by:
      getOptions in interface OptionHandler
      Overrides:
      getOptions in class AbstractClassifier
      Returns:
      an array of strings suitable for passing to setOptions
    • debugTipText

      public String debugTipText()
      Returns the tip text for this property
      Overrides:
      debugTipText in class AbstractClassifier
      Returns:
      tip text for this property suitable for displaying in the explorer/experimenter gui
    • setDebug

      public void setDebug(boolean debug)
      Sets whether debugging output will be printed.
      Overrides:
      setDebug in class AbstractClassifier
      Parameters:
      debug - true if debugging output should be printed
    • getDebug

      public boolean getDebug()
      Gets whether debugging output will be printed.
      Overrides:
      getDebug in class AbstractClassifier
      Returns:
      true if debugging output will be printed
    • useConjugateGradientDescentTipText

      public String useConjugateGradientDescentTipText()
      Returns the tip text for this property
      Returns:
      tip text for this property suitable for displaying in the explorer/experimenter gui
    • setUseConjugateGradientDescent

      public void setUseConjugateGradientDescent(boolean useConjugateGradientDescent)
      Sets whether conjugate gradient descent is used.
      Parameters:
      useConjugateGradientDescent - true if CGD is to be used.
    • getUseConjugateGradientDescent

      public boolean getUseConjugateGradientDescent()
      Gets whether to use conjugate gradient descent rather than BFGS updates.
      Returns:
      true if CGD is used
    • doNotStandardizeAttributesTipText

      public String doNotStandardizeAttributesTipText()
      Returns the tip text for this property
      Returns:
      tip text for this property suitable for displaying in the explorer/experimenter gui
    • setDoNotStandardizeAttributes

      public void setDoNotStandardizeAttributes(boolean DoNotStandardizeAttributes)
      Sets whether not to standardize attributes
      Parameters:
      DoNotStandardizeAttributes - true if attributes are not to be standardize
    • getDoNotStandardizeAttributes

      public boolean getDoNotStandardizeAttributes()
      Gets whether not to standardize attributes.
      Returns:
      true if attributes are not being standardized
    • ridgeTipText

      public String ridgeTipText()
      Returns the tip text for this property
      Returns:
      tip text for this property suitable for displaying in the explorer/experimenter gui
    • setRidge

      public void setRidge(double ridge)
      Sets the ridge in the log-likelihood.
      Parameters:
      ridge - the ridge
    • getRidge

      public double getRidge()
      Gets the ridge in the log-likelihood.
      Returns:
      the ridge
    • maxItsTipText

      public String maxItsTipText()
      Returns the tip text for this property
      Returns:
      tip text for this property suitable for displaying in the explorer/experimenter gui
    • getMaxIts

      public int getMaxIts()
      Get the value of MaxIts.
      Returns:
      Value of MaxIts.
    • setMaxIts

      public void setMaxIts(int newMaxIts)
      Set the value of MaxIts.
      Parameters:
      newMaxIts - Value to assign to MaxIts.
    • getCapabilities

      public Capabilities getCapabilities()
      Returns default capabilities of the classifier.
      Specified by:
      getCapabilities in interface CapabilitiesHandler
      Specified by:
      getCapabilities in interface Classifier
      Overrides:
      getCapabilities in class AbstractClassifier
      Returns:
      the capabilities of this classifier
      See Also:
    • buildClassifier

      public void buildClassifier(Instances train) throws Exception
      Builds the classifier
      Specified by:
      buildClassifier in interface Classifier
      Parameters:
      train - the training data to be used for generating the boosted classifier.
      Throws:
      Exception - if the classifier could not be built successfully
    • distributionForInstance

      public double[] distributionForInstance(Instance instance) throws Exception
      Computes the distribution for a given instance
      Specified by:
      distributionForInstance in interface Classifier
      Overrides:
      distributionForInstance in class AbstractClassifier
      Parameters:
      instance - the instance for which distribution is computed
      Returns:
      the distribution
      Throws:
      Exception - if the distribution can't be computed successfully
    • coefficients

      public double[][] coefficients()
      Returns the coefficients for this logistic model. The first dimension indexes the attributes, and the second the classes.
      Returns:
      the coefficients for this logistic model
    • toString

      public String toString()
      Gets a string describing the classifier.
      Overrides:
      toString in class Object
      Returns:
      a string describing the classifer built.
    • getRevision

      public String getRevision()
      Returns the revision string.
      Specified by:
      getRevision in interface RevisionHandler
      Overrides:
      getRevision in class AbstractClassifier
      Returns:
      the revision
    • aggregate

      public Logistic aggregate(Logistic toAggregate) throws Exception
      Aggregate an object with this one
      Specified by:
      aggregate in interface Aggregateable<Logistic>
      Parameters:
      toAggregate - the object to aggregate
      Returns:
      the result of aggregation
      Throws:
      Exception - if the supplied object can't be aggregated for some reason
    • finalizeAggregation

      public void finalizeAggregation() throws Exception
      Call to complete the aggregation process. Allows implementers to do any final processing based on how many objects were aggregated.
      Specified by:
      finalizeAggregation in interface Aggregateable<Logistic>
      Throws:
      Exception - if the aggregation can't be finalized for some reason
    • main

      public static void main(String[] argv)
      Main method for testing this class.
      Parameters:
      argv - should contain the command line arguments to the scheme (see Evaluation)
    • toPMML

      public String toPMML(Instances train)
      Produce a PMML representation of this logistic model
      Specified by:
      toPMML in interface PMMLProducer
      Parameters:
      train - the training data that was used to construct the model
      Returns:
      a string containing the PMML representation