Class BinC45Split

All Implemented Interfaces:
Serializable, Cloneable, RevisionHandler

public class BinC45Split extends ClassifierSplitModel
Class implementing a binary C4.5-like split on an attribute.
Version:
$Revision: 14911 $
Author:
Eibe Frank (eibe@cs.waikato.ac.nz)
See Also:
  • Constructor Details

    • BinC45Split

      public BinC45Split(int attIndex, int minNoObj, double sumOfWeights, boolean useMDLcorrection)
      Initializes the split model.
  • Method Details

    • buildClassifier

      public void buildClassifier(Instances trainInstances) throws Exception
      Creates a C4.5-type split on the given data.
      Specified by:
      buildClassifier in class ClassifierSplitModel
      Throws:
      Exception - if something goes wrong
    • attIndex

      public final int attIndex()
      Returns index of attribute for which split was generated.
    • splitPoint

      public double splitPoint()
      Returns the split point (numeric attribute only).
      Returns:
      the split point used for a test on a numeric attribute
    • gainRatio

      public final double gainRatio()
      Returns (C4.5-type) gain ratio for the generated split.
    • classProb

      public final double classProb(int classIndex, Instance instance, int theSubset) throws Exception
      Gets class probability for instance.
      Overrides:
      classProb in class ClassifierSplitModel
      Throws:
      Exception - if something goes wrong
    • infoGain

      public final double infoGain()
      Returns (C4.5-type) information gain for the generated split.
    • leftSide

      public final String leftSide(Instances data)
      Prints left side of condition.
      Specified by:
      leftSide in class ClassifierSplitModel
      Parameters:
      data - the data to get the attribute name from.
      Returns:
      the attribute name
    • rightSide

      public final String rightSide(int index, Instances data)
      Prints the condition satisfied by instances in a subset.
      Specified by:
      rightSide in class ClassifierSplitModel
      Parameters:
      index - of subset and training set.
    • sourceExpression

      public final String sourceExpression(int index, Instances data)
      Returns a string containing java source code equivalent to the test made at this node. The instance being tested is called "i".
      Specified by:
      sourceExpression in class ClassifierSplitModel
      Parameters:
      index - index of the nominal value tested
      data - the data containing instance structure info
      Returns:
      a value of type 'String'
    • setSplitPoint

      public final void setSplitPoint(Instances allInstances)
      Sets split point to greatest value in given data smaller or equal to old split point. (C4.5 does this for some strange reason).
    • resetDistribution

      public void resetDistribution(Instances data) throws Exception
      Sets distribution associated with model.
      Overrides:
      resetDistribution in class ClassifierSplitModel
      Throws:
      Exception
    • weights

      public final double[] weights(Instance instance)
      Returns weights if instance is assigned to more than one subset. Returns null if instance is only assigned to one subset.
      Specified by:
      weights in class ClassifierSplitModel
    • whichSubset

      public final int whichSubset(Instance instance) throws Exception
      Returns index of subset instance is assigned to. Returns -1 if instance is assigned to more than one subset.
      Specified by:
      whichSubset in class ClassifierSplitModel
      Throws:
      Exception - if something goes wrong
    • getRevision

      public String getRevision()
      Returns the revision string.
      Returns:
      the revision