Class C45Split

All Implemented Interfaces:
Serializable, Cloneable, RevisionHandler

public class C45Split extends ClassifierSplitModel
Class implementing a C4.5-type split on an attribute.
Version:
$Revision: 14911 $
Author:
Eibe Frank (eibe@cs.waikato.ac.nz)
See Also:
  • Constructor Details

    • C45Split

      public C45Split(int attIndex, int minNoObj, double sumOfWeights, boolean useMDLcorrection)
      Initializes the split model.
  • Method Details

    • buildClassifier

      public void buildClassifier(Instances trainInstances) throws Exception
      Creates a C4.5-type split on the given data. Assumes that none of the class values is missing.
      Specified by:
      buildClassifier in class ClassifierSplitModel
      Throws:
      Exception - if something goes wrong
    • attIndex

      public final int attIndex()
      Returns index of attribute for which split was generated.
    • splitPoint

      public double splitPoint()
      Returns the split point (numeric attribute only).
      Returns:
      the split point used for a test on a numeric attribute
    • classProb

      public final double classProb(int classIndex, Instance instance, int theSubset) throws Exception
      Gets class probability for instance.
      Overrides:
      classProb in class ClassifierSplitModel
      Throws:
      Exception - if something goes wrong
    • codingCost

      public final double codingCost()
      Returns coding cost for split (used in rule learner).
      Overrides:
      codingCost in class ClassifierSplitModel
    • gainRatio

      public final double gainRatio()
      Returns (C4.5-type) gain ratio for the generated split.
    • infoGain

      public final double infoGain()
      Returns (C4.5-type) information gain for the generated split.
    • leftSide

      public final String leftSide(Instances data)
      Prints left side of condition..
      Specified by:
      leftSide in class ClassifierSplitModel
      Parameters:
      data - training set.
    • rightSide

      public final String rightSide(int index, Instances data)
      Prints the condition satisfied by instances in a subset.
      Specified by:
      rightSide in class ClassifierSplitModel
      Parameters:
      index - of subset
      data - training set.
    • sourceExpression

      public final String sourceExpression(int index, Instances data)
      Returns a string containing java source code equivalent to the test made at this node. The instance being tested is called "i".
      Specified by:
      sourceExpression in class ClassifierSplitModel
      Parameters:
      index - index of the nominal value tested
      data - the data containing instance structure info
      Returns:
      a value of type 'String'
    • setSplitPoint

      public final void setSplitPoint(Instances allInstances)
      Sets split point to greatest value in given data smaller or equal to old split point. (C4.5 does this for some strange reason).
    • minsAndMaxs

      public final double[][] minsAndMaxs(Instances data, double[][] minsAndMaxs, int index)
      Returns the minsAndMaxs of the index.th subset.
    • resetDistribution

      public void resetDistribution(Instances data) throws Exception
      Sets distribution associated with model.
      Overrides:
      resetDistribution in class ClassifierSplitModel
      Throws:
      Exception
    • weights

      public final double[] weights(Instance instance)
      Returns weights if instance is assigned to more than one subset. Returns null if instance is only assigned to one subset.
      Specified by:
      weights in class ClassifierSplitModel
    • whichSubset

      public final int whichSubset(Instance instance) throws Exception
      Returns index of subset instance is assigned to. Returns -1 if instance is assigned to more than one subset.
      Specified by:
      whichSubset in class ClassifierSplitModel
      Throws:
      Exception - if something goes wrong
    • getRevision

      public String getRevision()
      Returns the revision string.
      Returns:
      the revision