Class InputMappedClassifier

All Implemented Interfaces:
Serializable, Cloneable, Classifier, AdditionalMeasureProducer, BatchPredictor, CapabilitiesHandler, CapabilitiesIgnorer, CommandlineRunnable, Drawable, EnvironmentHandler, OptionHandler, RevisionHandler, WeightedInstancesHandler

Wrapper classifier that addresses incompatible training and test data by building a mapping between the training data that a classifier has been built with and the incoming test instances' structure. Model attributes that are not found in the incoming instances receive missing values, so do incoming nominal attribute values that the classifier has not seen before. A new classifier can be trained or an existing one loaded from a file.

Valid options are:

 -I
  Ignore case when matching attribute names and nominal values.
 
 -M
  Suppress the output of the mapping report.
 
 -trim
  Trim white space from either end of names before matching.
 
 -L <path to model to load>
  Path to a model to load. If set, this model
  will be used for prediction and any base classifier
  specification will be ignored. Environment variables
  may be used in the path (e.g. ${HOME}/myModel.model)
 
 -D
  If set, classifier is run in debug mode and
  may output additional info to the console
 
 -W
  Full name of base classifier.
  (default: weka.classifiers.rules.ZeroR)
 
 Options specific to classifier weka.classifiers.rules.ZeroR:
 
 -D
  If set, classifier is run in debug mode and
  may output additional info to the console
 
Version:
$Revision: 15211 $
Author:
Mark Hall (mhall{[at]}pentaho{[dot]}com)
See Also:
  • Constructor Details

    • InputMappedClassifier

      public InputMappedClassifier()
  • Method Details

    • globalInfo

      public String globalInfo()
      Returns a string describing this classifier
      Returns:
      a description of the classifier suitable for displaying in the explorer/experimenter gui
    • setEnvironment

      public void setEnvironment(Environment env)
      Set the environment variables to use
      Specified by:
      setEnvironment in interface EnvironmentHandler
      Parameters:
      env - the environment variables to use
    • ignoreCaseForNamesTipText

      public String ignoreCaseForNamesTipText()
      Returns the tip text for this property
      Returns:
      tip text for this property suitable for displaying in the explorer/experimenter gui
    • setIgnoreCaseForNames

      public void setIgnoreCaseForNames(boolean ignore)
      Set whether to ignore case when matching attribute names and nominal values.
      Parameters:
      ignore - true if case is to be ignored
    • getIgnoreCaseForNames

      public boolean getIgnoreCaseForNames()
      Get whether to ignore case when matching attribute names and nominal values.
      Returns:
      true if case is to be ignored.
    • trimTipText

      public String trimTipText()
      Returns the tip text for this property
      Returns:
      tip text for this property suitable for displaying in the explorer/experimenter gui
    • setTrim

      public void setTrim(boolean trim)
      Set whether to trim white space from each end of names before matching.
      Parameters:
      trim - true to trim white space.
    • getTrim

      public boolean getTrim()
      Get whether to trim white space from each end of names before matching.
      Returns:
      true if white space is to be trimmed.
    • suppressMappingReportTipText

      public String suppressMappingReportTipText()
      Returns the tip text for this property
      Returns:
      tip text for this property suitable for displaying in the explorer/experimenter gui
    • setSuppressMappingReport

      public void setSuppressMappingReport(boolean suppress)
      Set whether to suppress output the report of model to input mappings.
      Parameters:
      suppress - true to suppress this output.
    • getSuppressMappingReport

      public boolean getSuppressMappingReport()
      Get whether to suppress output the report of model to input mappings.
      Returns:
      true if this output is to be suppressed.
    • modelPathTipText

      public String modelPathTipText()
      Returns the tip text for this property
      Returns:
      tip text for this property suitable for displaying in the explorer/experimenter gui
    • setModelPath

      public void setModelPath(String modelPath) throws Exception
      Set the path from which to load a model. Loading occurs when the first test instance is received or getModelHeader() is called programatically. Environment variables can be used in the supplied path - e.g. ${HOME}/myModel.model.
      Parameters:
      modelPath - the path to the model to load.
      Throws:
      Exception - if a problem occurs during loading.
    • getModelPath

      public String getModelPath()
      Get the path used for loading a model.
      Returns:
      the path used for loading a model.
    • getCapabilities

      public Capabilities getCapabilities()
      Returns default capabilities of the classifier.
      Specified by:
      getCapabilities in interface CapabilitiesHandler
      Specified by:
      getCapabilities in interface Classifier
      Overrides:
      getCapabilities in class SingleClassifierEnhancer
      Returns:
      the capabilities of this classifier
      See Also:
    • listOptions

      public Enumeration<Option> listOptions()
      Returns an enumeration describing the available options. Valid options are:

       -I
        Ignore case when matching attribute names and nominal values.
       
       -M
        Suppress the output of the mapping report.
       
       -trim
        Trim white space from either end of names before matching.
       
       -L <path to model to load>
        Path to a model to load. If set, this model
        will be used for prediction and any base classifier
        specification will be ignored. Environment variables
        may be used in the path (e.g. ${HOME}/myModel.model)
       
       -D
        If set, classifier is run in debug mode and
        may output additional info to the console
       
       -W
        Full name of base classifier.
        (default: weka.classifiers.rules.ZeroR)
       
       Options specific to classifier weka.classifiers.rules.ZeroR:
       
       -D
        If set, classifier is run in debug mode and
        may output additional info to the console
       
      Specified by:
      listOptions in interface OptionHandler
      Overrides:
      listOptions in class SingleClassifierEnhancer
      Returns:
      an enumeration of all the available options.
    • setOptions

      public void setOptions(String[] options) throws Exception
      Parses a given list of options.

      Valid options are:

       -I
        Ignore case when matching attribute names and nominal values.
       
       -M
        Suppress the output of the mapping report.
       
       -trim
        Trim white space from either end of names before matching.
       
       -L <path to model to load>
        Path to a model to load. If set, this model
        will be used for prediction and any base classifier
        specification will be ignored. Environment variables
        may be used in the path (e.g. ${HOME}/myModel.model)
       
       -D
        If set, classifier is run in debug mode and
        may output additional info to the console
       
       -W
        Full name of base classifier.
        (default: weka.classifiers.rules.ZeroR)
       
       Options specific to classifier weka.classifiers.rules.ZeroR:
       
       -D
        If set, classifier is run in debug mode and
        may output additional info to the console
       
      Options after -- are passed to the designated classifier.

      Specified by:
      setOptions in interface OptionHandler
      Overrides:
      setOptions in class SingleClassifierEnhancer
      Parameters:
      options - the list of options as an array of strings
      Throws:
      Exception - if an option is not supported
    • getOptions

      public String[] getOptions()
      Gets the current settings of the Classifier.
      Specified by:
      getOptions in interface OptionHandler
      Overrides:
      getOptions in class SingleClassifierEnhancer
      Returns:
      an array of strings suitable for passing to setOptions
    • setTestStructure

      public void setTestStructure(Instances testStructure)
      Set the test structure (if known in advance) that we are likely to see. If set, then a call to buildClassifier() will not overwrite any test structure that has been recorded with the current training structure. This is useful for getting a correct mapping report output in toString() after buildClassifier has been called and before any test instance has been seen. Test structure and mapping will get reset if a test instance is received whose structure does not match the recorded test structure.
      Parameters:
      testStructure - the structure of the test instances that we are likely to see (if known in advance)
    • setModelHeader

      public void setModelHeader(Instances modelHeader)
      Set the structure of the data used to create the model. This method is useful for clients who have an existing in-memory model that they'd like to wrap in the InputMappedClassifier
      Parameters:
      modelHeader - the structure of the data used to build the wrapped model
    • buildClassifier

      public void buildClassifier(Instances data) throws Exception
      Build the classifier
      Specified by:
      buildClassifier in interface Classifier
      Parameters:
      data - the training data to be used for generating the bagged classifier.
      Throws:
      Exception - if the classifier could not be built successfully
    • getModelHeader

      public Instances getModelHeader(Instances defaultH) throws Exception
      Return the instance structure that the encapsulated model was built with. If the classifier will be built from scratch by InputMappedClassifier then this method just returns the default structure that is passed in as argument.
      Parameters:
      defaultH - the default instances structure
      Returns:
      the instances structure used to create the encapsulated model
      Throws:
      Exception - if a problem occurs
    • getMappedClassIndex

      public int getMappedClassIndex() throws Exception
      Throws:
      Exception
    • constructMappedInstance

      public Instance constructMappedInstance(Instance incoming) throws Exception
      Throws:
      Exception
    • classifyInstance

      public double classifyInstance(Instance inst) throws Exception
      Description copied from class: AbstractClassifier
      Classifies the given test instance. The instance has to belong to a dataset when it's being classified. Note that a classifier MUST implement either this or distributionForInstance().
      Specified by:
      classifyInstance in interface Classifier
      Overrides:
      classifyInstance in class AbstractClassifier
      Parameters:
      inst - the instance to be classified
      Returns:
      the predicted most likely class for the instance or Utils.missingValue() if no prediction is made
      Throws:
      Exception - if an error occurred during the prediction
    • distributionForInstance

      public double[] distributionForInstance(Instance inst) throws Exception
      Description copied from class: AbstractClassifier
      Predicts the class memberships for a given instance. If an instance is unclassified, the returned array elements must be all zero. If the class is numeric, the array must consist of only one element, which contains the predicted value. Note that a classifier MUST implement either this or classifyInstance().
      Specified by:
      distributionForInstance in interface Classifier
      Overrides:
      distributionForInstance in class AbstractClassifier
      Parameters:
      inst - the instance to be classified
      Returns:
      an array containing the estimated membership probabilities of the test instance in each class or the numeric prediction
      Throws:
      Exception - if distribution could not be computed successfully
    • toString

      public String toString()
      Overrides:
      toString in class Object
    • graphType

      public int graphType()
      Returns the type of graph this classifier represents.
      Specified by:
      graphType in interface Drawable
      Returns:
      the type of graph
    • enumerateMeasures

      public Enumeration<String> enumerateMeasures()
      Returns an enumeration of the additional measure names
      Specified by:
      enumerateMeasures in interface AdditionalMeasureProducer
      Returns:
      an enumeration of the measure names
    • getMeasure

      public double getMeasure(String additionalMeasureName)
      Returns the value of the named measure
      Specified by:
      getMeasure in interface AdditionalMeasureProducer
      Parameters:
      additionalMeasureName - the name of the measure to query for its value
      Returns:
      the value of the named measure
      Throws:
      IllegalArgumentException - if the named measure is not supported
    • graph

      public String graph() throws Exception
      Returns graph describing the classifier (if possible).
      Specified by:
      graph in interface Drawable
      Returns:
      the graph of the classifier in dotty format
      Throws:
      Exception - if the classifier cannot be graphed
    • getRevision

      public String getRevision()
      Returns the revision string.
      Specified by:
      getRevision in interface RevisionHandler
      Overrides:
      getRevision in class AbstractClassifier
      Returns:
      the revision
    • main

      public static void main(String[] argv)
      Main method for testing this class.
      Parameters:
      argv - should contain the following arguments: -t training file [-T test file] [-c class index]