Class NumericCleaner

All Implemented Interfaces:
Serializable, CapabilitiesHandler, CapabilitiesIgnorer, CommandlineRunnable, OptionHandler, RevisionHandler, WeightedAttributesHandler, WeightedInstancesHandler, StreamableFilter

public class NumericCleaner extends SimpleStreamFilter implements WeightedAttributesHandler, WeightedInstancesHandler
A filter that 'cleanses' the numeric data from values that are too small, too big or very close to a certain value, and sets these values to a pre-defined default.

Valid options are:

 -output-debug-info
  Turns on output of debugging information.
 
 -min <double>
  The minimum threshold. (default -Double.MAX_VALUE)
 
 -min-default <double>
  The minimum threshold below which values are replaced by the corresponding default.
  (default -Double.MAX_VALUE)
 
 -max <double>
  The maximum threshold above which values are replaced by the corresponding default.
  (default Double.MAX_VALUE)
 
 -max-default <double>
  The replacement for values larger than the maximum threshold.
  (default Double.MAX_VALUE)
 
 -closeto <double>
  The value with respect to which closeness is determined. (default 0)
 
 -closeto-default <double>
  The replacement for values that are too close to '-closeto'.
  (default 0)
 
 -closeto-tolerance <double>
  The tolerance for testing whether a value is too close. (default 1E-6)
 
 -decimals <int>
  The number of decimals to round to, -1 means no rounding at all.
  (default -1)
 
 -R <col1,col2,...>
  The list of columns to cleanse, e.g., first-last or first-3,5-last.
  (default first-last)
 
 -V
  Inverts the matching sense.
 
 -include-class
  Whether to include the class in the cleansing.
  The class column will always be skipped if this flag is not
  present. (default no)
 
Version:
$Revision: 14508 $
Author:
fracpete (fracpete at waikato dot ac dot nz)
See Also:
  • Constructor Details

    • NumericCleaner

      public NumericCleaner()
  • Method Details

    • globalInfo

      public String globalInfo()
      Returns a string describing this filter.
      Specified by:
      globalInfo in class SimpleFilter
      Returns:
      a description of the filter suitable for displaying in the explorer/experimenter gui
    • listOptions

      public Enumeration<Option> listOptions()
      Returns an enumeration describing the available options.
      Specified by:
      listOptions in interface OptionHandler
      Overrides:
      listOptions in class Filter
      Returns:
      an enumeration of all the available options.
    • getOptions

      public String[] getOptions()
      Gets the current settings of the filter.
      Specified by:
      getOptions in interface OptionHandler
      Overrides:
      getOptions in class Filter
      Returns:
      an array of strings suitable for passing to setOptions
    • setOptions

      public void setOptions(String[] options) throws Exception
      Parses a given list of options.

      Valid options are:

       -output-debug-info
        Turns on output of debugging information.
       
       -min <double>
        The minimum threshold. (default -Double.MAX_VALUE)
       
       -min-default <double>
        The minimum threshold below which values are replaced by the corresponding default.
        (default -Double.MAX_VALUE)
       
       -max <double>
        The maximum threshold above which values are replaced by the corresponding default.
        (default Double.MAX_VALUE)
       
       -max-default <double>
        The replacement for values larger than the maximum threshold.
        (default Double.MAX_VALUE)
       
       -closeto <double>
        The value with respect to which closeness is determined. (default 0)
       
       -closeto-default <double>
        The replacement for values that are too close to '-closeto'.
        (default 0)
       
       -closeto-tolerance <double>
        The tolerance for testing whether a value is too close. (default 1E-6)
       
       -decimals <int>
        The number of decimals to round to, -1 means no rounding at all.
        (default -1)
       
       -R <col1,col2,...>
        The list of columns to cleanse, e.g., first-last or first-3,5-last.
        (default first-last)
       
       -V
        Inverts the matching sense.
       
       -include-class
        Whether to include the class in the cleansing.
        The class column will always be skipped if this flag is not
        present. (default no)
       
      Specified by:
      setOptions in interface OptionHandler
      Overrides:
      setOptions in class Filter
      Parameters:
      options - the list of options as an array of strings
      Throws:
      Exception - if an option is not supported
    • getCapabilities

      public Capabilities getCapabilities()
      Returns the Capabilities of this filter.
      Specified by:
      getCapabilities in interface CapabilitiesHandler
      Overrides:
      getCapabilities in class Filter
      Returns:
      the capabilities of this object
      See Also:
    • minThresholdTipText

      public String minThresholdTipText()
      Returns the tip text for this property
      Returns:
      tip text for this property suitable for displaying in the explorer/experimenter gui
    • getMinThreshold

      public double getMinThreshold()
      Get the minimum threshold.
      Returns:
      the minimum threshold.
    • setMinThreshold

      public void setMinThreshold(double value)
      Set the minimum threshold.
      Parameters:
      value - the minimum threshold to use.
    • minDefaultTipText

      public String minDefaultTipText()
      Returns the tip text for this property
      Returns:
      tip text for this property suitable for displaying in the explorer/experimenter gui
    • getMinDefault

      public double getMinDefault()
      Get the minimum default.
      Returns:
      the minimum default.
    • setMinDefault

      public void setMinDefault(double value)
      Set the minimum default.
      Parameters:
      value - the minimum default to use.
    • maxThresholdTipText

      public String maxThresholdTipText()
      Returns the tip text for this property
      Returns:
      tip text for this property suitable for displaying in the explorer/experimenter gui
    • getMaxThreshold

      public double getMaxThreshold()
      Get the maximum threshold.
      Returns:
      the maximum threshold.
    • setMaxThreshold

      public void setMaxThreshold(double value)
      Set the maximum threshold.
      Parameters:
      value - the maximum threshold to use.
    • maxDefaultTipText

      public String maxDefaultTipText()
      Returns the tip text for this property
      Returns:
      tip text for this property suitable for displaying in the explorer/experimenter gui
    • getMaxDefault

      public double getMaxDefault()
      Get the maximum default.
      Returns:
      the maximum default.
    • setMaxDefault

      public void setMaxDefault(double value)
      Set the naximum default.
      Parameters:
      value - the maximum default to use.
    • closeToTipText

      public String closeToTipText()
      Returns the tip text for this property
      Returns:
      tip text for this property suitable for displaying in the explorer/experimenter gui
    • getCloseTo

      public double getCloseTo()
      Get the "close to" number.
      Returns:
      the "close to" number.
    • setCloseTo

      public void setCloseTo(double value)
      Set the "close to" number.
      Parameters:
      value - the number to use for checking closeness.
    • closeToDefaultTipText

      public String closeToDefaultTipText()
      Returns the tip text for this property
      Returns:
      tip text for this property suitable for displaying in the explorer/experimenter gui
    • getCloseToDefault

      public double getCloseToDefault()
      Get the "close to" default.
      Returns:
      the "close to" default.
    • setCloseToDefault

      public void setCloseToDefault(double value)
      Set the "close to" default.
      Parameters:
      value - the "close to" default to use.
    • closeToToleranceTipText

      public String closeToToleranceTipText()
      Returns the tip text for this property
      Returns:
      tip text for this property suitable for displaying in the explorer/experimenter gui
    • getCloseToTolerance

      public double getCloseToTolerance()
      Get the "close to" Tolerance.
      Returns:
      the "close to" Tolerance.
    • setCloseToTolerance

      public void setCloseToTolerance(double value)
      Set the "close to" Tolerance.
      Parameters:
      value - the "close to" Tolerance to use.
    • attributeIndicesTipText

      public String attributeIndicesTipText()
      Returns the tip text for this property
      Returns:
      tip text for this property suitable for displaying in the explorer/experimenter gui
    • getAttributeIndices

      public String getAttributeIndices()
      Gets the selection of the columns, e.g., first-last or first-3,5-last
      Returns:
      the selected indices
    • setAttributeIndices

      public void setAttributeIndices(String value)
      Sets the columns to use, e.g., first-last or first-3,5-last
      Parameters:
      value - the columns to use
    • invertSelectionTipText

      public String invertSelectionTipText()
      Returns the tip text for this property
      Returns:
      tip text for this property suitable for displaying in the explorer/experimenter gui
    • getInvertSelection

      public boolean getInvertSelection()
      Gets whether the selection of the columns is inverted
      Returns:
      true if the selection is inverted
    • setInvertSelection

      public void setInvertSelection(boolean value)
      Sets whether the selection of the indices is inverted or not
      Parameters:
      value - the new invert setting
    • includeClassTipText

      public String includeClassTipText()
      Returns the tip text for this property
      Returns:
      tip text for this property suitable for displaying in the explorer/experimenter gui
    • getIncludeClass

      public boolean getIncludeClass()
      Gets whether the class is included in the cleaning process or always skipped.
      Returns:
      true if the class can be considered for cleaning.
    • setIncludeClass

      public void setIncludeClass(boolean value)
      Sets whether the class can be cleaned, too.
      Parameters:
      value - true if the class can be cleansed, too
    • decimalsTipText

      public String decimalsTipText()
      Returns the tip text for this property
      Returns:
      tip text for this property suitable for displaying in the explorer/experimenter gui
    • getDecimals

      public int getDecimals()
      Get the number of decimals to round to.
      Returns:
      the number of decimals.
    • setDecimals

      public void setDecimals(int value)
      Set the number of decimals to round to.
      Parameters:
      value - the number of decimals.
    • getRevision

      public String getRevision()
      Returns the revision string.
      Specified by:
      getRevision in interface RevisionHandler
      Overrides:
      getRevision in class Filter
      Returns:
      the revision
    • main

      public static void main(String[] args)
      Runs the filter from commandline, use "-h" to see all options.
      Parameters:
      args - the commandline options for the filter