Class NumericCleaner
java.lang.Object
weka.filters.Filter
weka.filters.SimpleFilter
weka.filters.SimpleStreamFilter
weka.filters.unsupervised.attribute.NumericCleaner
- All Implemented Interfaces:
Serializable
,CapabilitiesHandler
,CapabilitiesIgnorer
,CommandlineRunnable
,OptionHandler
,RevisionHandler
,WeightedAttributesHandler
,WeightedInstancesHandler
,StreamableFilter
public class NumericCleaner
extends SimpleStreamFilter
implements WeightedAttributesHandler, WeightedInstancesHandler
A filter that 'cleanses' the numeric data from
values that are too small, too big or very close to a certain value,
and sets these values to a pre-defined default.
Valid options are:
-output-debug-info Turns on output of debugging information.
-min <double> The minimum threshold. (default -Double.MAX_VALUE)
-min-default <double> The minimum threshold below which values are replaced by the corresponding default. (default -Double.MAX_VALUE)
-max <double> The maximum threshold above which values are replaced by the corresponding default. (default Double.MAX_VALUE)
-max-default <double> The replacement for values larger than the maximum threshold. (default Double.MAX_VALUE)
-closeto <double> The value with respect to which closeness is determined. (default 0)
-closeto-default <double> The replacement for values that are too close to '-closeto'. (default 0)
-closeto-tolerance <double> The tolerance for testing whether a value is too close. (default 1E-6)
-decimals <int> The number of decimals to round to, -1 means no rounding at all. (default -1)
-R <col1,col2,...> The list of columns to cleanse, e.g., first-last or first-3,5-last. (default first-last)
-V Inverts the matching sense.
-include-class Whether to include the class in the cleansing. The class column will always be skipped if this flag is not present. (default no)
- Version:
- $Revision: 14508 $
- Author:
- fracpete (fracpete at waikato dot ac dot nz)
- See Also:
-
Constructor Summary
-
Method Summary
Modifier and TypeMethodDescriptionReturns the tip text for this propertyReturns the tip text for this propertyReturns the tip text for this propertyReturns the tip text for this propertyReturns the tip text for this propertyGets the selection of the columns, e.g., first-last or first-3,5-lastReturns the Capabilities of this filter.double
Get the "close to" number.double
Get the "close to" default.double
Get the "close to" Tolerance.int
Get the number of decimals to round to.boolean
Gets whether the class is included in the cleaning process or always skipped.boolean
Gets whether the selection of the columns is inverteddouble
Get the maximum default.double
Get the maximum threshold.double
Get the minimum default.double
Get the minimum threshold.String[]
Gets the current settings of the filter.Returns the revision string.Returns a string describing this filter.Returns the tip text for this propertyReturns the tip text for this propertyReturns an enumeration describing the available options.static void
Runs the filter from commandline, use "-h" to see all options.Returns the tip text for this propertyReturns the tip text for this propertyReturns the tip text for this propertyReturns the tip text for this propertyvoid
setAttributeIndices
(String value) Sets the columns to use, e.g., first-last or first-3,5-lastvoid
setCloseTo
(double value) Set the "close to" number.void
setCloseToDefault
(double value) Set the "close to" default.void
setCloseToTolerance
(double value) Set the "close to" Tolerance.void
setDecimals
(int value) Set the number of decimals to round to.void
setIncludeClass
(boolean value) Sets whether the class can be cleaned, too.void
setInvertSelection
(boolean value) Sets whether the selection of the indices is inverted or notvoid
setMaxDefault
(double value) Set the naximum default.void
setMaxThreshold
(double value) Set the maximum threshold.void
setMinDefault
(double value) Set the minimum default.void
setMinThreshold
(double value) Set the minimum threshold.void
setOptions
(String[] options) Parses a given list of options.Methods inherited from class weka.filters.SimpleStreamFilter
batchFinished, input
Methods inherited from class weka.filters.SimpleFilter
setInputFormat
Methods inherited from class weka.filters.Filter
batchFilterFile, debugTipText, doNotCheckCapabilitiesTipText, filterFile, getCapabilities, getCopyOfInputFormat, getDebug, getDoNotCheckCapabilities, getOutputFormat, isFirstBatchDone, isNewBatch, isOutputFormatDefined, makeCopies, makeCopy, mayRemoveInstanceAfterFirstBatchDone, numPendingOutput, output, outputPeek, postExecution, preExecution, run, runFilter, setDebug, setDoNotCheckCapabilities, toString, useFilter, wekaStaticWrapper
-
Constructor Details
-
NumericCleaner
public NumericCleaner()
-
-
Method Details
-
globalInfo
Returns a string describing this filter.- Specified by:
globalInfo
in classSimpleFilter
- Returns:
- a description of the filter suitable for displaying in the explorer/experimenter gui
-
listOptions
Returns an enumeration describing the available options.- Specified by:
listOptions
in interfaceOptionHandler
- Overrides:
listOptions
in classFilter
- Returns:
- an enumeration of all the available options.
-
getOptions
Gets the current settings of the filter.- Specified by:
getOptions
in interfaceOptionHandler
- Overrides:
getOptions
in classFilter
- Returns:
- an array of strings suitable for passing to setOptions
-
setOptions
Parses a given list of options. Valid options are:-output-debug-info Turns on output of debugging information.
-min <double> The minimum threshold. (default -Double.MAX_VALUE)
-min-default <double> The minimum threshold below which values are replaced by the corresponding default. (default -Double.MAX_VALUE)
-max <double> The maximum threshold above which values are replaced by the corresponding default. (default Double.MAX_VALUE)
-max-default <double> The replacement for values larger than the maximum threshold. (default Double.MAX_VALUE)
-closeto <double> The value with respect to which closeness is determined. (default 0)
-closeto-default <double> The replacement for values that are too close to '-closeto'. (default 0)
-closeto-tolerance <double> The tolerance for testing whether a value is too close. (default 1E-6)
-decimals <int> The number of decimals to round to, -1 means no rounding at all. (default -1)
-R <col1,col2,...> The list of columns to cleanse, e.g., first-last or first-3,5-last. (default first-last)
-V Inverts the matching sense.
-include-class Whether to include the class in the cleansing. The class column will always be skipped if this flag is not present. (default no)
- Specified by:
setOptions
in interfaceOptionHandler
- Overrides:
setOptions
in classFilter
- Parameters:
options
- the list of options as an array of strings- Throws:
Exception
- if an option is not supported
-
getCapabilities
Returns the Capabilities of this filter.- Specified by:
getCapabilities
in interfaceCapabilitiesHandler
- Overrides:
getCapabilities
in classFilter
- Returns:
- the capabilities of this object
- See Also:
-
minThresholdTipText
Returns the tip text for this property- Returns:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
getMinThreshold
public double getMinThreshold()Get the minimum threshold.- Returns:
- the minimum threshold.
-
setMinThreshold
public void setMinThreshold(double value) Set the minimum threshold.- Parameters:
value
- the minimum threshold to use.
-
minDefaultTipText
Returns the tip text for this property- Returns:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
getMinDefault
public double getMinDefault()Get the minimum default.- Returns:
- the minimum default.
-
setMinDefault
public void setMinDefault(double value) Set the minimum default.- Parameters:
value
- the minimum default to use.
-
maxThresholdTipText
Returns the tip text for this property- Returns:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
getMaxThreshold
public double getMaxThreshold()Get the maximum threshold.- Returns:
- the maximum threshold.
-
setMaxThreshold
public void setMaxThreshold(double value) Set the maximum threshold.- Parameters:
value
- the maximum threshold to use.
-
maxDefaultTipText
Returns the tip text for this property- Returns:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
getMaxDefault
public double getMaxDefault()Get the maximum default.- Returns:
- the maximum default.
-
setMaxDefault
public void setMaxDefault(double value) Set the naximum default.- Parameters:
value
- the maximum default to use.
-
closeToTipText
Returns the tip text for this property- Returns:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
getCloseTo
public double getCloseTo()Get the "close to" number.- Returns:
- the "close to" number.
-
setCloseTo
public void setCloseTo(double value) Set the "close to" number.- Parameters:
value
- the number to use for checking closeness.
-
closeToDefaultTipText
Returns the tip text for this property- Returns:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
getCloseToDefault
public double getCloseToDefault()Get the "close to" default.- Returns:
- the "close to" default.
-
setCloseToDefault
public void setCloseToDefault(double value) Set the "close to" default.- Parameters:
value
- the "close to" default to use.
-
closeToToleranceTipText
Returns the tip text for this property- Returns:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
getCloseToTolerance
public double getCloseToTolerance()Get the "close to" Tolerance.- Returns:
- the "close to" Tolerance.
-
setCloseToTolerance
public void setCloseToTolerance(double value) Set the "close to" Tolerance.- Parameters:
value
- the "close to" Tolerance to use.
-
attributeIndicesTipText
Returns the tip text for this property- Returns:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
getAttributeIndices
Gets the selection of the columns, e.g., first-last or first-3,5-last- Returns:
- the selected indices
-
setAttributeIndices
Sets the columns to use, e.g., first-last or first-3,5-last- Parameters:
value
- the columns to use
-
invertSelectionTipText
Returns the tip text for this property- Returns:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
getInvertSelection
public boolean getInvertSelection()Gets whether the selection of the columns is inverted- Returns:
- true if the selection is inverted
-
setInvertSelection
public void setInvertSelection(boolean value) Sets whether the selection of the indices is inverted or not- Parameters:
value
- the new invert setting
-
includeClassTipText
Returns the tip text for this property- Returns:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
getIncludeClass
public boolean getIncludeClass()Gets whether the class is included in the cleaning process or always skipped.- Returns:
- true if the class can be considered for cleaning.
-
setIncludeClass
public void setIncludeClass(boolean value) Sets whether the class can be cleaned, too.- Parameters:
value
- true if the class can be cleansed, too
-
decimalsTipText
Returns the tip text for this property- Returns:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
getDecimals
public int getDecimals()Get the number of decimals to round to.- Returns:
- the number of decimals.
-
setDecimals
public void setDecimals(int value) Set the number of decimals to round to.- Parameters:
value
- the number of decimals.
-
getRevision
Returns the revision string.- Specified by:
getRevision
in interfaceRevisionHandler
- Overrides:
getRevision
in classFilter
- Returns:
- the revision
-
main
Runs the filter from commandline, use "-h" to see all options.- Parameters:
args
- the commandline options for the filter
-