Class Filter
- All Implemented Interfaces:
Serializable
,CapabilitiesHandler
,CapabilitiesIgnorer
,CommandlineRunnable
,OptionHandler
,RevisionHandler
- Direct Known Subclasses:
AbstractTimeSeries
,Add
,AddCluster
,AddExpression
,AddID
,AddNoise
,AddUserFields
,AddValues
,AllFilter
,AttributeSelection
,ChangeDateFormat
,ClassOrder
,ClusterMembership
,Copy
,Discretize
,FirstOrder
,MakeIndicator
,MergeTwoValues
,NominalToBinary
,NominalToBinary
,NominalToString
,NonSparseToSparse
,NumericTransform
,Obfuscate
,PartitionMembership
,PotentialClassIgnorer
,PrincipalComponents
,Randomize
,RandomProjection
,Remove
,RemoveFolds
,RemoveFrequentValues
,RemoveMisclassified
,RemovePercentage
,RemoveRange
,RemoveType
,RemoveUseless
,RemoveWithValues
,RenameNominalValues
,RenameRelation
,Reorder
,Resample
,Resample
,ReservoirSample
,SimpleFilter
,SparseToNonSparse
,SpreadSubsample
,StratifiedRemoveFolds
,StringToNominal
,StringToWordVector
,SwapValues
A simple example of filter use. This example doesn't remove instances from the output queue until all instances have been input, so has higher memory consumption than an approach that uses output instances as they are made available:
Filter filter = ..some type of filter..
Instances instances = ..some instances..
for (int i = 0; i < data.numInstances(); i++) {
filter.input(data.instance(i));
}
filter.batchFinished();
Instances newData = filter.outputFormat();
Instance processed;
while ((processed = filter.output()) != null) {
newData.add(processed);
}
..do something with newData..
- Version:
- $Revision: 14804 $
- Author:
- Len Trigg (trigg@cs.waikato.ac.nz)
- See Also:
-
Constructor Summary
-
Method Summary
Modifier and TypeMethodDescriptionstatic void
batchFilterFile
(Filter filter, String[] options) Method for testing filters ability to process multiple batches.boolean
Signify that this batch of input to the filter is finished.Returns the tip text for this propertyReturns the tip text for this propertystatic void
filterFile
(Filter filter, String[] options) Method for testing filters.Returns the Capabilities of this filter.getCapabilities
(Instances data) Returns the Capabilities of this filter, customized based on the data.Gets a copy of just the structure of the input format instances.boolean
getDebug()
Get whether debugging is turned on.boolean
Get whether capabilities checking is turned off.String[]
Gets the current settings of the filter.Gets the format of the output instances.Returns the revision string.boolean
Input an instance for filtering.boolean
Returns true if the first batch of instances got processed.boolean
Returns true if the a new batch was started, either a new instance of the filter was created or the batchFinished() method got called.boolean
Returns whether the output format is ready to be collectedReturns an enumeration describing the available options.static void
Main method for testing this class.static Filter[]
makeCopies
(Filter model, int num) Creates a given number of deep copies of the given filter using serialization.static Filter
Creates a deep copy of the given filter using serialization.boolean
Default implementation returns false.int
Returns the number of instances pending outputoutput()
Output an instance after filtering and remove from the output queue.Output an instance after filtering but do not remove from the output queue.void
Perform any teardown stuff that might need to happen after execution.void
Perform any setup stuff that might need to happen before commandline execution.void
Execute the supplied object.static void
runs the filter instance with the given options.void
setDebug
(boolean debug) Set debugging mode.void
setDoNotCheckCapabilities
(boolean doNotCheckCapabilities) Set whether not to check capabilities.boolean
setInputFormat
(Instances instanceInfo) Sets the format of the input instances.void
setOptions
(String[] options) Parses a given list of options.toString()
Returns a description of the filter, by default only the classname.static Instances
Filters an entire set of instances through a filter and returns the new set.static String
wekaStaticWrapper
(Sourcable filter, String className, Instances input, Instances output) generates source code from the filter
-
Constructor Details
-
Filter
public Filter()
-
-
Method Details
-
isNewBatch
public boolean isNewBatch()Returns true if the a new batch was started, either a new instance of the filter was created or the batchFinished() method got called.- Returns:
- true if a new batch has been initiated
- See Also:
-
m_NewBatch
batchFinished()
-
isFirstBatchDone
public boolean isFirstBatchDone()Returns true if the first batch of instances got processed. Necessary for supervised filters, which "learn" from the first batch and then shouldn't get updated with subsequent calls of batchFinished().- Returns:
- true if the first batch has been processed
- See Also:
-
m_FirstBatchDone
batchFinished()
-
mayRemoveInstanceAfterFirstBatchDone
public boolean mayRemoveInstanceAfterFirstBatchDone()Default implementation returns false. Some filters may not necessarily be able to produce an instance for output for every instance input after the first batch has been completed - such filters should override this method and return true.- Returns:
- false by default
-
getCapabilities
Returns the Capabilities of this filter. Derived filters have to override this method to enable capabilities.- Specified by:
getCapabilities
in interfaceCapabilitiesHandler
- Returns:
- the capabilities of this object
- See Also:
-
getRevision
Returns the revision string.- Specified by:
getRevision
in interfaceRevisionHandler
- Returns:
- the revision
-
getCapabilities
Returns the Capabilities of this filter, customized based on the data. I.e., if removes all class capabilities, in case there's not class attribute present or removes the NO_CLASS capability, in case that there's a class present.- Parameters:
data
- the data to use for customization- Returns:
- the capabilities of this object, based on the data
- See Also:
-
getCopyOfInputFormat
Gets a copy of just the structure of the input format instances.- Returns:
- a copy of the structure (attribute information) of the input format instances
-
setInputFormat
Sets the format of the input instances. If the filter is able to determine the output format before seeing any input instances, it does so here. This default implementation clears the output format and output queue, and the new batch flag is set. Overriders should callsuper.setInputFormat(Instances)
- Parameters:
instanceInfo
- an Instances object containing the input instance structure (any instances contained in the object are ignored - only the structure is required).- Returns:
- true if the outputFormat may be collected immediately
- Throws:
Exception
- if the inputFormat can't be set successfully
-
getOutputFormat
Gets the format of the output instances. This should only be called after input() or batchFinished() has returned true. The relation name of the output instances should be changed to reflect the action of the filter (eg: add the filter name and options).- Returns:
- an Instances object containing the output instance structure only.
- Throws:
NullPointerException
- if no input structure has been defined (or the output format hasn't been determined yet)
-
input
Input an instance for filtering. Ordinarily the instance is processed and made available for output immediately. Some filters require all instances be read before producing output, in which case output instances should be collected after calling batchFinished(). If the input marks the start of a new batch, the output queue is cleared. This default implementation assumes all instance conversion will occur when batchFinished() is called.- Parameters:
instance
- the input instance- Returns:
- true if the filtered instance may now be collected with output().
- Throws:
NullPointerException
- if the input format has not been defined.Exception
- if the input instance was not of the correct format or if there was a problem with the filtering.
-
batchFinished
Signify that this batch of input to the filter is finished. If the filter requires all instances prior to filtering, output() may now be called to retrieve the filtered instances. Any subsequent instances filtered should be filtered based on setting obtained from the first batch (unless the inputFormat has been re-assigned or new options have been set). This default implementation assumes all instance processing occurs during inputFormat() and input().- Returns:
- true if there are instances pending output
- Throws:
NullPointerException
- if no input structure has been defined,Exception
- if there was a problem finishing the batch.
-
output
Output an instance after filtering and remove from the output queue.- Returns:
- the instance that has most recently been filtered (or null if the queue is empty).
- Throws:
NullPointerException
- if no output structure has been defined
-
outputPeek
Output an instance after filtering but do not remove from the output queue.- Returns:
- the instance that has most recently been filtered (or null if the queue is empty).
- Throws:
NullPointerException
- if no input structure has been defined
-
numPendingOutput
public int numPendingOutput()Returns the number of instances pending output- Returns:
- the number of instances pending output
- Throws:
NullPointerException
- if no input structure has been defined
-
isOutputFormatDefined
public boolean isOutputFormatDefined()Returns whether the output format is ready to be collected- Returns:
- true if the output format is set
-
makeCopy
Creates a deep copy of the given filter using serialization.- Parameters:
model
- the filter to copy- Returns:
- a deep copy of the filter
- Throws:
Exception
- if an error occurs
-
makeCopies
Creates a given number of deep copies of the given filter using serialization.- Parameters:
model
- the filter to copynum
- the number of filter copies to create.- Returns:
- an array of filters.
- Throws:
Exception
- if an error occurs
-
useFilter
Filters an entire set of instances through a filter and returns the new set.- Parameters:
data
- the data to be filteredfilter
- the filter to be used- Returns:
- the filtered set of data
- Throws:
Exception
- if the filter can't be used successfully
-
toString
Returns a description of the filter, by default only the classname. -
wekaStaticWrapper
public static String wekaStaticWrapper(Sourcable filter, String className, Instances input, Instances output) throws Exception generates source code from the filter- Parameters:
filter
- the filter to output as sourceclassName
- the name of the generated classinput
- the input data the header is generated foroutput
- the output data the header is generated for- Returns:
- the generated source code
- Throws:
Exception
- if source code cannot be generated
-
filterFile
Method for testing filters.- Parameters:
filter
- the filter to useoptions
- should contain the following arguments:
-i input_file
-o output_file
-c class_index
-z classname (for filters implementing weka.filters.Sourcable)
-decimal num (the number of decimal places to use in the output; default = 6)
or -h for help on options- Throws:
Exception
- if something goes wrong or the user requests help on command options
-
batchFilterFile
Method for testing filters ability to process multiple batches.- Parameters:
filter
- the filter to useoptions
- should contain the following arguments:
-i (first) input file
-o (first) output file
-r (second) input file
-s (second) output file
-c class_index
-z classname (for filters implementing weka.filters.Sourcable)
-decimal num (the number of decimal places to use in the output; default = 6)
or -h for help on options- Throws:
Exception
- if something goes wrong or the user requests help on command options
-
runFilter
runs the filter instance with the given options.- Parameters:
filter
- the filter to runoptions
- the commandline options
-
listOptions
Returns an enumeration describing the available options.- Specified by:
listOptions
in interfaceOptionHandler
- Returns:
- an enumeration of all the available options.
-
setOptions
Parses a given list of options. Valid options are:-D
If set, filter is run in debug mode and may output additional info to the console.-do-not-check-capabilities
If set, filter capabilities are not checked before filter is built (use with caution).- Specified by:
setOptions
in interfaceOptionHandler
- Parameters:
options
- the list of options as an array of strings- Throws:
Exception
- if an option is not supported
-
getOptions
Gets the current settings of the filter.- Specified by:
getOptions
in interfaceOptionHandler
- Returns:
- an array of strings suitable for passing to setOptions
-
setDebug
public void setDebug(boolean debug) Set debugging mode.- Parameters:
debug
- true if debug output should be printed
-
getDebug
public boolean getDebug()Get whether debugging is turned on.- Returns:
- true if debugging output is on
-
debugTipText
Returns the tip text for this property- Returns:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
setDoNotCheckCapabilities
public void setDoNotCheckCapabilities(boolean doNotCheckCapabilities) Set whether not to check capabilities.- Specified by:
setDoNotCheckCapabilities
in interfaceCapabilitiesIgnorer
- Parameters:
doNotCheckCapabilities
- true if capabilities are not to be checked.
-
getDoNotCheckCapabilities
public boolean getDoNotCheckCapabilities()Get whether capabilities checking is turned off.- Specified by:
getDoNotCheckCapabilities
in interfaceCapabilitiesIgnorer
- Returns:
- true if capabilities checking is turned off.
-
doNotCheckCapabilitiesTipText
Returns the tip text for this property- Returns:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
preExecution
Perform any setup stuff that might need to happen before commandline execution. Subclasses should override if they need to do something here- Specified by:
preExecution
in interfaceCommandlineRunnable
- Throws:
Exception
- if a problem occurs during setup
-
run
Execute the supplied object.- Specified by:
run
in interfaceCommandlineRunnable
- Parameters:
toRun
- the object to executeoptions
- any options to pass to the object- Throws:
Exception
- if the object is not of the expected type.
-
postExecution
Perform any teardown stuff that might need to happen after execution. Subclasses should override if they need to do something here- Specified by:
postExecution
in interfaceCommandlineRunnable
- Throws:
Exception
- if a problem occurs during teardown
-
main
Main method for testing this class.- Parameters:
args
- should contain arguments to the filter: use -h for help
-