Package weka.filters
Class SimpleBatchFilter
java.lang.Object
weka.filters.Filter
weka.filters.SimpleFilter
weka.filters.SimpleBatchFilter
- All Implemented Interfaces:
Serializable
,CapabilitiesHandler
,CapabilitiesIgnorer
,CommandlineRunnable
,OptionHandler
,RevisionHandler
- Direct Known Subclasses:
AddClassification
,CartesianProduct
,ClassBalancer
,ClassConditionalProbabilities
,DateToNumeric
,InterquartileRange
,KernelFilter
,MergeInfrequentNominalValues
,MergeNominalValues
,NumericToDate
,NumericToNominal
,PartitionedMultiFilter
,RandomSubset
,RemoveDuplicates
,ReplaceWithMissingValue
,SubsetByExpression
,Transpose
This filter is a superclass for simple batch filters.
General notes:
The following code snippet uses the filter
Only the following abstract methods need to be implemented:
And the getCapabilities() method must return what kind of attributes and classes the filter can handle. If more options are necessary, then the following methods need to be overriden:
Valid filter-specific options are: -D
Turns on output of debugging information.
- After adding instances to the filter via input(Instance) one always has to call batchFinished() to make them available via output().
- After the first call of batchFinished() the field m_FirstBatchDone is set
to
true
.
The following code snippet uses the filter
SomeFilter
on a
dataset that is loaded from filename
.
import weka.core.*; import weka.filters.*; import java.io.*; ... SomeFilter filter = new SomeFilter(); // set necessary options for the filter Instances data = new Instances( new BufferedReader( new FileReader(filename))); Instances filteredData = Filter.useFilter(data, filter);Implementation:
Only the following abstract methods need to be implemented:
- globalInfo()
- determineOutputFormat(Instances)
- process(Instances)
And the getCapabilities() method must return what kind of attributes and classes the filter can handle. If more options are necessary, then the following methods need to be overriden:
- listOptions()
- setOptions(String[])
- getOptions()
public static void main(String[] args) { runFilter(new <Filtername>(), args); }Example implementation:
import weka.core.*; import weka.core.Capabilities.*; import weka.filters.*; public class SimpleBatch extends SimpleBatchFilter { public String globalInfo() { return "A simple batch filter that adds an additional attribute 'bla' at the end containing the index of the processed instance."; } public Capabilities getCapabilities() { Capabilities result = super.getCapabilities(); result.enableAllAttributes(); result.enableAllClasses(); result.enable(Capability.NO_CLASS); // filter doesn't need class to be set return result; } protected Instances determineOutputFormat(Instances inputFormat) { Instances result = new Instances(inputFormat, 0); result.insertAttributeAt(new Attribute("bla"), result.numAttributes()); return result; } protected Instances process(Instances inst) { Instances result = new Instances(determineOutputFormat(inst), 0); for (int i = 0; i < inst.numInstances(); i++) { double[] values = new double[result.numAttributes()]; for (int n = 0; n < inst.numAttributes(); n++) values[n] = inst.instance(i).value(n); values[values.length - 1] = i; result.add(new DenseInstance(1, values)); } return result; } public static void main(String[] args) { runFilter(new SimpleBatch(), args); } }Options:
Valid filter-specific options are: -D
Turns on output of debugging information.
- Version:
- $Revision: 14804 $
- Author:
- FracPete (fracpete at waikato dot ac dot nz)
- See Also:
-
SimpleStreamFilter
input(Instance)
batchFinished()
Filter.m_FirstBatchDone
- Serialized Form
-
Constructor Summary
-
Method Summary
Modifier and TypeMethodDescriptionboolean
Returns whether to allow the determineOutputFormat(Instances) method access to the full dataset rather than just the header.boolean
Signify that this batch of input to the filter is finished.boolean
Input an instance for filtering.boolean
A version of the input(Instance) method that enables input of a whole dataset represented as an Instances object into the filter.Methods inherited from class weka.filters.SimpleFilter
globalInfo, setInputFormat
Methods inherited from class weka.filters.Filter
batchFilterFile, debugTipText, doNotCheckCapabilitiesTipText, filterFile, getCapabilities, getCapabilities, getCopyOfInputFormat, getDebug, getDoNotCheckCapabilities, getOptions, getOutputFormat, getRevision, isFirstBatchDone, isNewBatch, isOutputFormatDefined, listOptions, main, makeCopies, makeCopy, mayRemoveInstanceAfterFirstBatchDone, numPendingOutput, output, outputPeek, postExecution, preExecution, run, runFilter, setDebug, setDoNotCheckCapabilities, setOptions, toString, useFilter, wekaStaticWrapper
-
Constructor Details
-
SimpleBatchFilter
public SimpleBatchFilter()
-
-
Method Details
-
allowAccessToFullInputFormat
public boolean allowAccessToFullInputFormat()Returns whether to allow the determineOutputFormat(Instances) method access to the full dataset rather than just the header. Default implementation returns false.- Returns:
- whether determineOutputFormat has access to the full input dataset
-
input
Input an instance for filtering. Filter requires all training instances be read before producing output (calling the method batchFinished() makes the data available). If this instance is part of a new batch, m_NewBatch is set to false.- Overrides:
input
in classFilter
- Parameters:
instance
- the input instance- Returns:
- true if the filtered instance may now be collected with output().
- Throws:
IllegalStateException
- if no input structure has been definedException
- if something goes wrong- See Also:
-
input
A version of the input(Instance) method that enables input of a whole dataset represented as an Instances object into the filter. This method is more efficient when processing batches other than the first batch of data because it can apply the process(Instances) method to the full batch and does not have to process individual instances independently.- Parameters:
instances
- the input instances- Returns:
- true if the filtered instances may now be collected with output().
- Throws:
IllegalStateException
- if no input structure has been definedException
- if something goes wrong- See Also:
-
batchFinished
Signify that this batch of input to the filter is finished. If the filter requires all instances prior to filtering, output() may now be called to retrieve the filtered instances. Any subsequent instances filtered should be filtered based on setting obtained from the first batch (unless the setInputFormat has been re-assigned or new options have been set). Sets m_FirstBatchDone and m_NewBatch to true.- Overrides:
batchFinished
in classFilter
- Returns:
- true if there are instances pending output
- Throws:
IllegalStateException
- if no input format has been set.Exception
- if something goes wrong- See Also:
-
Filter.m_NewBatch
Filter.m_FirstBatchDone
-