Class Resample

java.lang.Object
weka.filters.Filter
weka.filters.supervised.instance.Resample
All Implemented Interfaces:
Serializable, CapabilitiesHandler, CapabilitiesIgnorer, CommandlineRunnable, OptionHandler, Randomizable, RevisionHandler, WeightedAttributesHandler, SupervisedFilter

public class Resample extends Filter implements SupervisedFilter, OptionHandler, Randomizable, WeightedAttributesHandler
Produces a random subsample of a dataset using either sampling with replacement or without replacement.
The original dataset must fit entirely in memory. The number of instances in the generated dataset may be specified. The dataset must have a nominal class attribute. If not, use the unsupervised version. The filter can be made to maintain the class distribution in the subsample, or to bias the class distribution toward a uniform distribution. When used in batch mode (i.e. in the FilteredClassifier), subsequent batches are NOT resampled.

Valid options are:

 -S <num>
  Specify the random number seed (default 1)
 
 -Z <num>
  The size of the output dataset, as a percentage of
  the input dataset (default 100)
 
 -B <num>
  Bias factor towards uniform class distribution.
  0 = distribution in input data -- 1 = uniform distribution.
  (default 0)
 
 -no-replacement
  Disables replacement of instances
  (default: with replacement)
 
 -V
  Inverts the selection - only available with '-no-replacement'.
 
Version:
$Revision: 15265 $
Author:
Len Trigg (len@reeltwo.com), FracPete (fracpete at waikato dot ac dot nz), Eibe Frank
See Also:
  • Constructor Details

    • Resample

      public Resample()
  • Method Details

    • globalInfo

      public String globalInfo()
      Returns a string describing this filter.
      Returns:
      a description of the filter suitable for displaying in the explorer/experimenter gui
    • listOptions

      public Enumeration<Option> listOptions()
      Returns an enumeration describing the available options.
      Specified by:
      listOptions in interface OptionHandler
      Overrides:
      listOptions in class Filter
      Returns:
      an enumeration of all the available options.
    • setOptions

      public void setOptions(String[] options) throws Exception
      Parses a given list of options.

      Valid options are:

       -S <num>
        Specify the random number seed (default 1)
       
       -Z <num>
        The size of the output dataset, as a percentage of
        the input dataset (default 100)
       
       -B <num>
        Bias factor towards uniform class distribution.
        0 = distribution in input data -- 1 = uniform distribution.
        (default 0)
       
       -no-replacement
        Disables replacement of instances
        (default: with replacement)
       
       -V
        Inverts the selection - only available with '-no-replacement'.
       
      Specified by:
      setOptions in interface OptionHandler
      Overrides:
      setOptions in class Filter
      Parameters:
      options - the list of options as an array of strings
      Throws:
      Exception - if an option is not supported
    • getOptions

      public String[] getOptions()
      Gets the current settings of the filter.
      Specified by:
      getOptions in interface OptionHandler
      Overrides:
      getOptions in class Filter
      Returns:
      an array of strings suitable for passing to setOptions
    • biasToUniformClassTipText

      public String biasToUniformClassTipText()
      Returns the tip text for this property.
      Returns:
      tip text for this property suitable for displaying in the explorer/experimenter gui
    • getBiasToUniformClass

      public double getBiasToUniformClass()
      Gets the bias towards a uniform class. A value of 0 leaves the class distribution as-is, a value of 1 ensures the class distributions are uniform in the output data.
      Returns:
      the current bias
    • setBiasToUniformClass

      public void setBiasToUniformClass(double newBiasToUniformClass)
      Sets the bias towards a uniform class. A value of 0 leaves the class distribution as-is, a value of 1 ensures the class distributions are uniform in the output data.
      Parameters:
      newBiasToUniformClass - the new bias value, between 0 and 1.
    • randomSeedTipText

      public String randomSeedTipText()
      Returns the tip text for this property.
      Returns:
      tip text for this property suitable for displaying in the explorer/experimenter gui
    • getRandomSeed

      public int getRandomSeed()
      Gets the random number seed.
      Returns:
      the random number seed.
    • setRandomSeed

      public void setRandomSeed(int newSeed)
      Sets the random number seed.
      Parameters:
      newSeed - the new random number seed.
    • setSeed

      @ProgrammaticProperty public void setSeed(int seed)
      Description copied from interface: Randomizable
      Set the seed for random number generation.
      Specified by:
      setSeed in interface Randomizable
      Parameters:
      seed - the seed
    • getSeed

      @ProgrammaticProperty public int getSeed()
      Description copied from interface: Randomizable
      Gets the seed for the random number generations
      Specified by:
      getSeed in interface Randomizable
      Returns:
      the seed for the random number generation
    • sampleSizePercentTipText

      public String sampleSizePercentTipText()
      Returns the tip text for this property.
      Returns:
      tip text for this property suitable for displaying in the explorer/experimenter gui
    • getSampleSizePercent

      public double getSampleSizePercent()
      Gets the subsample size as a percentage of the original set.
      Returns:
      the subsample size
    • setSampleSizePercent

      public void setSampleSizePercent(double newSampleSizePercent)
      Sets the size of the subsample, as a percentage of the original set.
      Parameters:
      newSampleSizePercent - the subsample set size, between 0 and 100.
    • noReplacementTipText

      public String noReplacementTipText()
      Returns the tip text for this property.
      Returns:
      tip text for this property suitable for displaying in the explorer/experimenter gui
    • getNoReplacement

      public boolean getNoReplacement()
      Gets whether instances are drawn with or without replacement.
      Returns:
      true if the replacement is disabled
    • setNoReplacement

      public void setNoReplacement(boolean value)
      Sets whether instances are drawn with or with out replacement.
      Parameters:
      value - if true then the replacement of instances is disabled
    • invertSelectionTipText

      public String invertSelectionTipText()
      Returns the tip text for this property.
      Returns:
      tip text for this property suitable for displaying in the explorer/experimenter gui
    • getInvertSelection

      public boolean getInvertSelection()
      Gets whether selection is inverted (only if instances are drawn WIHTOUT replacement).
      Returns:
      true if the replacement is disabled
      See Also:
      • m_NoReplacement
    • setInvertSelection

      public void setInvertSelection(boolean value)
      Sets whether the selection is inverted (only if instances are drawn WIHTOUT replacement).
      Parameters:
      value - if true then selection is inverted
    • getCapabilities

      public Capabilities getCapabilities()
      Returns the Capabilities of this filter.
      Specified by:
      getCapabilities in interface CapabilitiesHandler
      Overrides:
      getCapabilities in class Filter
      Returns:
      the capabilities of this object
      See Also:
    • setInputFormat

      public boolean setInputFormat(Instances instanceInfo) throws Exception
      Sets the format of the input instances.
      Overrides:
      setInputFormat in class Filter
      Parameters:
      instanceInfo - an Instances object containing the input instance structure (any instances contained in the object are ignored - only the structure is required).
      Returns:
      true if the outputFormat may be collected immediately
      Throws:
      Exception - if the input format can't be set successfully
    • input

      public boolean input(Instance instance)
      Input an instance for filtering. Filter requires all training instances be read before producing output.
      Overrides:
      input in class Filter
      Parameters:
      instance - the input instance
      Returns:
      true if the filtered instance may now be collected with output().
      Throws:
      IllegalStateException - if no input structure has been defined
    • batchFinished

      public boolean batchFinished()
      Signify that this batch of input to the filter is finished. If the filter requires all instances prior to filtering, output() may now be called to retrieve the filtered instances.
      Overrides:
      batchFinished in class Filter
      Returns:
      true if there are instances pending output
      Throws:
      IllegalStateException - if no input structure has been defined
    • getRevision

      public String getRevision()
      Returns the revision string.
      Specified by:
      getRevision in interface RevisionHandler
      Overrides:
      getRevision in class Filter
      Returns:
      the revision
    • main

      public static void main(String[] argv)
      Main method for testing this class.
      Parameters:
      argv - should contain arguments to the filter: use -h for help