Package weka.core

Class FilteredDistance

java.lang.Object
weka.core.FilteredDistance
All Implemented Interfaces:
Serializable, DistanceFunction, OptionHandler

public class FilteredDistance extends Object implements DistanceFunction, OptionHandler, Serializable
Applies the given filter before calling the given distance function.

Valid options are:

 -F
  The filter to use. (default: weka.unsupervised.attribute.RandomProjection
 -E
  The distance function to use. (default: weka.core.EuclideanDistance
 
 Options specific to filter weka.filters.unsupervised.attribute.RandomProjection:
 
 -N <number>
  The number of dimensions (attributes) the data should be reduced to
  (default 10; exclusive of the class attribute, if it is set).
 -D [SPARSE1|SPARSE2|GAUSSIAN]
  The distribution to use for calculating the random matrix.
  Sparse1 is:
    sqrt(3)*{-1 with prob(1/6), 0 with prob(2/3), +1 with prob(1/6)}
  Sparse2 is:
    {-1 with prob(1/2), +1 with prob(1/2)}
 
 -P <percent>
  The percentage of dimensions (attributes) the data should
  be reduced to (exclusive of the class attribute, if it is set). The -N
  option is ignored if this option is present and is greater
  than zero.
 -M
  Replace missing values using the ReplaceMissingValues filter
 -R <num>
  The random seed for the random number generator used for
  calculating the random matrix (default 42).
 
 Options specific to distance function weka.core.EuclideanDistance:
 
 -D
  Turns off the normalization of attribute 
  values in distance calculation.
 -R <col1,col2-col4,...>
  Specifies list of columns to used in the calculation of the 
  distance. 'first' and 'last' are valid indices.
  (default: first-last)
 -V
  Invert matching sense of column indices.
 -R <col1,col2-col4,...>
  Specifies list of columns to used in the calculation of the 
  distance. 'first' and 'last' are valid indices.
  (default: first-last)
 -V
  Invert matching sense of column indices.
Version:
$Revision: 8034 $
Author:
Eibe Frank (eibe@cs.waikato.ac.nz)
See Also:
  • Constructor Details

    • FilteredDistance

      public FilteredDistance()
      Default constructor: need to set up Remove filter.
  • Method Details

    • globalInfo

      public String globalInfo()
      Returns a string describing this object.
      Returns:
      a description of the evaluator suitable for displaying in the explorer/experimenter gui
    • filterTipText

      public String filterTipText()
      Returns the tip text for this property
      Returns:
      tip text for this property suitable for displaying in the explorer/experimenter gui
    • setFilter

      public void setFilter(Filter filter)
      Sets the filter
      Parameters:
      filter - the filter with all options set.
    • getFilter

      public Filter getFilter()
      Gets the filter used.
      Returns:
      the filter
    • distanceTipText

      public String distanceTipText()
      Returns the tip text for this property
      Returns:
      tip text for this property suitable for displaying in the explorer/experimenter gui
    • setDistance

      public void setDistance(DistanceFunction distance)
      Sets the distance
      Parameters:
      distance - the distance with all options set.
    • getDistance

      public DistanceFunction getDistance()
      Gets the distance used.
      Returns:
      the distance
    • listOptions

      public Enumeration<Option> listOptions()
      Returns an enumeration describing the available options.
      Specified by:
      listOptions in interface OptionHandler
      Returns:
      an enumeration of all the available options.
    • getOptions

      public String[] getOptions()
      Gets the current settings. Returns empty array.
      Specified by:
      getOptions in interface OptionHandler
      Returns:
      an array of strings suitable for passing to setOptions()
    • setOptions

      public void setOptions(String[] options) throws Exception
      Parses a given list of options.
      Specified by:
      setOptions in interface OptionHandler
      Parameters:
      options - the list of options as an array of strings
      Throws:
      Exception - if an option is not supported
    • setInstances

      public void setInstances(Instances insts)
      Sets the instances.
      Specified by:
      setInstances in interface DistanceFunction
      Parameters:
      insts - the instances to use
    • getInstances

      public Instances getInstances()
      returns the instances currently set.
      Specified by:
      getInstances in interface DistanceFunction
      Returns:
      the current instances
    • setAttributeIndices

      public void setAttributeIndices(String value)
      Sets the range of attributes to use in the calculation of the distance. The indices start from 1, 'first' and 'last' are valid as well. E.g.: first-3,5,6-last
      Specified by:
      setAttributeIndices in interface DistanceFunction
      Parameters:
      value - the new attribute index range
    • getAttributeIndices

      public String getAttributeIndices()
      Gets the range of attributes used in the calculation of the distance.
      Specified by:
      getAttributeIndices in interface DistanceFunction
      Returns:
      the attribute index range
    • setInvertSelection

      public void setInvertSelection(boolean value)
      Sets whether the matching sense of attribute indices is inverted or not.
      Specified by:
      setInvertSelection in interface DistanceFunction
      Parameters:
      value - if true the matching sense is inverted
    • getInvertSelection

      public boolean getInvertSelection()
      Gets whether the matching sense of attribute indices is inverted or not.
      Specified by:
      getInvertSelection in interface DistanceFunction
      Returns:
      true if the matching sense is inverted
    • distance

      public double distance(Instance first, Instance second)
      Calculates the distance between two instances.
      Specified by:
      distance in interface DistanceFunction
      Parameters:
      first - the first instance
      second - the second instance
      Returns:
      the distance between the two given instances
    • distance

      public double distance(Instance first, Instance second, PerformanceStats stats) throws Exception
      Calculates the distance between two instances.
      Specified by:
      distance in interface DistanceFunction
      Parameters:
      first - the first instance
      second - the second instance
      stats - the performance stats object
      Returns:
      the distance between the two given instances
      Throws:
      Exception - if calculation fails
    • distance

      public double distance(Instance first, Instance second, double cutOffValue)
      Calculates the distance between two instances. Offers speed up (if the distance function class in use supports it) in nearest neighbour search by taking into account the cutOff or maximum distance. Depending on the distance function class, post processing of the distances by postProcessDistances(double []) may be required if this function is used.
      Specified by:
      distance in interface DistanceFunction
      Parameters:
      first - the first instance
      second - the second instance
      cutOffValue - If the distance being calculated becomes larger than cutOffValue then the rest of the calculation is discarded.
      Returns:
      the distance between the two given instances or Double.POSITIVE_INFINITY if the distance being calculated becomes larger than cutOffValue.
    • distance

      public double distance(Instance first, Instance second, double cutOffValue, PerformanceStats stats)
      Calculates the distance between two instances. Offers speed up (if the distance function class in use supports it) in nearest neighbour search by taking into account the cutOff or maximum distance. Depending on the distance function class, post processing of the distances by postProcessDistances(double []) may be required if this function is used.
      Specified by:
      distance in interface DistanceFunction
      Parameters:
      first - the first instance
      second - the second instance
      cutOffValue - If the distance being calculated becomes larger than cutOffValue then the rest of the calculation is discarded.
      stats - the performance stats object
      Returns:
      the distance between the two given instances or Double.POSITIVE_INFINITY if the distance being calculated becomes larger than cutOffValue.
    • postProcessDistances

      public void postProcessDistances(double[] distances)
      Does post processing of the distances (if necessary) returned by distance(distance(Instance first, Instance second, double cutOffValue). It may be necessary, depending on the distance function, to do post processing to set the distances on the correct scale. Some distance function classes may not return correct distances using the cutOffValue distance function to minimize the inaccuracies resulting from floating point comparison and manipulation.
      Specified by:
      postProcessDistances in interface DistanceFunction
      Parameters:
      distances - the distances to post-process
    • update

      public void update(Instance ins)
      Update the distance function (if necessary) for the newly added instance.
      Specified by:
      update in interface DistanceFunction
      Parameters:
      ins - the instance to add
    • clean

      public void clean()
      Free any references to training instances
      Specified by:
      clean in interface DistanceFunction