Package weka.core

Interface DistanceFunction

All Superinterfaces:
OptionHandler
All Known Implementing Classes:
ChebyshevDistance, EuclideanDistance, FilteredDistance, ManhattanDistance, MinkowskiDistance, NormalizableDistance

public interface DistanceFunction extends OptionHandler
Interface for any class that can compute and return distances between two instances.
Version:
$Revision: 10535 $
Author:
Ashraf M. Kibriya (amk14@cs.waikato.ac.nz)
  • Method Details

    • setInstances

      void setInstances(Instances insts)
      Sets the instances.
      Parameters:
      insts - the instances to use
    • getInstances

      Instances getInstances()
      returns the instances currently set.
      Returns:
      the current instances
    • setAttributeIndices

      void setAttributeIndices(String value)
      Sets the range of attributes to use in the calculation of the distance. The indices start from 1, 'first' and 'last' are valid as well. E.g.: first-3,5,6-last
      Parameters:
      value - the new attribute index range
    • getAttributeIndices

      String getAttributeIndices()
      Gets the range of attributes used in the calculation of the distance.
      Returns:
      the attribute index range
    • setInvertSelection

      void setInvertSelection(boolean value)
      Sets whether the matching sense of attribute indices is inverted or not.
      Parameters:
      value - if true the matching sense is inverted
    • getInvertSelection

      boolean getInvertSelection()
      Gets whether the matching sense of attribute indices is inverted or not.
      Returns:
      true if the matching sense is inverted
    • distance

      double distance(Instance first, Instance second)
      Calculates the distance between two instances.
      Parameters:
      first - the first instance
      second - the second instance
      Returns:
      the distance between the two given instances
    • distance

      double distance(Instance first, Instance second, PerformanceStats stats) throws Exception
      Calculates the distance between two instances.
      Parameters:
      first - the first instance
      second - the second instance
      stats - the performance stats object
      Returns:
      the distance between the two given instances
      Throws:
      Exception - if calculation fails
    • distance

      double distance(Instance first, Instance second, double cutOffValue)
      Calculates the distance between two instances. Offers speed up (if the distance function class in use supports it) in nearest neighbour search by taking into account the cutOff or maximum distance. Depending on the distance function class, post processing of the distances by postProcessDistances(double []) may be required if this function is used.
      Parameters:
      first - the first instance
      second - the second instance
      cutOffValue - If the distance being calculated becomes larger than cutOffValue then the rest of the calculation is discarded.
      Returns:
      the distance between the two given instances or Double.POSITIVE_INFINITY if the distance being calculated becomes larger than cutOffValue.
    • distance

      double distance(Instance first, Instance second, double cutOffValue, PerformanceStats stats)
      Calculates the distance between two instances. Offers speed up (if the distance function class in use supports it) in nearest neighbour search by taking into account the cutOff or maximum distance. Depending on the distance function class, post processing of the distances by postProcessDistances(double []) may be required if this function is used.
      Parameters:
      first - the first instance
      second - the second instance
      cutOffValue - If the distance being calculated becomes larger than cutOffValue then the rest of the calculation is discarded.
      stats - the performance stats object
      Returns:
      the distance between the two given instances or Double.POSITIVE_INFINITY if the distance being calculated becomes larger than cutOffValue.
    • postProcessDistances

      void postProcessDistances(double[] distances)
      Does post processing of the distances (if necessary) returned by distance(distance(Instance first, Instance second, double cutOffValue). It may be necessary, depending on the distance function, to do post processing to set the distances on the correct scale. Some distance function classes may not return correct distances using the cutOffValue distance function to minimize the inaccuracies resulting from floating point comparison and manipulation.
      Parameters:
      distances - the distances to post-process
    • update

      void update(Instance ins)
      Update the distance function (if necessary) for the newly added instance.
      Parameters:
      ins - the instance to add
    • clean

      void clean()
      Free any references to training instances