Package weka.core
Class EuclideanDistance
java.lang.Object
weka.core.NormalizableDistance
weka.core.EuclideanDistance
- All Implemented Interfaces:
Serializable
,Cloneable
,DistanceFunction
,OptionHandler
,RevisionHandler
,TechnicalInformationHandler
public class EuclideanDistance
extends NormalizableDistance
implements Cloneable, TechnicalInformationHandler
Implementing Euclidean distance (or similarity) function.
One object defines not one distance but the data model in which the distances between objects of that data model can be computed.
Attention: For efficiency reasons the use of consistency checks (like are the data models of the two instances exactly the same), is low.
For more information, see:
Wikipedia. Euclidean distance. URL http://en.wikipedia.org/wiki/Euclidean_distance. BibTeX:
One object defines not one distance but the data model in which the distances between objects of that data model can be computed.
Attention: For efficiency reasons the use of consistency checks (like are the data models of the two instances exactly the same), is low.
For more information, see:
Wikipedia. Euclidean distance. URL http://en.wikipedia.org/wiki/Euclidean_distance. BibTeX:
@misc{missing_id, author = {Wikipedia}, title = {Euclidean distance}, URL = {http://en.wikipedia.org/wiki/Euclidean_distance} }Valid options are:
-D Turns off the normalization of attribute values in distance calculation.
-R <col1,col2-col4,...> Specifies list of columns to used in the calculation of the distance. 'first' and 'last' are valid indices. (default: first-last)
-V Invert matching sense of column indices.
- Version:
- $Revision: 8034 $
- Author:
- Gabi Schmidberger (gabi@cs.waikato.ac.nz), Ashraf M. Kibriya (amk14@cs.waikato.ac.nz), FracPete (fracpete at waikato dot ac dot nz)
- See Also:
-
Field Summary
Fields inherited from class weka.core.NormalizableDistance
R_MAX, R_MIN, R_WIDTH
-
Constructor Summary
ConstructorDescriptionConstructs an Euclidean Distance object, Instances must be still set.EuclideanDistance
(Instances data) Constructs an Euclidean Distance object and automatically initializes the ranges. -
Method Summary
Modifier and TypeMethodDescriptionint
closestPoint
(Instance instance, Instances allPoints, int[] pointList) Returns the index of the closest point to the current instance.double
Calculates the distance between two instances.double
distance
(Instance first, Instance second, PerformanceStats stats) Calculates the distance (or similarity) between two instances.double
getMiddle
(double[] ranges) Returns value in the middle of the two parameter values.Returns the revision string.Returns an instance of a TechnicalInformation object, containing detailed information about the technical background of this class, e.g., paper reference or book this class is based on.Returns a string describing this object.void
postProcessDistances
(double[] distances) Does post processing of the distances (if necessary) returned by distance(distance(Instance first, Instance second, double cutOffValue).double
sqDifference
(int index, double val1, double val2) Returns the squared difference of two values of an attribute.boolean
valueIsSmallerEqual
(Instance instance, int dim, double value) Returns true if the value of the given dimension is smaller or equal the value to be compared with.Methods inherited from class weka.core.NormalizableDistance
attributeIndicesTipText, clean, distance, distance, dontNormalizeTipText, getAttributeIndices, getDontNormalize, getInstances, getInvertSelection, getOptions, getRanges, initializeRanges, initializeRanges, initializeRanges, initializeRangesEmpty, inRanges, invertSelectionTipText, listOptions, rangesSet, setAttributeIndices, setDontNormalize, setInstances, setInvertSelection, setOptions, toString, update, updateRanges, updateRanges, updateRanges, updateRangesFirst
-
Constructor Details
-
EuclideanDistance
public EuclideanDistance()Constructs an Euclidean Distance object, Instances must be still set. -
EuclideanDistance
Constructs an Euclidean Distance object and automatically initializes the ranges.- Parameters:
data
- the instances the distance function should work on
-
-
Method Details
-
globalInfo
Returns a string describing this object.- Specified by:
globalInfo
in classNormalizableDistance
- Returns:
- a description of the evaluator suitable for displaying in the explorer/experimenter gui
-
getTechnicalInformation
Returns an instance of a TechnicalInformation object, containing detailed information about the technical background of this class, e.g., paper reference or book this class is based on.- Specified by:
getTechnicalInformation
in interfaceTechnicalInformationHandler
- Returns:
- the technical information about this class
-
distance
Calculates the distance between two instances.- Specified by:
distance
in interfaceDistanceFunction
- Overrides:
distance
in classNormalizableDistance
- Parameters:
first
- the first instancesecond
- the second instance- Returns:
- the distance between the two given instances
-
distance
Calculates the distance (or similarity) between two instances. Need to pass this returned distance later on to postprocess method to set it on correct scale.
P.S.: Please don't mix the use of this function with distance(Instance first, Instance second), as that already does post processing. Please consider passing Double.POSITIVE_INFINITY as the cutOffValue to this function and then later on do the post processing on all the distances.- Specified by:
distance
in interfaceDistanceFunction
- Overrides:
distance
in classNormalizableDistance
- Parameters:
first
- the first instancesecond
- the second instancestats
- the structure for storing performance statistics.- Returns:
- the distance between the two given instances or Double.POSITIVE_INFINITY.
-
postProcessDistances
public void postProcessDistances(double[] distances) Does post processing of the distances (if necessary) returned by distance(distance(Instance first, Instance second, double cutOffValue). It is necessary to do so to get the correct distances if distance(distance(Instance first, Instance second, double cutOffValue) is used. This is because that function actually returns the squared distance to avoid inaccuracies arising from floating point comparison.- Specified by:
postProcessDistances
in interfaceDistanceFunction
- Overrides:
postProcessDistances
in classNormalizableDistance
- Parameters:
distances
- the distances to post-process
-
sqDifference
public double sqDifference(int index, double val1, double val2) Returns the squared difference of two values of an attribute.- Parameters:
index
- the attribute indexval1
- the first valueval2
- the second value- Returns:
- the squared difference
-
getMiddle
public double getMiddle(double[] ranges) Returns value in the middle of the two parameter values.- Parameters:
ranges
- the ranges to this dimension- Returns:
- the middle value
-
closestPoint
Returns the index of the closest point to the current instance. Index is index in Instances object that is the second parameter.- Parameters:
instance
- the instance to assign a cluster toallPoints
- all pointspointList
- the list of points- Returns:
- the index of the closest point
- Throws:
Exception
- if something goes wrong
-
valueIsSmallerEqual
Returns true if the value of the given dimension is smaller or equal the value to be compared with.- Parameters:
instance
- the instance where the value should be taken ofdim
- the dimension of the valuevalue
- the value to compare with- Returns:
- true if value of instance is smaller or equal value
-
getRevision
Returns the revision string.- Specified by:
getRevision
in interfaceRevisionHandler
- Returns:
- the revision
-