Package weka.core
Class NormalizableDistance
java.lang.Object
weka.core.NormalizableDistance
- All Implemented Interfaces:
Serializable
,DistanceFunction
,OptionHandler
,RevisionHandler
- Direct Known Subclasses:
ChebyshevDistance
,EuclideanDistance
,ManhattanDistance
,MinkowskiDistance
public abstract class NormalizableDistance
extends Object
implements DistanceFunction, OptionHandler, Serializable, RevisionHandler
Represents the abstract ancestor for normalizable distance functions, like
Euclidean or Manhattan distance.
- Version:
- $Revision: 14813 $
- Author:
- Fracpete (fracpete at waikato dot ac dot nz), Gabi Schmidberger (gabi@cs.waikato.ac.nz) -- original code from weka.core.EuclideanDistance, Ashraf M. Kibriya (amk14@cs.waikato.ac.nz) -- original code from weka.core.EuclideanDistance
- See Also:
-
Field Summary
-
Constructor Summary
ConstructorDescriptionInvalidates the distance function, Instances must be still set.Initializes the distance function and automatically initializes the ranges. -
Method Summary
Modifier and TypeMethodDescriptionReturns the tip text for this property.void
clean()
Free any references to training instancesdouble
Calculates the distance between two instances.double
Calculates the distance between two instances.double
distance
(Instance first, Instance second, double cutOffValue, PerformanceStats stats) Calculates the distance between two instances.double
distance
(Instance first, Instance second, PerformanceStats stats) Calculates the distance between two instances.Returns the tip text for this property.Gets the range of attributes used in the calculation of the distance.boolean
Gets whether if the attribute values are to be normazlied in distance calculation.returns the instances currently set.boolean
Gets whether the matching sense of attribute indices is inverted or not.String[]
Gets the current settings.double[][]
Method to get the ranges.abstract String
Returns a string describing this object.double[][]
Initializes the ranges using all instances of the dataset.double[][]
initializeRanges
(int[] instList) Initializes the ranges of a subset of the instances of this dataset.double[][]
initializeRanges
(int[] instList, int startIdx, int endIdx) Initializes the ranges of a subset of the instances of this dataset.void
initializeRangesEmpty
(int numAtt, double[][] ranges) Used to initialize the ranges.boolean
Test if an instance is within the given ranges.Returns the tip text for this property.Returns an enumeration describing the available options.void
postProcessDistances
(double[] distances) Does nothing, derived classes may override it though.boolean
Check if ranges are set.void
setAttributeIndices
(String value) Sets the range of attributes to use in the calculation of the distance.void
setDontNormalize
(boolean dontNormalize) Sets whether if the attribute values are to be normalized in distance calculation.void
setInstances
(Instances insts) Sets the instances.void
setInvertSelection
(boolean value) Sets whether the matching sense of attribute indices is inverted or not.void
setOptions
(String[] options) Parses a given list of options.toString()
Returns an empty string.void
Update the distance function (if necessary) for the newly added instance.void
updateRanges
(Instance instance) Update the ranges with a new instance.double[][]
updateRanges
(Instance instance, double[][] ranges) Updates the ranges given a new instance.void
updateRanges
(Instance instance, int numAtt, double[][] ranges) Updates the minimum and maximum and width values for all the attributes based on a new instance.void
updateRangesFirst
(Instance instance, int numAtt, double[][] ranges) Used to initialize the ranges.Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, wait, wait, wait
Methods inherited from interface weka.core.RevisionHandler
getRevision
-
Field Details
-
R_MIN
public static final int R_MINIndex in ranges for MIN.- See Also:
-
R_MAX
public static final int R_MAXIndex in ranges for MAX.- See Also:
-
R_WIDTH
public static final int R_WIDTHIndex in ranges for WIDTH.- See Also:
-
-
Constructor Details
-
NormalizableDistance
public NormalizableDistance()Invalidates the distance function, Instances must be still set. -
NormalizableDistance
Initializes the distance function and automatically initializes the ranges.- Parameters:
data
- the instances the distance function should work on
-
-
Method Details
-
globalInfo
Returns a string describing this object.- Returns:
- a description of the evaluator suitable for displaying in the explorer/experimenter gui
-
listOptions
Returns an enumeration describing the available options.- Specified by:
listOptions
in interfaceOptionHandler
- Returns:
- an enumeration of all the available options.
-
getOptions
Gets the current settings. Returns empty array.- Specified by:
getOptions
in interfaceOptionHandler
- Returns:
- an array of strings suitable for passing to setOptions()
-
setOptions
Parses a given list of options.- Specified by:
setOptions
in interfaceOptionHandler
- Parameters:
options
- the list of options as an array of strings- Throws:
Exception
- if an option is not supported
-
dontNormalizeTipText
Returns the tip text for this property.- Returns:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
setDontNormalize
public void setDontNormalize(boolean dontNormalize) Sets whether if the attribute values are to be normalized in distance calculation.- Parameters:
dontNormalize
- if true the values are not normalized
-
getDontNormalize
public boolean getDontNormalize()Gets whether if the attribute values are to be normazlied in distance calculation. (default false i.e. attribute values are normalized.)- Returns:
- false if values get normalized
-
attributeIndicesTipText
Returns the tip text for this property.- Returns:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
setAttributeIndices
Sets the range of attributes to use in the calculation of the distance. The indices start from 1, 'first' and 'last' are valid as well. E.g.: first-3,5,6-last- Specified by:
setAttributeIndices
in interfaceDistanceFunction
- Parameters:
value
- the new attribute index range
-
getAttributeIndices
Gets the range of attributes used in the calculation of the distance.- Specified by:
getAttributeIndices
in interfaceDistanceFunction
- Returns:
- the attribute index range
-
invertSelectionTipText
Returns the tip text for this property.- Returns:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
setInvertSelection
public void setInvertSelection(boolean value) Sets whether the matching sense of attribute indices is inverted or not.- Specified by:
setInvertSelection
in interfaceDistanceFunction
- Parameters:
value
- if true the matching sense is inverted
-
getInvertSelection
public boolean getInvertSelection()Gets whether the matching sense of attribute indices is inverted or not.- Specified by:
getInvertSelection
in interfaceDistanceFunction
- Returns:
- true if the matching sense is inverted
-
setInstances
Sets the instances.- Specified by:
setInstances
in interfaceDistanceFunction
- Parameters:
insts
- the instances to use
-
getInstances
returns the instances currently set.- Specified by:
getInstances
in interfaceDistanceFunction
- Returns:
- the current instances
-
postProcessDistances
public void postProcessDistances(double[] distances) Does nothing, derived classes may override it though.- Specified by:
postProcessDistances
in interfaceDistanceFunction
- Parameters:
distances
- the distances to post-process
-
update
Update the distance function (if necessary) for the newly added instance.- Specified by:
update
in interfaceDistanceFunction
- Parameters:
ins
- the instance to add
-
distance
Calculates the distance between two instances.- Specified by:
distance
in interfaceDistanceFunction
- Parameters:
first
- the first instancesecond
- the second instance- Returns:
- the distance between the two given instances
-
distance
Calculates the distance between two instances.- Specified by:
distance
in interfaceDistanceFunction
- Parameters:
first
- the first instancesecond
- the second instancestats
- the performance stats object- Returns:
- the distance between the two given instances
-
distance
Calculates the distance between two instances. Offers speed up (if the distance function class in use supports it) in nearest neighbour search by taking into account the cutOff or maximum distance. Depending on the distance function class, post processing of the distances by postProcessDistances(double []) may be required if this function is used.- Specified by:
distance
in interfaceDistanceFunction
- Parameters:
first
- the first instancesecond
- the second instancecutOffValue
- If the distance being calculated becomes larger than cutOffValue then the rest of the calculation is discarded.- Returns:
- the distance between the two given instances or Double.POSITIVE_INFINITY if the distance being calculated becomes larger than cutOffValue.
-
distance
Calculates the distance between two instances. Offers speed up (if the distance function class in use supports it) in nearest neighbour search by taking into account the cutOff or maximum distance. Depending on the distance function class, post processing of the distances by postProcessDistances(double []) may be required if this function is used.- Specified by:
distance
in interfaceDistanceFunction
- Parameters:
first
- the first instancesecond
- the second instancecutOffValue
- If the distance being calculated becomes larger than cutOffValue then the rest of the calculation is discarded.stats
- the performance stats object- Returns:
- the distance between the two given instances or Double.POSITIVE_INFINITY if the distance being calculated becomes larger than cutOffValue.
-
initializeRanges
public double[][] initializeRanges()Initializes the ranges using all instances of the dataset. Sets m_Ranges.- Returns:
- the ranges
-
updateRangesFirst
Used to initialize the ranges. For this the values of the first instance is used to save time. Sets low and high to the values of the first instance and width to zero.- Parameters:
instance
- the new instancenumAtt
- number of attributes in the model (ignored)ranges
- low, high and width values for all attributes
-
updateRanges
Updates the minimum and maximum and width values for all the attributes based on a new instance.- Parameters:
instance
- the new instancenumAtt
- number of attributes in the model (ignored)ranges
- low, high and width values for all attributes
-
initializeRangesEmpty
public void initializeRangesEmpty(int numAtt, double[][] ranges) Used to initialize the ranges.- Parameters:
numAtt
- number of attributes in the modelranges
- low, high and width values for all attributes
-
updateRanges
Updates the ranges given a new instance.- Parameters:
instance
- the new instanceranges
- low, high and width values for all attributes- Returns:
- the updated ranges
-
initializeRanges
Initializes the ranges of a subset of the instances of this dataset. Therefore m_Ranges is not set.- Parameters:
instList
- list of indexes of the subset- Returns:
- the ranges
- Throws:
Exception
- if something goes wrong
-
initializeRanges
Initializes the ranges of a subset of the instances of this dataset. Therefore m_Ranges is not set. The caller of this method should ensure that the supplied start and end indices are valid (start <= end, end<instList.length etc) and correct.- Parameters:
instList
- list of indexes of the instancesstartIdx
- start index of the subset of instances in the indices arrayendIdx
- end index of the subset of instances in the indices array- Returns:
- the ranges
- Throws:
Exception
- if something goes wrong
-
updateRanges
Update the ranges with a new instance.- Parameters:
instance
- the new instance
-
inRanges
Test if an instance is within the given ranges. Missing values are skipped. Inefficient when using sparse data.- Parameters:
instance
- the instanceranges
- the ranges the instance is tested to be in- Returns:
- true if instance is within the ranges
-
rangesSet
public boolean rangesSet()Check if ranges are set.- Returns:
- true if ranges are set
-
getRanges
Method to get the ranges.- Returns:
- the ranges
- Throws:
Exception
- if no randes are set yet
-
clean
public void clean()Description copied from interface:DistanceFunction
Free any references to training instances- Specified by:
clean
in interfaceDistanceFunction
-
toString
Returns an empty string.
-