Package weka.clusterers
Class FarthestFirst
java.lang.Object
weka.clusterers.AbstractClusterer
weka.clusterers.RandomizableClusterer
weka.clusterers.FarthestFirst
- All Implemented Interfaces:
Serializable
,Cloneable
,Clusterer
,CapabilitiesHandler
,CapabilitiesIgnorer
,CommandlineRunnable
,OptionHandler
,Randomizable
,RevisionHandler
,TechnicalInformationHandler
Cluster data using the FarthestFirst algorithm.
For more information see:
Hochbaum, Shmoys (1985). A best possible heuristic for the k-center problem. Mathematics of Operations Research. 10(2):180-184.
Sanjoy Dasgupta: Performance Guarantees for Hierarchical Clustering. In: 15th Annual Conference on Computational Learning Theory, 351-363, 2002.
Notes:
- works as a fast simple approximate clusterer
- modelled after SimpleKMeans, might be a useful initializer for it BibTeX:
For more information see:
Hochbaum, Shmoys (1985). A best possible heuristic for the k-center problem. Mathematics of Operations Research. 10(2):180-184.
Sanjoy Dasgupta: Performance Guarantees for Hierarchical Clustering. In: 15th Annual Conference on Computational Learning Theory, 351-363, 2002.
Notes:
- works as a fast simple approximate clusterer
- modelled after SimpleKMeans, might be a useful initializer for it BibTeX:
@article{Hochbaum1985, author = {Hochbaum and Shmoys}, journal = {Mathematics of Operations Research}, number = {2}, pages = {180-184}, title = {A best possible heuristic for the k-center problem}, volume = {10}, year = {1985} } @inproceedings{Dasgupta2002, author = {Sanjoy Dasgupta}, booktitle = {15th Annual Conference on Computational Learning Theory}, pages = {351-363}, publisher = {Springer}, title = {Performance Guarantees for Hierarchical Clustering}, year = {2002} }Valid options are:
-N <num> number of clusters. (default = 2).
-S <num> Random number seed. (default 1)
- Version:
- $Revision: 15519 $
- Author:
- Bernhard Pfahringer (bernhard@cs.waikato.ac.nz)
- See Also:
-
Constructor Summary
-
Method Summary
Modifier and TypeMethodDescriptionvoid
buildClusterer
(Instances data) Generates a clusterer.int
clusterInstance
(Instance instance) Classifies a given instance.Returns default capabilities of the clusterer.Get the centroids found by FarthestFirstint
gets the number of clusters to generateString[]
Gets the current settings of FarthestFirstReturns the revision string.Returns an instance of a TechnicalInformation object, containing detailed information about the technical background of this class, e.g., paper reference or book this class is based on.Returns a string describing this clustererReturns an enumeration describing the available options.static void
Main method for testing this class.int
Returns the number of clusters.Returns the tip text for this propertyvoid
setNumClusters
(int n) set the number of clusters to generatevoid
setOptions
(String[] options) Parses a given list of options.toString()
return a string describing this clustererMethods inherited from class weka.clusterers.RandomizableClusterer
getSeed, seedTipText, setSeed
Methods inherited from class weka.clusterers.AbstractClusterer
debugTipText, distributionForInstance, doNotCheckCapabilitiesTipText, forName, getDebug, getDoNotCheckCapabilities, makeCopies, makeCopy, postExecution, preExecution, run, runClusterer, setDebug, setDoNotCheckCapabilities
-
Constructor Details
-
FarthestFirst
public FarthestFirst()
-
-
Method Details
-
globalInfo
Returns a string describing this clusterer- Returns:
- a description of the evaluator suitable for displaying in the explorer/experimenter gui
-
getTechnicalInformation
Returns an instance of a TechnicalInformation object, containing detailed information about the technical background of this class, e.g., paper reference or book this class is based on.- Specified by:
getTechnicalInformation
in interfaceTechnicalInformationHandler
- Returns:
- the technical information about this class
-
getCapabilities
Returns default capabilities of the clusterer.- Specified by:
getCapabilities
in interfaceCapabilitiesHandler
- Specified by:
getCapabilities
in interfaceClusterer
- Overrides:
getCapabilities
in classAbstractClusterer
- Returns:
- the capabilities of this clusterer
- See Also:
-
buildClusterer
Generates a clusterer. Has to initialize all fields of the clusterer that are not being set via options.- Specified by:
buildClusterer
in interfaceClusterer
- Specified by:
buildClusterer
in classAbstractClusterer
- Parameters:
data
- set of instances serving as training data- Throws:
Exception
- if the clusterer has not been generated successfully
-
clusterInstance
Classifies a given instance.- Specified by:
clusterInstance
in interfaceClusterer
- Overrides:
clusterInstance
in classAbstractClusterer
- Parameters:
instance
- the instance to be assigned to a cluster- Returns:
- the number of the assigned cluster as an integer if the class is enumerated, otherwise the predicted value
- Throws:
Exception
- if instance could not be classified successfully
-
numberOfClusters
Returns the number of clusters.- Specified by:
numberOfClusters
in interfaceClusterer
- Specified by:
numberOfClusters
in classAbstractClusterer
- Returns:
- the number of clusters generated for a training dataset.
- Throws:
Exception
- if number of clusters could not be returned successfully
-
getClusterCentroids
Get the centroids found by FarthestFirst- Returns:
- the centroids found by FarthestFirst
-
listOptions
Returns an enumeration describing the available options.- Specified by:
listOptions
in interfaceOptionHandler
- Overrides:
listOptions
in classRandomizableClusterer
- Returns:
- an enumeration of all the available options.
-
numClustersTipText
Returns the tip text for this property- Returns:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
setNumClusters
set the number of clusters to generate- Parameters:
n
- the number of clusters to generate- Throws:
Exception
- if number of clusters is negative
-
getNumClusters
public int getNumClusters()gets the number of clusters to generate- Returns:
- the number of clusters to generate
-
setOptions
Parses a given list of options. Valid options are:-N <num> number of clusters. (default = 2).
-S <num> Random number seed. (default 1)
- Specified by:
setOptions
in interfaceOptionHandler
- Overrides:
setOptions
in classRandomizableClusterer
- Parameters:
options
- the list of options as an array of strings- Throws:
Exception
- if an option is not supported
-
getOptions
Gets the current settings of FarthestFirst- Specified by:
getOptions
in interfaceOptionHandler
- Overrides:
getOptions
in classRandomizableClusterer
- Returns:
- an array of strings suitable for passing to setOptions()
-
toString
return a string describing this clusterer -
getRevision
Returns the revision string.- Specified by:
getRevision
in interfaceRevisionHandler
- Overrides:
getRevision
in classAbstractClusterer
- Returns:
- the revision
-
main
Main method for testing this class.- Parameters:
argv
- should contain the following arguments:-t training file [-N number of clusters]
-