T
- type of the points to clusterpublic class FuzzyKMeansClusterer<T extends Clusterable> extends Clusterer<T>
The Fuzzy K-Means algorithm is a variation of the classical K-Means algorithm, with the major difference that a single data point is not uniquely assigned to a single cluster. Instead, each point i has a set of weights uij which indicate the degree of membership to the cluster j.
The algorithm then tries to minimize the objective function:
J = ∑i=1..C∑k=1..N uikmdik2with dik being the distance between data point i and the cluster center k.
The algorithm requires two parameters:
The fuzzy variant of the K-Means algorithm is more robust with regard to the selection of the initial cluster centers.
Constructor and Description |
---|
FuzzyKMeansClusterer(int k,
double fuzziness)
Creates a new instance of a FuzzyKMeansClusterer.
|
FuzzyKMeansClusterer(int k,
double fuzziness,
int maxIterations,
DistanceMeasure measure)
Creates a new instance of a FuzzyKMeansClusterer.
|
FuzzyKMeansClusterer(int k,
double fuzziness,
int maxIterations,
DistanceMeasure measure,
double epsilon,
RandomGenerator random)
Creates a new instance of a FuzzyKMeansClusterer.
|
Modifier and Type | Method and Description |
---|---|
List<CentroidCluster<T>> |
cluster(Collection<T> dataPoints)
Performs Fuzzy K-Means cluster analysis.
|
List<CentroidCluster<T>> |
getClusters()
Returns the list of clusters resulting from the last call to
cluster(Collection) . |
List<T> |
getDataPoints()
Returns an unmodifiable list of the data points used in the last
call to
cluster(Collection) . |
double |
getEpsilon()
Returns the convergence criteria used by this instance.
|
double |
getFuzziness()
Returns the fuzziness factor used by this instance.
|
int |
getK()
Return the number of clusters this instance will use.
|
int |
getMaxIterations()
Returns the maximum number of iterations this instance will use.
|
RealMatrix |
getMembershipMatrix()
Returns the
nxk membership matrix, where n is the number
of data points and k the number of clusters. |
double |
getObjectiveFunctionValue()
Get the value of the objective function.
|
RandomGenerator |
getRandomGenerator()
Returns the random generator this instance will use.
|
distance, getDistanceMeasure
public FuzzyKMeansClusterer(int k, double fuzziness) throws NumberIsTooSmallException
The euclidean distance will be used as default distance measure.
k
- the number of clusters to split the data intofuzziness
- the fuzziness factor, must be > 1.0NumberIsTooSmallException
- if fuzziness <= 1.0
public FuzzyKMeansClusterer(int k, double fuzziness, int maxIterations, DistanceMeasure measure) throws NumberIsTooSmallException
k
- the number of clusters to split the data intofuzziness
- the fuzziness factor, must be > 1.0maxIterations
- the maximum number of iterations to run the algorithm for.
If negative, no maximum will be used.measure
- the distance measure to useNumberIsTooSmallException
- if fuzziness <= 1.0
public FuzzyKMeansClusterer(int k, double fuzziness, int maxIterations, DistanceMeasure measure, double epsilon, RandomGenerator random) throws NumberIsTooSmallException
k
- the number of clusters to split the data intofuzziness
- the fuzziness factor, must be > 1.0maxIterations
- the maximum number of iterations to run the algorithm for.
If negative, no maximum will be used.measure
- the distance measure to useepsilon
- the convergence criteria (default is 1e-3)random
- random generator to use for choosing initial centersNumberIsTooSmallException
- if fuzziness <= 1.0
public int getK()
public double getFuzziness()
public int getMaxIterations()
public double getEpsilon()
public RandomGenerator getRandomGenerator()
public RealMatrix getMembershipMatrix()
nxk
membership matrix, where n
is the number
of data points and k
the number of clusters.
The element Ui,j represents the membership value for data point i
to cluster j
.
MathIllegalStateException
- if cluster(Collection)
has not been called beforepublic List<T> getDataPoints()
cluster(Collection)
.null
if cluster(Collection)
has
not been called before.public List<CentroidCluster<T>> getClusters()
cluster(Collection)
.null
if cluster(Collection)
has
not been called before.public double getObjectiveFunctionValue()
MathIllegalStateException
- if cluster(Collection)
has not been called beforepublic List<CentroidCluster<T>> cluster(Collection<T> dataPoints) throws MathIllegalArgumentException
cluster
in class Clusterer<T extends Clusterable>
dataPoints
- the points to clusterMathIllegalArgumentException
- if the data points are null or the number
of clusters is larger than the number of data pointsCopyright © 2003–2016 The Apache Software Foundation. All rights reserved.