ee.Clusterer.wekaCascadeKMeans
Cascade simple k-means selects the best k according to the Calinski-Harabasz criterion. For more information see:
Calinski, T. and J. Harabasz. 1974. A dendrite method for cluster analysis. Commun. Stat. 3: 1-27.
Usage | Returns |
---|
ee.Clusterer.wekaCascadeKMeans(minClusters, maxClusters, restarts, manual, init, distanceFunction, maxIterations) | Clusterer |
Argument | Type | Details |
---|
minClusters | Integer, default: 2 | Min number of clusters. |
maxClusters | Integer, default: 10 | Max number of clusters. |
restarts | Integer, default: 10 | Number of restarts. |
manual | Boolean, default: false | Manually select the number of clusters. |
init | Boolean, default: false | Set whether to initialize using the probabilistic farthest first like method of the k-means++ algorithm (rather than the standard random selection of initial cluster centers). |
distanceFunction | String, default: "Euclidean" | Distance function to use. Options are: Euclidean and Manhattan. |
maxIterations | Integer, default: null | Maximum number of iterations for k-means. |
Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License, and code samples are licensed under the Apache 2.0 License. For details, see the Google Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2024-09-19 UTC.
[null,null,["Last updated 2024-09-19 UTC."],[[["Cascade simple k-means automatically determines the optimal number of clusters (k) within a specified range using the Calinski-Harabasz criterion."],["Users can customize the clustering process by defining the minimum and maximum number of clusters, the number of algorithm restarts, initialization methods, distance functions, and the maximum number of iterations."],["This Weka-based clusterer offers flexibility by allowing users to either automatically or manually select the number of clusters for their analysis."],["The underlying algorithm leverages either Euclidean or Manhattan distance metrics to measure similarity between data points for cluster assignments."]]],[]]