[null,null,["最后更新时间 (UTC):2025-02-25。"],[[["\u003cp\u003eReal-world data often lacks clearly defined clusters, making visual assessment of clustering quality difficult and requiring the use of heuristics and best practices.\u003c/p\u003e\n"],["\u003cp\u003eEvaluating clustering involves assessing cluster cardinality, magnitude, downstream performance, and the similarity measure used.\u003c/p\u003e\n"],["\u003cp\u003eThe optimal number of clusters can be determined by analyzing the relationship between the number of clusters and the sum of cluster magnitudes, often using the elbow method.\u003c/p\u003e\n"],["\u003cp\u003eIf clustering results are unsatisfactory, re-evaluate data preparation, the similarity measure, and the algorithm's assumptions to ensure they align with the data's characteristics.\u003c/p\u003e\n"]]],[],null,["# Evaluating results\n\nBecause clustering is unsupervised, no\n[ground truth](/machine-learning/glossary#ground-truth) is available to verify\nresults. The absence of truth complicates assessments of quality. Moreover,\nreal-world datasets typically don't offer obvious clusters of examples as in the\nexample shown in Figure 1.\n**Figure 1: An ideal data plot. Real-world data rarely looks like this.**\n\nInstead, real-world data often looks more like Figure 2, making it difficult to\nvisually assess clustering quality.\n**Figure 2: A more realistic data plot**\n\nHowever, there are heuristics and best practices that you can iteratively apply\nto improve the quality of your clustering. The following flowchart gives an\noverview of how to evaluate your clustering results. We'll expand upon each\nstep.\n[**Click here to see a larger version of this chart.**](/machine-learning/clustering/flowchart)\n\nStep 1: Assess quality of clustering\n------------------------------------\n\nFirst check that the clusters look as you'd expect, and that examples you\nconsider similar to each other appear in the same cluster.\n\nThen check these commonly-used metrics (not an exhaustive list):\n\n- Cluster cardinality\n- Cluster magnitude\n- Downstream performance\n\n### Cluster cardinality\n\n**Cluster cardinality** is the number of examples per cluster. Plot the\ncluster cardinality for all clusters and investigate clusters that\nare major outliers. In Figure 2, this would be cluster 5.\n**Figure 2: Cardinality of several clusters.**\n\n### Cluster magnitude\n\n**Cluster magnitude** is the sum of distances from all examples in a cluster\nto the cluster's centroid. Plot cluster magnitude for all clusters and\ninvestigate outliers. In Figure 3, cluster 0 is an outlier.\n\nAlso consider looking at the maximum or average distance of examples\nfrom centroids, by cluster, to find outliers.\n**Figure 3: Magnitude of several clusters.**\n\n### Magnitude versus cardinality\n\nYou may have noticed that a higher cluster cardinality corresponds to a higher\ncluster magnitude, which makes intuitive sense, since the more points in a\ncluster (cardinality), the greater the probable sum of the distances of those\npoints from the centroid (magnitude). You can also identify anomalous clusters\nby looking for ones where this relationship between cardinality and magnitude\nis very different than for other clusters. In Figure 4, fitting a line to\nthe plot of cardinality and magnitude suggests that cluster 0 is anomalous.\n(Cluster 5 is also far from the line, but if cluster 0 were omitted, the new\nfitted line would be much closer to cluster 5.)\n**Figure 4: Cardinality vs. magnitude for the previously shown clusters.**\n\n### Downstream performance\n\nSince clustering outputs are often used in downstream ML systems, see if\ndownstream model performance improves when your clustering process changes.\nThis offers a real-world evaluation of the quality of your clustering results,\nalthough it can be complex and expensive to conduct this kind of test.\n\nStep 2: Reassess your similarity measure\n----------------------------------------\n\nYour clustering algorithm is only as good as your similarity measure. Make sure\nyour similarity measure returns sensible results. A quick check is to\nidentify pairs of examples known to be more or less similar. Calculate the\nsimilarity measure for each pair of examples, and compare your results to\nyour knowledge: pairs of similar examples should have a higher similarity\nmeasure than pairs of dissimilar examples.\n\nThe examples you use to spot-check your similarity measure should be\nrepresentative of the dataset, so you can be confident that your similarity\nmeasure holds for all your examples. The performance of your\nsimilarity measure, whether manual or supervised, must be consistent across your\ndataset. If your similarity measure is inconsistent for some examples, those\nexamples won't be clustered with similar examples.\n\nIf you find examples with inaccurate similarity scores, then your similarity\nmeasure probably doesn't fully capture the feature data that distinguishes those\nexamples. Experiment with your similarity measure until it returns more\naccurate and consistent results.\n\nStep 3: Find the optimal number of clusters\n-------------------------------------------\n\nk-means requires you to decide the number of clusters \\\\(k\\\\) beforehand. How do\nyou determine an optimal \\\\(k\\\\)? Try running the algorithm with\nincreasing values of \\\\(k\\\\) and note the sum of all cluster magnitudes. As\n\\\\(k\\\\) increases, clusters become smaller, and the total distance of points\nfrom centroids decreases. We can treat this total distance as a loss.\nPlot this distance against the number of clusters.\n\nAs shown in Figure 5, above a certain \\\\(k\\\\), the reduction in loss becomes\nmarginal with increasing \\\\(k\\\\). Consider using the \\\\(k\\\\)\nwhere the slope first has a drastic change, which is called the\n[elbow method](https://wikipedia.org/wiki/Elbow_method_(clustering)). For the\nplot shown, the optimal \\\\(k\\\\) is approximately 11. If you prefer more granular\nclusters, you can choose a higher \\\\(k\\\\), consulting this plot.\n**Figure 5: Loss versus number of clusters**\n\nTroubleshooting questions\n-------------------------\n\nIf you discover issues in the course of your evaluation, reassess your data\npreparation steps and chosen similarity measure. Ask:\n\n- Is your data appropriately scaled?\n- Is your similarity measure correct?\n- Is your algorithm performing semantically meaningful operations on the data?\n- Do your algorithm's assumptions match the data?"]]