A highly efficient multi-core k-means algorithm for clustering extremely large datasets
To select the cluster number estimation method click on the second tab on top of the frame.
The cluster number estimation method is selected by switching to the cluster number estimation tab. First choose the minimal and maximal number of clusters to test, the maximal number of iterations for the cluster algorithm, and the maximal number of repetitions for each cluster number k.
Press the 'Cluster number estimation' button to start. The result is shown as a boxplot chart (displaying the median and interquartile range).
The most stable clustering (greatest difference between mean cluster result and mean random clustering) is marked in the figure by a '*' and shown on the lower right panel.
Additionally, a p-value from testing the null hypothesis of no difference between MCA-index from clustering and random baseline (U-test) is displayed. The best clustering (from 100 random restarts with k*) is shown in the cluster analysis tab and can be saved via 'Menu -> File -> Save clustering'.
NOTE: McKmeans is licensed and distributed under the terms of the Artistic license 2.0.