Robust clustering based on the most frequent value method
Kulcsszavak:Most Frequent Value, k-MFVs, outlier map, robust clustering, anomaly detection
Assigning observations to highly separable although relatively homogeneous groups is still a challenging task despite the abundance of well-elaborated theories and effective, practical algorithms. Not just the aim of clustering then the underlying data itself influences the choice of method and the way of assessing the results. Outliers and non-normal data distribution can lead to surprising, unstable and many times undesirable clustering results especially in higher dimensions. This implies the importance of some human supervision in case of such unsupervised algorithms as well. In this paper a robust clustering alternative is presented based on the Most Frequent Value Method for crisp-type clustering in case of real-life data. The proposed approach is compared with the k-Medians algorithm. A favourable attribute of the applied procedure is its ease of application on multidimensional data sets where critical judgment of formed groups is particularly troublesome.