IEEE VIS 2024 Content: Towards a Visual Perception-Based Analysis of Clustering Quality Metrics

Towards a Visual Perception-Based Analysis of Clustering Quality Metrics

Graziano Blasilli - Sapienza University of Rome, Rome, Italy

Daniel Kerrigan - Northeastern University, Boston, United States

Enrico Bertini - Northeastern University, Boston, United States

Giuseppe Santucci - Sapienza University of Rome, Rome, Italy

Room: Bayshore I

2024-10-13T17:05:00ZGMT-0600Change your timezone on the schedule page
2024-10-13T17:05:00Z
Exemplar figure, described by caption below
This paper presents the first attempt of a deep and exhaustive evaluation of the perceptive aspects of clustering quality metrics, focusing on the Davies-Bouldin Index, Dunn Index, Calinski-Harabasz Index, and Silhouette Score. Our research is centered around two main objectives: a) assessing the human perception of the metrics in 2D scatterplots and b) exploring the potential of Large Multimodal Models, in particular GPT-4o, to emulate the assessed human perception.
Abstract

Clustering is an essential technique across various domains, such as data science, machine learning, and eXplainable Artificial Intelligence.Information visualization and visual analytics techniques have been proven to effectively support human involvement in the visual exploration of clustered data to enhance the understanding and refinement of cluster assignments. This paper presents an attempt of a deep and exhaustive evaluation of the perceptive aspects of clustering quality metrics, focusing on the Davies-Bouldin Index, Dunn Index, Calinski-Harabasz Index, and Silhouette Score. Our research is centered around two main objectives: a) assessing the human perception of common CVIs in 2D scatterplots and b) exploring the potential of Large Language Models (LLMs), in particular GPT-4o, to emulate the assessed human perception. By discussing the obtained results, highlighting limitations, and areas for further exploration, this paper aims to propose a foundation for future research activities.