Cardiff University | Prifysgol Caerdydd ORCA
Online Research @ Cardiff 
WelshClear Cookie - decide language by browser settings

Interactive visual cluster analysis by contrastive dimensionality reduction

Xia, Jiazhi, Huang, Linquan, Lin, Weixing, Zhao, Xin, Wu, Jing ORCID: https://orcid.org/0000-0001-5123-9861, Chen, Yang, Zhao, Ying and Chen, Wei 2023. Interactive visual cluster analysis by contrastive dimensionality reduction. IEEE Transactions on Visualization and Computer Graphics 29 (1) , pp. 734-744. 10.1109/TVCG.2022.3209423

[thumbnail of Conrastive_Dimensionality_Reduction_based_Interactive_Visual_Cluster_Analysis (1).pdf]
Preview
PDF - Accepted Post-Print Version
Download (9MB) | Preview

Abstract

We propose a contrastive dimensionality reduction approach (CDR) for interactive visual cluster analysis. Although dimensionality reduction of high-dimensional data is widely used in visual cluster analysis in conjunction with scatterplots, there are several limitations on effective visual cluster analysis. First, it is non-trivial for an embedding to present clear visual cluster separation when keeping neighborhood structures. Second, as cluster analysis is a subjective task, user steering is required. However, it is also non-trivial to enable interactions in dimensionality reduction. To tackle these problems, we introduce contrastive learning into dimensionality reduction for high-quality embedding. We then redefine the gradient of the loss function to the negative pairs to enhance the visual cluster separation of embedding results. Based on the contrastive learning scheme, we employ link-based interactions to steer embeddings. After that, we implement a prototype visual interface that integrates the proposed algorithms and a set of visualizations. Quantitative experiments demonstrate that CDR outperforms existing techniques in terms of preserving correct neighborhood structures and improving visual cluster separation. The ablation experiment demonstrates the effectiveness of gradient redefinition. The user study verifies that CDR outperforms t-SNE and UMAP in the task of cluster identification. We also showcase two use cases on real-world datasets to present the effectiveness of link-based interactions.

Item Type: Article
Date Type: Publication
Status: Published
Schools: Computer Science & Informatics
Publisher: Institute of Electrical and Electronics Engineers
ISSN: 1077-2626
Date of First Compliant Deposit: 11 October 2022
Date of Acceptance: 16 July 2022
Last Modified: 08 Nov 2023 08:04
URI: https://orca.cardiff.ac.uk/id/eprint/153298

Actions (repository staff only)

Edit Item Edit Item

Downloads

Downloads per month over past year

View more statistics