Cardiff University | Prifysgol Caerdydd ORCA
Online Research @ Cardiff 
WelshClear Cookie - decide language by browser settings

Optimisation and parallelisation of the partitioning around medoids function in R

Piotrowski, Michal, Forster, Thorsten, Dobrezelecki, Bartosz, Sloan, Terence M., Mitchell, Lawrence, Ghazal, Peter ORCID: https://orcid.org/0000-0003-0035-2228, Mewsissen, Muriel, Petrou, Savvas, Trew, Arthur and Hill, Jon 2011. Optimisation and parallelisation of the partitioning around medoids function in R. p. 707. 10.1109/HPCSim.2011.5999896

Full text not available from this repository.

Abstract

R is a free statistical programming language commonly used for the analysis of high-throughput microarray and other data. It is currently unable to easily utilise multi processor architectures without substantial changes to existing R scripts. Further, working with large volumes of data often leads to slow processing and even memory allocation faults. A recent survey highlighted clustering algorithms as both computation and data intensive bottlenecks in post-genomic data analyses. These algorithms aim to sort numeric vectors (such as gene expression profiles) into groups by minimising vector distances within groups and maximising them between groups. This paper describes the optimisation and parallelisation of a popular clustering algorithm, partitioning around medoids (PAM), for the Simple Parallel R INTerface (SPRINT). SPRINT allows R users to exploit high performance computing systems without expert knowledge of such systems. This paper reports on a serial optimisation of the original code and a subsequent parallel implementation. The parallel implementation enables the processing of data sets that exceed the available physical memory and can yield, depending on the data set, over 100-fold increase in performance.

Item Type: Conference or Workshop Item (Paper)
Date Type: Published Online
Status: Published
Schools: Medicine
Last Modified: 23 Oct 2022 14:02
URI: https://orca.cardiff.ac.uk/id/eprint/112591

Citation Data

Cited 3 times in Scopus. View in Scopus. Powered By Scopus® Data

Actions (repository staff only)

Edit Item Edit Item