Cardiff University | Prifysgol Caerdydd ORCA
Online Research @ Cardiff 
WelshClear Cookie - decide language by browser settings

Exploiting extensible background knowledge for clustering-based automatic keyphrase extraction

Alrehamy, Hassan and Walker, Coral ORCID: https://orcid.org/0000-0002-0258-9301 2018. Exploiting extensible background knowledge for clustering-based automatic keyphrase extraction. Soft Computing - A Fusion of Foundations, Methodologies and Applications 22 (21) , pp. 7041-7057. 10.1007/s00500-018-3414-4

[thumbnail of 10.1007_s00500-018-3414-4.pdf]
Preview
PDF - Published Version
Available under License Creative Commons Attribution.

Download (1MB) | Preview

Abstract

Keyphrases are single- or multi-word phrases that are used to describe the essential content of a document. Utilizing an external knowledge source such as WordNet is often used in keyphrase extraction methods to obtain relation information about terms and thus improves the result, but the drawback is that a sole knowledge source is often limited. This problem is identified as the coverage limitation problem. In this paper, we introduce SemCluster, a clustering-based unsupervised keyphrase extraction method that addresses the coverage limitation problem by using an extensible approach that integrates an internal ontology (i.e., WordNet) with other knowledge sources to gain a wider background knowledge. SemCluster is evaluated against three unsupervised methods, TextRank, ExpandRank, and KeyCluster, and under the F1-measure metric. The evaluation results demonstrate that SemCluster has better accuracy and computational efficiency and is more robust when dealing with documents from different domains.

Item Type: Article
Date Type: Publication
Status: Published
Schools: Computer Science & Informatics
Publisher: Springer Verlag (Germany)
ISSN: 1432-7643
Date of First Compliant Deposit: 28 August 2018
Date of Acceptance: 16 August 2018
Last Modified: 04 May 2023 16:36
URI: https://orca.cardiff.ac.uk/id/eprint/114371

Citation Data

Cited 10 times in Scopus. View in Scopus. Powered By Scopus® Data

Actions (repository staff only)

Edit Item Edit Item

Downloads

Downloads per month over past year

View more statistics