Cardiff University | Prifysgol Caerdydd ORCA
Online Research @ Cardiff 
WelshClear Cookie - decide language by browser settings

A simple probability based term weighting scheme for automated text classification

Liu, Ying and Loh, Han Tong 2007. A simple probability based term weighting scheme for automated text classification. Presented at: 20th International Conference on Industrial, Engineering and Other Applications of Applied Intelligent Systems (IEA/AIE 2007), Kyoto, Japan, 26-29 June 2007. Published in: Okuno, Hiroshi G. and Ali, Moonis eds. New Trends in Applied Artificial Intelligence: 20th International Conference on Industrial, Engineering and Other Applications of Applied Intelligent Systems, IEA/AIE 2007, Kyoto, Japan, June 26-29, 2007. Proceedings. Lecture Notes in Computer Science , vol. 4570. Berlin Heidelberg: Springer, pp. 33-43. 10.1007/978-3-540-73325-6_4

Full text not available from this repository.

Abstract

In the automated text classification, tfidf is often considered as the default term weighting scheme and has been widely reported in literature. However, tfidf does not directly reflect terms’ category membership. Inspired by the analysis of various feature selection methods, we propose a simple probability based term weighting scheme which directly utilizes two critical information ratios, i.e. relevance indicators. These relevance indicators are nicely supported by probability estimates which embody the category membership. Our experimental study based on two data sets, including Reuters-21578, demonstrates that the proposed probability based term weighting scheme outperforms tfidf significantly using Bayesian classifier and Support Vector Machines (SVM).

Item Type: Conference or Workshop Item (Paper)
Date Type: Publication
Status: Published
Schools: Engineering
Subjects: T Technology > TA Engineering (General). Civil engineering (General)
Publisher: Springer
ISBN: 9783540733225
ISSN: 0302-9743
Last Modified: 04 Jun 2017 05:27
URI: http://orca.cf.ac.uk/id/eprint/51440

Citation Data

Cited 1 time in Google Scholar. View in Google Scholar

Actions (repository staff only)

Edit Item Edit Item