Details of content

ECTI TRANSACTIONS ON COMPUTER INFORMATION TECHNOLOGY

Volume 15, No. 02, Month AUGUST, Year 2021, Pages 166 - 176

Hierarchical text classication using relative inverse document frequency

Boonthida Chiraratanasopha, Thanaruk Theeramunkong, Salin Boonbrahm

Abstract Download PDF

Automatic hierarchical text classification has been a challenging and in-needed task with an increasing of hierarchical taxonomy from the booming of knowledge organization. The hierarchical structure identifies the relationships of dependence between different categories in which can be overlapped of generalized and specific concepts within the tree. This paper presents the use of frequency of the occurring term in related categories among the hierarchical tree to help in document classification. The four extended term weighting of Relative Inverse Document Frequency (IDFr) including its located category, its parent category, its sibling categories and its child categories are exploited to generate a classifier model using centroid-based technique. From the experiment on hierarchical text classification of Thai documents, the IDFr achieved the best accuracy and F-measure as 53.65% and 50.80% in Top-n features set from family-based evaluation in which are higher than TF-IDF for 2.35% and 1.15% in the same settings, respectively.

Keywords

Hierarchical Text Classi- cation, Term Weighting, Hierarchical Categories, Relative Inverse Documents Frequency (IDFr)

ThaiScience

ECTI TRANSACTIONS ON COMPUTER INFORMATION TECHNOLOGY

Hierarchical text classication using relative inverse document frequency

Abstract Download PDF

Keywords

ECTI TRANSACTIONS ON COMPUTER INFORMATION TECHNOLOGY

Published by : ECTI Association
Contributions welcome at : http://www.ecti-thailand.org/paper/journal/ECTI-CIT

ThaiScience

ECTI TRANSACTIONS ON COMPUTER INFORMATION TECHNOLOGY

Hierarchical text classication using relative inverse document frequency

Abstract Download PDF

Keywords

ECTI TRANSACTIONS ON COMPUTER INFORMATION TECHNOLOGY

Published by : ECTI AssociationContributions welcome at : http://www.ecti-thailand.org/paper/journal/ECTI-CIT

Hierarchical text classication using relative inverse document frequency

Published by : ECTI Association
Contributions welcome at : http://www.ecti-thailand.org/paper/journal/ECTI-CIT