Improving Text Classification by Fusing Linguistic and Semantic Features

By:

Sarang Shaikh
Ehtesham Hashmi
Sule Yildirim Yayilgan
Mohamed Abomhara
Rjendra Akerkar

Technology and society

Big data and Emerging Technologies

Article

Id:

February 2025

Publisher:

IEEE Xplore

Text classification remains a fundamental challenge in natural language processing (NLP), with performance often limited by the reliance on either traditional linguistic features or semantic embedding techniques in isolation. This study addresses this limitation by proposing a feature fusion method that integrates traditional linguistic features — such as part-of-speech tags, bag-of-words, TF-IDF, and n-grams — with advanced semantic embedding techniques like word 2 vec and doc 2 vec. The proposed approach aims to capture both syntactic and semantic nuances, enhancing the robustness and accuracy of text classification tasks. To evaluate its effectiveness, the method was applied to five datasets across three critical domains: fake news detection, bloom’s taxonomy classification, and hate speech detection.

This paper received best paper award at the 6th International Conference on Advancements in Computational Sciences (ICACS25) in Lahore, Pakistan, February 2025.

Link:

Conference article: Improving Text Classification by Fusing Linguistic and Sema…

Improving Text Classification by Fusing Linguistic and Semantic Features

Violence-Inducing Behaviour Prevention in Social-Cyber Space of Local Communities