A Comparative Analysis of Text Vectorization and Machine Learning Classifiers for Fake News Detection

Ashutosh Dhamija, Mukesh Kumar

doi:10.52783/pst.3302

PDF

Published: Mar 16, 2026

Ashutosh Dhamija, Mukesh Kumar

Abstract

In today’s digital era, the media landscape has seamlessly transitioned from print to online platforms, leading to an unprecedented increase in information accessibility and exchange. However, this transformation has also intensified a major challenge, the rapid proliferation of fake news, which refers to fabricated or misleading information that can be easily produced and disseminated. This paper addresses the growing global concern of misinformation and explores potential solutions through machine learning techniques. The proposed study develops a model designed to assess the authenticity of news articles by evaluating multiple text vectorization methods, specifically the Bag-of-Words approach using both Count Vectorizer and TF-IDF Vectorizer. Two classification algorithms, namely the Multinomial Naive Bayes Classifier and the Passive Aggressive Classifier, are employed to detect fake news. The study further investigates how text pre-processing influences overall model performance. The dataset chosen for training is comprised of 67.7% curated information, while the remaining 33.3% remains untrained raw data. Notably, the model demonstrates a noteworthy efficiency rate of 93.78% under optimum conditions. This strong result demonstrates how well the suggested methodology works to differentiate between real and fake news.

Issue

Vol. 50 No. 1 (2026)

Section

Articles

Acceptance Rate:	24%
Review Speed:	29 days
Issue Per Year:	4
Number of Articles:	1
Number of Reviewers:	489
Number of Contributors:	8296
Contributing Countries:	42
No. of Scopus Citations:	64269
No. of WoS Citations:	3269
Abstract Views:	82,897
PDF Download:	94,708

Article Sidebar

Main Article Content

Abstract

Article Details