EMAIL SPAM DETECTION USING SUPPORT VECTOR MACHINES AND NATURAL LANGUAGE PROCESSING

Kamatchy, B and Kalaichelvi, N and Muthukumaran, S (2026) EMAIL SPAM DETECTION USING SUPPORT VECTOR MACHINES AND NATURAL LANGUAGE PROCESSING. In: 4th INTERNATIONAL CONFERENCE ON CYER SECURITY AND GENERATIVE ARTIFICIAL INTELLIGENCE, 13/03/2026, SRM, Chennai.

[thumbnail of Conference_srm_spammail-paper.pdf] Text
Conference_srm_spammail-paper.pdf - Published Version

Download (4MB)

Abstract

The growing use of email communication has led to the massive influx of unwanted and viral
communications otherwise known as spam. Spamming must be detected effectively to make
sure that an email is not lost and that the email is secure. In this paper, an automated email
spam solution has been introduced and it is based on Natural Language Processing (NLP) and
a Multi- Kernel Support Vector Model (MK-SVM) model. Term Frequency Inverse Document
Frequency (TF-IDF) technique is used to convert email text into numerical features and
Support Vector Machines with Linear, Radial Basis Function (RBF), Polynomial, and Sigmoid
kernels are used to do the classification. Each of the kernels is tested based on several metrics
which include accuracy, precision, recall, specificity and errors rate. The experimental results
indicate that the Linear and Sigmoid kernel SVMs are the most accurate in classification with
a minimum error of 0.011, as compared to RBF and Polynomial kernels. These results show
that the simplicity of kernel functions is better applied to large textual data of high dimensions
and it would be the most appropriate to email spam filters based on NLP.

Item Type: Conference or Workshop Item (Paper)
Subjects: Computer Applications > Artificial Intelligence
Computer Applications > Computer Networks
Domains: Computer Applications
Depositing User: Mr IR Admin
Date Deposited: 16 May 2026 11:40
Last Modified: 18 May 2026 12:30
URI: https://ir.vistas.ac.in/id/eprint/19880

Actions (login required)

View Item
View Item