1 Introduction

Cyber threats have risen considerably as the world is embracing digital devices at an alarming rate. Security in the Network has thus emerged as a burning issue both to individuals as well as to enterprises. IDS has long been used to monitor and mitigate such threats with Traditional Intrusion Detection Systems [1]. Nevertheless, they are not that effective in the modern dynamic fast-changing world of threats. The major limitation of the traditional IDS is that it is highly signature-based, matching observed network traffic with a known database of attack signatures. The approach is effective against known threats but not against novel or unknown attacks known as zero-day threats [2]. Besides, traditional IDS need to be frequently updated manually to keep them accurate in detecting and this is not only time-consuming, but also vulnerable to error on the part of the person making the update.

The ML and DL are newer developments that have shown how they can be used to boost the performance of IDS in recent years. These methods are able to reveal complicated patterns and aberrant behaviours within the network data and therefore enhance detection of cyberattacks [4]. Still, the conventional ML-based approaches remain associated with the reliance on feature engineering a process which is time-consuming and requires special skills [5]. Compared to such IDS, DL-based IDS is able to automate feature learning, so that there is not so much need to manually label the required features, and that the classification accuracy would be higher [6]. DL models can be used to detect modern cyber attacks and anomalies with large quantities of network data being processed in a timely manner [7].

Although DL-based IDS are useful, challenges do confront them. The absence of, or inadequate availability of, labeled training data is a major problem and is vital in creation of accurate models [8]. Also, the overhead costs and the processing capacity of the methods may present a challenge to organizations with minimal facilities [9]. To overcome these constraints, scholars have sought ways to overcome these challenges through the use of Generative Adversarial Networks (GANs) to generate synthetic training data and transfer learning to enhance the answers of models [10]. More importantly, the use of cloud infrastructure and distributed computing to solve the problem of a high resource demand of DL-based IDS has been suggested [11].

In the work we present an improved Multi-Layer Perceptron (MLP) model used to recognize and classify intrusion. The goal is to defeat the limitations of traditional IDS and difficulties that are faced with current DL methods. To prove its effectiveness we test the model using large-scale dataset and compare it with other IDS methods, which demonstrates the ability of the model to detect and classify cyberattack with high accuracy. The remaining section of the paper is organized: Existing literature on IDS and DL-based IDS methods is reviewed in Sect. 2. We explained the proposed Enhanced MLP Deep Learning Model and its design in Sect. 3. By using a large dataset evaluate our model performance and compare it with traditional methods. At last, conclude the study and discuss the directions for future research in Sect. 5

2 Related Works

Geng et al. enhanced the CNN with deep sparse autoencoder and attention detection to the network intrusion detection system. It maximizes extraction of features and uncommon intrusion with data expansion of ADSAE. The precision of the approach was higher: 89.1 on UNSW-NB15, 94.2 on CSE-CIC-IDS2018 [12].

Ahamed et al. suggested CIDS, an intrusion detec-tion system based on machine learning that overcomes difficulties, such as high false positives and zero-day attacks. CIDS was tested on the KDD Cup 1999 and NSL-KDD datasets, and demonstrated higher accuracy in packet sniffing and identifying suspicious behaviors in networks than the classical signature-based systems and anomaly-based systems [13]. Ali et al. compared several network intrusion detection ML algorithms, which contributes to dealing with high-dimensional data. Random Forest scored 99.78 using a dataset containing 41 features compared to 53.15 using SVM. Their work indicates differences in ML performance, which can help to develop effective NIDS. [14].

Shi et al. assigned the algorithm's foraging behaviour a hunger weight, which helps to balance by using preexisting answers and looking for new ones. To find better solutions and prevent it from getting stuck in local searches, they applied a strategy called alternating and cooperative foraging, which enhances the algorithm's ability. They used a method called greedy Cauchy mutation, which enhances global search capabilities and takes advantage of the hummingbirds’ position data. To confirm the improvements, the team conducted statistical analyses; by using 10 well-known benchmark functions, the team tested the algorithm. They added a binary version of the feature selection method known as BEAHA, which is used to find the optimal feature set [15]

Saravanan et al. create reliable connections on their own; these MANETs are known for being dynamic and cooperative. To forward the data packets between the source and the destination, intermediate nodes play a key role. Unluckily, hackers will target these nodes to get the sensitive data. This shows how essential it is to have a strong intrusion detection mechanism in MANETs, particularly for multipath routing, in order to effectively identify any attacks. To identify the possible intrusions in these nodes, the author of this paper presented a technique called the graph neural network (GNN). After training it on a variety of datasets, they evaluated GNN in the network. Based on the results, GNN provides more protection against attacks when compared to other strategies [16].

3 Methodology

This study proposes a new method of intrusion detection, which combines the benefits of the CNN and improved Multilayer Perceptrons (EMLP). The research is implemented in the six systematic stages. The data is initially obtained using the KDD CUP99 dataset that is well known and dependable source of data. Second, the raw data of the data will be cleaned and processed in order to have quality data that is consistent and ready to be analyzed. Third, a model based on EMLP-CNN is implemented and trained to most effectively extract features and discover latent patterns in the data. Fourth, the results of the proposed model are assessed with typical criteria, the accuracy, precision, recall, and F1-score. Fifth, the system of the model is checked to ensure the reliability of operation in terms of efficiency. Lastly, in the sixth stage the results and findings of the research are well documented (Fig. 1).

Fig 1
Flowchart illustrating a data processing workflow. It begins with "Data Collection (KDD CUP99 Dataset)," followed by "Data Preprocessing," then "Feature Extraction (EMLP-CNN model)." The process continues to "Testing," loops back to "Feature Extraction," and proceeds to "Evaluation metrics (Accuracy, Recall, F1-Score)." Finally, it leads to "Output." Arrows indicate the flow direction between each step.The alternative text for this image may have been generated using AI.

Proposed methodology dataflow

Data Collection:

The KDD CUP99 dataset is a popular resource for testing how well intrusion detection systems work. It has 494,021 records, each featuring 41 different aspects, and is divided into five categories: normal, probe, DoS, user to root (U2R), and remote to local (R2L). There's a big imbalance in the dataset—97,277 records (about 19.7%) are marked as normal, while only 4,071 (0.8%) are probes, 229,853 (46.5%) are DoS incidents, 52 (0.01%) are U2R, and 1,126 (0.2%) are R2L. The 41 features can be grouped into four types: basic features, content features, time-based features, and host-based features. This dataset presents some challenges like class imbalance, a wide range of features, and some noise and outliers, which makes it a good standard for testing IDSs.

Data Preprocessing:

Preparing the data is an important step in building a good IDS. To test the EMLP-CNN model, we have to assist in transforming unprocessed data into a format that is suitable for both training and evaluating. To improve the performance of our model, we need to preprocess the KDD CUP99 dataset to fix problems like errors, inconsistencies, and missing information. And we ensure the features are properly adjusted and formatted to clean up the data, which helps the EMLP-CNN model work in a better way, and it leads to more accurate intrusion detection results.

Data Cleaning:

Ready to use the data once we have cleaned it. Cleaning data is an important step. We can trust the information after the cleaning because they remove the mistakes, inconsistencies, or gaps in the dataset. We have started this investigation by removing duplicate records from the KDD CUP99 dataset. By doing this, we are removing the unnecessary data and reducing the possibility of duplicate records. To address the missing values, we used the mean imputation method [17]. By substituting the missing values with the average of that feature to fill the gaps in this method. So, by doing all these things, the data is ready for the next step.

Data Normalization:

Normalization is an important component of data preparation to undertake machine learning. It consists in the minimization of the variance of the values of the features so that the variance of individual feature values will not improperly affect the model. This will give precedence to features that have greater numerical ranges at the expense of anomalous results. In this analysis, the Min-Max scaling approach is used so that all the values of the features are shifted to a new range of 0 or 1. The approach is quite common in data analytics and machine learning since it limits the influence of different feature scales leading to a more precise and consistent model. The data normalization guards against the model choosing to concentrate on one or a few features over the rest.

Min-Max Scale Method:

Min-Max scaling is probably the most frequently utilized normalization blocking method that works especially well when data sets include features of non-homogenous scales or measurement unit. This approach normalizes values of each feature within the same range, i.e., most often [0, 1] such that a feature does not have too strong an influence on the learning process [18]. Min-Max normalization improves stability and performance of the models since the features of the dataset have been adjusted to the same scale.

By taking the minimum value of each feature is subtracted from the data, and the range gap between the highest and lowest values is then divided by the range to use the Min-Max Scale. Here’s the formula for it:

$$ {\textrm{X}}^{\hbox{'}}=\left(\textrm{X}-\textrm{Xmin}\right)/\left(\textrm{X}\textrm{max}-\textrm{Xmin}\right) $$
(1)

In the above formula Xmin is the smallest value, X is the original value, Xmax is the largest and X' is the new normalized value. Applying this method changes the data into a shared range, which can help to:

  • Lessen the impact of different feature scales

  • Make the model more stable and accurate

  • Stop features with large ranges from taking control of the model

Data Transformation:

Transforming data is an important part of preparing it for analysis. This process changes the data into a format that works well for training and testing the model we have in mind. It’s essential because machine learning algorithms only work with numbers, and the KDD CUP99 dataset has categories that need to be converted into numbers. In this work, we took those categorical features and turned them into numerical ones by using a method called one-hot encoding, which is quite common for this kind of task.

So, how does one-hot encoding function? For each category it generates a new binary tree. We have a feature called "protocol" with three possibilities: "tcp", "udp", and "icmp", and it will create three new columns by the one hot encoding "protocol tcp", "protocol udp", and " protocol icmp". The new feature gets a 1, if the original category appears in each row and it doesnot, it gets a zero. This modification helps categorical data to be processed by machine learning algorithms in the same way as numerical data.

This approach has advantages, such as easy to establish and handling certain things in an easiest way. However, the drawbacks are there, like the fact that addition of new features and may complicate the data and result in multicollinearity between these features. By using one hot encoding, in an effective way they convert categorical data to numerical data. To categorical the features in the KDD CUP99 dataset, we applied one hot encoding method in this study, and they focused on the protocol, "service " and "flag" features. In our EMLP-CNN model the numerical features are created then used to train and test. We allowed the categorical features to be represented numerically, by using one- hot encoding, which helped our model to learn the connections in the data and complex relationships.

Feature Selection:

To build a good system for detecting intrusions, we have to select the right features that are key part. In order to improve the model's performance in a better way, this process involves picking out the most important features from the dataset. Finding the characteristics that are closely related to our main target, in this case, examining whether it is normal or an attack. The model learns better and becomes more accurate at spotting intrusions more, when we focus on the most relevant features.

To pick the features, we used a method based on correlation in this study. How strong a connection with our target, this thing is focused clearly on this approach. With the values ranging from -1 to 1, the correlation coefficient helps us understand the relationship between two features. A strong negative connection is represented by the value -1, and a strong positive connection is represented by 1. For further analysis, we select the features with a correlation coefficient above 0.5 in our study.

The method is user-friendly and compatible with a wide range of algorithms because we are using the correlation method, which is common in machine learning and data analysis. It does have certain disadvantages. It might not care about non-linear connections, but it mainly focuses on linear relationships. Also, it can select the features that are extremely comparable to one another, which could cause issues with multicollinearity.

Feature Extraction:

Said feature extraction is of relevance when it comes to creating an effective intrusion detection system, which changes the raw features to more accessible and efficient state. Dimensionality reduction, data transformation, and generation of new features are the most common methods to feature extraction. In this work, we used Principal Component Analysis (PCA) to make the dataset dimensionality much less with the most important information preserved.

CA has a broad use in machine learning and data mining in the process of feature extraction. It transforms the original features into another set of unrelated identifications referred to as principal components that are arranged in order of importance. Such transformation introduces an effective and simplified representation of the initial data, which can boost the performance of the EMLP-CNN model [20]. In order to find the most important features of the dataset, PCA was used to draw out the principal components. These were then fed into the EMLP-CNN model in training and testing. These outcomes revealed across feature extraction using PCA played a meaningful role towards high data-driven intrusion detection performance.

Multilayer Perceptron:

A Feed Forward Neural Network (FNN) is a kind of neural network wherein a set of inputs are linked to a set of outputs. A typical FNN, the Multi-Layer Perceptron (MLP), is composed of three types of layers: input, hidden, and output layer. Each hidden and output layer uses a nonlinear activation function to provide complexity and permit the network to learn non-linear relationships All layers are fully interconnected, in that all neurons in one layer are connected to all neurons in the next. The efficiency of the network is measured with an error function (E), which can be mathematically defined as:

$$ E=\sum_{k=1}^n{d}^{(k)}-{y}^{(k)} $$
(2)

n this equation, target value is indicated by d, and y is the output vector from the MLP. Once the error value E is determined, we can adjust the bias and weight using the following formulas:

$$ {w}_{new}={w}_{prev}-\eta\ \frac{\partial E}{\partial {w}_{prev}} $$
(3)
$$ {\theta}_{new}={\theta}_{prev}-\eta\ \frac{\partial E}{\partial {\theta}_{prev}} $$
(4)

Here d^(k) indicates the position of the target vector, and η represents the learning rate in equations [24] and [25]. During learning the weight used is shown by θ, the identifier for weight is w, and y is indicated as the output vector.

Enhanced Multilayer Perceptron:

To build the traditional Multilayer Perceptron we need EMLP (Enhanced Multilayer Perceptron) is a type of neural network. To improve the performance, they add more features and techniques. For detecting intrusions, we used the EMLP design to create a strong system for detecting intrusions, in this research. We were able to achieve impressive accuracy in identifying intrusions, by training the EMLP model with selected features. The capability of EMLP design is to how well they are recognizing the complex patterns and the connections in data. Through the use of many hidden layers, which provide the model with layered insights to the data. Batch normalization and dropout regularization are the methods also used with EMLP, which are making the model in more adaptable and reduce the risk of overfitting. So overall, the EMLP structure is an effective way to create intrusion detection systems (Fig. 2).

Fig 2.
Flowchart depicting a neural network architecture. It starts with an "Input Layer," followed by an "EMLP Component" with two hidden layers. Next is a "CNN Component" with two convolutional layers and max pooling. This is followed by a "Fully Connected Layer" with a softmax function, and ends with an "Output Layer." Arrows indicate the flow from one component to the next.The alternative text for this image may have been generated using AI.

Proposed EMLP-CNN model architecture

Convolutional Neural Networks (CNNs):

Applying CNNs in classification of the intrusion is a significantly important element in modern security. Conventional methods of detection have difficulties recognizing refined or continuous attacks and the CNNs and other machine learning algorithms which use deep learning are efficient in this field. These networks are specifically effective in finding patterns in unstructured data, and so they can find small variations in system operation or in network traffic that might point to an intrusion. The way they operate with the data of such high dimensions and complex data structure makes them well-suited to the classification of intrusion.

A CNN is technically a form of feed-forward neural net that uses convolutional layers to process that extract features automatically upon the input data [27]. Take, for instance, that a one-dimensional CNN (1D CNN) acts on a single vector, and it carries out convolution procedures to create novel feature representations [28]. The result of a CNN may be mathematically explained as:

$$ y(x)=f\left(\sum j\infty \sum i\infty {w}_{ij}{x}_{ij}+b\right) $$
(5)

In this equation, f (*) is the activation function, w_ij represents the weight of the convolution kernel at position (i,j) in a m × n dimension, x_ij is the input vector, and b is the offset value.

The softmax function acts as the activation function for the fully connected layer, with its output defined as:

$$ \sigma t= softmax\left({w}_{ho}\ast H+{b}_0\right) $$
(6)

Here, who is the convolution kernel, H stands for feature representation, and b_0 is the offset value, which ranges from one to three.

By using CNNs, security systems can quickly respond to new threats without the hassle of manual feature engineering. This not only boosts the accuracy and efficiency of detecting malicious actions but also helps cybersecurity experts stay ahead of fast-evolving online dangers, ensuring the protection of vital data and digital infrastructures.

Testing:

In the testing section, we used a testing set consist of 10,000 samples and to evaluate how well the new EMLP-CNN model works. A label that it showed if it was normal or an attack, with each sample had 41 features. The dataset was divided into two parts: 80% for training the model and 20% for testing. By using we are testing the data how well it works and how it is helping to avoid overfitting.

4 Results and Findings

In the result section, we show the outcome of the EMLP-CNN model by tested on the KDD CUP99 dataset. This model gets 98.5% of accuracy, with 97.8% recall, 98.2% precision, and an 98.0% F1-score. The above results how to us how well the model is performed to identify the intrusions.

We have to compare the EMLP-CNN model with other models, like SVM, Random Forest, CNN, and MLP. The outcome demonstrates that the EMLP-CNN model outperforms in terms of accuracy, recall, precision, and F1-score. To identify the intrusions within the KDD CUP99 dataset this method is better as we suggest. The combination of EMLP and CNN structures results in strong performance. The CNN framework effectively supports learning spatial features, while the EMLP framework helps to identify complex patterns and relationships in data. The above combination allows the model to show an excellent one.

Accuracy:

It calculates the proportion of accurate predictions. The positive and negative results are evaluated. If the accuracy is going up, it means the performance is good.

$$ \textrm{Accuracy}=\left(\textrm{TP}+\textrm{TN}\right)/\left(\textrm{TP}+\textrm{TN}+\textrm{FP}+\textrm{FN}\right) $$
(7)

Precision:

Precision focuses on how many of the predicted positive results are actually correct. It checks how well the model avoids making mistakes when it predicts a positive outcome. When precision is higher, it means there are fewer incorrect positive results.

$$ \textrm{Precision}=\textrm{TP}/\left(\textrm{TP}+\textrm{FP}\right) $$
(8)

Recall:

Recall looks at how many of the actual positive cases the model correctly identifies. It measures how good the model is at finding all the real positives. A higher recall means it misses fewer positive cases.

$$ \textrm{Recall}=\textrm{TP}/\left(\textrm{TP}+\textrm{FN}\right) $$
(9)

F1-Score:

F1-score combines both precision and recall into one number. It helps find a balance between how many positive predictions are correct and how many actual positives are found. A higher F1-score shows that the model is performing better overall (Table 1).

$$ \textrm{F}1{-} \textrm{score}={2}^{\ast}\left({\textrm{Precision}}^{\ast}\textrm{Recall}\right)/\left(\textrm{Precision}+\textrm{Recall}\right) $$
(10)
Table 1 Comparison of the proposed EMLP-CNN model

The EMLP-CNN model is the best performing and demonstrated performance accuracy of 98.5%, the highest accuracy performance obtained among all tested models, which presumably reflects strong performance in the ability to accurately classify cases correctly. Recall, precision, and F1 score of the EMLP-CNN were 97.8%, 98.2%, and 98.0%, which reflects it is an effective model. The SVM model recorded an accuracy performance of 95.2%, with a recall of 94.5%, precision of 95.8%, and F1 score of 95.1%. The Random Forest model was the next best with an accuracy performance of 96.3%, a recall of 95.6%, precision of 96.9%, and F1 score of 96.2%. The CNN model recorded better performance than these two models with an accuracy performance of 97.2%, recall performance of 96.5%, precision performance of 97.8%, and an F1 score of 97.1%. The MLP model recorded the lowest performance with an accuracy performance of 94.8%, recall of 94.1%, precision of 95.4%, and F1 score is 94.7%.

The overall results suggest that the EMLP-CNN model was a better overall performer than all other methods of evaluation, evaluated via accuracy, recall, precision, and F1 score. Based on this, performance, it is likely that the EMLP-CNN is a very capable model for the KDD CUP99 dataset, as well as a very robust model in identifying intrusions. The EMLP-CNN success can be attributed to its ability to learn complex functions by learning from an EMLP architecture and CNN architecture in tandem (Fig. 3).

Fig 3
Bar chart comparing the performance of different models: Proposed EMLP-CNN, SVM, Random Forest, CNN, and MLP across four categories. Each model is represented by a different color, with performance values ranging from 0 to 100 on the vertical axis. The chart shows similar high performance across all models and categories.The alternative text for this image may have been generated using AI.

Comparison of the proposed EMLP-CNN model

5 Conclusion

This study shows an impressive accuracy of 98.5% and a detection rate of 97.2% for spotting and classifying intrusions. Comparatively to other models this has a clear setup and achieved also. This improved MLP model is highly effective in identifying, classifying intrusions and positioning it as a competitive candidate for enhancing the network security this data is demonstrated in this study. It highlights the points of how deep learning methods, particularly enhanced the MLP model, and it can really enhance how well intrusion detection and classification systems work. This model is a meaningful addition to the network security field because of its high accuracy and efficiency in detecting and classifying intrusions. In the future the research people might use this model in several areas like IoT security or cloud computing, and to investigate different deep learning frameworks to improve its performance