Fusion ofstatistical importance for feature selection in Deep Neural Network-basedIntrusion Detection System | Tài liệu môn An toàn thông tin

Intrusion Detection System (IDS) is an essential part of network as it contributes towards securing the network against various vulnerabilities and threats. Over the past decades, there has been comprehensive study in the field of IDS and various approaches have been developed to design intrusion detection and classification system. With the proliferation in the usage of Deep Learning (DL) techniques and their ability to learn data extensively, Tài liệu giúp bạn tham khảo, ôn tập và đạt kết quả cao. Mời bạn đọc đón xem!

Information Fusion 90 (2023) 353–363
Available online 3 October 2022
1566-2535/© 2022 Elsevier B.V. All rights reserved.
Contents lists available at ScienceDirect
Information Fusion
journal homepage: www.elsevier.com/locate/inffus
Full length article
Fusion of statistical importance for feature selection in Deep Neural
Network-based Intrusion Detection System
Ankit Thakkar, Ritika Lohiya
Institute of Technology, Nirma University, Ahmedabad, Gujarat 382 481, India
A R T I C L E I N F O
Keywords:
Intrusion Detection System
Deep Learning
Filter-based feature selection
Deep Neural Network
Standard deviation
Fusion of statistical importance
A B S T R A C T
Intrusion Detection System (IDS) is an essential part of network as it contributes towards securing the network
against various vulnerabilities and threats. Over the past decades, there has been comprehensive study in the
field of IDS and various approaches have been developed to design intrusion detection and classification system.
With the proliferation in the usage of Deep Learning (DL) techniques and their ability to learn data extensively,
we aim to design Deep Neural Network (DNN)-based IDS. In this study, we aim to focus on enhancing the
performance of DNN-based IDS by proposing a novel feature selection technique that selects features via
fusion of statistical importance using Standard Deviation and Difference of Mean and Median. Here, in the
proposed approach, features are pruned based on their rank derived using fusion of statistical importance.
Moreover, fusion of statistical importance aims to derive relevant features that possess high discernibility and
deviation, that assists in better learning of data. The performance of the proposed approach is evaluated using
three intrusion detection datasets, namely, NSL-KDD, UNSW_NB-15, and CIC-IDS-2017. Performance analysis is
presented in terms of different evaluation metrics such as accuracy, precision, recall, -score, and False Positive𝑓
Rate (FPR) and the results are compared with existing feature selection techniques. Apart from evaluation
metrics, performance comparison is also presented in terms of execution time. Moreover, results achieved are
also statistically tested using Wilcoxon Signed Rank test.
1. Introduction
The fundamental rationale of developing Intrusion Detection System
(IDS) is to detect and classify network samples with precise classi-
fication accuracy and minimum false alarms [1,2]. Hence, various
philosophical and analytical design principles should be taken into
consideration while developing intrusion detection and classification
system. Over past decades, there has been various efforts in designing
an efficient IDS using Deep Learning (DL) techniques [3]. Furthermore,
DL techniques such as Deep Neural Network (DNN) has emerged as one
of the leading solutions for building an efficient IDS [ ]. This is because,4
DNN has an intriguing characteristic attribute of performing end-to-end
learning and in-depth analysis to derive patterns in data for prediction
and classification [ ]. Hence, DL techniques such as DNN can be5
considered as one of the intelligent techniques that performs implicit
learning on high-dimensional data with ease. Apart from handling high-
dimensional data, DNN offers high-level data abstraction and good
generalization ability for underlying attack classification problem [ ].6
In this research, we aim to design a DNN-based intrusion detection
and classification system by applying fusion of statistical importance
by considering statistical measures for feature engineering on intrusion
Corresponding author.
E-mail addresses: ankit.thakkar@nirmauni.ac.in (A. Thakkar), 18ftphde30@nirmauni.ac.in (R. Lohiya).
detection datasets. We have applied fusion of statistical importance
using statistical measures to derive association and identify significance
of features for feature selection [ ]. For intrusion classification, DNN7
requires a large amount of data for learning and deriving patterns. In
the field of IDS, various intrusion detection datasets have been devel-
oped for analysis and learning [ ]. These datasets have been developed8
by capturing raw network traffic flowing through underlying network
environment. Various networking tools such as Wireshark and Nmap
are used for capturing raw network traffic [9]. Further, the captured
data is stored in the form of pcap or tcpdump files, which are processed
to extract network features from network packets consisting header
and payload information [10]. Hence, intrusion detection datasets used
for the performance evaluation comprises of high-dimensional network
feature space for learning. However, considering network features,
there is a possibility that intrusion detection datasets might consist
of redundant and irrelevant features that conceivably would either
affect or would not contribute towards prediction and classification
process [ ].11
Consequently, considering the role and importance of feature en-
gineering in intrusion detection and classification process, we aim
https://doi.org/10.1016/j.inffus.2022.09.026
Received 23 February 2022; Received in revised form 26 September 2022; Accepted 28 September 2022
Information Fusion 90 (2023) 353–363
354
A. Thakkar and R. Lohiya
Fig. 1. Scientific contribution and importance of the proposed feature selection technique.
to design a novel feature engineering process that selects features
based on statistical interpretation of features for the underlying intru-
sion detection datasets. Generating a reduced subset of interpretative
features is a critical process and hence, we aim to design a novel filter-
based feature selection technique that considers standard deviation,
mean, and median statistical measures for deriving reduced feature
subset for learning and performance enhancement of DNN-based IDS.
Unlike other feature selection technique, filter-based feature selection
approach aims at deriving feature subset without any influence of
classification technique applied for learning and prediction [ ].12
1.1. Scientific contribution and importance of the proposed feature selection
technique
With a goal of designing effective predictive model for the under-
lying classification problem, feature selection technique can be consid-
ered as a heuristic approach that may not guarantee optimal, perfect,
or rational performance but is an adequate medium to achieve im-
mediate and effective performance for the underlying classification
problem [ ]. Hence, the science and art of the proposed feature selec-13
tion technique is based on the heuristics, namely, standard deviation
and difference ( ) of the features in the given dataset.|𝑀𝑒𝑎𝑛 𝑀 𝑒𝑑𝑖𝑎𝑛|
The scientific contribution and importance of the proposed feature
selection technique is shown in .Fig. 1
The major contributions of our proposed work are summarized as
follows.
We have proposed a novel feature selection technique based on
fusion of statistical importance of features for intrusion detection
and classification.
DNN is applied for learning and classification process using re-
duced feature subset.
For fusion of statistical importance of features, statistical mea-
sures, namely, standard deviation, mean, and median are taken
into consideration.
The proposed approach is evaluated using three intrusion de-
tection datasets, namely, NSL-KDD, UNSW_NB-15, and CIC-IDS-
2017.
Performance analysis of the proposed approach is presented in
terms of varied evaluation metrics such as accuracy, precision,
recall, 𝑓 -score, and False Positive Rate (FPR).
Performance of the proposed approach is compared with differ-
ent feature selection techniques such as Chi-Square, Correlation-
based Feature Selection (CFS), Recursive Feature Elimination,
Genetic Algorithm (GA), Mutual Information (MI), Relief-f, and
Random Forest (RF).
Comparative analysis with existing feature selection techniques
is presented using different evaluation metrics that have been
considered as well as execution time.
The remainder of the paper is organized as follows. Section 2
presents overview of feature selection techniques for intrusion detec-
tion and classification. Section describes the proposed feature selec-3
tion technique for DNN-based IDS. Section 4 discusses experimental
methodology adopted for intrusion detection and classification. Sec-
tion present and discuss result analysis of implemented techniques.5
Section concludes the work presented in this paper.6
2. Related work
Over the past years, various approaches have been proposed for
intrusion detection and classification. In [ ], a wrapper-based feature14
selection technique is proposed using Genetic Algorithm (GA) and
Logistic Regression (LR). Here, in the proposed approach, GA along
with LR is applied for feature selection and Decision Tree (DT) clas-
sifier is applied for classification. The performance of the proposed
approach is evaluated using Weka tool with two intrusion detection
datasets, namely, KDD CUP 99 and UNSW_NB-15 datasets. Result anal-
ysis revealed that the proposed feature selection technique achieved
99.90% Detection Rate (DR) and 0.1% False Alarm Rate (FAR) for KDD
CUP 99 dataset with 18 features and 81.24% DR and 6.39% FAR for
UNSW_NB-15 dataset with 20 features.
Flexible Mutual Information (FMI) feature selection technique is
proposed in [ ] for deriving reduced feature subset for intrusion15
detection and classification. Here, in the proposed approach, Least
Square-Support Vector Machine (LS-SVM) is applied for classification
and features are selected using FMI considering correlation among the
features of the dataset. Moreover, FMI is a non-linear feature selection
techniques that uses correlation as a metric for feature selection. For
assessing the performance of the proposed approach three intrusion
detection datasets are used, namely, Kyoto 2006, KDD CUP 99, and
NSL-KDD datasets. The performance analysis of the proposed approach
is presented in terms of DR and FAR.
Correlation-based Feature Selection (CFS) technique is applied in
[16] for intrusion detection and classification. Here, in the proposed
approach, DT classifier is applied for classification that uses reduced
feature subset derived using CFS. The performance of the proposed
approach is evaluated using NSL-KDD dataset consisting of 41 features.
The proposed approach derives reduced feature subset with 14 fea-
tures which is further used for intrusion detection and classification.
Performance analysis of the proposed approach is presented in terms
of accuracy and it can be inferred from the results that the proposed
approach achieved accuracy of 90.30% for NSL-KDD dataset with 14
features.
Comparative analysis of various classifiers is performed in [ ]17
using Weka tool. Here, for the comparative analysis various feature
selection techniques are applied, namely, attribute evaluator, greedy
stepwise, IG, and ranker technique. Further, two feature subsets are
derived for intrusion detection and classification by performing defined
number of simulations. It is inferred from the performance analysis
that Random Forest (RF) classifier performs better in terms of overall
performance for both derived feature subsets. Moreover, result analysis
is presented in terms of Kappa statistic and accuracy for demonstrating
the performance of each feature subset.
A filter-based feature selection technique using IG is proposed
in [ ] for intrusion detection and classification. Here, in the proposed18
Information Fusion 90 (2023) 353–363
355
A. Thakkar and R. Lohiya
approach, classification is performed by integrating rule-based ap-
proach with multiple tree classifiers. The performance of the proposed
approach is evaluated using UNSW_NB-15 dataset with 22 features
derived using IG technique. Furthermore, results analysis is presented
in terms of accuracy, -score, and FAR.𝑓
Feature selection using RF is applied in [ ], wherein feature rank-19
ing is performed using feature importance. Here, in the proposed study,
feature importance of each feature is computed and features are ranked
based on their feature importance value. This implies that feature
with highest rank can be considered as most significant feature for
intrusion detection and classification. For the prediction and classifi-
cation, various classification techniques were implemented, namely,
𝑘 𝑘-Nearest Neighbour ( NN), DT, Bagging Meta Estimator (BME), XG-
Boost, and RF. The performance of the proposed approach is assessed
using UNSW_NB-15 dataset with reduced feature subset consisting of 11
features. Furthermore, result analysis is presented in terms of accuracy
and -score.𝑓
An ensemble two-tier IDS is designed in [ ], wherein, hybrid20
feature selection techniques is implemented along with majority voting-
based classification. Here, in the proposed approach, features are se-
lected using hybrid technique designed using PSO, GA, and Ant Colony
Optimization (ACO). Further, for the classification, rotation forest and
bagging classifier are applied and predictions are generated using
majority voting technique. The performance of the proposed approach
is evaluated using KDD CUP 99 dataset with reduced feature subset
consisting of 19 features. Moreover, the performance of the proposed
approach is validated using 10-fold cross validation technique and
result analysis is presented in terms of accuracy, precision, recall, and
FAR.
A two-stage intrusion classification model is designed using RF
classifier in [ ]. Here, in the proposed approach, IG is applied as21
feature selection technique. In the first stage, detection of minority
class is performed and in the second stage, majority class is detected.
Prediction from each stage are combined to generate classification
result. The performance of the proposed approach is assessed using
UNSW_NB-15 dataset and results are presented in terms of accuracy
and FAR.
RepTree-based two stage IDS is proposed in [22], wherein IG is
used as feature selection technique. Here, in the initial stage under-
lying dataset is divided into three categories based on the protocol
type and further, in the second stage classification is performed. The
performance of the proposed approach is evaluated using UNSW_NB-
15 dataset with a reduced feature set consisting of 20 features and
results are presented in terms of accuracy. An incremental technique
comprising of Extreme Learning Machine (IELM) and Advanced Prin-
cipal Component (APCA) algorithm is proposed in [ ] for intrusion23
detection and classification. Here, in the proposed approach, ELM is
applied for classification and APCA is applied for adaptive feature
selection. The performance of the proposed approach is assessed using
UNSW_NB-15 dataset and results are presented in terms of accuracy,
DR, and FAR.
Intrusion detection and classification system is designed in [ ]25
using NB and MLP classifier. Here, in the proposed approach, combined
feature selection technique is applied that consists of three feature
selection techniques, namely, IG, GR, and ReliefF. Performance of the
proposed approach is evaluated using KDD CUP 99 dataset and result
analysis is presented in terms of accuracy and FAR. An ensemble
framework along with feature selection is designed in [ ] for intrusion24
detection and classification. Here, in the proposed approach, GR is
applied for selecting significant features for learning and Bagging classi-
fier is applied for classification. Performance evaluation of the proposed
approach is conducted using NSL-KDD dataset and results are presented
in terms of classification accuracy and FAR. The comparative summary
of existing ML approaches for IDS is presented in Table .Table 1
2.1. Comparative analysis of existing approaches for IDS
The design and development of discussed research work for in-
trusion detection and classification using feature engineering is en-
couraging. However, varied IDS have been designed using different
learning algorithms and feature selection techniques which adapt to
unique learning strategy for feature selection as well as attack classifica-
tion [ ]. However, research gaps still exist with different learning26 28
mechanisms for intrusion detection and classification such as,
Majority of the research work have designed IDS considering
existing feature selection techniques using visualization tools such
as WeKa [ , ].14 17
The existing techniques designed for intrusion detection and clas-
sification using feature selection techniques have been analysed
and compared using outdated datasets which lack experimental
scenario.
Hence, there is a scope for developing an enhanced technique for
intrusion detection and classification. Therefore, we aim to design a
novel feature selection approach for DNN-based IDS that uses fusion
of statistical importance derived using standard deviation and absolute
difference of mean and median to select relevant and contributing
features for prediction and classification process. The application of
statistical importance to select features is effective because it derives
features based on statistical reasoning, which enhances the perfor-
mance of designed DNN-based IDS with feature discernibility and
deviation. Thus, novelties of our proposed work can be summarized
as follows.
A novel feature selection technique based on fusion of statistical
importance of features is proposed for intrusion detection and
classification.
Experimental-based analysis of novel feature selection techniques
is performed using recent datasets and datasets used in literature,
namely, NSL-KDD, UNSW_NB-15, and CIC-IDS-2017.
We have presented comparative analysis of the proposed feature
selection technique with existing feature selection techniques.
3. Proposed feature selection technique for intrusion detection
and classification
For intrusion detection and classification, various feature selec-
tion techniques are applied which are categorized in three categories,
namely, filter-based feature selection technique, wrapper-based feature
selection technique, and embedded feature selection technique [ ].29
In filter-based feature selection technique, reduced feature subset is
derived based on certain relevancy criteria that defines significance
of features pertaining to learning and classification [ ]. Hence, in30
filter-based feature selection technique a relevancy score is derived and
features are filtered based on the calculated score [ ]. In wrapper-30
based feature selection technique, classification algorithm is considered
and knowledge based is constructed to derive feature subset for learn-
ing and classification [ ]. The knowledge base of features reveals31
the importance of features in refined form based on underlying clas-
sification algorithm. Feature selection using wrapper-based techniques
is performed using predefined rules and conditions. However, perfor-
mance of wrapper-based feature selection technique is dependent on
the type of classification algorithm used [31]. Embedded feature se-
lection technique is implemented by incorporating two phases that are
learning phase and feature selection phase [ ]. This implies that em-32
bedded technique employs selecting features separately in two phases,
wherein outcomes of learning phase are used to add or delete features
in feature selection phase [ ].32
Information Fusion 90 (2023) 353–363
356
A. Thakkar and R. Lohiya
Table 1
Comparative summary of existing Machine Learning (ML) approaches for IDS.
Ref Technique Feature selection Dataset Results
[16] DT CFS NSL-KDD Accuracy for NSL-KDD: 90.30%
[15] LS-SVM FMI Kyoto 2006, KDD
CUP 99, and
NSL-KDD
DR for KDD CUP 99: 99.46%
DR for NSL-KDD: 98.76%
DR for Kyoto 2006: 99.64%
[17] RF Attribute evaluator,
greedy stepwise, IG,
and ranker
KDD CUP 99,
UNSW_NB-15
Comparative analysis is presented in
graphical format for feature selection
technique considered.
[22] RepTree IG NSL-KDD,
UNSW_NB-15
Accuracy for NSL-KDD: 89.85%
Accuracy for UNSW_NB-15: 88.95%
[14] DT GA KDD CUP 99,
UNSW_NB-15
DR for KDD CUP 99: 99.90%
DR for UNSW_NB-15: 81.24%
[19] kNN, DT, BME,
XGBoost, and RF
Feature importance UNSW_NB-15 Accuracy for kNN: 71.01%
Accuracy for DT: 74.22%
Accuracy for BME: 74.64%
Accuracy for XGBoost: 71.43%
Accuracy for RF: 74.87%
[21] RF IG UNSW_NB-15 Accuracy for UNSW_NB-15: 85.78%
[24] Bagging classifier GR NSL-KDD Accuracy for NSL-KDD: 84.25%
[23] IELM APCA NSL-KDD,
UNSW_NB-15
Accuracy for NSL-KDD: 81.22%
Accuracy for UNSW_NB-15: 70.51%
[20] Rotation forest and
Bagging classifier
PSO, ACO, and GA KDD CUP 99 Accuracy for KDD CUP 99: 72.52%
[25] NB, MLP Combined feature
selection technique
KDD CUP 99 Accuracy for NB: 93.00%
Accuracy for MLP: 97.00%
[18] Rule-based multiple
tree classifiers
IG UNSW_NB-15 Accuracy for UNSW_NB-15: 84.83%
3.1. Association between intrusion classification and feature selection
Intrusion detection datasets are developed by sniffing network pack-
ets flowing through network environment using various networking
tools such as Wireshark and Nmap [ ]. Captured network packets26
are accumulated in form of raw network files such as pcap files or
tcpdump files. These files consist of various details regarding network
communication which are extracted from network packet header and
network packet payload. The details regarding network communication
from captured network traffic serve as network features for designed
IDS. Intrusion detection and classification system examines the network
activities and analyses the data to inspect whether the analysed data
flow is anomalous network traffic or normal network traffic [32]. IDS
analyses the data to check whether system’s confidentiality, integrity,
or availability is compromised or not. While designing an IDS, various
aspects are considered such as monitoring network, collecting data,
statistically analysing the collected data, intrusion detection, intimidat-
ing the security administrator of an intrusive event, and responding to
intrusions [ ].8
In literature various researchers have designed hybridized
approaches by combining feature selection techniques with classifi-
cation technique [ ]. Considerably, feature selection technique is26
incorporated to enhance the performance of the designed IDS. How-
ever, the combination of feature selection and intrusion detection
should focus on achieving increased classification accuracy with re-
duced number of false positives. Therefore, the designed IDS requires
an efficient feature selection technique which is capable of extracting
significant features from the underlying dataset. Hence, the proposed
feature selection technique aims to select features by considering
statistical characteristics of features.
3.2. Conceptualization of the proposed feature selection technique
Nowadays, enormous amount of network traffic is generated from
various network resources. Features from flowing network traffic are
studied for deriving patterns of normal and anomalous network traffic.
However, network traffic data needs to examined with precision and
effectiveness. Selecting relevant features plays a significant role in
deriving pertinent information from a large number of data samples.
Feature selection is one of the critical approaches that directs to select
features from the underlying dataset which can contribute better in
enhancing the predictive capability for the given classification problem.
Hence, feature selection can be described as selection strategy adopted
to remove irrelevant and redundant features for better representation
of data.
In our study, a novel filter-based feature selection technique is de-
signed to derive relevant features from the intrusion detection dataset
that can contribute more towards learning and classification process.
Hence, with an aim to enhance the performance of DNN-based IDS,
a new and discerning feature selection technique named as Feature
Selection via Standard Deviation and Difference of Mean and Median
is proposed in our study. The proposed feature selection technique
derives reduced feature subset with high discernibility and deviation.
The application of standard deviation, mean, and median is effective
in deriving features as these measures perform quantitative and statis-
tical reasoning to derive relevant features for intrusion detection and
classification [ ]. The fusion of statistical importance using statistical33
measures aims at improving the performance of prediction and clas-
sification through quantitative and descriptive comparisons [33]. The
conceptualization strategy of the proposed feature selection techniques
is presented in .Fig. 2
3.3. Standard deviation
Standard deviation for features can be described as a statistical mea-
sure that measures the amount of variation or deviation in features from
mean [ ]. The standard deviation can be computed using Eq. [ ].34 (1) 34
𝜎
=
(𝑥
𝑖
𝜇)
2
𝑁
(1)
Here, in Eq. is the total number(1), 𝜎 represents standard deviation, 𝑁
of samples, represents each value from underlying feature, and𝑥
𝑖
𝜇
represents mean of underlying feature.
Information Fusion 90 (2023) 353–363
357
A. Thakkar and R. Lohiya
Fig. 2. Conceptualization strategy of the proposed feature selection technique.
Interpretation of standard deviation reveals that high standard de-
viation value indicates that feature is dispersed over large range of
values and a low standard deviation value indicates that feature values
are closely located with respect to mean [ ]. Hence, feature selection33
using standard deviation chooses features with high standard deviation
value because as feature values are extended over large range, effective
prediction outcome can be achieved. Moreover, standard deviation
represents features distinguishable capability and therefore, standard
deviation of a feature manifests its differences on all samples. This
implies that high standard deviation value reveals more differences the
feature has on all samples [ ].33
3.4. Mean and median
Mean and Median can be defined as descriptive statistical measures
that are used to characterize data distribution [ ]. Moreover, these35
statistical measures represents relative magnitude of deviation in a data
distribution [ ]. For feature selection, we have utilized the absolute35
value of the difference between mean and median to derive relevant
features from the dataset, which is represented using Eq. .(2)
𝐷 = |𝑀𝑒𝑎𝑛 𝑀 𝑒𝑑𝑖𝑎𝑛| (2)
Here, in Eq. represents absolute value of the difference between(2), 𝐷
mean and median for a given feature. Interpretation of the difference of
mean and median reveals that high difference value indicates deviation
over a large range of values and hence, features with high difference
value can be selected as relevant features from dataset for effective
prediction and classification process [ ].34
3.5. Process of feature selection for intrusion detection and classification
The outcome of feature selection process is a set of relevant features
that are strongly related with class output labels and contribute more
towards the learning patterns from data. To quantify contribution of a
given feature in classification process, we introduce combined feature
rank that represents significance of a features. The combined feature
rank is computed based on ranks derived using fusion of standard
deviation and difference of mean and median. From the description of
standard deviation and difference of mean and median is revealed that
features with highest values possess strong discernibility and minimum
redundancy. Hence, process of feature selection is described as follows.
Compute the standard deviation ( ) of the features of dataset.𝜎
Rank the features based on standard deviation value from high to
low. Assign rank derived using standard deviation ( ) as .𝜎 𝑅𝑎𝑛𝑘
1
Compute the absolute difference ( ) of mean and median of the𝐷
features of dataset.
Rank the features based on difference value from high to low.
Assign rank derived using difference ( ) as .𝐷 𝑅𝑎𝑛𝑘
2
Compute combined feature rank as Combined Feature Rank =
𝑅𝑎𝑛𝑘 𝑅𝑎𝑛𝑘
1
+
2
.
Recursively add features to feature subset based on combined
feature rank until accuracy is not better than the previous derived
feature subset.
The algorithm for recursive feature selection using proposed technique
is presented in Algorithm . The derived feature subset is given input1
to DNN model for training and classification.
4. Experimental methodology
The proposed work implements DNN for intrusion detection and
classification. DNN architecture is a multi-layered neural network struc-
ture that performs mathematical transformations on input data to
derive and learn patterns for prediction and classification [36]. The
experimental methodology of the proposed approach consists of various
phases such as deciding on intrusion detection datasets required for the
performance evaluation, data pre-processing for transforming data for
ease of experimentation, feature selection to derived reduced feature
subset for learning, training DNN with reduced feature subset, and
performance evaluation. The schematic of the proposed approach is as
shown in .Fig. 3
4.1. Dataset description
The performance of the proposed approach is evaluated using three
intrusion detection datasets, namely, NSL-KDD, UNSW_NB-15, and CIC-
IDS-2017. These datasets consist of wide variety of network features
and have been developed in different network environments [8]. More-
over, these datasets consist of realistic as well as synthetic network
traffic. Hence, performance of the proposed approach can be indis-
putably advocated using diversified network traffic from three different
datasets. A brief description of each dataset is as follows.
Information Fusion 90 (2023) 353–363
358
A. Thakkar and R. Lohiya
Fig. 3. Schematic of the proposed approach.
Algorithm 1 Recursive Feature Selection Using Fusion of Standard
Deviation and Absolute Difference of Mean and Median
1: Consider Dataset for intrusion detection and classification where𝐷
𝑙
𝐷 = {NSL-KDD, UNSW_NB-15, CIC-IDS-2017}.
2: For features of dataset , calculate standard deviation for each𝐷
𝑙
feature using equation .(1)
3: Sort features from high to low based on their standard deviation
and rank them. Consider the assigned rank as .𝑅𝑎𝑛𝑘
1
4: For features of dataset , calculate absolute value of difference𝐷
𝑙
between mean and median of each feature using equation .(2)
5: Sort features from high to low based on the absolute value of the
difference and rank them. Consider the assigned rank as .𝑅𝑎𝑛𝑘
2
6: Compute combined feature rank by summing𝑅 𝑅𝑎𝑛𝑘
1
and .𝑅𝑎𝑛𝑘
2
7: For each feature of dataset do,𝐹
𝑖
𝐹 𝐷
𝑙
8: Remove the highest rank feature from F and update𝐹
𝑖
𝑆
𝑙
as =𝑆
𝑙
𝑆 𝐹
𝑙
𝑖
.
9: Train DNN model on training set with features and compute𝑆
𝑙
model accuracy.
10: Repeat Steps [8-9], for features until increase in accuracy is𝐹
𝑖
recorded more than previous computed accuracy.
11: Store the derived relevant features in subset for the Dataset .𝑆
𝑙
𝐷
𝑙
12: Use feature subset for training DNN-based IDS for the dataset𝑆
𝑙
𝐷
𝑙
.
NSL-KDD Dataset: It is an intrusion detection dataset that has
been developed by eliminating missing and duplicate samples
from KDD CUP 99 dataset [ ]. It consist of different categories37
of network features as well as samples for four attack categories
and normal network traffic [ ]. Moreover, a distinct training38
dataset and test dataset with 125,973 and 22,544 data samples,
respectively, is present in NSL-KDD dataset [ ].38
UNSW_NB-15 Dataset: It is an intrusion detection dataset that has
been developed using IXIA Perfect Storm tool which captures the
network packets flowing in the designed network testbed [ ].39
The dataset is developed with network traffic that describes se-
curity vulnerabilities and exploits along with normal network
traffic. The network features of the designed dataset are extracted
using software tools, namely, Argus and Bro-IDS [ ]. Moreover,39
a distinct training and test datasets with 175,341 and 82,332 data
samples, respectively, is present in UNSW_NB-15 dataset [ ].39
CIC-IDS-2017 Dataset: It is one of the largest and recent intru-
sion detection datasets that is developed by sniffing real-time
network packets flowing through the network [ ]. The designed40
intrusion detection dataset covers wide range of network services,
protocols, and modernistic attack categories. The dataset consist
of data samples captured over the span of five days. Moreover,
the designed dataset consist of distinctive wide range of network
features extracted using CICFlowMeter tool [ ].41
The statistics of intrusion detection datasets used for experimentation
is presented in .Table 2
4.2. Data pre-processing
Data pre-processing techniques are applied for ease of experimen-
tation to convert the data for smooth processing and learning [ ].36
In the proposed work, two data pre-processing techniques are ap-
plied, namely, feature encoding and feature normalization. Feature
encoding is performed to convert categorical features into numerical
features [4]. Intrusion detection datasets used for experimentation
consist of categorical features such as flag, service type, and protocol
type. The categorical features are converted into numerical features by
applying one-hot encoding technique. One-hot encoding is one of the
common feature encoding technique that is applied for numeralization
of categorical features [ ]. Further, after feature encoding, feature4
normalization is performed, as datasets might consist of features that
are in different dimensions and scale of values. Hence, for feature
normalization standard scalar technique is applied that normalizes the
features by subtracting the mean and scaling the feature values to unit
variance.
4.3. Feature selection
Feature selection process is performed to derived reduced feature
subset consisting of relevant and contributing features from considered
intrusion detection dataset. Features are selected using proposed fea-
ture selection technique described in Section . The proposed feature3
selection technique is applied on intrusion detection datasets, namely,
NSL-KDD, UNSW_NB-15, and CIC-IDS-2017. Application of proposed
Information Fusion 90 (2023) 353–363
359
A. Thakkar and R. Lohiya
Table 2
Statistics of the experimental datasets [ ].8
Criteria ( )/Dataset ( ) NSL-KDD UNSW_NB-15 CIC-IDS-2017
Type of network traffic Real & Synthetic Synthetic Real
Number of features 41 42 79
Number of attack categories 4 9 7
Number of classes 5 10 15
Number of data samples 148 517 257 673 225 745
Number of samples in training set 125 973 175341 165 730
Number of samples in test set 22 544 82 332 60 015
Table 3
Neural network architecture and configuration details [ ].4
Criteria Values
Model Sequential
Number of hidden layers [4] 3
Size of input NSL-KDD: 21, UNSW_NB: 21,
CIC-IDS-2017: 64
Number of neurons in hidden layers [4] 1024, 768, 512
Activation function for hidden layer [4] ReLU
Activation function for output layer [4] Sigmoid
Dropout techniques Standard dropout (p 0.1)=
(Derived using GridSearchCV)
Batch-size [4] 1024
Epochs [4] 300
feature selection technique results in reduced feature susbet of 21
features out of 41 features for NSL-KDD dataset, 21 features out of 42
features for UNSW_NB-15 dataset, and 64 features out of 79 features
for CIC-IDS-2017 dataset. The reduced feature subset is given input to
DNN model for learning and prediction.
4.4. Deep neural network for intrusion detection and classification
Multi-layered DNN architecture is designed for intrusion detection
and classification. The designed DNN architecture consists of an input
layer with input dimension equal to the number of features derived
using feature selection, three fully connected dense hidden layers with
varied number of neurons for data transformation and learning, and
an output layer with one neuron for binary classification. The intricate
layered structure of neurons learns patterns by exhibiting end-to-end
learning and performs prediction for given input sample. With each
fully connected layer ReLU activation function is used to strengthen
effect of learning process [5]. Moreover, succeeding to every dense
layer, a dropout layer is incorporated to achieve generalization and
avoid co-adaptation in neural network [42]. For output layer, Sigmoid
activation function is used to predict the class output label. Further-
more, performance of the DNN structure is evaluated by applying
binary cross entropy loss function. The structure of DNN is presented
in Fig. 4 and its configuration details are presented in .Table 3
4.5. Performance evaluation
Performance of the proposed approach is presented using varied
evaluation metrics derived from confusion matrix, namely, accuracy,
precision, recall, 𝑓 -score, and FPR [26]. The evaluation metrics are
expressed using Eqs. .(3)(7)
𝐴𝑐𝑐𝑢𝑟𝑎𝑐𝑦 =
𝑇
𝑝
+ 𝑇
𝑛
𝑇 𝐹
𝑝
+ 𝐹
𝑝
+
𝑛
+ 𝑇
𝑛
(3)
𝑃 𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 =
𝑇
𝑝
𝑇
𝑝
+ 𝐹
𝑝
(4)
𝑅𝑒𝑐𝑎𝑙𝑙 =
𝑇
𝑝
𝑇
𝑝
+ 𝐹
𝑛
(5)
𝑓
𝑠𝑐𝑜𝑟𝑒 =
2 𝑃 𝑟𝑒𝑐 𝑖𝑠𝑖𝑜𝑛 𝑅𝑒𝑐𝑎𝑙𝑙
𝑃 𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛
+ 𝑅𝑒𝑐𝑎𝑙𝑙
(6)
Fig. 4. Deep Neural Network.
𝐹 𝑃 𝑅 =
𝐹
𝑝
𝐹
𝑝
+ 𝑇
𝑛
(7)
Here, in Eqs. (3)(7), 𝑇
𝑝
, ,𝑇
𝑛
𝐹
𝑝
, and 𝐹
𝑛
represent true positive, true
negative, false positive, and false negative, respectively [ ].26
5. Result analysis
Experiments for evaluation of the proposed approach is performed
on Intel(R) Core(TM) i5-8265U CPU processor with 64-bit Windows
10 operating system and 8.00 GB RAM using Python. The experiments
are performed on pre-processed intrusion detection datasets namely,
NSL-KDD, UNSW_NB-15, and CIC-IDS-2017 with reduced feature subset
derived using proposed feature selection technique. Experiments are
performed for ten runs and achieved results are averaged. For the per-
formance analysis, we have compared our proposed feature selection
technique with existing feature selection techniques that are described
as follows.
Recursive Feature Elimination (RFE) : In RFE, feature selection
is performed by recursively eliminating features based on feature
importance and deriving a feature subset that consist of relevant
features with superior feature importance scores [ ].12
Chi-Square: In Chi-Square feature selection technique reduced
feature subset is derived by performing chi-square statistical test
that measures the dependency among features [ ].12
Correlation-based Feature Selection (CFS): CFS technique is based
on the hypothesis that a good feature subset consist of features
that are in strong correlation with the target class and in low
correlation with each other [ ]. Hence, in CFS features are43
selected based on their computed correlation coefficient score.
Genetic Algorithm: In feature selection technique based on ge-
netic algorithm, feature are selected based on their fitness values
computed using defined fitness function. In [44], fitness function
Information Fusion 90 (2023) 353–363
360
A. Thakkar and R. Lohiya
Table 4
Results for NSL-KDD Dataset.
Technique Feature selection No. of selected features Accuracy Precision Recall f-score FPR Execution time (s)
DNN Recursive Feature Elimination (RFE) [12] 13 98.94 99.39 98.75 99.33 0.012 32 519.045
DNN Chi-Square [12] 13 98.92 99.92 98.73 99.32 0.012 31 613.115
DNN Correlation-based Feature Selection (CFS) [43] 30 92.65 99.59 91.24 95.23 0.082 25 915.630
DNN Genetic Algorithm [44] 23 94.90 95.10 94.30 94.70 0.094 35 569.016
DNN Mutual information 13 98.89 99.90 98.70 99.30 0.0291 33 033.130
DNN Relief-f 20 81.94 81.91 98.46 89.42 0.0530 36 045.110
DNN Random forest 16 98.88 99.89 98.71 99.30 0.0210 34 761.620
DNN Proposed feature selection technique 21 99.84 99.94 98.81 99.37 0.011 22 318.015
Note: The value in boldface indicates the best performance for the experiments conducted for NSL-KDD dataset.
Table 5
Results for UNSW_NB-15 Dataset.
Technique Feature selection No. of selected features Accuracy Precision Recall f-score FPR Execution time (s)
DNN Recursive Feature Elimination (RFE) [12] 13 82.21 78.71 98.86 87.64 0.013 22 314.470
DNN Chi-Square [12] 13 82.41 79.02 98.61 87.73 0.013 21 832.195
DNN Correlation-based Feature Selection (CFS) [43] 30 75.34 67.43 98.29 79.99 0.017 21 766.215
DNN Genetic Algorithm [44] 30 76.70 92.70 95.00 93.83 0.069 25 387.412
DNN Mutual information 21 76.26 72.87 97.92 83.55 0.077 18 190.205
DNN Relief-f 13 72.34 73.26 89.09 80.40 0.1090 18 643.650
DNN Random forest 17 82.69 79.37 98.41 87.87 0.0518 18 152.130
DNN Proposed feature selection technique 21 89.03 95.00 98.95 96.93 0.011 13 913.500
Note: The value in boldface indicates the best performance for the experiments conducted for UNSW_NB-15 dataset.
is defined using accuracy, -score, and FPR, which is used to𝑓
calculate fitness of features and further, features with high fitness
values are chosen for intrusion detection and classification.
Mutual Information: In mutual information, feature selection is
performed by estimating dependency between the features. The
selection of the feature relies upon non-parametric procedure,
namely, entropy estimation [ ].45
Relief-f: This feature selection technique is based on sensitive
feature interactions, wherein feature score is derived for each
feature which is further considered to rank features. The feature
score is obtained by estimating feature value differences among
the nearest neighbour instance pairs [ ].45
Random Forest: Random forest is one of the popular classifica-
tion technique that possess implicit feature selection capability.
In random forest, features are selected based on their measure
of impurity, namely, Gini index. Hence, while training random
forest classifier, there is a possibility to determine how much
each feature reduces the impurity. The more a feature reduces
impurity, the more significant it is [ ].46
The experimental results for NSL-KDD, UNSW_NB-15, and CIC-IDS-
2017 are presented in , and , respectively. It can be inferredTables 4, 5 6
from the results that the proposed feature selection technique achieved
better results compared to existing features selection technique with
DNN-based IDS for all the three intrusion detection datasets. For NSL-
KDD dataset, the proposed approach achieved 99.84% accuracy with
the derived reduced feature subset. Hence, an approximate increase
of 1%–18% in accuracy is recorded with proposed feature selection
technique for DNN-based IDS using NSL-KDD dataset. For UNSW_NB-
15 dataset, the proposed approach achieved 89.03% accuracy with
the derived reduced feature subset. Hence, an approximate increase of
7%–17% in accuracy is recorded with the proposed feature selection
technique for DNN-based IDS using UNSW_NB-15 dataset. For CIC-
IDS-2017 dataset, the proposed approach achieved 99.80% accuracy
with the reduced feature subset. Hence, an approximate increase of
0.9%–2% in accuracy is recorded with the proposed feature selection
technique for DNN-based IDS using CIC-IDS-2017 dataset.
Precision and recall evaluation metrics illustrate the relevancy and
sensitivity of underlying classification technique for a given application
problem. For the proposed approach, promising scores for precision and
recall evaluation metrics are achieved for all the three datasets. For
NSL-KDD dataset, approximate increase of 0.04%–18% in precision and
0.06%–7% in recall is recorded with precision of 99.94% and recall of
98.81% using proposed feature selection technique. For UNSW_NB-15
dataset, approximate increase of 3%–28% in precision and 0.09%–9%
in recall is recorded with precision of 95.00% and recall of 98.95%
using proposed feature selection technique. For CIC-IDS-2017 dataset,
approximate increase of 0.05%–2% in precision and 0.67%–2% in
recall is recorded with precision of 99.85% and recall of 99.94% using
proposed feature selection technique.
It is interesting to study the performance of classification techniques
with unbalanced dataset using -score. This is because, -score can be𝑓 𝑓
considered as one of the important performance measure which is a
balanced metric that considers both precision and recall. For NSL-KDD
dataset, 99.37% 𝑓 -score value is achieved with proposed approach,
which is approximately, 0.07%–10% more compared to other feature
selection technique. For UNSW_NB-15 dataset, 96.93% -score value𝑓
is achieved with proposed approach, which is approximately, 3%–17%
more compared to other feature selection technique. For CIC-IDS-2017
dataset, 99.89% 𝑓 -score value is achieved with proposed approach,
which is approximately, 0.59%–2% more compared to other feature
selection technique. Moreover, the proposed feature selection approach
has outperformed with respect to minimum FPR for DNN-based IDS
compared to other feature selection techniques.
Apart from the evaluation metrics, the proposed approach can also
be compared based on the execution time recorded that consist of pre-
processing, feature selection, training, and classification. The execution
time for all the three intrusion detection datasets is presented in Ta-
bles 4 6 . It can be inferred from the execution time that with derived
feature subset the proposed DNN-based IDS has recorded less execution
time even though the number of features are higher than few of the
existing techniques that have been implemented.
Information Fusion 90 (2023) 353–363
361
A. Thakkar and R. Lohiya
Table 6
Results for CIC-IDS-2017 Dataset.
Technique Feature selection No. of selected features Accuracy Precision Recall f-score FPR Execution time (s)
DNN Recursive Feature Elimination (RFE) [12] 13 98.81 98.00 98.00 98.00 0.041 35 214.235
DNN Chi-Square [12] 13 98.15 98.20 98.93 98.56 0.062 35 517.115
DNN Correlation-based Feature Selection (CFS) [43] 54 97.78 97.77 97.00 97.38 0.094 32 261.510
DNN Genetic Algorithm [44] 38 98.00 98.89 97.77 98.32 0.069 40 552.215
DNN Mutual information 35 98.00 98.17 99.82 98.98 0.0178 32 031.522
DNN Relief-f 35 98.99 99.07 99.07 99.07 0.0801 32 045.130
DNN Random forest 37 98.90 99.90 98.70 99.30 0.0601 32 029.250
DNN Proposed feature selection technique 64 99.80 99.85 99.94 99.89 0.012 27 719.360
Note: The value in boldface indicates the best performance for the experiments conducted for CIC-IDS-2017 dataset.
5.1. Comparison with existing studies and future scope
The hypothesis of feature selection technique reveals that feature
selection is an essential part of a learning model that facilitates the
model to extract and learn features and thereby reduce the complexity
of the model [ ]. Feature engineering can be performed by selecting or26
extracting relevant features from the dataset. In our proposed approach
we aim to focus on enhancing the performance of DNN-based IDS by
proposing a novel feature selection technique that selects features via
fusion of statistical importance using Standard Deviation and Difference
of Mean and Median. Result analysis of the proposed approach commu-
nicates that the proposed approach performs better to existing feature
selection techniques considered for performance comparison. Apart
from comparative analysis with existing feature selection technique, we
have also presented comparative analysis with existing research work
in the field of intrusion detection and classification as shown in .Table 7
Considering and comparing with the research work and their
achieved results following key insights can be derived.
It can be deduced from result analysis that DL techniques per-
forms better compared to ML techniques for intrusion detection
and classification. There are various factors that contribute in
better performance of DL-based IDS such as, efficacy to handle
high-dimensional data, better feature learning capability, and
effective learning strategy. The proposed approach is able to
achieve to improved performance for NSL-KDD dataset with an
approximate increase in accuracy of 26.27% compared to [ ].47
Comparing the results of the proposed approach with other DL
techniques presented in [ , ], it can be deduced that the48 49
proposed approach achieved better results for NSL-KDD dataset
in terms of accuracy, wherein increase in accuracy is reported
approximately 9% and 1% compared to [ ,48 49], respectively.
Moreover, the proposed approach has achieved improved perfor-
mance in terms of FPR for NSL-KDD dataset with approximate
decrease of 7% and 6% compared to [ ,48 49], respectively.
Moreover, comparable performance is achieved for other perfor-
mance metrics such as precision, recall, and -score.𝑓
However, for the performance recorded for UNSW_NB-15 dataset
in [ , ] is better in terms of varied performance metrics such as48 49
accuracy, precision, recall, and -score. This is because based on the𝑓
exploratory analysis of UNSW_NB-15 dataset consist of high number of
outliers and skewed data, which can be effectively handled by CNN
and LSTM architectures [50]. However, the proposed approach records
better performance in terms of FPR for UNSW_NB-15 dataset com-
pared to [ , ]. Hence, in future, it would be promising to consider48 49
analysing the resiliency of IDS by optimizing the neural network ar-
chitecture using nature-inspired algorithms or by using nature-inspired
algorithms as feature selection techniques.
5.2. Complexity analysis
In this section, we discuss the time complexity of Algorithm . For1
analysing the time complexity, assume that is the number of data𝑁
samples in the underlying dataset, is the number of features, is𝑑 𝑚
the number of features in feature subset is the number of nested𝑆, 𝑛
subset of features. The computational significance of Algorithm is1
to derive feature subset to enhance the process of DNN-based IDS𝑆
by minimizing the generalization error and increasing the predictive
capability.
The proposed feature selection requires to compute standard devi-
ation and difference of mean and median for each feature. The time
complexity of calculating standard deviation, mean, median, and com-
bined rank of all features is . Time complexity of recursively𝑂( )𝑑𝑁𝑙𝑜𝑔𝑁
eliminating one feature from the feature subset is
𝑂(
𝑁𝑑
2
2
) [52]. Hence,
the time complexity of Algorithm
1 is 𝑚𝑎𝑥 𝑑𝑁𝑙𝑜𝑔𝑁[𝑂( ), 𝑂(
𝑁𝑑
2
2
)].
5.3. Energy consumption analysis
For profiling energy consumption analysis for different datasets,
it is intriguing to note that the energy consumption for a given task
refers to the core power usage during the task execution time [ ].53
This implies that the energy consumption is directly proportional to
the execution time during which the power is consumed. From the
Tables 4 6 , it can be inferred that the proposed approach recorded less
execution time compared to the existing feature selection techniques
for all the three intrusion detection datasets considered for performance
evaluation. Hence, from the result analysis, it can be deduced that the
proposed approach consumed less energy compared to other existing
feature selection techniques considered for comparative analysis.
5.4. Statistical significance and discussions
The attained results are also statistically validated using Wilcoxon
signed-rank test for all the performance measures considered for the
experimentation. Significance of the results achieved can be expressed
using 𝑝-value, wherein -value should be less than 0.05 [ ]. It can be𝑝 54
inferred from the , that the -value obtained for all the threeTable 8 𝑝
datasets considered for experimentation is less than 0.05. Hence, the
results achieved are statistically significant.
Considering the role and importance of feature engineering in intru-
sion detection and classification process, the proposed feature selection
technique recursively derives features based on their statistical proper-
ties. Focus of the proposed feature selection approach is to recursively
derive significant features from the underlying dataset based on their
computed combined rank. Single-feature ranking procedure assumes
that the features in the underlying dataset are independent of each
other. However, there are often correlations among features that should
be considered to impose feature redundancy while feature selection.
Hence, multi-feature ranking can be considered that takes correlation
as well as combined feature to derive reduced feature subset. Thus,
this can serve as an important future research direction which can
be consider prospective scope in the field of feature engineering for
intrusion detection and classification.
Information Fusion 90 (2023) 353–363
362
A. Thakkar and R. Lohiya
Table 7
Comparison with existing studies.
Ref. Technique Feature
selection
Dataset Result analysis
[51] Decision Tree
(DT)
Linear
correlation
coefficient
KDD CUP 99 Accuracy: 95.03%, Detection Rate: 95.23%, FPR: 1.65%
[47] Ensemble
tree classifier
CFS-Bat
algorithm
NSL-KDD Results for NSL-KDD: Accuracy of 73.57%, Detection Rate of 73.6% and FPR
of 12.92%
[48] Optimized
CNN
Hierarchical
multi-scale
LSTM
NSL-KDD,
ISCX,
UNSW_NB-15
NSL-KDD dataset Accuracy: 90.67%, Precision: 86.71%, Recall: 95.19%,
𝑓 -score: 91.46%, FPR: 8.86%, and Training time: 5118 s.
ISCX dataset Accuracy: 95.33%, Precision: 100%, Recall: 94.77%, -score:𝑓
97.61%, FPR: 7.84%, and Training time: 54 480 s.
UNSW_NB-15 dataset Accuracy: 96.33%, Precision: 100%, Recall: 95.87%,
𝑓 -score: 98.13%, FPR: 5.87%, and Training time: 30 665 s.
[49] Black widow
optimized
Conv LSTM
Artificial bee
colony
NSL-KDD,
UNSW_NB-
15, ISCX,
CIC-IDS-2018
NSL-KDD dataset accuracy: 98.67%, Precision: 97.48%, Recall: 100%,
𝑓 -score: 98.73%, FPR: 7.50%, and Training time: 4675.45 s.
UNSW_NB-15 dataset accuracy: 98.66%, Precision: 100%, Recall: 98.77%,
𝑓 -score: 98.77%, FPR: 4.48%, and Training time: 26 721.2 s.
ISCX dataset accuracy: 97.00%, Precision: 100%, Recall: 95.78%, -score:𝑓
99.67%, FPR: 5.76%, and Training time: 48 761.05 s.
CSE-CIC-IDS-2018 dataset accuracy: 98.25%, Precision: 97.48%, Recall:
98.67%, 𝑓 -score: 98.18%, FPR: 2.52%, and Training time: 22 713.02 s.
Our Study DNN Proposed
feature
selection
NSL-KDD,
UNSW_NB-
15,
CIC-IDS-2017
NSL-KDD dataset accuracy: 99.84%, Precision: 99.94%, Recall: 98.81%,
𝑓 -score: 99.37%, FPR: 1.1%, and Execution time: 22 318.015 s.
UNSW_NB-15 dataset accuracy: 89.03%, Precision: 95.00%, Recall: 98.95%,
𝑓 -score: 96.93%, FPR: 1.1%, and Execution time: 13 913.50 s.
CIC-IDS-2017 dataset accuracy: 99.80%, Precision: 99.85%, Recall: 99.94%,
𝑓 -score: 99.89%, FPR: 1.2%, and Execution time: 27 719.36 s.
Table 8
Wilcoxon signed-rank test results.
Dataset -value𝑝
NSL-KDD 0.0027
UNSW_NB-15 0.0053
CIC-IDS-2017 0.0054
6. Concluding remarks
The study proposes a novel feature selection technique based on
fusion of statistical importance using standard deviation and difference
of mean and median for enhancing the performance of intrusion detec-
tion and classification. The proposed feature selection technique aims
at deriving reduced feature subset that consist of features that have
attributes such as high discernibility and deviation. For prediction and
classification Deep Neural Network (DNN) technique is applied that
considers reduced feature subset for learning and deriving patterns in
data. Performance evaluation of the proposed approach is performed
using three intrusion detection datasets, namely, NSL-KDD, UNSW_NB-
15, and CIC-IDS-2017. The performance of the proposed approach is
demonstrated in terms of accuracy, precision, recall, -score, False𝑓
Positive Rate (FPR), and execution time. From the experiments per-
formed, it can be deduced that the proposed approach achieved better
performance compared to existing feature selection techniques for all
the three intrusion detection datasets with reduced execution time.
Hence, the derived features using proposed feature selection technique
were able to enhance the performance of DNN-based IDS.
Declaration of competing interest
The authors declare that they have no known competing finan-
cial interests or personal relationships that could have appeared to
influence the work reported in this paper.
Data availability
The authors are unable or have chosen not to specify which data
has been used.
References
[1] A. Thakkar, R. Lohiya, Role of swarm and evolutionary algorithms for intrusion
detection system: A survey, Swarm Evol. Comput. 53 (2020) 100631.
[2] R. Lohiya, A. Thakkar, Application domains, evaluation datasets, and research
challenges of IoT: A systematic review, IEEE Internet Things J. (2020).
[3] A. Thakkar, R. Lohiya, A review on machine learning and deep learning
perspectives of IDS for IoT: Recent updates, security issues, and challenges,
Arch. Comput. Methods Eng. (2020) 1–33, http://dx.doi.org/10.1007/s11831-
020-09496-0.
[4] A. Thakkar, R. Lohiya, Analyzing fusion of regularization techniques in the deep
learning-based intrusion detection system, Int. J. Intell. Syst. (2021).
[5] M.A. Chang, D. Bottini, L. Jian, P. Kumar, A. Panda, S. Shenker, How to train
your DNN: The network operator edition, 2020, arXiv preprint .arXiv:2004.10275
[6] R. Lohiya, A. Thakkar, Intrusion detection using deep neural network with
antirectifier layer, in: Applied Soft Computing and Communication Networks,
Springer, 2021, pp. 89–105.
[7] F.E. White, Data Fusion Lexicon, Technical Report, Joint Directors of Labs
Washington DC, 1991.
[8] A. Thakkar, R. Lohiya, A review of the advancement in intrusion detection
datasets, Procedia Comput. Sci. 167 (2020) 636–645.
[9] G. Bagyalakshmi, G. Rajkumar, N. Arunkumar, M. Easwaran, K. Narasimhan, V.
Elamaran, M. Solarte, I. Hernández, G. Ramirez-Gonzalez, Network vulnerability
analysis on brain signal/image databases using nmap and wireshark tools, IEEE
Access 6 (2018) 57144–57151.
[10] A. Gharib, I. Sharafaldin, A.H. Lashkari, A.A. Ghorbani, An evaluation framework
for intrusion detection dataset, in: 2016 International Conference on Information
Science and Security (ICISS), IEEE, 2016, pp. 1–6.
[11] G. Creech, J. Hu, Generation of a new IDS test dataset: Time to retire the KDD
collection, in: 2013 IEEE Wireless Communications and Networking Conference
(WCNC), IEEE, 2013, pp. 4487–4492.
[12] A. Thakkar, R. Lohiya, Attack classification using feature selection techniques:
a comparative study, J. Ambient Intell. Humaniz. Comput. 12 (1) (2021)
1249–1266.
[13] O. Almomani, A feature selection model for network intrusion detection system
based on PSO, GWO, FFA and GA algorithms, Symmetry 12 (6) (2020) 1046.
[14] C. Khammassi, S. Krichen, A GA-LR wrapper approach for feature selection in
network intrusion detection, Comput. Secur. 70 (2017) 255–277.
[15] M.A. Ambusaidi, X. He, P. Nanda, Z. Tan, Building an intrusion detection system
using a filter-based feature selection algorithm, IEEE Trans. Comput. 65 (10)
(2016) 2986–2998.
[16] B. Ingre, A. Yadav, Performance analysis of NSL-KDD dataset using ANN, in: 2015
International Conference on Signal Processing and Communication Engineering
Systems, IEEE, 2015, pp. 92–96.
[17] T. Janarthanan, S. Zargari, Feature selection in UNSW-NB15 and KDDCUP’99
datasets, in: 2017 IEEE 26th International Symposium on Industrial Electronics
(ISIE), IEEE, 2017, pp. 1881–1886.
Information Fusion 90 (2023) 353–363
363
A. Thakkar and R. Lohiya
[18] V. Kumar, D. Sinha, A.K. Das, S.C. Pandey, R.T. Goswami, An integrated rule
based intrusion detection system: analysis on UNSW-NB15 data set and the real
time online dataset, Cluster Comput. 23 (2) (2020) 1397–1418.
[19] N.M. Khan, N. Madhav C, A. Negi, I.S. Thaseen, Analysis on improving the
performance of machine learning models using feature selection technique,
in: International Conference on Intelligent Systems Design and Applications,
Springer, 2018, pp. 69–77.
[20] B.A. Tama, M. Comuzzi, K.-H. Rhee, TSE-IDS: A two-stage classifier ensemble
for intelligent anomaly-based intrusion detection system, IEEE Access 7 (2019)
94497–94507.
[21] W. Zong, Y.-W. Chow, W. Susilo, A two-stage classifier approach for network
intrusion detection, in: International Conference on Information Security Practice
and Experience, Springer, 2018, pp. 329–340.
[22] M. Belouch, S. El Hadaj, M. Idhammad, A two-stage classifier approach using
reptree algorithm for network intrusion detection, Int. J. Adv. Comput. Sci. Appl.
8 (6) (2017) 389–394.
[23] J. Gao, S. Chai, B. Zhang, Y. Xia, Research on network intrusion detection based
on incremental extreme learning machine and adaptive principal component
analysis, Energies 12 (7) (2019) 1223.
[24] N.T. Pham, E. Foo, S. Suriadi, H. Jeffrey, H.F.M. Lahza, Improving performance
of intrusion detection system using ensemble methods and feature selection, in:
Proceedings of the Australasian Computer Science Week Multiconference, 2018,
pp. 1–6.
[25] A.A. Salih, M.B. Abdulrazaq, Combining best features selection using three
classifiers in intrusion detection system, in: 2019 International Conference on
Advanced Science and Engineering (ICOASE), IEEE, 2019, pp. 94–99.
[26] A. Thakkar, R. Lohiya, A survey on intrusion detection system: feature selection,
model, performance measures, application perspective, challenges, and future
research directions, Artif. Intell. Rev. (2021) 1–111.
[27] Y. Xin, L. Kong, Z. Liu, Y. Chen, Y. Li, H. Zhu, M. Gao, H. Hou, C. Wang, Machine
learning and deep learning methods for cybersecurity, IEEE Access (2018).
[28] A.L. Buczak, E. Guven, A survey of data mining and machine learning methods
for cyber security intrusion detection, IEEE Commun. Surv. Tutor. 18 (2) (2016)
1153–1176.
[29] L.-H. Li, R. Ahmad, W.-C. Tsai, A.K. Sharma, A feature selection based DNN for
intrusion detection system, in: 2021 15th International Conference on Ubiquitous
Information Management and Communication (IMCOM), IEEE, 2021, pp. 1–8.
[30] T.-S. Chou, K.K. Yen, J. Luo, Network intrusion detection design using feature
selection of soft computing paradigms, Int. J. Comput. Intell. 4 (3) (2008)
196–208.
[31] S. Zaman, F. Karray, Features selection for intrusion detection systems based
on support vector machines, in: Consumer Communications and Networking
Conference, 2009. CCNC 2009. 6th IEEE, IEEE, 2009, pp. 1–8.
[32] S. Aljawarneh, M. Aldwairi, M.B. Yassein, Anomaly-based intrusion detection
system through feature selection analysis and building hybrid efficient model, J.
Comput. Sci. 25 (2018) 152–160.
[33] J. Xie, M. Wang, S. Xu, Z. Huang, P.W. Grant, The unsupervised feature selection
algorithms based on standard deviation and cosine similarity for genomic data
analysis, Front. Genet. 12 (2021).
[34] R. de Nijs, T.L. Klausen, On the expected difference between mean and median,
Electron. J. Appl. Statist. Anal. 6 (1) (2013) 110–117.
[35] T. Pham-Gia, T.L. Hung, The mean and median absolute deviations, Math.
Comput. Modelling 34 (7–8) (2001) 921–936.
[36] P. Chen, Y. Guo, J. Zhang, Y. Wang, H. Hu, A novel preprocessing methodology
for DNN-based intrusion detection, in: 2020 IEEE 6th International Conference
on Computer and Communications (ICCC), IEEE, 2020, pp. 2059–2064.
[37] U. Repository, NSL-KDD dataset, 2009, URL https://www.unb.ca/cic/datasets/
nsl.html (accessed April 22, 2019).
[38] L. Dhanabal, S. Shantharajah, A study on NSL-KDD dataset for intrusion detection
system based on classification algorithms, Int. J. Adv. Res. Comput. Commun.
Eng. 4 (6) (2015) 446–452.
[39] N. Moustafa, J. Slay, UNSW-NB15: a comprehensive data set for network
intrusion detection systems (UNSW-NB15 network data set), in: 2015 Military
Communications and Information Systems Conference (MilCIS), IEEE, 2015, pp.
1–6.
[40] I. Sharafaldin, A.H. Lashkari, A.A. Ghorbani, Toward generating a new intrusion
detection dataset and intrusion traffic characterization, in: ICISSP, 2018, pp.
108–116.
[41] R. Panigrahi, S. Borah, A detailed analysis of CICIDS2017 dataset for designing
intrusion detection systems, Int. J. Eng. Technol. 7 (3.24) (2018) 479–482.
[42] N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskever, R. Salakhutdinov, Dropout:
a simple way to prevent neural networks from overfitting, J. Mach. Learn. Res.
15 (1) (2014) 1929–1958.
[43] N. Gopika, M.E.A. Meena Kowshalaya, Correlation based feature selection
algorithm for machine learning, in: 2018 3rd International Conference on
Communication and Electronics Systems (ICCES), IEEE, 2018, pp. 692–695.
[44] Z. Liu, Y. Shi, A hybrid IDS using GA-based feature selection method and random
forest, Int. J. Mach. Learn. Comput. 12 (2) (2022).
[45] Y. Zhang, X. Ren, J. Zhang, Intrusion detection method based on information
gain and relieff feature selection, in: 2019 International Joint Conference on
Neural Networks (IJCNN), IEEE, 2019, pp. 1–5.
[46] X. Li, W. Chen, Q. Zhang, L. Wu, Building auto-encoder intrusion detection
system based on random forest feature selection, Comput. Secur. 95 (2020)
101851.
[47] Y. Zhou, G. Cheng, S. Jiang, M. Dai, Building an efficient intrusion detection
system based on feature selection and ensemble classifier, Comput. Netw. 174
(2020) 107247.
[48] P.R. Kanna, P. Santhi, Unified deep learning approach for efficient intrusion
detection system using integrated spatial–temporal features, Knowl.-Based Syst.
226 (2021) 107132.
[49] P.R. Kanna, P. Santhi, Hybrid intrusion detection using MapReduce based black
widow optimized convolutional long short-term memory neural networks, Expert
Syst. Appl. 194 (2022) 116545.
[50] N. Sharma, N.S. Yadav, S. Sharma, Classification of UNSW-NB15 dataset using
exploratory data analysis using ensemble learning, EAI Endorsed Trans. Ind.
Netw. Intell. Syst. 8 (29) (2021) e4.
[51] S. Mohammadi, H. Mirvaziri, M. Ghazizadeh-Ahsaee, H. Karimipour, Cyber
intrusion detection by combined feature selection algorithm, J. Inf. Secur. Appl.
44 (2019) 80–88.
[52] X. Ding, F. Yang, F. Ma, An efficient model selection for linear discriminant
function-based recursive feature elimination, J. Biomed. Inform. 129 (2022)
104070.
[53] S. Hajiamini, B.A. Shirazi, A study of DVFS methodologies for multicore systems
with islanding feature, in: Advances in Computers, Vol. 119, Elsevier, 2020, pp.
35–71.
[54] S. Taheri, G. Hesamian, A generalization of the wilcoxon signed-rank test and
its applications, Statist. Papers 54 (2) (2013) 457–470.
| 1/11

Preview text:

Information Fusion 90 (2023) 353–363
Contents lists available at ScienceDirect Information Fusion
journal homepage: www.elsevier.com/locate/inffus Full length article
Fusion of statistical importance for feature selection in Deep Neural
Network-based Intrusion Detection System
Ankit Thakkar, Ritika Lohiya ∗
Institute of Technology, Nirma University, Ahmedabad, Gujarat 382 481, India A R T I C L E I N F O A B S T R A C T Keywords:
Intrusion Detection System (IDS) is an essential part of network as it contributes towards securing the network Intrusion Detection System
against various vulnerabilities and threats. Over the past decades, there has been comprehensive study in the Deep Learning
field of IDS and various approaches have been developed to design intrusion detection and classification system. Filter-based feature selection
With the proliferation in the usage of Deep Learning (DL) techniques and their ability to learn data extensively, Deep Neural Network
we aim to design Deep Neural Network (DNN)-based IDS. In this study, we aim to focus on enhancing the Standard deviation
performance of DNN-based IDS by proposing a novel feature selection technique that selects features via
Fusion of statistical importance
fusion of statistical importance using Standard Deviation and Difference of Mean and Median. Here, in the
proposed approach, features are pruned based on their rank derived using fusion of statistical importance.
Moreover, fusion of statistical importance aims to derive relevant features that possess high discernibility and
deviation, that assists in better learning of data. The performance of the proposed approach is evaluated using
three intrusion detection datasets, namely, NSL-KDD, UNSW_NB-15, and CIC-IDS-2017. Performance analysis is
presented in terms of different evaluation metrics such as accuracy, precision, recall, 𝑓 -score, and False Positive
Rate (FPR) and the results are compared with existing feature selection techniques. Apart from evaluation
metrics, performance comparison is also presented in terms of execution time. Moreover, results achieved are
also statistically tested using Wilcoxon Signed Rank test. 1. Introduction
detection datasets. We have applied fusion of statistical importance
using statistical measures to derive association and identify significance
The fundamental rationale of developing Intrusion Detection System
of features for feature selection [7]. For intrusion classification, DNN
(IDS) is to detect and classify network samples with precise classi-
requires a large amount of data for learning and deriving patterns. In
fication accuracy and minimum false alarms [1,2]. Hence, various
the field of IDS, various intrusion detection datasets have been devel-
philosophical and analytical design principles should be taken into
oped for analysis and learning [8]. These datasets have been developed
consideration while developing intrusion detection and classification
by capturing raw network traffic flowing through underlying network
system. Over past decades, there has been various efforts in designing
environment. Various networking tools such as Wireshark and Nmap
an efficient IDS using Deep Learning (DL) techniques [3]. Furthermore,
are used for capturing raw network traffic [9]. Further, the captured
DL techniques such as Deep Neural Network (DNN) has emerged as one
data is stored in the form of pcap or tcpdump files, which are processed
of the leading solutions for building an efficient IDS [4]. This is because,
to extract network features from network packets consisting header
DNN has an intriguing characteristic attribute of performing end-to-end
and payload information [10]. Hence, intrusion detection datasets used
learning and in-depth analysis to derive patterns in data for prediction
for the performance evaluation comprises of high-dimensional network
and classification [5]. Hence, DL techniques such as DNN can be
feature space for learning. However, considering network features,
considered as one of the intelligent techniques that performs implicit
there is a possibility that intrusion detection datasets might consist
learning on high-dimensional data with ease. Apart from handling high-
dimensional data, DNN offers high-level data abstraction and good
of redundant and irrelevant features that conceivably would either
generalization ability for underlying attack classification problem [6].
affect or would not contribute towards prediction and classification
In this research, we aim to design a DNN-based intrusion detection process [11].
and classification system by applying fusion of statistical importance
Consequently, considering the role and importance of feature en-
by considering statistical measures for feature engineering on intrusion
gineering in intrusion detection and classification process, we aim ∗ Corresponding author.
E-mail addresses: ankit.thakkar@nirmauni.ac.in (A. Thakkar), 18ftphde30@nirmauni.ac.in (R. Lohiya).
https://doi.org/10.1016/j.inffus.2022.09.026
Received 23 February 2022; Received in revised form 26 September 2022; Accepted 28 September 2022
Available online 3 October 2022
1566-2535/© 2022 Elsevier B.V. All rights reserved.
A. Thakkar and R. Lohiya
Information Fusion 90 (2023) 353–363
Fig. 1. Scientific contribution and importance of the proposed feature selection technique.
to design a novel feature engineering process that selects features
methodology adopted for intrusion detection and classification. Sec-
based on statistical interpretation of features for the underlying intru-
tion 5 present and discuss result analysis of implemented techniques.
sion detection datasets. Generating a reduced subset of interpretative
Section 6 concludes the work presented in this paper.
features is a critical process and hence, we aim to design a novel filter-
based feature selection technique that considers standard deviation, 2. Related work
mean, and median statistical measures for deriving reduced feature
subset for learning and performance enhancement of DNN-based IDS.
Over the past years, various approaches have been proposed for
Unlike other feature selection technique, filter-based feature selection
intrusion detection and classification. In [14], a wrapper-based feature
approach aims at deriving feature subset without any influence of
selection technique is proposed using Genetic Algorithm (GA) and
classification technique applied for learning and prediction [12].
Logistic Regression (LR). Here, in the proposed approach, GA along
with LR is applied for feature selection and Decision Tree (DT) clas-
1.1. Scientific contribution and importance of the proposed feature selection
sifier is applied for classification. The performance of the proposed technique
approach is evaluated using Weka tool with two intrusion detection
datasets, namely, KDD CUP 99 and UNSW_NB-15 datasets. Result anal-
With a goal of designing effective predictive model for the under-
ysis revealed that the proposed feature selection technique achieved
lying classification problem, feature selection technique can be consid-
99.90% Detection Rate (DR) and 0.1% False Alarm Rate (FAR) for KDD
ered as a heuristic approach that may not guarantee optimal, perfect,
CUP 99 dataset with 18 features and 81.24% DR and 6.39% FAR for
or rational performance but is an adequate medium to achieve im-
UNSW_NB-15 dataset with 20 features.
mediate and effective performance for the underlying classification
Flexible Mutual Information (FMI) feature selection technique is
problem [13]. Hence, the science and art of the proposed feature selec-
tion technique is based on the heuristics, namely, standard deviation
proposed in [15] for deriving reduced feature subset for intrusion
detection and classification. Here, in the proposed approach, Least
and difference (|𝑀𝑒𝑎𝑛 𝑀𝑒𝑑𝑖𝑎𝑛|) of the features in the given dataset.
The scientific contribution and importance of the proposed feature
Square-Support Vector Machine (LS-SVM) is applied for classification
and features are selected using FMI considering correlation among the
selection technique is shown in Fig. 1.
The major contributions of our proposed work are summarized as
features of the dataset. Moreover, FMI is a non-linear feature selection follows.
techniques that uses correlation as a metric for feature selection. For
assessing the performance of the proposed approach three intrusion
• We have proposed a novel feature selection technique based on
detection datasets are used, namely, Kyoto 2006, KDD CUP 99, and
fusion of statistical importance of features for intrusion detection
NSL-KDD datasets. The performance analysis of the proposed approach and classification.
is presented in terms of DR and FAR.
• DNN is applied for learning and classification process using re-
Correlation-based Feature Selection (CFS) technique is applied in duced feature subset.
[16] for intrusion detection and classification. Here, in the proposed
• For fusion of statistical importance of features, statistical mea-
approach, DT classifier is applied for classification that uses reduced
sures, namely, standard deviation, mean, and median are taken
feature subset derived using CFS. The performance of the proposed into consideration.
approach is evaluated using NSL-KDD dataset consisting of 41 features.
• The proposed approach is evaluated using three intrusion de-
The proposed approach derives reduced feature subset with 14 fea-
tection datasets, namely, NSL-KDD, UNSW_NB-15, and CIC-IDS-
tures which is further used for intrusion detection and classification. 2017.
Performance analysis of the proposed approach is presented in terms
• Performance analysis of the proposed approach is presented in
of accuracy and it can be inferred from the results that the proposed
terms of varied evaluation metrics such as accuracy, precision,
approach achieved accuracy of 90.30% for NSL-KDD dataset with 14
recall, 𝑓 -score, and False Positive Rate (FPR). features.
• Performance of the proposed approach is compared with differ-
Comparative analysis of various classifiers is performed in [17]
ent feature selection techniques such as Chi-Square, Correlation-
using Weka tool. Here, for the comparative analysis various feature
based Feature Selection (CFS), Recursive Feature Elimination,
selection techniques are applied, namely, attribute evaluator, greedy
Genetic Algorithm (GA), Mutual Information (MI), Relief-f, and
stepwise, IG, and ranker technique. Further, two feature subsets are Random Forest (RF).
derived for intrusion detection and classification by performing defined
• Comparative analysis with existing feature selection techniques
number of simulations. It is inferred from the performance analysis
is presented using different evaluation metrics that have been
that Random Forest (RF) classifier performs better in terms of overall
considered as well as execution time.
performance for both derived feature subsets. Moreover, result analysis
The remainder of the paper is organized as follows. Section 2
is presented in terms of Kappa statistic and accuracy for demonstrating
presents overview of feature selection techniques for intrusion detec-
the performance of each feature subset.
tion and classification. Section 3 describes the proposed feature selec-
A filter-based feature selection technique using IG is proposed
tion technique for DNN-based IDS. Section 4 discusses experimental
in [18] for intrusion detection and classification. Here, in the proposed 354
A. Thakkar and R. Lohiya
Information Fusion 90 (2023) 353–363
approach, classification is performed by integrating rule-based ap-
2.1. Comparative analysis of existing approaches for IDS
proach with multiple tree classifiers. The performance of the proposed
approach is evaluated using UNSW_NB-15 dataset with 22 features
The design and development of discussed research work for in-
derived using IG technique. Furthermore, results analysis is presented
trusion detection and classification using feature engineering is en-
in terms of accuracy, 𝑓 -score, and FAR.
couraging. However, varied IDS have been designed using different
Feature selection using RF is applied in [19], wherein feature rank-
learning algorithms and feature selection techniques which adapt to
ing is performed using feature importance. Here, in the proposed study,
unique learning strategy for feature selection as well as attack classifica-
feature importance of each feature is computed and features are ranked
tion [26–28]. However, research gaps still exist with different learning
based on their feature importance value. This implies that feature
mechanisms for intrusion detection and classification such as,
with highest rank can be considered as most significant feature for
intrusion detection and classification. For the prediction and classifi-
• Majority of the research work have designed IDS considering
cation, various classification techniques were implemented, namely,
existing feature selection techniques using visualization tools such
𝑘-Nearest Neighbour (𝑘NN), DT, Bagging Meta Estimator (BME), XG- as WeKa [14,17].
Boost, and RF. The performance of the proposed approach is assessed
• The existing techniques designed for intrusion detection and clas-
using UNSW_NB-15 dataset with reduced feature subset consisting of 11
sification using feature selection techniques have been analysed
features. Furthermore, result analysis is presented in terms of accuracy
and compared using outdated datasets which lack experimental and 𝑓 -score. scenario.
An ensemble two-tier IDS is designed in [20], wherein, hybrid
Hence, there is a scope for developing an enhanced technique for
feature selection techniques is implemented along with majority voting-
intrusion detection and classification. Therefore, we aim to design a
based classification. Here, in the proposed approach, features are se-
novel feature selection approach for DNN-based IDS that uses fusion
lected using hybrid technique designed using PSO, GA, and Ant Colony
of statistical importance derived using standard deviation and absolute
Optimization (ACO). Further, for the classification, rotation forest and
difference of mean and median to select relevant and contributing
bagging classifier are applied and predictions are generated using
majority voting technique. The performance of the proposed approach
features for prediction and classification process. The application of
is evaluated using KDD CUP 99 dataset with reduced feature subset
statistical importance to select features is effective because it derives
consisting of 19 features. Moreover, the performance of the proposed
features based on statistical reasoning, which enhances the perfor-
mance of designed DNN-based IDS with feature discernibility and
approach is validated using 10-fold cross validation technique and
result analysis is presented in terms of accuracy, precision, recall, and
deviation. Thus, novelties of our proposed work can be summarized FAR. as follows.
A two-stage intrusion classification model is designed using RF
• A novel feature selection technique based on fusion of statistical
classifier in [21]. Here, in the proposed approach, IG is applied as
importance of features is proposed for intrusion detection and
feature selection technique. In the first stage, detection of minority classification.
class is performed and in the second stage, majority class is detected.
• Experimental-based analysis of novel feature selection techniques
Prediction from each stage are combined to generate classification
is performed using recent datasets and datasets used in literature,
result. The performance of the proposed approach is assessed using
namely, NSL-KDD, UNSW_NB-15, and CIC-IDS-2017.
UNSW_NB-15 dataset and results are presented in terms of accuracy
• We have presented comparative analysis of the proposed feature and FAR.
selection technique with existing feature selection techniques.
RepTree-based two stage IDS is proposed in [22], wherein IG is
used as feature selection technique. Here, in the initial stage under-
lying dataset is divided into three categories based on the protocol
3. Proposed feature selection technique for intrusion detection
type and further, in the second stage classification is performed. The and classification
performance of the proposed approach is evaluated using UNSW_NB-
15 dataset with a reduced feature set consisting of 20 features and
For intrusion detection and classification, various feature selec-
results are presented in terms of accuracy. An incremental technique
tion techniques are applied which are categorized in three categories,
comprising of Extreme Learning Machine (IELM) and Advanced Prin-
namely, filter-based feature selection technique, wrapper-based feature
cipal Component (APCA) algorithm is proposed in [23] for intrusion
selection technique, and embedded feature selection technique [29].
detection and classification. Here, in the proposed approach, ELM is
In filter-based feature selection technique, reduced feature subset is
applied for classification and APCA is applied for adaptive feature
derived based on certain relevancy criteria that defines significance
selection. The performance of the proposed approach is assessed using
of features pertaining to learning and classification [30]. Hence, in
UNSW_NB-15 dataset and results are presented in terms of accuracy,
filter-based feature selection technique a relevancy score is derived and DR, and FAR.
features are filtered based on the calculated score [30]. In wrapper-
Intrusion detection and classification system is designed in [25]
based feature selection technique, classification algorithm is considered
using NB and MLP classifier. Here, in the proposed approach, combined
and knowledge based is constructed to derive feature subset for learn-
feature selection technique is applied that consists of three feature
ing and classification [31]. The knowledge base of features reveals
selection techniques, namely, IG, GR, and ReliefF. Performance of the
the importance of features in refined form based on underlying clas-
proposed approach is evaluated using KDD CUP 99 dataset and result
sification algorithm. Feature selection using wrapper-based techniques
analysis is presented in terms of accuracy and FAR. An ensemble
is performed using predefined rules and conditions. However, perfor-
framework along with feature selection is designed in [24] for intrusion
mance of wrapper-based feature selection technique is dependent on
detection and classification. Here, in the proposed approach, GR is
the type of classification algorithm used [31]. Embedded feature se-
applied for selecting significant features for learning and Bagging classi-
lection technique is implemented by incorporating two phases that are
fier is applied for classification. Performance evaluation of the proposed
learning phase and feature selection phase [32]. This implies that em-
approach is conducted using NSL-KDD dataset and results are presented
bedded technique employs selecting features separately in two phases,
in terms of classification accuracy and FAR. The comparative summary
wherein outcomes of learning phase are used to add or delete features
of existing ML approaches for IDS is presented in Table Table 1.
in feature selection phase [32]. 355
A. Thakkar and R. Lohiya
Information Fusion 90 (2023) 353–363 Table 1
Comparative summary of existing Machine Learning (ML) approaches for IDS. Ref Technique Feature selection Dataset Results [16] DT CFS NSL-KDD Accuracy for NSL-KDD: 90.30% ⋅ [15] LS-SVM FMI Kyoto 2006, KDD DR for KDD CUP 99: 99.46% ⋅ CUP 99, and DR for NSL-KDD: 98.76% ⋅ NSL-KDD DR for Kyoto 2006: 99.64% ⋅ [17] RF Attribute evaluator, KDD CUP 99,
Comparative analysis is presented in greedy stepwise, IG, UNSW_NB-15
graphical format for feature selection and ranker technique considered. [22] RepTree IG NSL-KDD, Accuracy for NSL-KDD: 89.85% ⋅ UNSW_NB-15
Accuracy for UNSW_NB-15: 88.95% ⋅ [14] DT GA KDD CUP 99, DR for KDD CUP 99: 99.90% ⋅ UNSW_NB-15 DR for UNSW_NB-15: 81.24% ⋅ [19] kNN, DT, BME, Feature importance UNSW_NB-15 ⋅ Accuracy for kNN: 71.01% XGBoost, and RF Accuracy for DT: 74.22% Accuracy for BME: 74.64% Accuracy for XGBoost: 71.43% Accuracy for RF: 74.87% [21] RF IG UNSW_NB-15
Accuracy for UNSW_NB-15: 85.78% ⋅ [24] Bagging classifier GR NSL-KDD Accuracy for NSL-KDD: 84.25% [23] IELM APCA NSL-KDD, Accuracy for NSL-KDD: 81.22% ⋅ UNSW_NB-15
Accuracy for UNSW_NB-15: 70.51% ⋅ [20] Rotation forest and PSO, ACO, and GA KDD CUP 99
⋅ Accuracy for KDD CUP 99: 72.52% Bagging classifier [25] NB, MLP Combined feature KDD CUP 99 Accuracy for NB: 93.00% ⋅ selection technique Accuracy for MLP: 97.00% ⋅ [18] Rule-based multiple IG UNSW_NB-15
Accuracy for UNSW_NB-15: 84.83% ⋅ tree classifiers
3.1. Association between intrusion classification and feature selection
effectiveness. Selecting relevant features plays a significant role in
deriving pertinent information from a large number of data samples.
Intrusion detection datasets are developed by sniffing network pack-
Feature selection is one of the critical approaches that directs to select
ets flowing through network environment using various networking
features from the underlying dataset which can contribute better in
tools such as Wireshark and Nmap [26]. Captured network packets
enhancing the predictive capability for the given classification problem.
are accumulated in form of raw network files such as pcap files or
Hence, feature selection can be described as selection strategy adopted
tcpdump files. These files consist of various details regarding network
to remove irrelevant and redundant features for better representation
communication which are extracted from network packet header and of data.
network packet payload. The details regarding network communication
In our study, a novel filter-based feature selection technique is de-
from captured network traffic serve as network features for designed
signed to derive relevant features from the intrusion detection dataset
IDS. Intrusion detection and classification system examines the network
that can contribute more towards learning and classification process.
activities and analyses the data to inspect whether the analysed data
Hence, with an aim to enhance the performance of DNN-based IDS,
flow is anomalous network traffic or normal network traffic [32]. IDS
a new and discerning feature selection technique named as Feature
analyses the data to check whether system’s confidentiality, integrity,
Selection via Standard Deviation and Difference of Mean and Median
or availability is compromised or not. While designing an IDS, various
is proposed in our study. The proposed feature selection technique
aspects are considered such as monitoring network, collecting data,
derives reduced feature subset with high discernibility and deviation.
statistically analysing the collected data, intrusion detection, intimidat-
The application of standard deviation, mean, and median is effective
ing the security administrator of an intrusive event, and responding to
in deriving features as these measures perform quantitative and statis- intrusions [8].
tical reasoning to derive relevant features for intrusion detection and
In literature various researchers have designed hybridized
classification [33]. The fusion of statistical importance using statistical
approaches by combining feature selection techniques with classifi-
measures aims at improving the performance of prediction and clas-
cation technique [26]. Considerably, feature selection technique is
sification through quantitative and descriptive comparisons [33]. The
incorporated to enhance the performance of the designed IDS. How-
conceptualization strategy of the proposed feature selection techniques
ever, the combination of feature selection and intrusion detection is presented in Fig. 2.
should focus on achieving increased classification accuracy with re-
duced number of false positives. Therefore, the designed IDS requires 3.3. Standard deviation
an efficient feature selection technique which is capable of extracting
significant features from the underlying dataset. Hence, the proposed
Standard deviation for features can be described as a statistical mea-
feature selection technique aims to select features by considering
sure that measures the amount of variation or deviation in features from
statistical characteristics of features.
mean [34]. The standard deviation can be computed using Eq. (1) [34].
3.2. Conceptualization of the proposed feature selection technique
√ ∑(𝑥𝑖 𝜇)2 𝜎 = (1)
Nowadays, enormous amount of network traffic is generated from 𝑁
various network resources. Features from flowing network traffic are
Here, in Eq. (1), 𝜎 represents standard deviation, 𝑁 is the total number
studied for deriving patterns of normal and anomalous network traffic.
of samples, 𝑥 represents each value from underlying feature, and 𝑖 𝜇
However, network traffic data needs to examined with precision and
represents mean of underlying feature. 356
A. Thakkar and R. Lohiya
Information Fusion 90 (2023) 353–363
Fig. 2. Conceptualization strategy of the proposed feature selection technique.
Interpretation of standard deviation reveals that high standard de-
• Compute the standard deviation (𝜎) of the features of dataset.
viation value indicates that feature is dispersed over large range of
• Rank the features based on standard deviation value from high to
values and a low standard deviation value indicates that feature values
low. Assign rank derived using standard deviation (𝜎) as 𝑅𝑎𝑛𝑘 . 1
are closely located with respect to mean [33]. Hence, feature selection
• Compute the absolute difference (𝐷) of mean and median of the
using standard deviation chooses features with high standard deviation features of dataset.
value because as feature values are extended over large range, effective
• Rank the features based on difference value from high to low.
prediction outcome can be achieved. Moreover, standard deviation
Assign rank derived using difference (𝐷) as 𝑅𝑎𝑛𝑘 . 2
represents features distinguishable capability and therefore, standard
• Compute combined feature rank as Combined Feature Rank =
deviation of a feature manifests its differences on all samples. This 𝑅𝑎𝑛𝑘 𝑅𝑎𝑛𝑘 . 1 + 2
implies that high standard deviation value reveals more differences the
• Recursively add features to feature subset based on combined
feature has on all samples [33].
feature rank until accuracy is not better than the previous derived feature subset. 3.4. Mean and median
The algorithm for recursive feature selection using proposed technique
Mean and Median can be defined as descriptive statistical measures
is presented in Algorithm 1. The derived feature subset is given input
that are used to characterize data distribution [35]. Moreover, these
to DNN model for training and classification.
statistical measures represents relative magnitude of deviation in a data
distribution [35]. For feature selection, we have utilized the absolute
value of the difference between mean and median to derive relevant
4. Experimental methodology
features from the dataset, which is represented using Eq. (2).
The proposed work implements DNN for intrusion detection and
𝐷 = |𝑀𝑒𝑎𝑛 𝑀 𝑒𝑑𝑖𝑎𝑛| (2)
classification. DNN architecture is a multi-layered neural network struc-
ture that performs mathematical transformations on input data to
Here, in Eq. (2), 𝐷 represents absolute value of the difference between
derive and learn patterns for prediction and classification [36]. The
mean and median for a given feature. Interpretation of the difference of
experimental methodology of the proposed approach consists of various
mean and median reveals that high difference value indicates deviation
phases such as deciding on intrusion detection datasets required for the
over a large range of values and hence, features with high difference
performance evaluation, data pre-processing for transforming data for
value can be selected as relevant features from dataset for effective
ease of experimentation, feature selection to derived reduced feature
prediction and classification process [34].
subset for learning, training DNN with reduced feature subset, and
performance evaluation. The schematic of the proposed approach is as
3.5. Process of feature selection for intrusion detection and classification shown in Fig. 3.
The outcome of feature selection process is a set of relevant features
4.1. Dataset description
that are strongly related with class output labels and contribute more
towards the learning patterns from data. To quantify contribution of a
The performance of the proposed approach is evaluated using three
given feature in classification process, we introduce combined feature
intrusion detection datasets, namely, NSL-KDD, UNSW_NB-15, and CIC-
rank that represents significance of a features. The combined feature
IDS-2017. These datasets consist of wide variety of network features
rank is computed based on ranks derived using fusion of standard
and have been developed in different network environments [8]. More-
deviation and difference of mean and median. From the description of
over, these datasets consist of realistic as well as synthetic network
standard deviation and difference of mean and median is revealed that
traffic. Hence, performance of the proposed approach can be indis-
features with highest values possess strong discernibility and minimum
putably advocated using diversified network traffic from three different
redundancy. Hence, process of feature selection is described as follows.
datasets. A brief description of each dataset is as follows. 357
A. Thakkar and R. Lohiya
Information Fusion 90 (2023) 353–363
Fig. 3. Schematic of the proposed approach.
Algorithm 1 Recursive Feature Selection Using Fusion of Standard
a distinct training and test datasets with 175,341 and 82,332 data
Deviation and Absolute Difference of Mean and Median
samples, respectively, is present in UNSW_NB-15 dataset [39].
1: Consider Dataset 𝐷 for intrusion detection and classification where 𝑙
• CIC-IDS-2017 Dataset: It is one of the largest and recent intru-
𝐷 = {NSL-KDD, UNSW_NB-15, CIC-IDS-2017}.
sion detection datasets that is developed by sniffing real-time
2: For features of dataset 𝐷 , calculate standard deviation for each 𝑙
network packets flowing through the network [40]. The designed feature using equation (1).
intrusion detection dataset covers wide range of network services,
3: Sort features from high to low based on their standard deviation
protocols, and modernistic attack categories. The dataset consist
and rank them. Consider the assigned rank as 𝑅𝑎𝑛𝑘 . 1
of data samples captured over the span of five days. Moreover,
4: For features of dataset 𝐷 , calculate absolute value of difference 𝑙
the designed dataset consist of distinctive wide range of network
between mean and median of each feature using equation (2).
features extracted using CICFlowMeter tool [41].
5: Sort features from high to low based on the absolute value of the
difference and rank them. Consider the assigned rank as 𝑅𝑎𝑛𝑘 .
The statistics of intrusion detection datasets used for experimentation 2
6: Compute combined feature rank 𝑅 by summing 𝑅𝑎𝑛𝑘 and 𝑅𝑎𝑛𝑘 . is presented in Table 2. 1 2
7: For each feature 𝐹 of dataset do, 𝑖 𝐹 𝐷𝑙
8: Remove the highest rank feature 𝐹 from F and update as 𝑆 =
4.2. Data pre-processing 𝑖 𝑆𝑙 𝑙 𝑆 𝐹 . 𝑙 𝑖
9: Train DNN model on training set with 𝑆 features and compute
Data pre-processing techniques are applied for ease of experimen- 𝑙 model accuracy.
tation to convert the data for smooth processing and learning [36].
10: Repeat Steps [8-9], for features 𝐹 until increase in accuracy is
In the proposed work, two data pre-processing techniques are ap- 𝑖
recorded more than previous computed accuracy.
plied, namely, feature encoding and feature normalization. Feature
11: Store the derived relevant features in subset 𝑆 for the Dataset .
encoding is performed to convert categorical features into numerical 𝑙 𝐷𝑙
12: Use feature subset 𝑆 for training DNN-based IDS for the dataset
features [4]. Intrusion detection datasets used for experimentation 𝑙 𝐷 .
consist of categorical features such as flag, service type, and protocol 𝑙
type. The categorical features are converted into numerical features by
applying one-hot encoding technique. One-hot encoding is one of the
common feature encoding technique that is applied for numeralization
• NSL-KDD Dataset: It is an intrusion detection dataset that has
of categorical features [4]. Further, after feature encoding, feature
been developed by eliminating missing and duplicate samples
normalization is performed, as datasets might consist of features that
from KDD CUP 99 dataset [37]. It consist of different categories
are in different dimensions and scale of values. Hence, for feature
of network features as well as samples for four attack categories
normalization standard scalar technique is applied that normalizes the
and normal network traffic [38]. Moreover, a distinct training
features by subtracting the mean and scaling the feature values to unit
dataset and test dataset with 125,973 and 22,544 data samples, variance.
respectively, is present in NSL-KDD dataset [38]. 4.3. Feature selection
• UNSW_NB-15 Dataset: It is an intrusion detection dataset that has
been developed using IXIA Perfect Storm tool which captures the
Feature selection process is performed to derived reduced feature
network packets flowing in the designed network testbed [39].
subset consisting of relevant and contributing features from considered
The dataset is developed with network traffic that describes se-
intrusion detection dataset. Features are selected using proposed fea-
curity vulnerabilities and exploits along with normal network
ture selection technique described in Section 3. The proposed feature
traffic. The network features of the designed dataset are extracted
selection technique is applied on intrusion detection datasets, namely,
using software tools, namely, Argus and Bro-IDS [39]. Moreover,
NSL-KDD, UNSW_NB-15, and CIC-IDS-2017. Application of proposed 358
A. Thakkar and R. Lohiya
Information Fusion 90 (2023) 353–363 Table 2
Statistics of the experimental datasets [8]. Criteria (↓)/Dataset (→) NSL-KDD UNSW_NB-15 CIC-IDS-2017 Type of network traffic Real & Synthetic Synthetic Real Number of features 41 42 79 Number of attack categories 4 9 7 Number of classes 5 10 15 Number of data samples 148 517 257 673 225 745
Number of samples in training set 125 973 175 341 165 730 Number of samples in test set 22 544 82 332 60 015 Table 3
Neural network architecture and configuration details [4]. Criteria Values Model Sequential Number of hidden layers [4] 3 Size of input NSL-KDD: 21, UNSW_NB: 21, CIC-IDS-2017: 64
Number of neurons in hidden layers [4] 1024, 768, 512
Activation function for hidden layer [4] ReLU
Activation function for output layer [4] Sigmoid Dropout techniques Standard dropout (p = 0.1) (Derived using GridSearchCV) Batch-size [4] 1024 Epochs [4] 300
feature selection technique results in reduced feature susbet of 21
features out of 41 features for NSL-KDD dataset, 21 features out of 42
features for UNSW_NB-15 dataset, and 64 features out of 79 features
Fig. 4. Deep Neural Network.
for CIC-IDS-2017 dataset. The reduced feature subset is given input to
DNN model for learning and prediction.
4.4. Deep neural network for intrusion detection and classification 𝐹𝑝 𝐹 𝑃 𝑅 = (7) 𝐹
Multi-layered DNN architecture is designed for intrusion detection 𝑝 + 𝑇𝑛
and classification. The designed DNN architecture consists of an input
Here, in Eqs. (3)–(7), 𝑇 , 𝑇 , , and represent true positive, true 𝑝 𝑛 𝐹𝑝 𝐹𝑛
layer with input dimension equal to the number of features derived
negative, false positive, and false negative, respectively [26].
using feature selection, three fully connected dense hidden layers with
varied number of neurons for data transformation and learning, and 5. Result analysis
an output layer with one neuron for binary classification. The intricate
layered structure of neurons learns patterns by exhibiting end-to-end
Experiments for evaluation of the proposed approach is performed
learning and performs prediction for given input sample. With each
on Intel(R) Core(TM) i5-8265U CPU processor with 64-bit Windows
fully connected layer ReLU activation function is used to strengthen
10 operating system and 8.00 GB RAM using Python. The experiments
effect of learning process [5]. Moreover, succeeding to every dense
are performed on pre-processed intrusion detection datasets namely,
layer, a dropout layer is incorporated to achieve generalization and
NSL-KDD, UNSW_NB-15, and CIC-IDS-2017 with reduced feature subset
avoid co-adaptation in neural network [42]. For output layer, Sigmoid
derived using proposed feature selection technique. Experiments are
activation function is used to predict the class output label. Further-
performed for ten runs and achieved results are averaged. For the per-
more, performance of the DNN structure is evaluated by applying
formance analysis, we have compared our proposed feature selection
binary cross entropy loss function. The structure of DNN is presented
technique with existing feature selection techniques that are described
in Fig. 4 and its configuration details are presented in Table 3. as follows.
4.5. Performance evaluation
• Recursive Feature Elimination (RFE) : In RFE, feature selection
is performed by recursively eliminating features based on feature
Performance of the proposed approach is presented using varied
importance and deriving a feature subset that consist of relevant
evaluation metrics derived from confusion matrix, namely, accuracy,
features with superior feature importance scores [12].
precision, recall, 𝑓 -score, and FPR [26]. The evaluation metrics are
• Chi-Square: In Chi-Square feature selection technique reduced
expressed using Eqs. (3)–(7).
feature subset is derived by performing chi-square statistical test 𝑇
that measures the dependency among features [12]. 𝑝 + 𝑇𝑛
𝐴𝑐𝑐𝑢𝑟𝑎𝑐𝑦 = (3) 𝑇 𝐹
• Correlation-based Feature Selection (CFS): CFS technique is based
𝑝 + 𝐹𝑝 + 𝑛 + 𝑇𝑛
on the hypothesis that a good feature subset consist of features 𝑇𝑝
that are in strong correlation with the target class and in low
𝑃 𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 = (4)
𝑇𝑝 + 𝐹𝑝
correlation with each other [43]. Hence, in CFS features are 𝑇
selected based on their computed correlation coefficient score. 𝑝
𝑅𝑒𝑐𝑎𝑙𝑙 = (5) 𝑇
• Genetic Algorithm: In feature selection technique based on ge- 𝑝 + 𝐹𝑛
netic algorithm, feature are selected based on their fitness values
2 ∗ 𝑃 𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 𝑅𝑒𝑐𝑎𝑙𝑙
𝑓 𝑠𝑐𝑜𝑟𝑒 = (6)
computed using defined fitness function. In [44], fitness function
𝑃 𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 + 𝑅𝑒𝑐𝑎𝑙𝑙 359
A. Thakkar and R. Lohiya
Information Fusion 90 (2023) 353–363 Table 4
Results for NSL-KDD Dataset. Technique Feature selection No. of selected features Accuracy Precision Recall f-score FPR Execution time (s) DNN
Recursive Feature Elimination (RFE) [12] 13 98.94 99.39 98.75 99.33 0.012 32 519.045 DNN Chi-Square [12] 13 98.92 99.92 98.73 99.32 0.012 31 613.115 DNN
Correlation-based Feature Selection (CFS) [43] 30 92.65 99.59 91.24 95.23 0.082 25 915.630 DNN Genetic Algorithm [44] 23 94.90 95.10 94.30 94.70 0.094 35 569.016 DNN Mutual information 13 98.89 99.90 98.70 99.30 0.0291 33 033.130 DNN Relief-f 20 81.94 81.91 98.46 89.42 0.0530 36 045.110 DNN Random forest 16 98.88 99.89 98.71 99.30 0.0210 34 761.620 DNN
Proposed feature selection technique 21 99.84 99.94 98.81 99.37 0.011 22 318.015
Note: The value in boldface indicates the best performance for the experiments conducted for NSL-KDD dataset. Table 5
Results for UNSW_NB-15 Dataset. Technique Feature selection No. of selected features Accuracy Precision Recall f-score FPR Execution time (s) DNN
Recursive Feature Elimination (RFE) [12] 13 82.21 78.71 98.86 87.64 0.013 22 314.470 DNN Chi-Square [12] 13 82.41 79.02 98.61 87.73 0.013 21 832.195 DNN
Correlation-based Feature Selection (CFS) [43] 30 75.34 67.43 98.29 79.99 0.017 21 766.215 DNN Genetic Algorithm [44] 30 76.70 92.70 95.00 93.83 0.069 25 387.412 DNN Mutual information 21 76.26 72.87 97.92 83.55 0.077 18 190.205 DNN Relief-f 13 72.34 73.26 89.09 80.40 0.1090 18 643.650 DNN Random forest 17 82.69 79.37 98.41 87.87 0.0518 18 152.130 DNN
Proposed feature selection technique 21 89.03 95.00 98.95 96.93 0.011 13 913.500
Note: The value in boldface indicates the best performance for the experiments conducted for UNSW_NB-15 dataset.
is defined using accuracy, 𝑓 -score, and FPR, which is used to
Precision and recall evaluation metrics illustrate the relevancy and
calculate fitness of features and further, features with high fitness
sensitivity of underlying classification technique for a given application
values are chosen for intrusion detection and classification.
problem. For the proposed approach, promising scores for precision and
• Mutual Information: In mutual information, feature selection is
recall evaluation metrics are achieved for all the three datasets. For
performed by estimating dependency between the features. The
NSL-KDD dataset, approximate increase of 0.04%–18% in precision and
selection of the feature relies upon non-parametric procedure,
0.06%–7% in recall is recorded with precision of 99.94% and recall of
namely, entropy estimation [45].
98.81% using proposed feature selection technique. For UNSW_NB-15
• Relief-f: This feature selection technique is based on sensitive
dataset, approximate increase of 3%–28% in precision and 0.09%–9%
feature interactions, wherein feature score is derived for each
in recall is recorded with precision of 95.00% and recall of 98.95%
feature which is further considered to rank features. The feature
using proposed feature selection technique. For CIC-IDS-2017 dataset,
score is obtained by estimating feature value differences among
approximate increase of 0.05%–2% in precision and 0.67%–2% in
the nearest neighbour instance pairs [45].
recall is recorded with precision of 99.85% and recall of 99.94% using
• Random Forest: Random forest is one of the popular classifica-
proposed feature selection technique.
tion technique that possess implicit feature selection capability.
In random forest, features are selected based on their measure
It is interesting to study the performance of classification techniques
of impurity, namely, Gini index. Hence, while training random
with unbalanced dataset using 𝑓 -score. This is because, 𝑓 -score can be
forest classifier, there is a possibility to determine how much
considered as one of the important performance measure which is a
each feature reduces the impurity. The more a feature reduces
balanced metric that considers both precision and recall. For NSL-KDD
impurity, the more significant it is [46].
dataset, 99.37% 𝑓 -score value is achieved with proposed approach,
which is approximately, 0.07%–10% more compared to other feature
The experimental results for NSL-KDD, UNSW_NB-15, and CIC-IDS-
selection technique. For UNSW_NB-15 dataset, 96.93% 𝑓 -score value
2017 are presented in Tables 4, 5, and 6, respectively. It can be inferred
is achieved with proposed approach, which is approximately, 3%–17%
from the results that the proposed feature selection technique achieved
more compared to other feature selection technique. For CIC-IDS-2017
better results compared to existing features selection technique with
dataset, 99.89% 𝑓 -score value is achieved with proposed approach,
DNN-based IDS for all the three intrusion detection datasets. For NSL-
which is approximately, 0.59%–2% more compared to other feature
KDD dataset, the proposed approach achieved 99.84% accuracy with
selection technique. Moreover, the proposed feature selection approach
the derived reduced feature subset. Hence, an approximate increase
has outperformed with respect to minimum FPR for DNN-based IDS
of 1%–18% in accuracy is recorded with proposed feature selection
compared to other feature selection techniques.
technique for DNN-based IDS using NSL-KDD dataset. For UNSW_NB-
15 dataset, the proposed approach achieved 89.03% accuracy with
Apart from the evaluation metrics, the proposed approach can also
the derived reduced feature subset. Hence, an approximate increase of
be compared based on the execution time recorded that consist of pre-
7%–17% in accuracy is recorded with the proposed feature selection
processing, feature selection, training, and classification. The execution
technique for DNN-based IDS using UNSW_NB-15 dataset. For CIC-
time for all the three intrusion detection datasets is presented in Ta-
IDS-2017 dataset, the proposed approach achieved 99.80% accuracy
bles 4–6. It can be inferred from the execution time that with derived
with the reduced feature subset. Hence, an approximate increase of
feature subset the proposed DNN-based IDS has recorded less execution
0.9%–2% in accuracy is recorded with the proposed feature selection
time even though the number of features are higher than few of the
technique for DNN-based IDS using CIC-IDS-2017 dataset.
existing techniques that have been implemented. 360
A. Thakkar and R. Lohiya
Information Fusion 90 (2023) 353–363 Table 6
Results for CIC-IDS-2017 Dataset. Technique Feature selection No. of selected features Accuracy Precision Recall f-score FPR Execution time (s) DNN
Recursive Feature Elimination (RFE) [12] 13 98.81 98.00 98.00 98.00 0.041 35 214.235 DNN Chi-Square [12] 13 98.15 98.20 98.93 98.56 0.062 35 517.115 DNN
Correlation-based Feature Selection (CFS) [43] 54 97.78 97.77 97.00 97.38 0.094 32 261.510 DNN Genetic Algorithm [44] 38 98.00 98.89 97.77 98.32 0.069 40 552.215 DNN Mutual information 35 98.00 98.17 99.82 98.98 0.0178 32 031.522 DNN Relief-f 35 98.99 99.07 99.07 99.07 0.0801 32 045.130 DNN Random forest 37 98.90 99.90 98.70 99.30 0.0601 32 029.250 DNN
Proposed feature selection technique 64 99.80 99.85 99.94 99.89 0.012 27 719.360
Note: The value in boldface indicates the best performance for the experiments conducted for CIC-IDS-2017 dataset.
5.1. Comparison with existing studies and future scope
5.2. Complexity analysis
In this section, we discuss the time complexity of Algorithm 1. For
The hypothesis of feature selection technique reveals that feature
analysing the time complexity, assume that 𝑁 is the number of data
selection is an essential part of a learning model that facilitates the
samples in the underlying dataset, 𝑑 is the number of features, 𝑚 is
model to extract and learn features and thereby reduce the complexity
the number of features in feature subset 𝑆, 𝑛 is the number of nested
of the model [26]. Feature engineering can be performed by selecting or
subset of features. The computational significance of Algorithm 1 is
extracting relevant features from the dataset. In our proposed approach
to derive feature subset 𝑆 to enhance the process of DNN-based IDS
we aim to focus on enhancing the performance of DNN-based IDS by
by minimizing the generalization error and increasing the predictive
proposing a novel feature selection technique that selects features via capability.
The proposed feature selection requires to compute standard devi-
fusion of statistical importance using Standard Deviation and Difference
ation and difference of mean and median for each feature. The time
of Mean and Median. Result analysis of the proposed approach commu-
complexity of calculating standard deviation, mean, median, and com-
nicates that the proposed approach performs better to existing feature
bined rank of all features is 𝑂(𝑑𝑁𝑙𝑜𝑔𝑁 ). Time complexity of recursively
selection techniques considered for performance comparison. Apart
eliminating one feature from the feature subset is 𝑁𝑑2 𝑂( ) [52]. Hence,
from comparative analysis with existing feature selection technique, we 2
the time complexity of Algorithm 1 is 𝑁𝑑2
𝑚𝑎𝑥[𝑂(𝑑𝑁𝑙𝑜𝑔𝑁 ), 𝑂( )].
have also presented comparative analysis with existing research work 2
in the field of intrusion detection and classification as shown in Table 7.
5.3. Energy consumption analysis
Considering and comparing with the research work and their
achieved results following key insights can be derived.
For profiling energy consumption analysis for different datasets,
it is intriguing to note that the energy consumption for a given task
• It can be deduced from result analysis that DL techniques per-
refers to the core power usage during the task execution time [53].
forms better compared to ML techniques for intrusion detection
This implies that the energy consumption is directly proportional to
and classification. There are various factors that contribute in
the execution time during which the power is consumed. From the
better performance of DL-based IDS such as, efficacy to handle
Tables 4–6, it can be inferred that the proposed approach recorded less
execution time compared to the existing feature selection techniques
high-dimensional data, better feature learning capability, and
for all the three intrusion detection datasets considered for performance
effective learning strategy. The proposed approach is able to
evaluation. Hence, from the result analysis, it can be deduced that the
achieve to improved performance for NSL-KDD dataset with an
proposed approach consumed less energy compared to other existing
approximate increase in accuracy of 26.27% compared to [47].
feature selection techniques considered for comparative analysis.
• Comparing the results of the proposed approach with other DL
techniques presented in [48,49], it can be deduced that the
5.4. Statistical significance and discussions
proposed approach achieved better results for NSL-KDD dataset
in terms of accuracy, wherein increase in accuracy is reported
The attained results are also statistically validated using Wilcoxon
signed-rank test for all the performance measures considered for the
approximately 9% and 1% compared to [48,49], respectively.
experimentation. Significance of the results achieved can be expressed
Moreover, the proposed approach has achieved improved perfor-
using 𝑝-value, wherein 𝑝-value should be less than 0.05 [54]. It can be
mance in terms of FPR for NSL-KDD dataset with approximate
inferred from the Table 8, that the 𝑝-value obtained for all the three
decrease of 7% and 6% compared to [48,49], respectively.
datasets considered for experimentation is less than 0.05. Hence, the
• Moreover, comparable performance is achieved for other perfor-
results achieved are statistically significant.
mance metrics such as precision, recall, and 𝑓 -score.
Considering the role and importance of feature engineering in intru-
sion detection and classification process, the proposed feature selection
However, for the performance recorded for UNSW_NB-15 dataset
technique recursively derives features based on their statistical proper-
in [48,49] is better in terms of varied performance metrics such as
ties. Focus of the proposed feature selection approach is to recursively
accuracy, precision, recall, and 𝑓 -score. This is because based on the
derive significant features from the underlying dataset based on their
exploratory analysis of UNSW_NB-15 dataset consist of high number of
computed combined rank. Single-feature ranking procedure assumes
outliers and skewed data, which can be effectively handled by CNN
that the features in the underlying dataset are independent of each
other. However, there are often correlations among features that should
and LSTM architectures [50]. However, the proposed approach records
be considered to impose feature redundancy while feature selection.
better performance in terms of FPR for UNSW_NB-15 dataset com-
Hence, multi-feature ranking can be considered that takes correlation
pared to [48,49]. Hence, in future, it would be promising to consider
as well as combined feature to derive reduced feature subset. Thus,
analysing the resiliency of IDS by optimizing the neural network ar-
this can serve as an important future research direction which can
chitecture using nature-inspired algorithms or by using nature-inspired
be consider prospective scope in the field of feature engineering for
algorithms as feature selection techniques.
intrusion detection and classification. 361
A. Thakkar and R. Lohiya
Information Fusion 90 (2023) 353–363 Table 7
Comparison with existing studies. Ref. Technique Feature Dataset Result analysis selection [51] Decision Tree Linear KDD CUP 99
Accuracy: 95.03%, Detection Rate: 95.23%, FPR: 1.65% (DT) correlation coefficient [47] Ensemble CFS-Bat NSL-KDD
Results for NSL-KDD: Accuracy of 73.57%, Detection Rate of 73.6% and FPR tree classifier algorithm of 12.92% [48] Optimized Hierarchical NSL-KDD,
NSL-KDD dataset Accuracy: 90.67%, Precision: 86.71%, Recall: 95.19%, ⋅ CNN multi-scale ISCX,
𝑓 -score: 91.46%, FPR: 8.86%, and Training time: 5118 s. LSTM UNSW_NB-15
ISCX dataset Accuracy: 95.33%, Precision: 100%, Recall: 94.77%, 𝑓 -score: ⋅
97.61%, FPR: 7.84%, and Training time: 54 480 s.
UNSW_NB-15 dataset Accuracy: 96.33%, Precision: 100%, Recall: 95.87%, ⋅
𝑓 -score: 98.13%, FPR: 5.87%, and Training time: 30 665 s. [49] Black widow Artificial bee NSL-KDD,
NSL-KDD dataset accuracy: 98.67%, Precision: 97.48%, Recall: 100%, ⋅ optimized colony UNSW_NB-
𝑓 -score: 98.73%, FPR: 7.50%, and Training time: 4675.45 s. Conv LSTM 15, ISCX,
UNSW_NB-15 dataset accuracy: 98.66%, Precision: 100%, Recall: 98.77%, ⋅ CIC-IDS-2018
𝑓 -score: 98.77%, FPR: 4.48%, and Training time: 26 721.2 s.
ISCX dataset accuracy: 97.00%, Precision: 100%, Recall: 95.78%, 𝑓 -score: ⋅
99.67%, FPR: 5.76%, and Training time: 48 761.05 s.
CSE-CIC-IDS-2018 dataset accuracy: 98.25%, Precision: 97.48%, Recall: ⋅
98.67%, 𝑓 -score: 98.18%, FPR: 2.52%, and Training time: 22 713.02 s. Our Study DNN Proposed NSL-KDD,
NSL-KDD dataset accuracy: 99.84%, Precision: 99.94%, Recall: 98.81%, ⋅ feature UNSW_NB-
𝑓 -score: 99.37%, FPR: 1.1%, and Execution time: 22 318.015 s. selection 15,
UNSW_NB-15 dataset accuracy: 89.03%, Precision: 95.00%, Recall: 98.95%, ⋅ CIC-IDS-2017
𝑓 -score: 96.93%, FPR: 1.1%, and Execution time: 13 913.50 s.
CIC-IDS-2017 dataset accuracy: 99.80%, Precision: 99.85%, Recall: 99.94%, ⋅
𝑓 -score: 99.89%, FPR: 1.2%, and Execution time: 27 719.36 s. Table 8 References
Wilcoxon signed-rank test results. Dataset 𝑝-value
[1] A. Thakkar, R. Lohiya, Role of swarm and evolutionary algorithms for intrusion NSL-KDD 0.0027
detection system: A survey, Swarm Evol. Comput. 53 (2020) 100631. UNSW_NB-15 0.0053
[2] R. Lohiya, A. Thakkar, Application domains, evaluation datasets, and research CIC-IDS-2017 0.0054
challenges of IoT: A systematic review, IEEE Internet Things J. (2020).
[3] A. Thakkar, R. Lohiya, A review on machine learning and deep learning
perspectives of IDS for IoT: Recent updates, security issues, and challenges,
Arch. Comput. Methods Eng. (2020) 1–33, http://dx.doi.org/10.1007/s11831- 6. Concluding remarks 020-09496-0.
[4] A. Thakkar, R. Lohiya, Analyzing fusion of regularization techniques in the deep
learning-based intrusion detection system, Int. J. Intell. Syst. (2021).
The study proposes a novel feature selection technique based on
[5] M.A. Chang, D. Bottini, L. Jian, P. Kumar, A. Panda, S. Shenker, How to train
fusion of statistical importance using standard deviation and difference
your DNN: The network operator edition, 2020, arXiv preprint arXiv:2004.10275.
of mean and median for enhancing the performance of intrusion detec-
[6] R. Lohiya, A. Thakkar, Intrusion detection using deep neural network with
antirectifier layer, in: Applied Soft Computing and Communication Networks,
tion and classification. The proposed feature selection technique aims Springer, 2021, pp. 89–105.
at deriving reduced feature subset that consist of features that have
[7] F.E. White, Data Fusion Lexicon, Technical Report, Joint Directors of Labs
attributes such as high discernibility and deviation. For prediction and Washington DC, 1991.
classification Deep Neural Network (DNN) technique is applied that
[8] A. Thakkar, R. Lohiya, A review of the advancement in intrusion detection
considers reduced feature subset for learning and deriving patterns in
datasets, Procedia Comput. Sci. 167 (2020) 636–645.
[9] G. Bagyalakshmi, G. Rajkumar, N. Arunkumar, M. Easwaran, K. Narasimhan, V.
data. Performance evaluation of the proposed approach is performed
Elamaran, M. Solarte, I. Hernández, G. Ramirez-Gonzalez, Network vulnerability
using three intrusion detection datasets, namely, NSL-KDD, UNSW_NB-
analysis on brain signal/image databases using nmap and wireshark tools, IEEE
15, and CIC-IDS-2017. The performance of the proposed approach is Access 6 (2018) 57144–57151.
demonstrated in terms of accuracy, precision, recall, 𝑓 -score, False
[10] A. Gharib, I. Sharafaldin, A.H. Lashkari, A.A. Ghorbani, An evaluation framework
Positive Rate (FPR), and execution time. From the experiments per-
for intrusion detection dataset, in: 2016 International Conference on Information
Science and Security (ICISS), IEEE, 2016, pp. 1–6.
formed, it can be deduced that the proposed approach achieved better
[11] G. Creech, J. Hu, Generation of a new IDS test dataset: Time to retire the KDD
performance compared to existing feature selection techniques for all
collection, in: 2013 IEEE Wireless Communications and Networking Conference
the three intrusion detection datasets with reduced execution time.
(WCNC), IEEE, 2013, pp. 4487–4492.
Hence, the derived features using proposed feature selection technique
[12] A. Thakkar, R. Lohiya, Attack classification using feature selection techniques:
a comparative study, J. Ambient Intell. Humaniz. Comput. 12 (1) (2021)
were able to enhance the performance of DNN-based IDS. 1249–1266.
[13] O. Almomani, A feature selection model for network intrusion detection system
Declaration of competing interest
based on PSO, GWO, FFA and GA algorithms, Symmetry 12 (6) (2020) 1046.
[14] C. Khammassi, S. Krichen, A GA-LR wrapper approach for feature selection in
The authors declare that they have no known competing finan-
network intrusion detection, Comput. Secur. 70 (2017) 255–277.
[15] M.A. Ambusaidi, X. He, P. Nanda, Z. Tan, Building an intrusion detection system
cial interests or personal relationships that could have appeared to
using a filter-based feature selection algorithm, IEEE Trans. Comput. 65 (10)
influence the work reported in this paper. (2016) 2986–2998.
[16] B. Ingre, A. Yadav, Performance analysis of NSL-KDD dataset using ANN, in: 2015 Data availability
International Conference on Signal Processing and Communication Engineering
Systems, IEEE, 2015, pp. 92–96.
[17] T. Janarthanan, S. Zargari, Feature selection in UNSW-NB15 and KDDCUP’99
The authors are unable or have chosen not to specify which data
datasets, in: 2017 IEEE 26th International Symposium on Industrial Electronics has been used.
(ISIE), IEEE, 2017, pp. 1881–1886. 362
A. Thakkar and R. Lohiya
Information Fusion 90 (2023) 353–363
[18] V. Kumar, D. Sinha, A.K. Das, S.C. Pandey, R.T. Goswami, An integrated rule
[37] U. Repository, NSL-KDD dataset, 2009, URL https://www.unb.ca/cic/datasets/
based intrusion detection system: analysis on UNSW-NB15 data set and the real
nsl.html (accessed April 22, 2019).
time online dataset, Cluster Comput. 23 (2) (2020) 1397–1418.
[38] L. Dhanabal, S. Shantharajah, A study on NSL-KDD dataset for intrusion detection
[19] N.M. Khan, N. Madhav C, A. Negi, I.S. Thaseen, Analysis on improving the
system based on classification algorithms, Int. J. Adv. Res. Comput. Commun.
performance of machine learning models using feature selection technique, Eng. 4 (6) (2015) 446–452.
in: International Conference on Intelligent Systems Design and Applications,
[39] N. Moustafa, J. Slay, UNSW-NB15: a comprehensive data set for network Springer, 2018, pp. 69–77.
intrusion detection systems (UNSW-NB15 network data set), in: 2015 Military
[20] B.A. Tama, M. Comuzzi, K.-H. Rhee, TSE-IDS: A two-stage classifier ensemble
for intelligent anomaly-based intrusion detection system, IEEE Access 7 (2019)
Communications and Information Systems Conference (MilCIS), IEEE, 2015, pp. 94497–94507. 1–6.
[21] W. Zong, Y.-W. Chow, W. Susilo, A two-stage classifier approach for network
[40] I. Sharafaldin, A.H. Lashkari, A.A. Ghorbani, Toward generating a new intrusion
intrusion detection, in: International Conference on Information Security Practice
detection dataset and intrusion traffic characterization, in: ICISSP, 2018, pp.
and Experience, Springer, 2018, pp. 329–340. 108–116.
[22] M. Belouch, S. El Hadaj, M. Idhammad, A two-stage classifier approach using
[41] R. Panigrahi, S. Borah, A detailed analysis of CICIDS2017 dataset for designing
reptree algorithm for network intrusion detection, Int. J. Adv. Comput. Sci. Appl.
intrusion detection systems, Int. J. Eng. Technol. 7 (3.24) (2018) 479–482. 8 (6) (2017) 389–394.
[42] N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskever, R. Salakhutdinov, Dropout:
[23] J. Gao, S. Chai, B. Zhang, Y. Xia, Research on network intrusion detection based
a simple way to prevent neural networks from overfitting, J. Mach. Learn. Res.
on incremental extreme learning machine and adaptive principal component 15 (1) (2014) 1929–1958.
analysis, Energies 12 (7) (2019) 1223.
[43] N. Gopika, M.E.A. Meena Kowshalaya, Correlation based feature selection
[24] N.T. Pham, E. Foo, S. Suriadi, H. Jeffrey, H.F.M. Lahza, Improving performance
of intrusion detection system using ensemble methods and feature selection, in:
algorithm for machine learning, in: 2018 3rd International Conference on
Proceedings of the Australasian Computer Science Week Multiconference, 2018,
Communication and Electronics Systems (ICCES), IEEE, 2018, pp. 692–695. pp. 1–6.
[44] Z. Liu, Y. Shi, A hybrid IDS using GA-based feature selection method and random
[25] A.A. Salih, M.B. Abdulrazaq, Combining best features selection using three
forest, Int. J. Mach. Learn. Comput. 12 (2) (2022).
classifiers in intrusion detection system, in: 2019 International Conference on
[45] Y. Zhang, X. Ren, J. Zhang, Intrusion detection method based on information
Advanced Science and Engineering (ICOASE), IEEE, 2019, pp. 94–99.
gain and relieff feature selection, in: 2019 International Joint Conference on
[26] A. Thakkar, R. Lohiya, A survey on intrusion detection system: feature selection,
Neural Networks (IJCNN), IEEE, 2019, pp. 1–5.
model, performance measures, application perspective, challenges, and future
[46] X. Li, W. Chen, Q. Zhang, L. Wu, Building auto-encoder intrusion detection
research directions, Artif. Intell. Rev. (2021) 1–111.
system based on random forest feature selection, Comput. Secur. 95 (2020)
[27] Y. Xin, L. Kong, Z. Liu, Y. Chen, Y. Li, H. Zhu, M. Gao, H. Hou, C. Wang, Machine 101851.
learning and deep learning methods for cybersecurity, IEEE Access (2018).
[47] Y. Zhou, G. Cheng, S. Jiang, M. Dai, Building an efficient intrusion detection
[28] A.L. Buczak, E. Guven, A survey of data mining and machine learning methods
for cyber security intrusion detection, IEEE Commun. Surv. Tutor. 18 (2) (2016)
system based on feature selection and ensemble classifier, Comput. Netw. 174 1153–1176. (2020) 107247.
[29] L.-H. Li, R. Ahmad, W.-C. Tsai, A.K. Sharma, A feature selection based DNN for
[48] P.R. Kanna, P. Santhi, Unified deep learning approach for efficient intrusion
intrusion detection system, in: 2021 15th International Conference on Ubiquitous
detection system using integrated spatial–temporal features, Knowl.-Based Syst.
Information Management and Communication (IMCOM), IEEE, 2021, pp. 1–8. 226 (2021) 107132.
[30] T.-S. Chou, K.K. Yen, J. Luo, Network intrusion detection design using feature
[49] P.R. Kanna, P. Santhi, Hybrid intrusion detection using MapReduce based black
selection of soft computing paradigms, Int. J. Comput. Intell. 4 (3) (2008)
widow optimized convolutional long short-term memory neural networks, Expert 196–208. Syst. Appl. 194 (2022) 116545.
[31] S. Zaman, F. Karray, Features selection for intrusion detection systems based
[50] N. Sharma, N.S. Yadav, S. Sharma, Classification of UNSW-NB15 dataset using
on support vector machines, in: Consumer Communications and Networking
exploratory data analysis using ensemble learning, EAI Endorsed Trans. Ind.
Conference, 2009. CCNC 2009. 6th IEEE, IEEE, 2009, pp. 1–8.
Netw. Intell. Syst. 8 (29) (2021) e4.
[32] S. Aljawarneh, M. Aldwairi, M.B. Yassein, Anomaly-based intrusion detection
system through feature selection analysis and building hybrid efficient model, J.
[51] S. Mohammadi, H. Mirvaziri, M. Ghazizadeh-Ahsaee, H. Karimipour, Cyber
Comput. Sci. 25 (2018) 152–160.
intrusion detection by combined feature selection algorithm, J. Inf. Secur. Appl.
[33] J. Xie, M. Wang, S. Xu, Z. Huang, P.W. Grant, The unsupervised feature selection 44 (2019) 80–88.
algorithms based on standard deviation and cosine similarity for genomic data
[52] X. Ding, F. Yang, F. Ma, An efficient model selection for linear discriminant
analysis, Front. Genet. 12 (2021).
function-based recursive feature elimination, J. Biomed. Inform. 129 (2022)
[34] R. de Nijs, T.L. Klausen, On the expected difference between mean and median, 104070.
Electron. J. Appl. Statist. Anal. 6 (1) (2013) 110–117.
[53] S. Hajiamini, B.A. Shirazi, A study of DVFS methodologies for multicore systems
[35] T. Pham-Gia, T.L. Hung, The mean and median absolute deviations, Math.
with islanding feature, in: Advances in Computers, Vol. 119, Elsevier, 2020, pp.
Comput. Modelling 34 (7–8) (2001) 921–936. 35–71.
[36] P. Chen, Y. Guo, J. Zhang, Y. Wang, H. Hu, A novel preprocessing methodology
[54] S. Taheri, G. Hesamian, A generalization of the wilcoxon signed-rank test and
for DNN-based intrusion detection, in: 2020 IEEE 6th International Conference
on Computer and Communications (ICCC), IEEE, 2020, pp. 2059–2064.
its applications, Statist. Papers 54 (2) (2013) 457–470. 363