Intelligent Traffic-Monitoring System Based On YOLO and Convolutional Fuzzy Neural Networks

Received January 15, 2022, accepted January 27, 2022, date of publication January 31, 2022, date of current
version February 8, 2022.

Digital Object Identifier 10.1109/ACCESS.2022.3147866
Intelligent Traffic-Monitoring System Based on

YOLO and Convolutional Fuzzy Neural Networks
CHENG-JIAN LIN 1,2 , (Senior Member, IEEE), AND JYUN-YU JHANG 2,3
1 Department of Computer Science and Information Engineering, National Chin-Yi University of Technology, Taichung 41170, Taiwan
2 College of Intelligence, National Taichung University of Science and Technology, Taichung 40401, Taiwan
3 Department of Computer Science and Information Engineering, National Taichung University of Science and Technology, Taichung 40401, Taiwan
Corresponding author: Cheng-Jian Lin (cjlin@ncut.edu.tw)

This work was supported by the Ministry of Science and Technology of the Republic of China, Taiwan, under Contract MOST
110-2221-E-167 -031 -MY2.
ABSTRACT With the rapid pace of urbanization, the number of vehicles traveling between cities has
increased significantly. Consequently, many traffic-related problems have emerged, such as traffic jams
and excessive numbers and types of vehicles. To solve traffic problems, road data collection is important.
Therefore, in this paper, we develop an intelligent traffic-monitoring system based on you only look
once (YOLO) and a convolutional fuzzy neural network (CFNN), which record traffic volume, and vehicle
type information from the road. In this system, YOLO is first used to detect vehicles and is combined
with a vehicle-counting method to calculate traffic flow. Then, two effective models (CFNN and Vector-
CFNN) and a network mapping fusion method are proposed for vehicle classification. In our experiments,
the proposed method achieved an accuracy of 90.45% on the Beijing Institute of Technology public dataset.
On the GRAM-RTM data set, the mean average precision and F-measure (F1) of the proposed YOLO-CFNN
and YOLO-VCFNN vehicle classification methods are 99%, superior to those of other methods. On actual
roads in Taiwan, the proposed YOLO-CFNN and YOLO-VCFNN methods not only have a high F1 score
for vehicle classification but also have outstanding accuracy in vehicle counting. In addition, the proposed
system can maintain a detection speed of more than 30 frames per second in the AGX embedded platform.
Therefore, the proposed intelligent traffic monitoring system is suitable for real-time vehicle classification
and counting in the actual environment.
INDEX TERMS Traffic-monitoring system, fuzzy neural network, vehicle classification, feature fusion,
deep learning.
I. INTRODUCTION all rely on information collected by a road monitoring system

Road traffic monitoring is an important research topic. for analysis. Therefore, to obtain information on passing
By analyzing the types of vehicles and traffic flow on the vehicles, many researchers have used different methods to
road, current traffic conditions can be understood, and action- achieve vehicle detection and classification.
able information can be provided to traffic management Traditional vehicle-detection methods are mainly divided
agencies. This information can help these agencies to make into two types: (1) Static-based methods [1]–[7] that use
decisions that improve people’s quality of life. For example, sliding windows or shape feature comparison methods to
on holidays, information regarding the road traffic volume generate vehicle prediction frames and verify them based
can be used to suggest alternate routes to drivers to divert on the information in the prediction frames and (2) methods
traffic from congested areas. In addition, if large trucks often that use the dynamic features of a moving object [8]–[12] to
use a certain road, roadside warnings can be installed to alert separate it from the image to obtain the contour of the object.
drivers and reduce traffic accidents. Moreover, the type and Regarding static-based methods, Mohamed et al. [1] pro-
color of a specific vehicle can be used to identify and track posed a vehicle-detection system that uses Haar-like features
the vehicles of criminals. The abovementioned applications to extract vehicle shape features and inputs the extracted fea-
tures into an artificial neural network to realize vehicle classi-
The associate editor coordinating the review of this manuscript and fication. Wen et al. [2] also used Haar-like features to extract
approving it for publication was Frederico Guimarães . the edge and structural features of vehicles and input them
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
14120 VOLUME 10, 2022
C.-J. Lin, J.-Y. Jhang: Intelligent Traffic-Monitoring System Based on YOLO and CFNNs
into AdaBoost to filter important features. Then, the filtered AlexNet to improve the traditional CNN by deepening the
features were input into a support vector machine (SVM) model architecture and using the ReLU excitation func-
for classification to improve its recognition accuracy. tion and the dropout layer to increase the effectiveness
Sun et al. [3] and David and Athira [4] used Garbor filters of the network during learning and prevent overfitting.
to obtain vehicle characteristics and then input them into an Szegedy et al. [16] proposed GoogLeNet, which uses mul-
SVM to determine whether a vehicle is present in an image. tiple filters of different sizes to extract features that enrich
Wei et al. [5] designed a two-step vehicle-detection method. feature information. Simonyan and Zisserman [17] proposed
First, they used Haar-like features and AdaBoost to obtain two models, namely VGG-16 and VGG-19. They replaced
the region of interest with vehicles and subsequently used the the large convolution kernel by successively using multiple
histogram of oriented gradients (HOG) [6] and an SVM to small convolution kernels to perform operations and proved
reverify the region. According to their experimental results, that increasing the depth of a model can improve its accuracy.
their method exhibited improved vehicle-detection capability. He et al. [18] proposed the ResNet model. They used resid-
Yan et al. [7] designed a vehicle-detection system that used ual blocks to solve the problem of gradient disappearance
vehicle shadows to select the boundaries of vehicles and the and convergence inability due to excessive network depth.
HOG to extract features. These features were then input into Howard et al. [19] proposed MoblieNet, which uses deep
an AdaBoost classifier and SVM classifier for verification. separation convolution to extract fewer and more useful fea-
In this method, when vehicles block each other, they are tures and reduces the number of redundant parameters in
regarded as one vehicle because the shadows are connected a CNN model.
to each other, which weakens the detection effect. The aforementioned studies have focused on improving
In terms of dynamics, Seenouvong et al. [8] pro- the feature description capabilities of a CNN to extend the
posed a vehicle-detection and counting system based on application of CNNs to more complex problems, such as
dynamic features. Background subtraction was used to obtain object detection. Several researchers [20]–[24] have used
a difference map from a given current image to achieve seg- region-based CNN (R-CNN) series models to solve the
mentation of the corresponding foreground image. In addi- vehicle-detection problem. R-CNN uses the region proposal
tion, various morphological operations were used to obtain network (RPN) [25] to extract the position of an object and
the outline and bounding box of a moving object, detect then classifies it by using a traditional CNN. RetinaNet [26]
moving vehicles, and count the vehicles passing through is the latest network architecture of R-CNN models. The
a designated area. A few researchers have used Gaussian R-CNN framework comprises a two-stage mechanism and
mixture models (GMMs) [9], [10] to model the background uses a multilayer neural network for classification [27], [28].
or adaptive background [11]–[13] with the aim of solving This architecture substantially increases the number of
the problem of background subtraction due to background parameters used and decreases the execution speed; thus, it is
images. Poor foreground segmentation is caused by grad- unsuitable for real-time detection. To solve this problem, one-
ual changes in brightness. The aforementioned static and stage mechanism methods have been proposed for vehicle
dynamic methods have many limitations in overcoming this detection, such as the you-only-look-once (YOLO) frame-
problem. For example, traditional feature extraction meth- work model [29]–[31] and the single-shot multibox detector
ods must be manually designed by experts on the basis of (SSD) [32] framework model. One-stage methods are fast
their experience, meaning that the process is complicated. and can detect objects in real time, but their classification
Moreover, the extracted features are mostly pieces of shallow accuracy is lower than that of R-CNN methods [33], [34].
vertical and horizontal information, which cannot effectively The aforementioned object-detection methods have the
describe the changes in vehicle features and cannot be widely following problems: 1) Two-stage object-detection methods
used. The dynamic feature method increases the complex- have high classification accuracy, but the large of network
ity of subsequent image processing operations in cases of parameters decrease the detection speed. 2) One-stage object-
extensive background changes in addition to yielding poor detection methods have a high real-time detection speed
detection results. With recent advancements in deep learning, but lower accuracy than two-stage object-detection methods.
these conventional methods have gradually been replaced by 3) To increase the number of object categories, the entire
deep learning techniques. network must be retrained, which is time-consuming and
reduces the scalability of the method.
II. LITERATURE REVIEW Recently, fuzzy neural networks (FNNs) [35]–[39] that
In recent years, deep learning has been widely used in many have a human-like fuzzy inference mechanism and the pow-
fields, and good prediction results have been obtained with erful learning functions of neural networks have been widely
this method. Compared with traditional methods that require used in various fields, such as classification, control, and fore-
artificial feature determination, the convolutional neural net- casting. Asim et al. [35] applied an adaptive network-based
work (CNN) method greatly improves the accuracy of image fuzzy inference system to classification problems. Compared
recognition. Initially, Lecun et al. [14] proposed the LeNet with traditional neural networks, this method yielded higher
model to solve the problem of recognizing handwritten digits classification accuracy. Lin et al. [36] used an interval type-2
in the banking industry. Krizhevsky et al. [15] proposed FNN and tool chips to predict flank wear, and their method
VOLUME 10, 2022 14121

yielded superior prediction results. A few researchers have • Category extensions (e.g., vehicle type) only require
used a locally recurrent functional link fuzzy neural net- training of the classification model (CFNN) with-
work [37] and Takagi–Sugeno–Kang-type FNNs [38], [39] out retraining of the object detection model (YOLO).
to solve system identification and prediction problems, and This not only saves substantial training time but also
both methods have yielded good results. In this study, an FNN improves the flexibility of category extension.
was embedded into a deep learning network to reduce the • The proposed intelligent traffic monitoring system was
number of parameters used in the network and obtain superior implemented on the NVIDIA AGX Xavier embedded
classification results. Conventional CNNs use pooling, global platform and applied to provincial highway 1 (T362)
pooling [40], and channel pooling [41] methods for feature in Kaohsiung, Taiwan for real-time vehicle tracking,
fusion. Global pooling methods sum the spatial information counting, and classification.
and perform operations on each feature map to achieve fea- The remainder of this paper is organized as follows:
ture fusion and can be divided into global average pooling Section 3 introduces the proposed YOLO-CFNN method
(GAP) [42] and global max pooling (GMP) [43]. Thus, global for intelligent traffic monitoring. The experimental results
pooling methods are more robust to spatial translations of of the proposed method are described in Section 4.
the input and prevent overfitting. Channel pooling methods Section 5 presents our conclusions and an outline of future
include channel average pooling (CAP) [44] and channel work.
max pooling (CMP) [45], which perform feature fusion by
computing average or maximum pixel values, respectively, III. PROPOSED YOLO-CFNN FOR INTELLIGENT TRAFFIC
at the same positions in each channel of feature maps. MONITORING
Furthermore, these methods only compress features and do In this section, an intelligent traffic-monitoring system is
not contain learnable weights, leading to poor classification introduced. The proposed system has three functions, namely
results. In this study, a new feature fusion method named net- (1) vehicle detection, (2) vehicle counting, and (3) vehicle
work mapping was proposed to enhance the utility of feature classification. The system architecture is illustrated in Fig. 1.
fusion and explore the effectiveness of different feature fusion
methods.
To design an intelligent traffic-monitoring system with
fast execution speed, high classification accuracy, and high
category extensibility, a two-stage object-detection method
was adopted in this study. The proposed intelligent traffic-
monitoring system based on YOLO and a convolutional
FNN (CFNN) collects real-time information on traffic vol-
ume and vehicle type on the road. In this system, a novel
modified YOLOv4-tiny (mYOLOv4-tiny) is first used to
FIGURE 1. Three functions of the proposed intelligent traffic-monitoring
detect vehicles and is then combined with a vehicle count- system.
ing method to calculate the traffic flow. Furthermore,
two effective models (CFNN and Vector-CFNN) and a A flowchart of the proposed intelligent traffic-monitoring
network mapping fusion method that improve the com- system is presented in Fig. 2. First, real-time road images
putational efficiency, classification accuracy, and category are obtained from traffic cameras. Then, the proposed
extensibility were proposed for vehicle classification. The mYOLOv4-tiny model is used to detect the position of a
proposed model architecture has fewer network parameters vehicle. To solve the problem of the repeated recording
compared to other models; therefore, the system can achieve of the same car as different vehicles in different frames,
real-time, high-accuracy vehicle classification with limited
hardware resources and flexible extensibility for different
categories.
The contributions of this study can be summarized as
follows:
• An intelligent traffic-monitoring system was developed
to record real-time information about traffic volume, and
vehicle types.
• An mYOLOv4-tiny model was proposed to achieve
real-time object detection and improve detection
efficiency.
• Two effective models (CFNN and Vector-CFNN) that
adopt a new network mapping fusion method were
implemented to increase the classification accuracy and FIGURE 2. Flowchart of the proposed intelligent traffic-monitoring
greatly reduce the number of model parameters. system.
14122 VOLUME 10, 2022

three max pooling layers were used. Finally, three scales—

25 × 15, 50 × 30, and 100 × 60—were used for prediction.
In this system, the mYOLOv4-tiny model is only used to
detect vehicle objects.
B. VEHICLE COUNTING METHOD

The above-described YOLO object detection method can
be used to identify a vehicle and its location information
from a single picture. However, in actual traffic applications,
a continuous image frame is provided as the input. The vehi-
cles detected in different image frames are independent of
each other. Therefore, the same vehicle is counted multiple
FIGURE 3. Virtual detection area. times, and the collected vehicle information would be wrong.
To solve this problem, the ID of the detected vehicle must
be configured to prevent double counting. In the proposed
a counting algorithm is introduced to track the vehicle. system, an object counting method is added to correlate and
In other words, a vehicle is assigned the same identity (ID) match the vehicles detected in different image frames and to
across different frames. Before execution of the counting determine whether a detected vehicle is newly added. In this
algorithm, the virtual detection area (as shown in Fig. 3) of the study, the multiobject counting method [46] is adopted, which
target vehicle is screened to reduce the computational burden. uses vehicle position information from the previous frame
Finally, the vehicles passing through the virtual detection area obtained by the detection method to predict the position of
are counted and classified, and the resulting information is a vehicle in the current frame by applying a Kalman filter.
collected and stored for subsequent analysis. Then, the actual vehicle position detected in the current frame
and the current vehicle position estimated using the Kalman
A. VEHICLE DETECTION USING MODIFIED YOLOv4-tiny filter are used to calculate the intersection over union (IoU) as
The conventional YOLOv4-tiny is a lightweight network the distance cost. Finally, the Hungarian algorithm is applied
simplified using YOLO. It uses convolutional layers and to match vehicles to achieve vehicle tracking.
max pooling layers to extract object features. In addition,
YOLOv4-tiny uses UpSampling and Concat layers to merge C. VEHICLE CLASSIFICATION USING CFNN
features and expand feature information to further improve By using the methods described in Subsections 2.1 and 2.2,
detection. Compared with other YOLO and SSD methods, the vehicle position can be determined from the complete
YOLOv4-tiny has a faster detection speed. However, the image and the vehicle can be segmented. Next, to collect more
detection accuracy of YOLOV4-tiny is worse than that of detailed information, such as vehicle type, the segmented
YOLO and SSD methods due to its greatly simplified net- vehicle image is analyzed and the results are obtained after
work architecture. Conventional YOLOv4-tiny uses two out- classification. If information items must be added, the YOLO
puts for object detection. To improve the detection accuracy, model need not be retrained; thus, it has superior extensibility
an mYOLOv4-tiny that has three outputs was designed for and reduced training time after category expansion.
vehicle detection. The network architecture of mYOLOv4- To identify relevant vehicle information, two CFNNs,
tiny is depicted in Fig. 4. In total, 24 convolutional layers and called CFNN and Vector-CFNN, are proposed, as illustrated
FIGURE 4. Network architecture of mYOLOv4-tiny.
VOLUME 10, 2022 14123

in Fig. 5. In the CFNN model in Fig. 5(a), at the outset, the TABLE 1. Different fusion methods.
convolutional layer is used to extract features from the image,
and the maximum pooling layer is then used to compress
these features to reduce the amount of calculation. The inter-
active stacking method is used to increase the model depth to
complete various shape feature combinations, and a feature
fusion layer is added to reduce the dimensionality of the fea-
ture size and integrate information. Finally, the fused feature
information is sent to the FNN for classification to obtain the
classification result of vehicle type. To solve the problem of
multiple redundant parameters in the traditional CNN model,
this study proposes a Vector-CFNN model (i.e., Fig. 5(b)).
The architecture of this model is similar to that of CFNN, and
the traditional convolutional layer is replaced with a two-layer
vector kernel convolutional layer [47] to further reduce the
number of parameters and computational complexity of the
model. Fig. 6, and the calculation formula is as follows:
Xn
fz = wzi ∗ xi . (1)
i=1
where fz is the output of the zth fusion, n is the total number
of input features, xi is the ith input feature element, and wzi is
the ith input weight used in the zth fusion result.
FIGURE 5. Schematic of the proposed network architecture: (a) CFNN FIGURE 6. Schematic of the network mapping fusion method.
model and (b) Vector-CFNN model.
2) FNNS
Next, the feature fusion layer and FNN classifier are FNNs mimic human logical thinking and learning abilities.
explained in the proposed models. In terms of network design, an FNN can be divided into the
input layer, fuzzification layer, rule layer, and defuzzification
1) FEATURE FUSION LAYER layer. The fuzzy set is contained in the fuzzy layer, and its
In the feature fusion layer, different fusion methods can be members can have different degrees of membership on the
used to integrate different types of feature information to interval is [0, 1]; this is known as a membership function.
obtain more useful features. Given a large number of input The fuzzy membership function converts input data to a value
features, a suitable fusion method is selected to compress in [0, 1] based on the degree of membership of a specified set,
the features and reduce the dimensionality of the information providing a measure of the degree of similarity of an element
between them. For method selection, the features are fused of a fuzzy set. Common fuzzy membership functions include
using either pooling operations or network mapping. Based triangular, trapezoidal, bell-shaped, and Gaussian; among
on the different operation rules between features, different these, the Gaussian membership function has the highest
fusion results can be obtained, as summarized in Table 1. accuracy [48]. Therefore, the Gaussian function is adopted as
In this study, a network mapping fusion method is pro- the membership function in the proposed CFNN. The feature
posed. This method assigns a weight to the information of vectors extracted by convolution operation are classified by
each extracted feature and then integrates these weights to a FNN. The If–then can be used to represent the fuzzy rules
obtain new features. The calculation method is shown in to make fuzzy inferences (Fig. 7).
14124 VOLUME 10, 2022

TABLE 2. Parameter settings of the proposed CFNN model.
FIGURE 7. Schematic of a fuzzy neural network.
IV. EXPERIMENTAL RESULTS

To verify the effectiveness of the proposed intelligent traffic-
monitoring system, three experiments were performed in
this study. Section 4A describes the evaluation indicators
and the parameter settings of the CFNN model. Section 4B
compares the classification efficiency of various CNN mod- TABLE 3. Parameter settings of the proposed Vector-CFNN model.
els and feature fusion methods by using the Beijing Insti-
tute of Technology Vehicle data set (BIT-Vehicle Dataset).
Section 4C applies the public GRAM road-traffic monitoring
(GRAM-RTM) data set to compare the evaluation indicators
between the proposed YOLO-CFNN and the state-of-the-art
object detection methods. Section 4D presents the proposed
intelligent traffic monitoring system that was implemented in
the AGX Xavier embedded platform and applied to provincial
highway 1 (T362) in Kaohsiung, Taiwan for vehicle classifi-
cation, tracking, and counting.
A. EXPERIMENTAL DESIGN
To evaluate the output results of the model, this study used the
category with the highest model output value (top-1) as the
classification result and accuracy as the evaluation indicator.
The calculation formula is as follows:
(TP + TN )
accuracy = (2)
(TP + FP + TN + FN )
where TP, FP, TN, and FN denote true positive, false positive,
true negative, and false negative, respectively. The mean aver-
age precision (mAP), precision, recall, F-measure (F1), and
detection speed (FPS) were also adopted to verify the effec- and Keras were used as the deep learning environment and
tiveness of various object detection models. The evaluation developmental tool, respectively, and an RTX2080Ti graphics
indicators can be calculated as follows: card was used to train the network model. The parameter
Pk=n settings of the proposed CFNN and Vector-CFNN models are
k=1 APk summarized in Tables 2 and 3, respectively.
mAP = (3)
n In the CFNN model, the input image size is set to 224 ×
TP
precision = (4) 224 × 3, and four sets of convolutional layers and pooling
TP + FP layers are used to achieve feature extraction. In each convo-
TP
recall = (5) lutional layer uses a 3 × 3 (see Table 1), 3 × 1, or 1 × 3
TP + FN (see Table 2) convolution kernel to extract features. Each
2precision × recall feature is compressed through the largest pooling layer of
F1 = (6)
precision + recall size 2 × 2 to reduce the computational load. In the convo-
frame lutional layer, 32, 64, and 128 are used as the number of
FPS = (7)
second convolution kernels in the first three layers to extract various
Here, n indicates the number of classes, and APk denotes the shape feature combinations. Then, the number of convolution
(AP) of class k. In the experimental environment, TensorFlow kernels in the last layer is set to 64, and the feature fusion layer
VOLUME 10, 2022 14125

TABLE 4. Number of each vehicle type. TABLE 5. Experimental results of CFNN and Vector-CFNN models with
various feature fusion methods.
FIGURE 8. Vehicle images after segmentation in BIT dataset.
TABLE 6. Experimental results of vehicle classification using various

models.
is added to reduce the dimensionality of the features. Here,
by using the proposed network mapping method, a total of
128 features are fused and input into the FNN for classifica-
tion. The output size represents the number of categories of
different vehicles.
B. CLASSIFICATION RESULTS OF BIT-VEHICLE DATASET

The BIT-Vehicle Dataset [49] is a public dataset for vehicle
classification collated by Beijing Institute of Technology. The
dataset contains a total of 9,850 vehicle images, all of which
were captured using 2 cameras at different times and locations
on highways. These images differ in brightness, proportions,
and surface color. The dataset includes six vehicle types:
buses, minibuses, minivans, sedans, sports utility vehicles
(SUVs), and trucks. Each image contains 1 or 2 vehicles. TABLE 7. Number of vehicles in the training and testing phases using
The vehicle positions marked in advance in the dataset are GRAM-RTM.
segmented (Fig. 8), and the vehicle types and numbers after
segmentation are listed in Table 4.
In the training and testing of the model, according to
the processing method described in [49], 200 vehicles were
randomly selected from each category to be the training and
testing data. In total, 2400 images each were used as the
training and test datasets for the experiment. Ten experiments
were performed using these data sets and the average of the
values obtained in these experiments was used for evalua- two CFNN models is higher than that of other deep learning
tion. This study used different fusion methods to evaluate classification methods. The accuracy of the two CFNN and
the performance of the proposed CFNN and Vector-CFNN Vector-CFNN models is 0.89% and 1.93% higher, respec-
models. The experimental results are listed in Table 5. The tively, than that of PCN-Net, and 51.7% and 57.1% fewer
accuracies of the CFNN and Vector-CFNN models reached parameters are used in the CFNN and Vector-CFNN models,
90.20% and 90.45%, respectively, with the network mapping respectively, than in PCN-Net.
fusion method. Compared with the global pooling and chan-
nel pooling methods, the proposed network mapping fusion C. VEHICLE CLASSIFICATION RESULTS ON THE
method has higher accuracy. GRAM-RTM DATA SET
Moreover, the two proposed models were compared with The GRAM-RTM (M-30) data set [51] was used to compare
other common models, namely AlexNet, GoogLeNet, VGG- the performance of the proposed YOLO-CFNN and state-of-
16, VGG-19, ResNet50, Sparse Laplacian CNN [49], and the-art object detection methods, including RetinaNet, SSD,
PCN-Net [50]. The experimental comparison results are sum- YOLOv4, and YOLOv4 tiny. The M30 contains 7520 frames
marized in Table 6. According to the table, the accuracy of the with a resolution of 800 × 480 at 30 fps recorded using a
14126 VOLUME 10, 2022

TABLE 8. Vehicle classification results for the proposed YOLO-CFNN with various object detection methods.
TABLE 9. Specifications of the NVIDIA Jetson AGX Xavier platform.
Nikon Coolpix L20 camera. Vehicle types include large truck,

truck, car, van, and motorcycle. The ratio of training data to
test data was 8:2. That is, 80% of the data (6016 frames) were
used for training and 20% of the data (1504 frames) were used FIGURE 9. Intelligent traffic-monitoring system in Taiwan.
for testing. The number of vehicles in the training and testing
phases are presented in Table 7. First, the vehicle object detec-
tion model (mYOLOv4-tiny) was trained, followed by the through the wireless network. Therefore, three functions are
classification models (CFNN and Vector-CFNN). The vehicle achieved on the road monitored for testing, namely vehicle
classification results are listed in Table 8, which reveals that classification, and traffic flow calculation. Moreover, a T362
the proposed YOLO-CFNN, YOLO-VCFNN, and YOLOv4 vehicle data set, which contained vehicle type, is established
yield a mAP as high as 99%. The proposed YOLO-CFNN for Kaohsiung, Taiwan to verify the performance of the pro-
and YOLO-VCFNN methods had a higher F1 score than other posed model.
models. The number of vehicle types image data points in the T362
The traditional YOLOv4-tiny model had a higher detection vehicle dataset was 1,815. The images were captured from
speed (300 FPS) but lower F1 and mAP than other mod- different lanes and at different times. Therefore, the captured
els. However, the proposed two-stage vehicle classification images were illuminated by different light sources, and some
models (YOLO-CFNN and YOLO-VCFNN) achieved a high vehicles were blocked, as illustrated in Fig. 10. The T362
score for the evaluation indicators and a detection speed of vehicle dataset has the following six vehicle types: buses,
over 70 FPS. Thus, the proposed YOLO-CFNN and YOLO- trucks, cars, motorcycles, and trailers. The numbers of each
VCFNN models can be employed for real-time vehicle clas- vehicle type are listed in Table 10. In terms of model training
sification applications. and testing, this study used 80% of the collected vehicle type
dataset as the training data and 20% as the testing data. The
D. APPLICATION TO PROVINCIAL HIGHWAY 1 (T362) IN input image size of the classification model was uniformly
KAOHSIUNG adjusted to 224 × 224 × 3, and 10 experimental runs were
To verify the effectiveness of the system in an actual environ- performed to ensure the stability of the experiment.
ment, the proposed intelligent traffic-monitoring system was
applied to roads in Taiwan, and its architecture is depicted 1) CLASSIFICATION RESULTS FOR VEHICLE TYPE
in Fig. 9. In this architecture, each monitored road section In the evaluation conducted using the vehicle type dataset, the
houses a camera and an AGX Xavier embedded computing experimental results obtained using CNN and Vector-CNN
platform. The specifications of the AGX Xavier platform with different fusion methods are summarized in Table 11.
are listed in Table 9. Real-time images are processed using The CFNN and Vector-CFNN models with the proposed
the AGX Xavier platform to obtain detailed traffic data, and network mapping fusion method exhibited the best accuracy
the data are sent to the road monitoring center for analysis values of 94.68% and 95.28%, respectively. The proposed
VOLUME 10, 2022 14127

FIGURE 10. Samples of collected vehicle images in the T362 vehicle

dataset.
TABLE 10. Numbers of each vehicle type.
FIGURE 11. Three actual road traffic videos.
TABLE 11. Vehicle type classification results obtained using CFNN and
Vector-CFNN with different fusion methods.
FIGURE 12. Different occlusion conditions in a real road scene.
TABLE 12. Vehicle type classification results obtained using various

models.
FIGURE 13. The visual vehicle detection and counting.
mapping fusion method was superior to that of the other

classification methods. The accuracy of the proposed models
was 1.83%, 3.59%, 8.6%, and 11.29% higher than the accu-
racy of AlexNet, VGG16, LeNet, and MoblieNet, respec-
tively. In terms of the number of parameters, the proposed
CFNN and Vector-CFNN models had the lowest parameter
value of approximately 0.5 M. In addition, compared with
the lightweight MobileNet, LeNet, and AlexNet, the two
proposed CFNN models reduced the number of parameters
by up to 86.8%, 89.4%, and 98.8%, respectively. Thus, the
network mapping method is superior to the other fusion proposed models achieve favorable classification and offer a
methods in vehicle classification. competitive advantage when few parameters are used.
The CFNN and Vector-CFNN models proposed in this
study were compared with a few common deep learning 2) COUNTING RESULTS OF ACTUAL ROAD TRAFFIC FLOW
methods, and the experimental results are listed in Table 12. Finally, the vehicle counting method used in this study
The accuracy of the proposed models with the network was evaluated and verified. In this experiment, the same
14128 VOLUME 10, 2022

FIGURE 14. Precision vs. recall curves of the various detection methods by using actual road traffic videos at 7:00.
FIGURE 15. Precision versus recall curves of the various detection methods by using actual road traffic videos at 17:00.
traffic scene was used for verification. Three actual road Still images from the three videos are displayed
traffic videos were used to evaluate the proposed vehicle in Fig. 11.
counting method. Each video was 5 min long, and the In the evaluation, the proposed vehicle flow counting result
two selected videos were recorded at 07:00 and 17:00. was divided by the manual counting result to determine the
The remaining videos were taken in rainy conditions. accuracy of vehicle counting. In addition, different occlusion
VOLUME 10, 2022 14129

FIGURE 16. Precision versus recall curves of the various detection methods using actual road traffic videos for rainy conditions.
TABLE 13. Traffic flow counting results obtained using actual road traffic videos at 7:00.
conditions were included in the real road scene as presented were 88.82% and 85.55%, respectively. However, the mAP
in Fig. 12. As shown in Fig. 12, a larger bus blocks a car, for trailers was only 64.44%. Although YOLOv4-tiny has a
resulting in a missed count. The visual vehicle detection and detection speed of 145 FPS, the motorcycle detection per-
counting are shown in Fig. 13. The text in the first half of formance was poor (65.49%). The proposed YOLO-CFNN
the green label in Fig. 13 represents the type of vehicle and and YOLO-VCFNN are superior to other methods in terms
the text in the second half represents the number of counts. of F1 score (99%). After introducing the counting method
When a vehicle enters the virtual detection zone, the proposed into CFNN and VCFNN, FPS can be maintained above 30 to
intelligent traffic-monitoring system immediately performs achieve real-time detection. The two proposed methods also
vehicle classification and counting. had an accuracy of 97.05% in traffic flow vehicle counting.
The traffic flow counting results of each video are summa- For the afternoon road traffic video (Table 14), the mAP
rized in Tables 13–15. The precision versus recall curves of and F1 of YOLO-CFNN and YOLO-VCFNN were higher
the proposed YOLO-CFNN and YOLO-VCFNN models are than those of other methods. The accuracy of flow count-
shown in Figs. 14–16. As shown in Table 13, the mAP of Reti- ing was 98.5%. For the rain video (Table 15), except for
naNet and SSD was 94%, but the F1 scores were only 76.06% the SSD method, the mAP of the motorcycle detection was
and 86.28%, respectively. The mAP and F1 score of YOLOv4 lower because images captured in rainy conditions are blurry,
14130 VOLUME 10, 2022

TABLE 14. Traffic flow counting results obtained using actual road traffic videos at 17:00.
TABLE 15. Traffic flow counting results obtained using actual road traffic videos for rainy conditions.
affecting the judgment results. However, the mAP and F1 of • Compared with the current state-of-the-art object detec-
the two proposed methods were higher than 90%, and the tion methods (Retinanet, SSD, YOLOv4, and YOLOv4
counting accuracy was 100%. These scenarios reveal that the tiny), the proposed YOLO-CFNN and YOLO-VCFNN
proposed intelligent traffic-monitoring system is suitable for have a high mAP rate, accurate counting accuracy,
real-time vehicle counting in actual environments and has a and real-time vehicle counting and classification ability
high counting accuracy. (over 30FPS).
The experimental results indicated that the performance
of the proposed CFNN and Vector-CFNN models was supe-
V. CONCLUSION
rior to that of common deep learning models. On the BIT
In this study, an intelligent traffic-monitoring system was
dataset, compared with the pooling method, the proposed
proposed to calculate traffic flows and classify vehicle types.
network mapping fusion method improved the recognition
The major contributions of this study are as follows:
accuracy by 3.59%–5.92%. In addition, compared with the
• A novel intelligent traffic-monitoring system combin- PCN-Net model, the proposed CFNN and Vector-CFNN
ing a YOLOv4-tiny model and counting method was models improved the accuracy by 1.93% and reduced the
proposed for traffic volume statistics and vehicle type number of parameters by 57.1%. On the GRAM-RTM data
classification. set, the mAP and F1 of the two proposed vehicle classifi-
• The proposed CFNN and Vector-CFNN were designed cation methods were 99%, higher than those of other meth-
by introducing the fusion method and FNN, which ods. In addition, among the FPS indicators, the proposed
can not only effectively reduce the number of network method was 1.65 times faster than the traditional YOLOv4.
parameters, but also enhance the classification accuracy. On the T362 vehicle type dataset, compared with the gen-
• The proposed network mapping fusion method was eral pooling methods, the accuracy of the proposed network
superior to the commonly used pooling method, and it mapping fusion method was 2.3%–5.36% higher. In addi-
could effectively integrate image features and improve tion, compared with the AlexNet model, the accuracy of the
the classification accuracy. proposed CFNN and Vector-CFNN models was 1.19% and
VOLUME 10, 2022 14131

1.83% higher, respectively, and the number of parameters [16] C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan,
decreased by 98.8%. In three actual road traffic scenarios, V. Vanhoucke, and A. Rabinovich, ‘‘Going deeper with convolutions,’’
in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), Jun. 2015,
the proposed YOLO-CFNN and YOLO-VCFNN methods pp. 1–9.
yielded a high F1 score for vehicle classification and high [17] K. Simonyan and A. Zisserman, ‘‘Very deep convolutional networks for
accuracy for vehicle counting. In summary, the CFNN and large-scale image recognition,’’ 2014, arXiv:1409.1556.
[18] K. He, X. Zhang, S. Ren, and J. Sun, ‘‘Deep residual learning for image
Vector-CFNN models proposed in this study not only have recognition,’’ in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR),
favorable vehicle classification effects but also have fewer Jun. 2016, pp. 770–778.
parameters relative to other models. Therefore, the proposed [19] A. G. Howard, M. Zhu, B. Chen, D. Kalenichenko, W. Wang, T. Weyand,
models are suitable for information analysis in environments M. Andreetto, and H. Adam, ‘‘MobileNets: Efficient convolutional neural
networks for mobile vision applications,’’ 2017, arXiv:1704.04861.
with limited hardware performance. [20] K. Shi, H. Bao, and N. Ma, ‘‘Forward vehicle detection based on incre-
In terms of the extensibility of the proposed models, many mental learning and fast R-CNN,’’ in Proc. 13th Int. Conf. Comput. Intell.
factors that affect the machining accuracy of machine tools Secur. (CIS), Dec. 2017, pp. 73–76.
[21] S.-C. Hsu, C.-L. Huang, and C.-H. Chuang, ‘‘Vehicle detection using sim-
in intelligent manufacturing have been identified, such as plified fast R-CNN,’’ in Proc. Int. Workshop Adv. Image Technol. (IWAIT),
temperature and tool wear. Therefore, developing an accurate Jan. 2018, pp. 1–3.
model of the effects of these factors is crucial. In future stud- [22] S. Rujikietgumjorn and N. Watcharapinchai, ‘‘Vehicle detection with sub-
ies, the proposed CFNN and Vector-CFNN models and the class training using R-CNN for the UA-DETRAC benchmark,’’ in Proc.
14th IEEE Int. Conf. Adv. Video Signal Based Surveill. (AVSS), Aug. 2017,
network mapping fusion method will be applied for modeling pp. 1–5.
in intelligent manufacturing. [23] W. Zhang, Y. Zheng, Q. Gao, and Z. Mi, ‘‘Part-aware region proposal for
vehicle detection in high occlusion environment,’’ IEEE Access, vol. 7,
pp. 100383–100393, 2019.
REFERENCES [24] L. Wang, Y. Lu, H. Wang, Y. Zheng, H. Ye, and X. Xue, ‘‘Evolving boxes
[1] A. Mohamed, A. Issam, B. Mohamed, and B. Abdellatif, ‘‘Real-time for fast vehicle detection,’’ in Proc. IEEE Int. Conf. Multimedia Expo.
detection of vehicles using the Haar-like features and artificial neuron (ICME), Jul. 2017, pp. 1135–1140.
networks,’’ Proc. Comput. Sci., vol. 73, pp. 24–31, Jan. 2015. [25] S. Ren, K. He, R. Girshick, and J. Sun, ‘‘Faster R-CNN: Towards real-
[2] X. Wen, L. Shao, W. Fang, and Y. Xue, ‘‘Efficient feature selection and time object detection with region proposal networks,’’ IEEE Trans. Pattern
classification for vehicle detection,’’ IEEE Trans. Circuits Syst. Video Anal. Mach. Intell., vol. 39, no. 6, pp. 1137–1149, Jun. 2017.
Technol., vol. 25, no. 3, pp. 508–517, Mar. 2015. [26] T.-Y. Lin, P. Goyal, R. Girshick, K. He, and P. Dollar, ‘‘Focal loss for dense
[3] Z. Sun, G. Bebis, and R. Miller, ‘‘On-road vehicle detection using Gabor object detection,’’ IEEE Trans. Pattern Anal. Mach. Intell., vol. 42, no. 2,
filters and support vector machines,’’ in Proc. 14th Int. Conf. Digit. Signal pp. 318–327, Feb. 2020.
Process. (DSP), Jul. 2002, pp. 1019–1022. [27] N. A. Al-Sammarraie, Y. M. H. Al-Mayali, and Y. A. Baker El-Ebiary,
[4] H. David and T. A. Athira, ‘‘Improving the performance of vehicle detec- ‘‘Classification and diagnosis using back propagation artificial neural net-
tion and verification by log Gabor filter optimization,’’ in Proc. 4th Int. works (ANN),’’ in Proc. Int. Conf. Smart Comput. Electron. Enterprise
Conf. Adv. Comput. Commun., Aug. 2014, pp. 50–55. (ICSCEE), Jul. 2018, pp. 1–5.
[5] Y. Wei, Q. Tian, J. Guo, W. Huang, and J. Cao, ‘‘Multi-vehicle detection [28] O. I. Abiodun, A. Jantan, A. E. Omolara, K. V. Dada, N. A. Mohamed,
algorithm through combining Harr and HOG features,’’ Math. Comput. and H. Arshad, ‘‘State-of-the-art in artificial neural network applications:
Simul., vol. 155, pp. 130–145, Jan. 2018. A survey,’’ Heliyon, vol. 4, no. 11, Nov. 2018, Art. no. e00938.
[6] S. Bougharriou, F. Hamdaoui, and A. Mtibaa, ‘‘Linear SVM classifier [29] J. Redmon and A. Farhadi, ‘‘YOLOv3: An incremental improvement,’’
based HOG car detection,’’ in Proc. 18th Int. Conf. Sci. Techn. Autom. 2018, arXiv:1804.02767.
Control Comput. Eng. (STA), Dec. 2017, pp. 241–245.
[30] A. Bochkovskiy, C.-Y. Wang, and H.-Y. M. Liao, ‘‘YOLOv4: Optimal
[7] G. Yan, M. Yu, Y. Yu, and L. Fan, ‘‘Real-time vehicle detection using
speed and accuracy of object detection,’’ 2020, arXiv:2004.10934.
histograms of oriented gradients and AdaBoost classification,’’ Optik,
[31] Z. Jiang, L. Zhao, S. Li, and Y. Jia, ‘‘Real-time object detection method
vol. 127, no. 19, pp. 7941–7951, 2016.
based on improved YOLOv4-tiny,’’ 2020, arXiv:2011.04244.
[8] N. Seenouvong, U. Watchareeruetai, C. Nuthong, K. Khongsomboon, and
N. Ohnishi, ‘‘A computer vision based vehicle detection and counting [32] W. Liu, D. Anguelov, D. Erhan, C. Szegedy, S. Reed, C. Fu, and A. C. Berg,
system,’’ in Proc. 8th Int. Conf. Knowl. Smart Technol. (KST), Feb. 2016, ‘‘SSD: Single shot multibox detector,’’ in Proc. Eur. Conf. Comput. Vis.,
pp. 224–227. Amsterdam, The Netherlands, Oct. 2016, pp. 21–37.
[9] P. K. Bhaskar and S.-P. Yong, ‘‘Image processing based vehicle detection [33] J. Huang, V. Rathod, C. Sun, M. Zhu, A. Korattikara, A. Fathi, I. Fischer,
and tracking method,’’ in Proc. Int. Conf. Comput. Inf. Sci. (ICCOINS), Z. Wojna, Y. Song, S. Guadarrama, and K. Murphy, ‘‘Speed/Accuracy
Jun. 2014, pp. 1–5. trade-offs for modern convolutional object detectors,’’ in Proc. IEEE Conf.
[10] N. Seenouvong, U. Watchareeruetai, C. Nuthong, K. Khongsomboon, and Comput. Vis. Pattern Recognit. (CVPR), Jul. 2017, pp. 3296–3297.
N. Ohnishi, ‘‘Vehicle detection and classification system based on virtual [34] P. Soviany and R. T. Ionescu, ‘‘Optimizing the trade-off between single-
detection zone,’’ in Proc. 13th Int. Joint Conf. Comput. Sci. Softw. Eng. stage and two-stage deep object detectors using image difficulty predic-
(JCSSE), Jul. 2016, pp. 1–5. tion,’’ in Proc. 20th Int. Symp. Symbolic Numeric Algorithms Sci. Comput.
[11] M. Anandhalli and V. P. Baligar, ‘‘Improvised approach using background (SYNASC), Sep. 2018, pp. 209–214.
subtraction for vehicle detection,’’ in Proc. IEEE Int. Advance Comput. [35] Y. Asim, B. Raza, A. K. Malik, A. R. Shahid, M. Faheem, and Y. J. Kumar,
Conf. (IACC), Jun. 2015, pp. 303–308. ‘‘A hybrid adaptive neuro-fuzzy inference system (ANFIS) approach for
[12] N. S. Sakpal and M. Sabnis, ‘‘Adaptive background subtraction in images,’’ professional bloggers classification,’’ in Proc. 22nd Int. Multitopic Conf.
in Proc. Int. Conf. Adv. Commun. Comput. Technol. (ICACCT), Feb. 2018, (INMIC), Nov. 2019, pp. 1–6.
pp. 439–444. [36] C.-J. Lin, J.-Y. Jhang, S.-H. Chen, and K.-Y. Young, ‘‘Using an interval
[13] N. Shah, A. Pingale, V. Patel, and N. V. George, ‘‘An adaptive background type-2 fuzzy neural network and tool chips for flank wear prediction,’’
subtraction scheme for video surveillance systems,’’ in Proc. IEEE Int. IEEE Access, vol. 8, pp. 122626–122640, 2020.
Symp. Signal Process. Inf. Technol. (ISSPIT), Dec. 2017, pp. 13–17. [37] D. K. Bebarta, R. Bisoi, and P. K. Dash, ‘‘Locally recurrent functional
[14] Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner, ‘‘Gradient-based learn- link fuzzy neural network and unscented H-infinity filter for shortterm
ing applied to document recognition,’’ Proc. IEEE, vol. 86, no. 11, prediction of load time series in energy markets,’’ in Proc. IEEE Power,
pp. 2278–2324, Nov. 1998. Commun. Inf. Technol. Conf. (PCITC), Oct. 2015, pp. 663–670.
[15] A. Krizhevsky, I. Sutskever, and G. Hinton, ‘‘ImageNet classification with [38] J.-W. Yeh and S.-F. Su, ‘‘Efficient approach for RLS type learning in TSK
deep convolutional neural networks,’’ in Proc. 25th Int. Conf. Neural Inf. neural fuzzy systems,’’ IEEE Trans. Cybern., vol. 47, no. 9, pp. 2343–2352,
Process. Syst. (NIPS), vol. 1, Dec. 2012, pp. 1097–1105. Sep. 2017.
14132 VOLUME 10, 2022

[39] C.-J. Lin, C.-H. Lin, and J.-Y. Jhang, ‘‘Dynamic system identification and [51] R. Guerrero-Gómez-Olmedo, R. J. López-Sastre, S. Maldonado-Bascón,
prediction using a self-evolving Takagi–Sugeno–Kang-type fuzzy CMAC and A. Fernández-Caballero, ‘‘Vehicle tracking by simultaneous detec-
network,’’ Electronics, vol. 9, no. 4, p. 631, Apr. 2020. tion and viewpoint estimation,’’ in Natural and Artificial Computation in
[40] M. Lin, Q. Chen, and S. Yan, ‘‘Network in network,’’ 2013, Engineering and Medical Applications, J. M. F. Vicente, J. R. Sánchez,
arXiv:1312.4400. F. de la Paz López, F. J. T. Moreo, Eds. Berlin, Germany: Springer, 2013,
[41] Z. Ma, D. Chang, J. Xie, Y. Ding, S. Wen, X. Li, Z. Si, and J. Guo, ‘‘Fine- pp. 306–316.
grained vehicle classification with channel max pooling modified CNNs,’’
IEEE Trans. Veh. Technol., vol. 68, no. 4, pp. 3224–3233, Apr. 2019.
[42] V. Christlein, L. Spranger, M. Seuret, A. Nicolaou, P. Kral, and A. Maier, CHENG-JIAN LIN (Senior Member, IEEE)
‘‘Deep generalized max pooling,’’ in Proc. Int. Conf. Document Anal. received the B.S. degree in electrical engineer-
Recognit. (ICDAR), Sep. 2019, pp. 1090–1096. ing from the Tatung Institute of Technology,
[43] Z. Li, S.-H. Wang, R.-R. Fan, G. Cao, Y.-D. Zhang, and T. Guo, ‘‘Teeth Taipei, Taiwan, in 1986, and the M.S. and Ph.D.
category classification via seven-layer deep convolutional neural network degrees in electrical and control engineering from
with max pooling and global average pooling,’’ Int. J. Imag. Syst. Technol., the National Chiao Tung University, Taiwan, in
vol. 29, no. 4, pp. 577–583, May 2019. 1991 and 1996, respectively. Currently, he is a
[44] Z. Gao, Y. Li, Y. Yang, N. Dong, X. Yang, and C. Grebogi, Chair Professor with the Computer Science and
‘‘A coincidence-filtering-based approach for CNNs in EEG-based recog- Information Engineering Department, National
nition,’’ IEEE Trans. Ind. Informat., vol. 16, no. 11, pp. 7159–7167, Chin-Yi University of Technology, Taichung, Tai-
Nov. 2020.
wan, and the Dean of the Intelligence College, National Taichung University
[45] L. Cheng, D. Chang, J. Xie, R. Ma, C. Wu, and Z. Ma, ‘‘Channel max
of Science and Technology, Taichung. His current research interests include
pooling for image classification,’’ in Intelligence Science and Big Data
Engineering. Visual Data Engineering, Z. Cui, J. Pan, S. Zhang, L. Xiao, machine learning, pattern recognition, intelligent control, image processing,
and J. Yang, Eds. Cham, Switzerland: Cham, Switzerland: Springer, 2019, intelligent manufacturing, and evolutionary robot.
pp. 273–284.
[46] A. Bewley, Z. Ge, L. Ott, F. Ramos, and B. Upcroft, ‘‘Simple online JYUN-YU JHANG received the B.S. and M.S.
and realtime tracking,’’ in Proc. IEEE Int. Conf. Image Process. (ICIP), degrees from the Department of Computer Science
Sep. 2016, p. 346. and Information Engineering, National Chin-Yi
[47] J. Ou and Y. Li, ‘‘Vector-kernel convolutional neural networks,’’ Neuro- University of Technology, Taichung, Taiwan,
computing, vol. 330, pp. 253–258, Feb. 2019.
in 2015, and the Ph.D. degree in electrical and
[48] N. Talpur, M. N. M. Salleh, and K. Hussain, ‘‘An investigation of
control engineering from the National Yang Ming
membership functions on performance of ANFIS for solving classifica-
tion problems,’’ IOP Conf. Ser., Mater. Sci. Eng., vol. 226, Aug. 2017, Chiao Tung University, Taiwan, in 2021. He is
Art. no. 012103. an currently an Assistant Professor with the
[49] Z. Dong, Y. Wu, M. Pei, and Y. Jia, ‘‘Vehicle type classification using a Computer Science and Information Engineering
semisupervised convolutional neural network,’’ IEEE Trans. Intell. Transp. Department, National Taichung University of Sci-
Syst., vol. 16, no. 4, pp. 2247–2256, Aug. 2015. ence and Technology, Taichung. His current research interests include
[50] F. C. Soon, H. Y. Khaw, J. H. Chuah, and J. Kanesan, ‘‘Semisupervised fuzzy logic theory, type-2 neural fuzzy systems, evolutionary computation,
PCA convolutional network for vehicle type classification,’’ IEEE Trans. machine learning, and computer vision and application.
Veh. Technol., vol. 69, no. 8, pp. 8267–8277, Aug. 2020.
VOLUME 10, 2022 14133

Intelligent Traffic-Monitoring System Based On YOLO and Convolutional Fuzzy Neural Networks

Uploaded by

Copyright:

Available Formats

Intelligent Traffic-Monitoring System Based On YOLO and Convolutional Fuzzy Neural Networks

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Intelligent Traffic-Monitoring System Based On YOLO and Convolutional Fuzzy Neural Networks

Uploaded by

Copyright:

Available Formats

Received January 15, 2022, accepted January 27, 2022, date of publication January 31, 2022, date of current

version February 8, 2022.

Intelligent Traffic-Monitoring System Based on

Corresponding author: Cheng-Jian Lin (cjlin@ncut.edu.tw)

I. INTRODUCTION all rely on information collected by a road monitoring system

VOLUME 10, 2022 14121

14122 VOLUME 10, 2022

three max pooling layers were used. Finally, three scales—

B. VEHICLE COUNTING METHOD

FIGURE 4. Network architecture of mYOLOv4-tiny.

VOLUME 10, 2022 14123

14124 VOLUME 10, 2022

TABLE 2. Parameter settings of the proposed CFNN model.

FIGURE 7. Schematic of a fuzzy neural network.

IV. EXPERIMENTAL RESULTS

VOLUME 10, 2022 14125

FIGURE 8. Vehicle images after segmentation in BIT dataset.

TABLE 6. Experimental results of vehicle classification using various

B. CLASSIFICATION RESULTS OF BIT-VEHICLE DATASET

14126 VOLUME 10, 2022

TABLE 9. Specifications of the NVIDIA Jetson AGX Xavier platform.

Nikon Coolpix L20 camera. Vehicle types include large truck,

VOLUME 10, 2022 14127

FIGURE 10. Samples of collected vehicle images in the T362 vehicle

TABLE 10. Numbers of each vehicle type.

FIGURE 11. Three actual road traffic videos.

FIGURE 12. Different occlusion conditions in a real road scene.

TABLE 12. Vehicle type classification results obtained using various

FIGURE 13. The visual vehicle detection and counting.

mapping fusion method was superior to that of the other

14128 VOLUME 10, 2022

VOLUME 10, 2022 14129

14130 VOLUME 10, 2022

VOLUME 10, 2022 14131

14132 VOLUME 10, 2022

VOLUME 10, 2022 14133

You might also like