Inergency
An online hub for emergency and natural disaster solutions

# Deep Learning-Based Road Traffic Noise Annoyance Assessment

6

This section performs a secondary optimization of the overall parameters and the model’s decoder parameters, respectively. The two methods will be denoted as total-tuning and fine-tuning. The training dataset is the listening experiment dataset with the total. Due to the different value ranges, in the secondary optimization, the model output will multiply eight by the empirical to suit the range. To compare the performance of the algorithms, artificial neural networks , linear regression , Lasso regression, and Ridge regression , and directly trained models, denoted as direct, are used as comparison algorithms. Different from the algorithms based on deep learning, the machine learning algorithm includes an artificial neural network, and regression algorithms should conduct feature engineering first. In this paper, the input information is the amplitude spectrum, and we use a principal component analysis (PCA) to achieve feature dimension reduction and then input the post-processed feature vectors. The PCA obtains the first 28 principal eigenvalues, and a total of 56 features (left channel and right channel) are obtained. The artificial neural network has nonlinear fitting capabilities; in this paper, the built artificial neural network consists of five linear fully connected layers and the output of each layer is activated by LeakyReLU. The regression algorithms are common analysis methods [16,17,50]. It is difficult to directly figure out the relationship of post-processing features, so this paper uses linear regression, Lasso regression, and Ridge regression as the basic comparison algorithms to evaluate the noise annoyance. The MAE is chosen to measure the error between the evaluation results and true results. Pearson’s correlation coefficient (Formula (13), denoted as PCC) , and Spearman’s correlation coefficient (Formula (14), denoted as SCC)  are chosen to measure the correlation between the evaluation results of the model and the true results.



PCC
=

cov
(
x
,
y
)

σ
x

σ
y

where x, y are two different arrays,



cov
(
.
,
.
)

is the covariance matrix,



σ
x

is the standard deviation of x, and



σ
y

is the standard deviation of y.



SCC
=

i
=
1

N

x
i

x
¯

y
i

y
¯

i
=
1

N

(

x
i

x
¯

)

2

.

i
=
1

N

(

y
i

y
¯

)

2

where x, y are two different arrays, N is the total number of each array,



x
¯

is the mean value of x,



and

y
¯

is the mean value of y.

A comparison of the evaluated performance of each algorithm on different annoyance intervals is shown in Table 5. In terms of the entire mixed dataset, compared to the mean absolute error of regression algorithms and the neural network, the mean error of direct is just 0.57, and the direct reduces the mean error by about 30%. Generally, the deep learning model performs best overall. In the annoyance intervals [2,5) and [6,9), total-tuning, fine-tuning, and direct obtain better results than all types of regression algorithms and artificial neural network algorithms. While in the interval [5,6), regression algorithms have the smallest evaluation error, which is better than direct, total-tuning, and fine-tuning, and within the interval [6,7), artificial neural networks have the most accurate assessment with an error of 0.26. The main reason for this phenomenon is the uneven distribution of the number of samples in different intervals. Both regression algorithms and artificial neural network algorithms prioritize learning the features present in the majority of samples. However, taking into account the information regarding the features of a small number of samples becomes challenging, leading to a significant error when evaluating small sample intervals. Although the accuracy of evaluation in larger sample intervals is high, the overall performance remains unsatisfactory, resulting in a severe over-fitting issue. Direct, total-tuning, and fine-tuning are algorithms that use deep learning models with strong feature-learning capabilities and perform better than regression and artificial neural network algorithms, but there are differences in the results due to the different training methods. Although direct methods have a significant improvement in the interval [2,3) compared to regression algorithms and artificial neural networks, total-tuning and fine-tuning use transfer learning to further reduce the evaluation error and outperform the direct method in all annoyance intervals. In terms of the difference between the maximum and minimum assessment errors for the different annoyance intervals, direct is 1.21 (maximum is interval [2,3), minimum is interval [4,5)), while fine-tuning and total-tuning are 0.45 and 0.38, respectively, and the algorithm using transfer learning greatly improves the robustness of the assessment.
Table 6 shows the MAE, PCC, and SCC results in the mixed dataset (data shown in Table 3). Total-tuning has the smallest evaluation error, but it is close to that of fine-tuning. On the other hand, compared to the MAE of direct, the fine-tuning is 0.45. Fine-tuning reduces this error by about 30%, and the evaluation results obtained by optimizing with the fine-tuning strategy have the largest correlation with the true results. The Pearson correlation coefficient and the Spearman correlation coefficient reach 0.93 and 0.92, compared to the Direct, they improve by about 6% and 5% respectively, which show that the algorithm trained by transfer learning has a larger overall improvement than direct training.