Inversion Study of Nitrogen Content of Hyperspectral Apple Canopy Leaves Using Optimized Least Squares Support Vector Machine Approach

[ad_1]

1. Introduction

The fruits of Rosaceae apple plants are rich in vitamins and minerals. China is the world’s largest producer of apples, with production reaching 48 million tons in 2022. Xinjiang is one of China’s most important apple production areas, with a long cultivation history [1]. The total area under cultivation has reached 390,000 mu, making apples an important cash crop in the Ring Tarim Basin that play a positive role in helping farmers to increase their income [2].
Nitrogen is one of the most important elements affecting the growth and development of apples. A lack of nitrogen causes the yellowing and curling of leaves, which has an impact on the quality of the fruits and can even result in the phenomenon of early fruit drop [3]. Meanwhile, excessive use of nitrogen fertilizers reduces the sugar content of the fruits and exacerbates the problem of soil acidification, which affects sustainable development [4]. Therefore, timely and accurate access to the nitrogen content of the leaves at the apple crown can facilitate the real-time management of orchards by providing information on growth and allowing for continued monitoring [5].
Traditional methods of plant nitrogen determination require that plant samples collected in orchards are brought back to the laboratory, ground, and treated with chemical reagents to determine their nitrogen content using formaldehyde [6] and distillation [7] methods. The formaldehyde method is an indirect titration of the acid–base titration method, where formaldehyde interacts with an ammonium salt and undergoes a formaldehyde–ammonia reaction, and the resulting compound is measured via colorimetry to determine the color of the solution. The amount of nitrogen in the sample can be indirectly determined; however, operating conditions must be strictly controlled, otherwise it is easy to produce large errors [6]. The distillation method obtains the nitrogen concentration by distilling out the nitrogen in the water sample and neutralizing and titrating it with a standard hydrochloric acid solution, which gives accurate and reliable results; however, the operation is cumbersome and time-consuming, and if the purity of the sample is poor, the analytical results are often low [7]. Hyperspectral remote sensing has the advantage of rapidly and non-destructively acquiring the canopy spectral information of crops, as compared to traditional determination methods [8,9]. Spectral analysis and nutritional diagnosis are carried out by utilizing the different degrees of light absorption and reflection by crop leaf cells, pigments, and water content [9]. Since nitrogen content and chlorophyll are closely related, a nitrogen content that is too low will lead to slowing down the rate of chlorophyll synthesis in crop leaves. Chlorophyll has a strong absorption rate in the red and blue light bands, and at the same time, the shape and position of the red edge band will be changed with the lack of nitrogen, so that a spectral nutrient diagnosis of the nitrogen content of the crop can be carried out accordingly [9]. Hyperspectral technology has been widely used for nutrient estimation in crop leaves, which is currently a focus of precision agriculture research [8,9].
Hyperspectral data in the acquisition process usually bear certain issues such as a large volume, redundancy, duplication, noise, as well as other problems. Therefore, in order to solve these problems so as to improve data quality, the data used in the prediction accuracy of the model to produce certain constraints should be preprocessed. Previous studies have shown that the preprocessing of spectral data can effectively reduce noise decomposition; therefore, extracting the sensitive bands of nitrogen spectra and constructing the model can improve the model accuracy. Ma et al. [10] explored the possibility of using hyperspectral techniques for the detection of total soil nitrogen, SG smoothing, and MSC spectral data preprocessing, combined with five modeling methods—partial least squares (PLS), back propagation (BP) neural network, radial basis function (RBF) neural network, extreme learning machine (ELM), and SVR—to compare the errors of spectral analysis using chemical analysis results as a control. The results showed that all five models could be used for the detection of soil total nitrogen content, and the SG smoothing preprocessing model had a better detection ability compared with MSC, with an R-squared of 0.8767, and an RMSE of 1.302, among which the SVR model had the best accuracy, with an R-squared of 0.9121 and an RMSE of 0.7581.
Since hyperspectral data have high-dimensional characteristics, and a lot of wavelength information in the visible and near-infrared spectra may be irrelevant to the target, the extraction of feature bands can reduce the bias caused by irrelevant wavelength information and improve the model prediction accuracy [11]. For example, the SPA algorithm can effectively extract feature bands from severely overlapping spectra, thus minimizing the effect of reducing covariance between spectral variables [12]. The CARS-PLS algorithm rejects redundant information by filtering feature selection [13]. Previous studies have shown that nonlinear models have more obvious advantages than linear models in quantitative prediction [14,15]. Common nonlinear models include SVR, LSTM, and RIME-LSSVM [16,17,18]. Therefore, this study further explores nonlinear modeling methods applicable to the prediction of leaf nitrogen content in apple trees.
To date, hyperspectral-based nitrogen inversion studies have mostly focused on wheat, rice, soybean, corn, and other grain crops. For example, Bruning et al. [19] have conducted nitrogen content inversion studies on hyperspectral data for four different genotypes of wheat through multiple regression methods combined with 10 spectral preprocessing techniques, and the results showed that the nitrogen content in the visible and near-infrared bands of 400–1000 nm was predicted with a 0.59 accuracy of R-squared. The prediction model accuracy was improved by adding the shortwave infrared band of 1000–2500 nm, providing an R-squared of 0.66, indicating that the red-edge band has a better effect for nitrogen content prediction. Guo et al. [20] took different varieties of winter wheat as the research object, used the continuous wave removal method to expand the characteristic band of nitrogen uptake, analyzed the correlation with leaf nitrogen accumulation (LNA), and compared the prediction accuracy of three nonlinear modeling methods for LNA. Their results showed that the continuous wave removal method improved the correlation with the LNA, and the Support Vector Machine (SVM) regression model had a higher accuracy with an R-squared of 0.8950. Yu et al. [21] investigated the relationship between nitrogen (N) content and spectral reflectance difference of rice in cold land, established a hyperspectral reflectance difference model for the difference in N content of rice, and modeled the spectral data by combining the PLS, the ELM, and the genetic algorithm-extreme learning machine (GA-ELM) algorithms after processing the spectral data using the discrete wavelet multi-scale decomposition, the continuum projection algorithm, and the principal component analysis. The results show that the GA-ELM model established via discrete wavelet multiscale decomposition obtains optimal results in both dataset modeling and training, and the R-squared of both training and validation datasets are above 0.68.
Hyperspectral techniques for nutrient element monitoring in fruit trees have also been studied by previous researchers, such as Azadnia et al. [22], who combined visible/near-infrared spectroscopy with four different machine learning algorithms, namely SVM, artificial neural networks (ANNs), Random Forest (RF), and PLS, to predict N, phosphorus (P), and potassium (K) contents in apple leaves. The results showed that the nonlinear modeling approach outperforms the linear one in all models. Gómez-Casero et al. [23] combined hyperspectral reflectance curves of olive trees under different nitrogen and potassium treatments, as well as the optimal wavelengths for distinguishing between different nitrogen and potassium treatments to explore the changes in the nutrient content of olive tree leaves in vegetation indices, with an accuracy of up to 94.4%, and the results showed that nitrogen or potassium nutritional deficiencies in olive tree leaves are mainly concentrated in the near-infrared region of hyperspectral reflectance. Somers et al. [24] collected spectral data on fruit tree leaves, fruits, and canopies in citrus orchards and cross-referenced them with biophysical and biochemical characteristics of the trees to explore the effect of citrus fruits on the spectral reflectance of canopies. The results showed that the presence of fruit resulted in a significant decrease in reflectance in the infrared region (700 to 2500 nm) of the electromagnetic spectrum. In the visible (VIS: 350 to 700 nm) region, the fruit had less effect, mainly due to leaf chlorosis resulting from nitrogen competition between canopy elements. Einzmann et al. [25] conducted two years of field monitoring of Norway spruce forests to explore the effects of artificial stress (bark stripping) on tree vigor in conjunction with hyperspectral data, transforming needle and canopy spectra with spectral derivatives, vegetation indices, and angular indices, and checking the separability of all features (ring-barked trees vs. control trees) using an RF classification algorithm. The results showed that younger, well-maintained stands showed less change over the 2-year period, while the changes in older stands were observed in both coniferous and hyperspectral canopy spectra, suggesting the great potential of hyperspectral remote sensing in detecting early vigor changes in stressed trees. In contrast, there have been fewer studies on the use of hyperspectral technology for the determination of apple physiological and biochemical indices; therefore, the use of hyperspectral technology for monitoring apple growth information in the Tarim Basin of the Southern Xinjiang Ring still needs to be further discussed.

In this study, the spectral raw data were preprocessed via SG smoothing and MSC, and the feature variables of the bands were extracted using CARS-PLS and SPA algorithms. Based on the feature variables extracted with the above two methods, the prediction accuracies of the three models (SVR, LSTM, and RIME-LSSVM) were established and compared, and the optimal model for estimating the nitrogen content of the canopy leaves of apple trees was determined.

4. Discussion

Nitrogen is an essential element for plant growth and development. As a key component of important macromolecules, especially in the pre-growth period of fruit trees, adequate nitrogen application determines yield and fruit quality. Alva et al. [41] took “Valencia”, “Parson Brown”, “Hamlin”, and “Sunburst” as experimental materials to explore the effects of different nitrogen application conditions on nitrogen accumulation and fruit growth and development during the growth period. Their results showed a rapid increase in cumulative nitrogen values and a rapid increase in fruit weight and diameter in June, August, and September, and a slow increase in the rest of the reproductive period, indicating the importance of adequate nitrogen supply for fruit and quality at the initial stages of fruit development and growth. Consequently, measuring the nitrogen content by means of inversion can provide a more comprehensive and rapid understanding of the nutritional status and growth condition of apples.
At present, in the study of an inverse model construction of crop physiological growth indicators using hyperspectral reflectance data, it was found that the model accuracy obtained by directly utilizing the original hyperspectral reflectance for inverse model construction of crop physiological growth indicators is often relatively low. This is due to the susceptibility of canopy spectra to crop structural characteristics, light intensity, and anthropogenic disturbances. Previous studies have shown [42] that preprocessing of spectral data can eliminate spectral noise, enhance spectral properties, and improve model accuracy. For example, Jayaselan et al. [43] performed nutrient prediction for palm oil and developed a PLS prediction model after preprocessing the spectral data with MSC, first and second derivatives and standard normal variation (SNV), Gaussian filtering, and SG smoothing, respectively. The results show that the highest model accuracy is obtained after MSC preprocessing with a predicted R-squared of 0.91, and the preprocessing of spectral data can effectively improve the model accuracy. In this study, the raw hyperspectral data were preprocessed with SG smoothing and MSC.
For this study, we constructed a variety of apple canopy leaf hyperspectral nitrogen inversion models for the different fertility stages of apples. Because modeling using the full band suffers from data redundancy and low model accuracy, in order to reduce the data dimension, redundant information and noise should be removed, the estimation model accuracy should be improved, and saturation should be avoided. The CARS-PLS and SPA methods were used to screen the spectral data from the full-band spectral data, and the sensitive bands were extracted as input values to construct the nitrogen inversion model by combining three different algorithms. Among them, the CARS-PLS algorithm extracted 18 sensitive bands, mainly in the range of 750~1032 nm. Studies have shown that nitrogen affects photosynthesis during crop growth, which in turn affects the absorption of blue and red light by the crop, which is in line with previous findings [44,45]. The feature bands screened by CARS-PLS were fewer than those screened by SPA. The R-squared of the accuracy of the nitrogen content estimation models constructed by CARS-PLS were 0.6155, 0.7468, and 0.964, respectively; and the RMSE values were 0.0041, 0.0014, and 0.052, which were higher than those of the models constructed using the feature bands screened by SPA and consistent with the results of [32,46]. The reason may be due to the fact that SPA mainly examines individual spectral bands in the process of screening the characteristic bands and does not take into account the synergistic effect of the combination of spectral bands. And CARS-PLS obtains the optimal subset of variables through the adaptive weighted sampling method and the exponential decay function, which can not only cause the effect of covariance between spectral bands to be effectively reduced and eliminate redundant information in spectral data, but also take into account the synergistic effect between filtered bands.

Comparing the inversion models constructed by the three different algorithms, the results show that the CARS-PLS-RIME-LSSVM model has a higher accuracy than the CARS-PLS-LSTM and CARS-PLS-SVR models due to the fact that the RIME-LSSVM model can better process the data and is more robust to parameter selection, while the SVR modeling accuracy is better than that of the LSTM. This is due to the better generalization ability of the SVR model. However, the spectral eigen-band modeling methods still have shortcomings, and the RIME algorithm has much room for improvement. In future work, the method proposed in this paper will be combined with the vegetation index for more in-depth research, which will provide a method for realizing a more accurate inversion of the nitrogen content of apple canopy leaves.

[ad_2]

This website uses cookies to improve your experience. We'll assume you're ok with this, but you can opt-out if you wish. Accept Read More