A Rapid Diagnostic Grading System for Cucumber Downy Mildew Based on Visible Light - Hyperspectral Imaging System

Downy mildew, a kind of cucumber disease with a high spread rate and harmfulness that is more common in the world, has a great influence on the yield of cucumbers. The rapid identification of its symptoms and the rapid classification of the post-disease characters are of great significance to the rapid diagnosis of cucumber frost mold and the proper treatment of medicine after the disease. In order to quickly and accurately classify the occurrence and the degree of cucumber downy mildew, a rapid diagnosis and classification method of cucumber downy mildew based on visible light - high spectral imaging technology was proposed in this paper. In addition, the stepwise regression method and PCA were used to reduce and extract the feature information of sensitive bands. Two kinds of acquired feature information are used as the input of the model to construct the disease degree classification detection model of the SVM classification model. The model based on the stepwise regression method is used to classify and identify downy mildew and normal leaves. In this model, the accuracy of the Sigmoid kernel function classification test is the highest, reaching 95.00%, and the recognition rate of different degrees of cucumber downy mildew disease leaves is as high as 93.88, which has a high classification detection accuracy. The results show that the rapid diagnosis and classification of cucumber downy mildew can be realized by using the visible light spectral imaging system combined with the automatic classification model of SVM, which provides a new method and reference for solving the problem of cucumber downy mildew in time. number of light sources used is 2, the light source is located on both sides of the sample, and the position of cucumber leaves is adjusted to be at the center line of line scanning imaging. During the experiment, the target blade is placed on the loading platform of the displacement platform. The direction of the moving loading platform is from right to left, and the hyperspectral scanning imaging method is adopted to obtain the hyperspectral image information of the sample.


Introduction
Cucumber is one of the most widely cultivated agricultural crops in the world, and with the popularization of modern facilities, cucumber cultivation has not limited to seasonal influence. Scientific facilities can be used to grow cucumbers at all times of the year. Cucumber downy mildew is one of the most common diseases of cucumber, which needs high humidity when it breaks out. As for the middle and lower reaches of the Yangtze River in China, the temperature is suitable, the relative humidity is high, and the leaves are wet, which easily leads to the occurrence of cucumber downy mildew [1] . Lack of scientific and systematic knowledge of plant pathology makes most cucumber grower can hardly make an accurate judgment on the severity of crop outbreaks, and this also causes the phenomenon that how much pesticides should spray on their own subjective judgment casually, which results in excessive use of pesticides or failure to effectively control the phenomenon of pests and diseases to lead to crop loss, and results in the abuse of pesticides and excessive residues and environmental pollution [2] . This is not in line with the current social requirements for food safety and environmental friendliness. Therefore, the rapid detection and classification of cucumber downy mildew is of great significance for the prediction of cucumber downy mildew, which can realize the effective prediction of cucumber downy mildew and the scientific treatment after the disease, and reduce the occurrence of cucumber yield reduction caused by downy mildew.
In view of the current situation of detection of cucumber downy mildew， except for artificial eye recognition， visual processing based on leaf, image has always been a common recognition method. Jiang Longquan [3] and others grouped different samples by obtaining image information on plants in various agricultural scenes. After analyzing and extracting the key features such as color and HSV in each group, the statistical analysis of image characteristics based on pixel points is carried out, a series of intelligent algorithms are used to form the key feature of a hog, a SVM machine learning based detection method for plant diseases and insect pests is established, and a high accuracy verification experiment is completed. Ma Juncheng [4] and others processed the cucumber leaves with downy mildew based on image segmentation, and extracted the disease spot feature information of cucumber downy mildew leaves after HSV spatial transformation of the acquired images, and carried out the automatic recognition model of Cucumber Downy Mildew by SVM classification method.
Because of its high recognition accuracy, it can be used for the image self recognition of downy mildew Moving recognition. Based on the visible spectrum image of cucumber powdery mildew, Bai Xuebing [5] and others proposed a regression discriminant model combined with SI-PLSR method, which was applied to the nondestructive detection of cucumber powdery mildew with small error.
Nowadays, spectroscopy technology is also increasingly used in the detection of disease phenomena. The spectral characteristics of plants are the characteristics of the comprehensive spectral response of their interaction with environmental factors during its growth process [6] . At present, in the research of plant diseases and insect pests, a variety of spectral techniques have been applied to the research of various crops. Jiang Jinbao [7] and others analyzed the field-induced wheat stripe rust by the near-infrared visible light system based on the disease degree, mainly through the first-order derivative of the near-infrared system as the entry point, made statistical analysis on the green edge, red edge, core area and other elements of plants, and established a set of correlation model to analyze the disease phenomena in multiple growth periods. Wang Xiangyu [8] and others conducted data acquisition experiments on artificially induced powdery mildew leaves by using visible light hyperspectral system. In the experiments, the principal component analysis and SVM methods were used to establish a rapid identification and detection system for cucumber powdery mildew, and the fast recognition and detection based on visible light hyperspectral detection system is realized through the correct division of verification set, correction set and test set. Ramalingam [9] and others analyzed the hyperspectral characteristic data under the leaves with different water stress, Based on the portable multi-spectrometer, the multi-spectral data of typical leaves of target crops is acquired, and the intensity of the reflected spectrum of the leaves of visible light (400-700nm), short-wave near infrared (700-1300nm) and near infrared (1300-2500nm) is extracted and established the spectral prediction model based on the moisture content of the information. Furthermore, through the prediction model, the intensity of different bands affected by spectral background compensation is obtained. Among them, the short-wave near infrared region is more affected by spectral background compensation than visible light and near infrared. Under anthracnose stress, Wu Nan [10] and others mainly analyzed the relationship among the leaf moisture content of Camellia oleifera leaf, disease index and canopy spectral characteristics. It mainly adopts the method of remote sensing observation of hyperspectral large area tea field to analyze the relationship between the moisture content and the severity of the disease and then establishes a model with spectral intensity as input. Beyond that, combined with intelligent algorithm, a rapid determination system of camellia oleiform leaf moisture content was established.
After the occurrence of disease, the grain characteristics of crop leaves will change obviously, and the corresponding spectral characteristics will also change at the disease location. Spectral imaging technology not only has the advantages of clear conventional imaging technology, but also generates a spectral image cube, which contains the spectral distribution information of each point of the target sample. The spectral feature information of the target point and the region of interest can be obtained directly from the image, which provides great convenience and more accurate selection for spectral data processing and acquisition.This paper mainly uses the visible hyperspectral imaging system, which can clearly and intuitively observe the normal area and pathological area, and can directly obtain the spectral data of each area on the hyperspectral image. At present, the near-infrared hyperspectral system has made some progress in the detection and identification of disease phenomena, while there are relatively few researches on disease analysis based on visible light imaging systems.

Test plan design
The samples obtained in the experiment of this subject are all from the crops cultivated in the multifunctional comprehensive Venlo multi-span glass greenhouse (32.201°N, 119.518°E) independently developed by Jiangsu University. Different crops are cultivated in different cultivation tanks. The cultivation method of the crops is soilless cultivation of perlite in the pot, with one plant planted in each pot. In order to explore the outbreaks of plant diseases and insect pests in the natural growth state of facility crops, the samples of plant diseases and insect pests are produced naturally under the standard cultivation mode. It uses a modified Yamazaki formula standard nutrient solution for daily watering to protect the growth of crops. In addition to daily nutritional irrigation of crops, the disease status was counted daily, and the leaves were picked and collected during the high incidence of cucumber downy mildew.
After obtaining the diseased leaves of different degrees, the images of the disease leaves are taken in time for imaging comparison after the experiment, and the disease degrees are graded based on four different degrees.
In this paper, the plant pathological characteristics of the leaves with disease phenomenon are professionally graded and confirmed by the plant protection experts of our school, which are divided into four levels: In grade Ⅰ, the degree of lesions is light, and the proportion of diseased leaves to total leaves is less than 5%; In grade Ⅱ, the degree of disease spots is primary, and the diseased leaves account for 5% to 10% of the total disease incidence of plant leaves; In grade Ⅲ, the area of the diseased spot is medium, and the diseased leaves account for 10%～30％ of the total leaves; In gradeⅣ, the degree of disease spots is more serious, and the diseased leaves account for more than 30% of the total. These scientific classifications can accurately classify the cucumber leaves that have diseased in the greenhouse.
In the hyperspectral scanning experiment of collected disease leaves, the optimal time is within two hours after picking. Immediately after the crop is picked, the biological activity of the crop changes little, and the spectral characteristics of the crop can be obtained with better imaging quality by visible light hyperspectral when it is close to the normal growth state. In the dark box, the distance between the lens and the stage is 25cm, the number of light sources used is 2, the light source is located on both sides of the sample, and the position of cucumber leaves is adjusted to be at the center line of line scanning imaging. During the experiment, the target blade is placed on the loading platform of the displacement platform. The direction of the moving loading platform is from right to left, and the hyperspectral scanning imaging method is adopted to obtain the hyperspectral image information of the sample.

Instrument part
In this paper, the visible light hyperspectral imaging scanning experiment was carried out in the Spectrum Central Laboratory of the Agricultural Engineering Department, the Key Laboratory of the Ministry of Education of Jiangsu University. The experiment is conducted in the environment with homoeothermy and normal atmospheric humidity, the information is collected inside the blackened inner wall of the collection box, and the adjustable light source provides the light to eliminate the interference of external light.  This paper adopts the image clipping and splicing technology of ENVI4.2 software to clip the cucumber hyperspectral images obtained. The area of the minimum cut rectangle of the plant leaves to be cut needs to be determined firstly. Through the feature collection of the hyperspectral image, the spectral image under a single band was analyzed band by band, and it was observed that the difference in gray scale between cucumber leaves and background was the largest among the target band images at 476nm, and threshold image segmentation was performed here. And then further processing is conducted based on computer vision operation, the missing area is supplemented and filled, and the noise reduction method of arithmetic mean filtering is used to remove the irrelevant noise points in the principle blade area; In ENVI, the generation and processing of masks based on threshold segmentation are performed, and further imaging operations are carried out. The background of the subsequent target image is set to white, which is easy for subsequent processing. As shown in the figure below, the acquired mask image can effectively remove the interference of the background area outside the blade. In order to accurately divide and extract the diseased areas in the spectral image of the leaves, this paper uses the spectral angle classification method to obtain the rendering of the diseased areas with a similar degree. The core idea of the spectral angle classification method lies in the size of the angle between two spectra. If the angle is small, it means that the similarity between the two spectra is also the closest, and it also means that these pixels or regions are more likely to belong to the same disease. Then calculate the spectral pixel area that is closest to the target disease area with the smallest spectral angle in the overall hyperspectral image. Then, in the detection and identification of diseases and insect pests and the area division, the range distribution of the disease area can be obtained more clearly and intuitively by rendering the spectral pixels within the corresponding spectral angle range. It can effectively identify and render the area of the whole leaf that is closer to the disease area more quickly. The general formula for calculating the spectral angle is as follows: In this formula, is the data of the spectral vector of the known region, and * is the data of spectral vectors of other pixels to be counted. The following is a picture, thus using the HSI hyperspectral image processing system to obtained byclassify the target diseases and pests leaves after spectral angle classification, which can effectively classify the disease areas of the spectral image and divide the characteristic areas.

Screening of disease characteristics of cucumber downy mildew
After the preliminary analysis and processing of the spectral data combined with the image, a total of 50 samples of cucumber downy mildew leaves is collected in this paper. A plurality of linear spectral regions on the blade are selected, and the maximum, minimum, average spectral intensity and standard deviation of spectral data in these regions are statistically analyzed to determine the feasibility of the selected disease samples. The number of pixels in the selected diseased area ranges between 40-1220. The spectral intensity of the pixel in the region

Fig. 4 Numerical statistics in the characteristic wavelength region of the target area
has obvious difference in many band regions. It covers samples of different diseases of various degrees in multiple regions, meets the requirements of the experiment for the number of samples of diseases and the phenomenon degree of different levels of diseases, and provides abundant and representative data screening samples of disease sensitive bands. The sample data information statistics are as follows:

Fig. 5 Comparison of the spectral intensity of downy mildew leaves and normal leaves
Combining the research of the intensity comparison of the spectral curve of the cucumber downy mildew region in the early stage, and the statistical analysis of the correlation between the multi-band and multi-band regions by the SPSS, the five sensitive intervals of cucumber downy mildew are: 400～440nm, 470～510nm, 540～590nm, 620～650nm, 690～740nm. In order to reduce the data dimension required for analysis and processing, two feature extraction methods are adopted in this paper to extract the feature wavelength of sensitive bands.

Extraction of Spectral Feature Information of Cucumber Downy Mildew
The main idea of the stepwise regression method refers to take the variable area that needs to be studied as the object, regard all the individual variables in the area as the factors that need to be verified by the significance index, verify the variables within the area one by one, compare the significance and retain the relevant variable with the largest significance index in the region [11] .  problem of most data. The extracted features can construct a single-factor disease leaf recognition system and feature layer fusion recognition system with a data dimension much lower than the original data [12] . After the principal component analysis of downy mildew at sensitive wavelength, it can be seen from the principal component analysis results of cucumber downy mildew that the cumulative contribution rate of the first five principal components with higher contribution rate all reached more than 99%, and most of the available information has been covered in the hyperspectral part.

Automatic detection and classification of cucumber downy mildew leaf based on support vector machine
Support vector machine (SVM) , as a common linear classifier in machine learning, was jointly proposed by Cortes and Vapnik as early as 1995, which committed to solve the pattern recognition problem on highdimensional, nonlinear problems [13] . The biggest reference factor for pattern recognition is taxonomy that maximizes it interval. In this paper, SVM is used for disease feature classification and monitoring. In order to classify different diseases, introduce advanced multi-classification method based on SVM for scientific classification of different disease trait levels. The specific method is to classify the four different disease degrees of Ⅰ, Ⅱ, Ⅲ, Ⅳ during training, Ⅱ, Ⅲ, Ⅳ as negative set when Ⅰ is the positive set, and Ⅰ, Ⅲ , Ⅳ as the negative set when Ⅱ is the positive one, and so on, in the final output result, four different output quantities are obtained. In the output result, the item with the largest final value is selected as the classification output result, which way has a strong classification recognition rate in the face of fewer sample sets.
This paper mainly uses four kinds of classification kernel functions as the comparison of accuracy verification of modeling. The four categories are shown in Table 3 as follows:  Before the classification and verification, firstly, 100 cucumber downy mildew leaves and 100 cucumber normal leaves are randomly selected from the diseased leaves to test. 70 leaves were taken as training set and 30 leaves as verification set respectively. The normal leaf output was defined as 0 and the cucumber downy mildew leaf output was defined as 1, and the test is in a random way. The accuracy of classification results is shown in the table:  As can be seen from the statistical results of the correct rate of Table 4 and Table 5, the statistical results of the accuracy classification of the SVM classifier algorithm used in this paper on cucumber downy mildew leaves and normal leaves have a good classification effect on the features obtained based on the stepwise regression method. And among the four kernel functions, the Sigmoid kernel function has the highest classification accuracy among the two feature models, reaching 95.00 and 83.33%.

Leaf classification and recognition results of four levels of cucumber downy mildew
When automatic classification detection of cucumber downy mildew based on support vector machine is carried out (the specific division of training sets and test sets at different levels is shown in Table 6), the recognition rate of the test set is verified and the classification accuracy of leaves at different levels is counted. The classification accuracy of two different feature extraction methods is shown in Table 7 and Table 8.  According to the comparison and analysis of statistical accuracy, hyperspectral information has effective classification accuracy in the classification and monitoring of cucumber downy mildew and disease degree.
Regardless of the classification detection model based on any feature extraction method,, the Sigmoid kernel function has a high test accuracy among the four kernel functions. 93.88% and 84.69% in the detection of cucumber downy mildew and powdery mildew based on the hyperspectral single factor, and linear kernel functions followed by it to reach 87.75% and 83.67%. By comparing the classification accuracy of the stepwise regression analysis method and the principal component analysis of the respective feature information modeling, it can be clearly found that the classification accuracy of the model based on the stepwise regression method is higher than that of the principal component analysis method.

Conclusion
This article uses the system of the visible hyperspectral imaging to scan the collected downy mildew and powdery mildew leaves. After acquired the information of spectral image, the interference information can be removed by the image information's clipping through the spectral image mask of ENVI software image. The HSI Analyzer analysis system is used to further analyze the spectral data pictures. After quickly obtaining the classification of the disease area through the spectral angle counting, the disease spectrum sensitive band was extracted for the disease area. The characteristic spectrum in the disease-sensitive band is extracted based on stepwise regression and PCA, and the spectral characteristic information of powdery mildew and downy mildew were respectively extracted. The gray scale intensity at the characteristic wavelength of cucumber diseased leaves is also extracted, as a disease feature information of cucumber leaves, which can also be used as a feature layer fusion element for multi-source information fusion. Then, a verification model of cucumber leaf classification accuracy based on hyperspectral system is established, then the features extracted by the two algorithms are used as input information to verify the automatic disease classification system based on SVM algorithm. The results shows that the feasibility of automatic classification detection of cucumber leaf disease based on hyperspectral information and verifies that the feature model extracted by stepwise regression method is superior to principal component analysis method.
The results of the study in this paper show that the visible light hyperspectral imaging system is effective in the recognition and automatic classification of cucumber downy mildew. For the establishment of the detection model of classification and recognition, this paper only uses the visible light -hyperspectral information, while the multi-source information fusion method can also be used to more fully describe cucumber downy mildew leaves by adding other information, or more model classification methods and data preprocessing algorithms can be introduced to continuously optimize the model, improve the recognition accuracy and speed, and make it more capable in generalization .