Abstract:
Objective In order to improve the prediction accuracy of yellowing degree of tobacco leaves during curing process, the influence of environmental factors on the yellowing degree of upper and lower tobacco leaves was studied.
Method Based on color threshold segmentation and gaussian mixture model (GMM), a stage segmentation algorithm was proposed to extract the yellowing degree of upper and lower tobacco leaf images during baking process. The particle swarm optimization (PSO) algorithm was used to optimize the hyperparameters of three machine learning algorithms: random forest (RF), support vector machine (SVR) and back propagation neural network (BPNN), and the prediction model of tobacco yellowing degree was constructed by combining the environmental factors (temperature, humidity and curing time) of curing barn. The SHAP method was used to interpret the optimal prediction model, and the relationship between the environmental factors of the curing barn and the yellowing degree of tobacco leaves was revealed.
Result The mean absolute error (MAE) and mean square error (MSE) of the segmentation algorithm in the stage were 0.02407 and 0.00058, respectively, which were smaller than those of the single color threshold segmentation algorithm (0.07657, 0.00588) and the single GMM algorithm (0.06541, 0.00429). It has high extraction accuracy in extracting the yellowing degree of tobacco leaves. In the five-fold cross-validation, the PSO-RF model had the best prediction accuracy for the yellowing degree of the upper and lower tobacco leaves. For the upper layer model, the standard deviation and coefficient of variation for MAE, MSE, and r2 were the smallest, which were 0.0073, 0.0058, 0.0066 and 0.0440, 0.1246, 0.0069, respectively. The standard deviation and coefficient of variation of MAE, MSE and r2 in the lower layer model were also the smallest, which were 0.0062, 0.0051, 0.0052 and 0.0403, 0.1181, 0.0053, respectively. In the analysis of model prediction results, the accuracy of the model PSO-BPNN and PSO-SVR were r2 < 0.90, MSE > 0.15, r2 > 0.90, MSE < 0.15, respectively. The model PSO-RF has the highest accuracy (r2 > 0.95, MSE < 0.06). SHAP analysis of the optimal PSO-RF model revealed that upper layer temperature and curing time were the key environmental factors influencing the yellowing degree of the upper and lower layers of tobacco leaves, respectively.
Conclusion The GMM-PSO-RF model can accurately predict the yellowing degree of tobacco leaves in different sheds under complex baking environment, and provide scientific basis for the adjustment of baking process.