Hyperspectral Estimation of Soil Copper Concentration Based on Improved TabNet Model in the Eastern Junggar Coalfield

China is the largest coal consumer in the world. The massive exploitation and utilization of coal resources have resulted in serious problems of heavy metal pollution and environmental contamination, such as soil degradation, water pollution, crop damage, and even threatening human lives. Therefore, monitoring soil heavy metal pollution quickly and in real time is an urgent task at present. This research not only formulated a new preprocessing method enlightened by few-shot learning for soil hyperspectral data but also combined it with other soil-related auxiliary information to extract effective information from the soil hyperspectrum, at the end of which different regression methods were adopted to predict soil heavy metal contamination. This test used 168 actual soil samples from the Eastern Junggar coalfield in Xinjiang for verification. Since copper in the soil is a trace element and the corresponding spectral characteristics are affected by other impurities, improper use of hyperspectral preprocessing methods may introduce interference information or may delete useful information, which makes the model effect unsatisfied. To effectively address the above-mentioned problems, the preprocessing method of this experiment combined the second-order differential derivation, and the data enhancement (DA) method together with the addition of auxiliary information to allow more effective features to be entered into the model. Next, the attentive interpretable tabular learning (TabNet) model was improved in three different ways using the original TabNet model and three improved TabNet models to create regression models. One of the improved TabNet models had the best effect, with a list of the top 30 features according to the degree of importance. Meanwhile, the regression prediction of Cu content using four different convolutional neural networks (CNNs) revealed that the model with the residual block was the strongest and slightly outperformed the improved TabNet model, but lacked interpretation of the input data. Besides, this experiment also employed different preprocessing methods for regression prediction on various models and found that the traditional preprocessing methods performed best in traditional regression models [e.g., partial least square regression (PLSR)] and underperformed in deep learning models. The selected optimal model was compared with PLSR and CNN models. The results indicated that both the improved TabNet model and the improved CNN model had better performance using the new preprocessing approach proposed in this article, with improved TabNet yielding a coefficient of determination ( $\text{R}^{2}$ ), root-mean-square error (RMSE), and the ratio of performance to interquartile range (RPIQ) of 0.94, 1.341, and 4.474, respectively. The improved CNN model had a coefficient of determination of 0.942, an RMSE of 1.324, and an interquartile range of 4.531 in the test dataset.