This study examines the predictive performance of Feature Attention (FA), Temporal Attention (TA), and Feature and Temporal Attention (FATA) within Gated Recurrent Unit (GRU), Long Short‐Term Memory (LSTM), and Transformer… Click to show full abstract
This study examines the predictive performance of Feature Attention (FA), Temporal Attention (TA), and Feature and Temporal Attention (FATA) within Gated Recurrent Unit (GRU), Long Short‐Term Memory (LSTM), and Transformer architectures using price data from four Chinese carbon markets (CEA, BEA, GDEA, and HBEA). Drawing on multiple forecasting accuracy measures and significance testing, the results show that attention mechanisms can enhance forecasting accuracy in certain market‐model combinations, but their effectiveness critically depends on the alignment among market conditions, model architectures, and attention mechanisms. In markets with high average prices and volatility, FA achieves the best performance with GRU and LSTM; in lower price, moderately volatile markets, TA combined with Transformer is more effective; and in the high‐price, high‐volatility CEA market, FATA shows promise when paired with Transformer, but lacks robustness across markets. These findings highlight a pronounced compatibility pattern among market conditions, model architectures, and attention mechanisms, suggesting that the deployment of attention mechanisms in carbon price forecasting should be tailored to specific market conditions and model structures rather than applied universally.
               
Click one of the above tabs to view related content.