27 October 2025

Transformer based Collaborative Reinforcement Learning for Fluid Antenna System (FAS)-enabled 3D UAV Positioning

In this paper, a novel three dimensional (3D) positioning framework of fluid antenna system (FAS)-enabled unmanned aerial vehicles (UAVs) is developed. In the proposed framework, a set of controlled UAVs including an active UAV and four FAS-enabled passive UAVs cooperatively estimate the real-time 3D position of a target UAV. Here, the active UAV transmits a measurement signal to the passive UAVs via the reflection from the target UAV. Each passive UAV estimates the distance of the active-target-passive UAV link and selects an antenna port to share the distance information with the base station (BS), which calculates the real-time position of the target UAV. As the target UAV is moving due to its task operation, the controlled UAVs must optimize their trajectories and select optimal antenna port for transmitting the positioning information, aiming to estimate the real-time position of the target UAV. We formulate this problem as an optimization problem whose goal is to minimize the target UAV positioning error via optimizing the trajectories of all controlled UAVs and antenna port selection of passive UAVs. To address this problem, an attention-based recurrent multi-agent reinforcement learning (AR-MARL) scheme is proposed, which enables each controlled UAV to use the local Q function to determine its trajectory and antenna port while optimizing the target UAV positioning performance without knowing the trajectories and antenna port selections of other controlled UAVs. Different from current MARL methods that use feedforward neural networks to approximate Q functions, the proposed method uses a recurrent neural network (RNN) that incorporates historical state-action pairs of each controlled UAV, and an attention mechanism to analyze the importance of these historical state-action pairs, thus improving the global Q function approximation accuracy and the target UAV positioning accuracy. Simulation results show that the proposed scheme can reduce the average positioning error by up to 17.5% and 58.5% compared to the value decomposition based-MARL scheme with FAS and the proposed AR-MARL method without FAS.