Token Communications: A Large Model-Driven Framework for Cross-Modal Context-Aware Semantic Communications

In this article, we introduce token communications (TokCom), a large model-driven framework to leverage cross-modal context information in generative semantic communications (GenSC). TokCom is a new paradigm, motivated by the recent success of generative foundation models and multimodal large language models (GFM/MLLMs), where the communication units are tokens, enabling efficient transformer-based token processing at the transmitter and receiver. In this article, we introduce the potential opportunities and challenges of leveraging context in GenSC, explore how to integrate GFM/MLLMs-based token processing into semantic communication systems to leverage cross-modal context effectively at affordable complexity, present the key principles for efficient TokCom at various layers in future wireless networks. In a typical image semantic communication setup, we demonstrate a significant improvement of the bandwidth efficiency, achieved by TokCom by leveraging the context information among tokens. Finally, the potential research directions are identified to facilitate adoption of TokCom in future wireless networks.

Qiao Li, Mashhadi Mahdi Boloursaz, Gao Zhen, Tafazolli Rahim, Bennis Mehdi, Niyato Dusit

A1 Journal article (refereed), original research

L. Qiao, M. B. Mashhadi, Z. Gao, R. Tafazolli, M. Bennis and D. Niyato, "Token Communications: A Large Model-Driven Framework for Cross-Modal Context-Aware Semantic Communications," in IEEE Wireless Communications, vol. 32, no. 5, pp. 80-88, October 2025, doi: 10.1109/MWC.001.2500084.

https://doi.org/10.1109/MWC.001.2500084