Reconciling Image Captioning and User’s Comments for Urban Tourism

Image captioning as a process of assigning textual description to an image has gained momentum nowadays thanks to recent advances in deep learning related architectures and availability of associated tools. In the era of digital tourism, this offers a valuable framework to reconcile the widely available tourism images and user’s generated content. This paper presents a novel approach to perform such reconciliation in a way to benefit tourism industry. Especially, IMGUR online image sharing services has been employed to construct a novel database, referred Tourism48, which contains gallery tourism images from 48 countries together with their associated user’s comments. Google Cloud Vision API has been employed to perform image captioning of the underlined images, while similarity analysis has been employed to match user’s comments to the obtained captioning results. The outcomes can trigger the development of subsequent policy research in tourism industry and behavior analysis. Besides, the Tourism48 dataset has been made available for research community.1