DIPA2: An Image Dataset with Cross-cultural Privacy Perception Annotations

The world today is increasingly visual. Many of the most popular online social networking services are largely powered by images, making image privacy protection a critical research topic in the fields of ubiquitous computing, usable security, and human-computer interaction (HCI). One topical issue is understanding privacy-threatening content in images that are shared online. This dataset article introduces DIPA2, an open-sourced image dataset that offers object-level annotations with high-level reasoning properties to show perceptions of privacy among different cultures. DIPA2 provides 5,897 annotations describing perceived privacy risks of 3,347 objects in 1,304 images. The annotations contain the type of the object and four additional privacy metrics: 1) information type indicating what kind of information may leak if the image containing the object is shared, 2) a 7-point Likert item estimating the perceived severity of privacy leakages, and 3) intended recipient scopes when annotators assume they are either image owners or allowing others to repost the image. Our dataset contains unique data from two cultures: We recruited annotators from both Japan and the U.K. to demonstrate the impact of culture on object-level privacy perceptions. In this paper, we first illustrate how we designed and performed the construction of DIPA2, along with data analysis of the collected annotations. Second, we provide two machine-learning baselines to demonstrate how DIPA2 challenges the current image privacy recognition task. DIPA2 facilitates various types of research on image privacy, including machine learning methods inferring privacy threats in complex scenarios, quantitative analysis of cultural influences on privacy preferences, understanding of image sharing behaviors, and promotion of cyber hygiene for general user populations.