Do trustworthiness judgements help people to recognise synthetic faces?

Dublin Core

Title

Creator

Haisa Shan

Date

8 September 2021

Description

Recent advances in digital image generative models have allowed for artificial creation of fake imagery such as synthesising highly photorealistic human faces. Style-based Generative Adversarial Networks (StyleGAN) is one of the most state-of-the-art generative models in this field, and has been widely used on facial image generation. However, with the increasing ease of using such image generative models, the security in many domains, such as forensic, border control and mass media, is vulnerable in front of the potential threats resulted from the misuse of image generative technologies. To date there has only been limited empirical research into the facial characteristics of StyleGAN-generated faces to support the design of detection methods against such synthetic faces. This study used StyleGAN2 (an improved version of StyleGAN) to generate faces and invited people to complete two facial image evaluation tasks, 1) Discrimination task, 2) Trustworthiness rating task. The study results demonstrated that, in the discrimination task, subjects had trouble recognising synthetic faces by direct/explicit judgement; while in the trustworthiness rating task, subjects perceived the synthetic faces as significantly more trustworthy than real faces. The study further analysed gender bias and ethnicity bias on the perception of facial trustworthiness, with results showing some differences between different levels of gender and ethnicity. In conclusion, people’s ability to recognise synthetic faces is poor, but it is possible that people rely on the perception of facial trustworthiness to discriminate synthetic from real faces. The findings in this study have implications for the development of detection methods against digitally generated faces.

Subject

StyleGAN, synthetic face, trustworthiness perception, facial trustworthiness

Source

Subjects and design
Three hundred and fifty-seven subjects (114 males, mean age = 25.2, SD = 5.8; 227 females, mean age = 25.0, SD = 6.3; 10 non-binary, mean age = 23.6, SD = 8.93) were recruited to complete an online survey test delivered on www.qualtrics.com. The responses of subjects who started but did not complete the online survey were eliminated to avoid distorting the research results. We used computer-synthesised facial images in this research as fake faces, mixed with real faces to examine people’s ability to detect fake faces and perceptual differences of trustworthiness between real/fake faces. Subjects did not get rewards for their participation, though they could see the test score of their performances at the end of the survey. The Qualtrics survey was based on a within-subjects design in which all subjects viewed the same two sets of adult facial images and completed each of the two tasks. To eliminate the effect of between-sets difference, the use of each image sets was counterbalanced in the individual test for each subject. Before the survey started, all subjects provided informed consent and completed a demographic questionnaire about their age, gender, ethnicity. In terms of the experimental power of 0.8 and significance level of 0.05, with a small effect, the power calculation indicated that the study needed at least 198 subjects.
Stimuli
A total of thirty-two human facial images (1024×1024 resolution), including 16 real and 16 synthetic faces, were used as stimuli in the survey. All real faces were taken from a publicly available dataset for high-quality human facial images, Flickr-Faces-HQ (FFHQ), which is created as a benchmark for GAN (see https://github.com/NVlabs/ffhq-dataset), and all synthetic faces were gained from the dataset of the generative image modeling, StyleGAN2 (see https://github.com/NVlabs/stylegan2). To ensure a diverse dataset, in each of the two sets of faces, there were 4 Black, 4 East Asian, 4 South Asian, 4 White, and 2 males and 2 females for each ethnicity. Among the sixteen faces of each set, half of them were real and half were synthetic, but this was unknown to subjects.

Publisher

Lancaster University

Format

data/Excel.csv

Identifier

Cognitive, Perception
Forensic

Contributor

Joanne Roe

Rights

Open

Relation

None

Language

English

Type

Data

Coverage

LA1 4YW

LUSTRE

Supervisor

Sophie Nightingale

Project Level

MSc

Topic

Cognitive, Perception
Forensic
Social

Sample Size

357 Participants

Statistical Analysis Type

ANOVA
Power Analysis
T-Test

Files

19_dataset2021 (4).zip

Collection

Questionnaire-based study

Citation

Haisa Shan , “Do trustworthiness judgements help people to recognise synthetic faces? ,” LUSTRE, accessed October 30, 2025, https://www.johnntowse.com/LUSTRE/items/show/127.