Is a Mean Difference of 0.46 Relevant? Towards Determining the Smallest Effect Size of Interest for Visual Aesthetics of Websites


Imagine conducting a study to determine whether users rate the visual aesthetics of your website more positively than your competitors. To assess users' perceptions of both websites, you use a validated survey scale for visual aesthetics, and you observe a statistically significant difference in users' ratings of the visual aesthetics of the two websites of 0.5 on a 7-point Likert-type scale. However, determining whether such a difference is practically (and theoretically) meaningful is challenging. In this paper, I follow the procedure outlined in Anvari & Lakens (2021) to determine the smallest subjectively experienced difference in VisAWI-s ratings using an anchor-based method. A sample of N = 249 participants rated and compared screenshots of eight websites in an online survey. I determined an estimate of a population-specific mean difference of 0.4, or in POMP units 6.58%, which translates to a mean difference of 0.46 with the 7-point Likert-type scale of the VisAWI-s. These values suggest that differences in VisAWI-s scores exceeding these estimates, such as the 0.5 mentioned above, are likely noticeable and meaningful to users. However, the estimate of this smallest subjectively experienced difference is affected by the overall visual aesthetics rating of the stimuli used. Researchers can use this effect size to inform study design and sample size planning. Still, whenever possible, they should aim to determine a domain- and research-design-specific smallest effect size of interest.