Introduction

This pilot experiment aimed at gathering some preliminary data to assess if the stimuli generated by Pyllusion behaves as expected for each of the 10 illusion types (i.e., whether an increase of task difficulty and illusion strength leads to an increase of errors), and develop an intuition about the magnitude of effects, to refine the stimuli parameters to a more sensible range (i.e., not overly easy and not impossibly hard) for the next study.

In line with open-science standards, all the material (stimuli generation code, experiment code, raw data, analysis script with complementary figures and analyses, preregistration, etc.) is available at https://github.com/RealityBending/IllusionGameValidation.

Methods

Procedure

We generated 56 stimuli for each of the 10 illusion types. These stimuli resulted from the combination of 8 linearly-spread levels of task difficulty (e.g., [1, 2, 3, 4, 5, 6, 7], where 1 corresponds to the highest difficulty - i.e., the smallest objective difference between targets) and 7 levels of illusion strength (3 values of strength on the congruent side, 3 on the incongruent side, and 0; e.g., [-3, -2, -1, 0, 1, 2, 3], where negative values correspond to congruent illusion strengths).

The 10 illusion blocks were randomly presented, and the order of the 56 stimuli within the blocks was also randomized. After the first series of 10 blocks, another series was administered (with new randomized orders of blocks and trials). In total, each participant saw 56 different trials per 10 illusion type, repeated 2 times (total = 1120 trials), to which they had to respond “as fast as possible without making errors” (i.e., an explicit double constraint to mitigate the inter-individual variability in the speed-accuracy trade off). The task was implemented using jsPsych (De Leeuw 2015), and the instructions for each illusion type are available in the experiment code.

library(tidyverse)
library(ggdist)
library(ggside)
library(easystats)
library(patchwork)
library(brms)

df <- read.csv("../data/study1.csv") |>
  mutate(
    Date = as.Date(Date),
    Participant = fct_reorder(Participant, Date),
    Screen_Refresh = as.character(Screen_Refresh),
    Illusion_Side = as.factor(Illusion_Side),
    Block = as.factor(Block),
    Education = fct_relevel(Education, "Master", "Bachelor", "High School", "Other")
  )

Exclusions

outliers <- c(
  # Half of the trials have very short RT
  # Prolific Status: REJECTED
  "S33",
  # Block n2 with very short RTs
  # Prolific Status: RETURNED
  "S20",
  # Error rate of 46.2% and short RTs
  # Prolific Status: RETURNED
  "S51",
  # Error rate of 46.2%
  # Prolific Status: RETURNED
  "S49",
  # Error rate of 47.9%
  # Prolific Status: REJECTED
  "S47",
  # Error rate of 42.1% and very large RT SD
  # Prolific Status: REJECTED
  "S12"
)

Error Rate

We removed 6 participants (in red) upon inspection of the average error rage (when close to 50%, suggesting random answers) and/or when the reaction time distribution was implausibly fast.

dfsub <- df |>
  group_by(Participant) |>
  summarize(
    # n = n(),
    Error = sum(Error) / n(),
    RT_Mean = mean(RT),
    RT_SD = sd(RT),
  ) |>
  ungroup() |>
  arrange(desc(Error))

knitr::kable(dfsub) |> 
  kableExtra::row_spec(which(dfsub$Participant %in% outliers), background  = "#EF9A9A") |> 
  kableExtra::kable_styling(full_width = TRUE) |> 
  kableExtra::scroll_box(width = "100%", height = "500px")
Participant Error RT_Mean RT_SD
S47 0.479 262 296
S51 0.465 263 372
S49 0.462 630 611
S12 0.421 507 1679
S20 0.402 492 725
S41 0.300 723 579
S50 0.287 659 333
S16 0.269 578 172
S05 0.262 716 271
S04 0.262 588 206
S42 0.260 718 1294
S43 0.255 837 711
S33 0.251 599 1326
S46 0.246 699 414
S06 0.246 560 197
S39 0.243 693 232
S15 0.238 1160 1660
S40 0.227 748 243
S08 0.225 780 427
S27 0.219 785 836
S30 0.216 506 150
S26 0.215 738 375
S10 0.213 1076 557
S23 0.210 870 489
S52 0.207 804 378
S34 0.206 652 367
S28 0.205 511 147
S32 0.203 627 223
S35 0.202 1133 982
S44 0.202 869 424
S24 0.201 720 298
S01 0.198 1110 738
S48 0.196 867 652
S18 0.189 983 830
S45 0.187 711 195
S25 0.186 803 397
S36 0.185 846 765
S09 0.182 1045 1082
S19 0.181 1001 562
S22 0.176 1307 1091
S14 0.175 1146 900
S02 0.174 703 243
S29 0.173 1046 918
S38 0.173 616 381
S17 0.171 927 550
S37 0.170 769 312
S13 0.157 848 364
S31 0.156 712 383
S07 0.142 1109 863
S11 0.139 895 816
S03 0.124 741 331
S21 0.092 884 509

Error Rate per Illusion Block

For each block, we computed the error rate and, if more than 50%, we discarded the whole block (as it likely indicates that instructions got mixed up, for instance participants were selecting the smaller instead of the bigger circle).

temp <- df |>
  group_by(Participant, Illusion_Type, Block) |>
  summarize(ErrorRate_per_block = sum(Error) / n()) |>
  ungroup() |> 
  arrange(desc(ErrorRate_per_block))

temp2 <- temp |> 
  filter(ErrorRate_per_block >= 0.5) |> 
  group_by(Illusion_Type, Block) |> 
  summarize(n = n()) |> 
  arrange(desc(n), Illusion_Type) |> 
  ungroup() |> 
  mutate(n_trials = cumsum(n * 56),
         p_trials = n_trials / nrow(df))

# knitr::kable(temp2)

p1 <- temp |>
  estimate_density(at = c("Illusion_Type", "Block")) |>
  ggplot(aes(x = x, y = y)) +
  geom_line(aes(color = Illusion_Type, linetype = Block)) + 
  geom_vline(xintercept = 0.5, linetype = "dashed") +
  scale_x_continuous(expand = c(0, 0)) +
  scale_y_continuous(expand = c(0, 0)) +
  labs(y = "Distribution", x = "Error Rate") +
  theme_modern()

p2 <- temp2 |> 
  mutate(Block = fct_rev(Block)) |> 
  ggplot(aes(x = Illusion_Type, y = p_trials)) +
  geom_bar(stat="identity", aes(fill = Block)) +
  scale_x_discrete(expand = c(0, 0)) +
  scale_y_continuous(labels = scales::percent, expand = c(0, 0)) +
  labs(y = "Percentage of Trials Removed", x = "Illusion Type") +
  theme_modern() +
  theme(axis.text.x = element_text(angle=45, hjust = 1))

p1 | p2



# Drop
df <- df |>
  group_by(Participant, Illusion_Type, Block) |>
  mutate(ErrorRate_per_block = sum(Error) / n()) |>
  ungroup() |> 
  filter(ErrorRate_per_block < 0.5) |>
  select(-ErrorRate_per_block)

rm(temp, temp2)

Reaction Time Distribution

# RT distribution
p <- estimate_density(df, select = "RT", at = c("Participant", "Block")) |>
  group_by(Participant) |>
  normalize(select = "y") |>
  ungroup() |>
  mutate(color = ifelse(Participant %in% outliers, "red", "blue")) |>
  ggplot(aes(x = x, y = y)) +
  geom_area(data = normalize(estimate_density(df, select = "RT"), select = "y"), alpha = 0.2) +
  geom_line(aes(color = color, group = interaction(Participant, Block), linetype = Block)) +
  geom_vline(xintercept = 2500, linetype = "dashed", color = "red") +
  scale_color_manual(values=c("red"="red", "blue"="blue"), guide = "none") +
  scale_x_continuous(expand = c(0, 0)) +
  scale_y_continuous(expand = c(0, 0)) +
  coord_cartesian(xlim = c(0, 3000)) +
  theme_modern() +
  theme(axis.text.y = element_blank()) +
  facet_wrap(~Participant) +
  labs(y = "", x = "Reaction Time (ms)")
p

# ggsave("figures/outliers.png", p, width=20, height=15)

# Filter out
df <- filter(df, !Participant %in% outliers)

Reaction Time per Trial

p1 <- estimate_density(df, select = "RT", at = "Participant") |>
  group_by(Participant) |>
  normalize(select = "y") |>
  ungroup() |>
  ggplot(aes(x = x, y = y)) +
  geom_area(data = normalize(estimate_density(df, select = "RT"), select = "y"), alpha = 0.2) +
  geom_line(aes(color = Participant, group = Participant)) +
  geom_vline(xintercept = c(150, 3000), linetype = "dashed", color = "red") +
  scale_color_material_d("rainbow", guide = "none") +
  scale_x_continuous(expand = c(0, 0)) +
  scale_y_continuous(expand = c(0, 0)) +
  coord_cartesian(xlim = c(0, 3500)) +
  theme_modern() +
  theme(axis.text.y = element_blank()) +
  # facet_wrap(~Participant) +
  labs(y = "", x = "Reaction Time (ms)")


df$Outlier <- df$RT < 150 | df$RT > 3000

p2 <- df |>
  group_by(Participant) |>
  summarize(Outlier = sum(Outlier) / n()) |>
  mutate(Participant = fct_reorder(Participant, Outlier)) |>
  ggplot(aes(x = Participant, y = Outlier)) +
  geom_bar(stat = "identity", aes(fill = Participant)) +
  scale_fill_material_d("rainbow", guide = "none") +
  scale_x_discrete(expand = c(0, 0)) +
  scale_y_continuous(expand = c(0, 0), labels = scales::percent) +
  see::theme_modern() +
  theme(axis.text.x = element_blank())

p1 | p2

We removed 692 (1.37%) outlier trials (150 ms < RT < 3000 ms).

df <- filter(df, Outlier == FALSE)

Participants

Fifty-two participants were recruited via Prolific (www.prolificacademic.co.uk), a crowd-sourcing platform providing high data quality (Peer et al. 2022). The only inclusion criterion was a fluent proficiency in English to ensure that the task instructions would be well-understood. Participants were incentivised with a reward of about for completing the task, which took about 50 minutes to finish. Demographic variables (age, gender, and ethnicity) were self-reported on a voluntary basis.

We removed 6 participants upon inspection of the average error rate (when close to 50%, suggesting random answers), and when the reaction time distribution was implausibly fast. For the remaining participants, we discarded blocks where the error rate was higher than 50% (possibly indicating that instructions got misunderstood; e.g., participants were selecting the shorter line instead of the longer one). Finally, we removed 692 (1.37%) trials based on an implausibly short or long response time (< 150 ms or > 3000 ms).

The final sample included 46 participants (Mean age = 26.7, SD = 7.7, range: [19, 60]; Sex: 39.1% females, 56.5% males, and 4.4% other).

dfsub <- df |>
  group_by(Participant) |>
  select(Participant, Age, Sex, Education, Nationality, Ethnicity, Duration, Break_Duration, Screen_Resolution, Screen_Refresh, Device_OS) |>
  slice(1) |>
  ungroup()

report::report_participants(dfsub, age="Age", sex="Sex")
plot_distribution <- function(dfsub, what = "Age", title = what, subtitle = "", fill = "orange") {
  dfsub |>
    ggplot(aes_string(x = what)) +
    geom_density(fill = fill) +
    geom_vline(xintercept = mean(dfsub[[what]]), color = "red", linetype = "dashed") +
    scale_x_continuous(expand = c(0, 0)) +
    scale_y_continuous(expand = c(0, 0)) +
    ggtitle(title, subtitle = subtitle) +
    theme_modern() +
    theme(
      plot.title = element_text(face = "bold", hjust = 0.5),
      plot.subtitle = element_text(face = "italic", hjust = 0.5),
      axis.title.x = element_blank(),
      axis.title.y = element_blank(),
      axis.text.y = element_blank()
    )
}

plot_waffle <- function(dfsub, what = "Nationality") {
  ggwaffle::waffle_iron(dfsub, what) |>
    # mutate(label = emojifont::fontawesome('fa-twitter')) |>
    ggplot(aes(x, y, fill = group)) +
    ggwaffle::geom_waffle() +
    # geom_point() +
    # geom_text(aes(label=label), family='fontawesome-webfont', size=4) +
    coord_equal() +
    ggtitle(what) +
    labs(fill = "") +
    theme_void() +
    theme(plot.title = element_text(face = "bold", hjust = 0.5))
}
p1 <- plot_distribution(dfsub, "Age", fill = "#FF9800")
p2 <- plot_distribution(dfsub, "Duration", title = "Total Duration", subtitle = "in minutes", fill = "#F44336")
p3 <- plot_distribution(dfsub, "Break_Duration", title = "Break Duration", subtitle = "in minutes", fill = "#3F51B5")

p4 <- plot_waffle(dfsub, "Sex") +
  scale_fill_manual(values = c("Male" = "#2196F3", "Female" = "#E91E63", "Other" = "#FF9800"))

p5 <- plot_waffle(dfsub, "Education") +
  scale_fill_viridis_d()

p6 <- plot_waffle(dfsub, "Nationality") +
  scale_fill_metro_d()

p7 <- plot_waffle(dfsub, "Ethnicity") +
  scale_fill_manual(values = c("Latino" = "#FF5722", "Asian" = "#FF9800", "Caucasian" = "#2196F3", "African" = "#4CAF50", "Jewish" = "#9C27B0"))

p8 <- plot_waffle(dfsub, "Screen_Resolution") +
  scale_fill_pizza_d()

p9 <- plot_waffle(dfsub, "Device_OS") +
  scale_fill_bluebrown_d()

# p10 <- plot_waffle(dfsub, "Screen_Refresh") +
#   scale_fill_viridis_d()


(p1 / p2 / p3) | (p4 / p5 / p6) | (p7 / p8 / p9)

Data Analysis

The analysis focused on the probability of errors as the main outcome variable. For each illusion, we started by visualizing the average effect of task difficulty and illusion strength to gain some intuition on the underlying generative model. Next, we tested the performance of various logistic models differing in their specifications, such as: with or without a transformation of the task difficulty (log, square root or cubic root), with or without a 2nd order polynomial term for the illusion strength, and with or without the illusion side (up vs. down or left vs. right) as an additional predictor. We then fitted the best performing model under a Bayesian framework, and compared its visualization with that of a General Additive Model (GAM), which has an increased ability of mapping underlying potential non-linear relationships (at the expense of model simplicity).

The analysis was carried out using R 4.2 (R Core Team 2022), brms (Bürkner 2017), the tidyverse (Wickham et al. 2019), and the easystats collection of packages (Makowski, Ben-Shachar, and Lüdecke 2019; Makowski et al. 2020; Lüdecke et al. 2021; Lüdecke, Waggoner, and Makowski 2019).

Results

Summary

The statistical models suggested that the effect of task difficulty had a cubic relationship with error rate for the Delboeuf and Ebbinghaus illusions (both composed of circular shapes), square relationship for the Rod and Frame and Vertical-Horizontal illusions, cubic relationship for the Zöllner and Poggendorff illusions, exponential relationship for the White illusion, cubic relationship for the Müller-Lyer and Ponzo illusions (both based on line lengths), and linear relationship for the Contrast illusion. All models suggested a significant effect of illusion strength and task difficulty. See details and figures in the analysis script.

Delboeuf

Descriptive

plot_descriptive <- function(data, side="leftright") {
  
  if(side == "leftright") {
    x <- data[data$Error == 0 & data$Illusion_Side == 1, ]$Answer[1]
    x <- tools::toTitleCase(gsub("arrow", "", x))
    if(x == "Left") {
      data$Answer <- ifelse(data$Illusion_Side == 1, "Left", "Right")
    } else if(x == "Right") {
      data$Answer <- ifelse(data$Illusion_Side == 1, "Right", "Left")
    }
  } else {
    x <- data[data$Error == 0 & data$Illusion_Side == 1, ]$Answer[1]
    x <- tools::toTitleCase(gsub("arrow", "", x))
    if(x == "Up") {
      data$Answer <- ifelse(data$Illusion_Side == 1, "Up", "Down")
    } else if(x == "Down") {
      data$Answer <- ifelse(data$Illusion_Side == 1, "Down", "Up")
    }
    data$Answer <- fct_rev(data$Answer)
  }
  
  dodge1 <- 0.1 * diff(range(data$Illusion_Difference))
  dodge2 <- -0.1 * diff(range(data$Illusion_Strength))
  
  colors <- colorRampPalette(c("#4CAF50", "#009688", "#00BCD4", "#2196F3", "#3F51B5", "#673AB7", "#9C27B0"))(length(unique(data$Illusion_Strength)))
  
  p1 <- data |> 
    group_by(Illusion_Difference, Illusion_Strength, Answer) |> 
    summarize(Error = mean(Error)) |> 
    mutate(Illusion_Strength = as.factor(round(Illusion_Strength, 2))) |> 
    ggplot(aes(x = Illusion_Difference, y = Error)) +
    geom_bar(aes(fill=Illusion_Strength), position = position_dodge(width=dodge1), stat="identity") +
    geom_line(aes(color = Illusion_Strength), position = position_dodge(width=dodge1)) +
    scale_y_continuous(limits = c(0, 1), expand = c(0, 0), labels = scales::percent) +
    scale_x_continuous(expand = c(0, 0)) +
    scale_color_manual(values = colors) +
    scale_fill_manual(values = colors) +
    theme_modern() +
    labs(
      color = "Illusion Strength", 
      fill = "Illusion Strength",
      y = "Probability of Error",
      x = "Task Difficulty"
    ) +
    theme(plot.title = element_text(face = "bold", hjust = 0.5))
  
  colors <- colorRampPalette(c("#F44336", "#FFC107", "#4CAF50"))(length(unique(data$Illusion_Difference)))
    
  p2 <- data |> 
    group_by(Illusion_Difference, Illusion_Strength, Answer) |> 
    summarize(Error = mean(Error)) |> 
    mutate(Illusion_Difference = as.factor(round(Illusion_Difference, 2))) |> 
    ggplot(aes(x = Illusion_Strength, y = Error)) +
    geom_vline(xintercept=0, linetype="dotted", alpha=0.6) +
    geom_bar(aes(fill=Illusion_Difference), position = position_dodge(width=dodge2), stat="identity") +
    geom_line(aes(color = Illusion_Difference), position = position_dodge(width=dodge2)) +
    scale_y_continuous(limits = c(0, 1), expand = c(0, 0), labels = scales::percent) +
    scale_x_continuous(expand = c(0, 0)) +
    scale_color_manual(values = colors) +
    scale_fill_manual(values = colors) +
    theme_modern() +
    labs(
      color = "Task Difficulty", 
      fill = "Task Difficulty",
      y = "Probability of Error",
      x = "Illusion Strength"
    ) +
    theme(plot.title = element_text(face = "bold", hjust = 0.5)) 
  
  if(side == "leftright") {
    p <- ((p1 + facet_wrap(~Answer, ncol=2, labeller = "label_both")) /
      (p2 + facet_wrap(~Answer, ncol=2, labeller = "label_both"))) + 
    plot_annotation(title = paste(data$Illusion_Type[1], "Illusion"), 
                    theme = theme(plot.title = element_text(face = "bold", hjust = 0.5)))
  } else {
    p <- ((p1 + facet_wrap(~Answer, nrow=2, labeller = "label_both")) |
      (p2 + facet_wrap(~Answer, nrow=2, labeller = "label_both"))) + 
    plot_annotation(title = paste(data$Illusion_Type[1], "Illusion"), 
                    theme = theme(plot.title = element_text(face = "bold", hjust = 0.5)))
  }
  p
}

data <- filter(df, Illusion_Type == "Delboeuf")

plot_descriptive(data)

Model Selection

best_models <- function(data) {
  models <- list()
  for(i in 1:1) {
    for(j in 1:1) {
      for(k1 in c("", "_log", "_sqrt", "_cbrt")) { 
        for(k2 in c("")) {  
          for(side in c("", "-side")) {
            name <- paste0("dif", k1, i, "-", "str", k2, j, side)
            # print(name)
            f <- paste0("poly(Illusion_Difference", 
                        k1,
                        ", ",
                        i,
                        ") * poly(Illusion_Strength",
                        k2, 
                        ", ",
                        j, 
                        ") + (1|Participant)")
            
            if(side == "-side") f <- paste0("Illusion_Side * ", f)
            
            m <- glmmTMB::glmmTMB(as.formula(paste0("Error ~ ", f)), 
                                  data = data, family = "binomial")
            if(performance::check_convergence(m)) {
              models[[name]] <- m
            }
          }
        }
      }
    }
  }

  to_keep <- compare_performance(models, metrics = c("BIC")) |> 
    arrange(BIC) |> 
    slice(1:10) |> 
    pull(Name)
  
  
  test <- test_performance(models[to_keep], reference=1)
  perf <- compare_performance(models[to_keep], metrics = c("BIC", "R2")) 
  
  merge(perf, test) |> 
    arrange(BIC) |> 
    select(Name, BIC, R2_marginal, BF) |> 
    mutate(BF = insight::format_bf(BF, name=""))
}

best_models(data)
##                  Name  BIC R2_marginal       BF
## 1      dif_cbrt1-str1 3305       0.392         
## 2      dif_sqrt1-str1 3307       0.400  = 0.353
## 3       dif_log1-str1 3317       0.412  = 0.003
## 4 dif_cbrt1-str1-side 3330       0.394  < 0.001
## 5 dif_sqrt1-str1-side 3332       0.401  < 0.001
## 6           dif1-str1 3333       0.424  < 0.001
## 7  dif_log1-str1-side 3341       0.414  < 0.001
## 8      dif1-str1-side 3356       0.426  < 0.001

Model Visualization

cbrt <- function(x) sign(x) * abs(x)**(1/3)

formula <- brms::bf(
  Error ~ cbrt(Illusion_Difference) * Illusion_Strength + 
    (1 | Participant),
  family = "bernoulli"
)

model <- brms::brm(formula,
  data = data,
  refresh = 0
)
## Running MCMC with 4 parallel chains...
## 
## Chain 4 finished in 12.3 seconds.
## Chain 2 finished in 13.7 seconds.
## Chain 1 finished in 13.8 seconds.
## Chain 3 finished in 14.1 seconds.
## 
## All 4 chains finished successfully.
## Mean chain execution time: 13.5 seconds.
## Total execution time: 14.2 seconds.

# parameters::parameters(model)
plot_model <- function(data, model) {
  data <- mutate(data, .dots_side = ifelse(Error == 1, "bottom", "top"))
  
  # Get variables
  vars <- insight::find_predictors(model)$conditional
  vardiff <- vars[1]
  varstrength <- vars[2]
  
  # Get predicted
  pred <- estimate_relation(model,
                            at = vars,
                            length = c(NA, 25))
  pred[[vardiff]] <- as.factor(pred[[vardiff]])
  
  # Set colors for lines
  colors <- colorRampPalette(c("#F44336", "#FFC107", "#4CAF50"))(length(unique(data[[vardiff]])))
  diffvals <- as.numeric(as.character(unique(sort(pred[[vardiff]]))))
  names(colors) <- diffvals
  
  # Assign color from the same palette to every observation of data (for geom_dots)
  closest <- diffvals[max.col(-abs(outer(data[[vars[1]]], diffvals, "-")))]
  data$color <- colors[as.character(closest)]
  data$color <- fct_reorder(data$color, closest)
  
  # Manual jittering
  xrange <- 0.05*diff(range(data[[varstrength]]))
  data$x <- data[[varstrength]]
  data$x[data$x > 0] <- data$x[data$x > 0] - runif(sum(data$x > 0), 0, xrange)
  data$x[data$x < 0] <- data$x[data$x < 0] + runif(sum(data$x < 0), 0, xrange)
  data$x[round(data$x, 2) == 0] <- data$x[round(data$x, 2) == 0] + runif(sum(round(data$x, 2) == 0), -xrange/2, xrange/2)
  
  
  pred |>
    ggplot(aes_string(x = varstrength, y = "Predicted")) +
    geom_dots(
      data = data,
      aes(x=x, y = Error, group = Error, side = .dots_side, order=color), 
      fill = data$color,
      color = NA, 
      alpha=0.5) +
    geom_slab(data=data, aes(y = Error)) +
    geom_ribbon(aes_string(ymin = "CI_low", ymax = "CI_high", fill = vardiff, group = vardiff), alpha = 0.2) +
    geom_vline(xintercept = 0, linetype = "dashed") +
    geom_hline(yintercept = c(0.05, 0.5, 0.95), linetype = "dotted", alpha = 0.5) +
    geom_line(aes_string(color = vardiff, group = vardiff)) +
    scale_y_continuous(limits = c(0, 1), expand = c(0, 0), labels = scales::percent) +
    scale_x_continuous(expand = c(0, 0)) +
    scale_color_manual(values = colors) +
    scale_fill_manual(values = colors) +
    coord_cartesian(xlim=c(min(data[[varstrength]]), max(data[[varstrength]]))) +
    theme_modern() +
    labs(
      title = paste0(data$Illusion_Type[1], " Illusion"),
      color = "Difficulty", fill = "Difficulty",
      y = "Probability of Error",
      x = "Illusion Strength"
    ) +
    theme(plot.title = element_text(face = "bold", hjust = 0.5))
}

plot_model(data, model)

GAM

make_gam <- function(data) {
  
  formula <- brms::bf(
    Error ~ t2(Illusion_Difference, Illusion_Strength, bs = "cr", k=4) + 
      (1 | Participant),
    family = "bernoulli"
  )

  model <- brms::brm(formula,
    data = data,
    refresh = 0
  )
  
  list(p = plot_model(data, model), model = model)
}

gam <- make_gam(data)
## Running MCMC with 4 parallel chains...
## 
## Chain 3 finished in 24.7 seconds.
## Chain 4 finished in 27.2 seconds.
## Chain 1 finished in 31.1 seconds.
## Chain 2 finished in 33.4 seconds.
## 
## All 4 chains finished successfully.
## Mean chain execution time: 29.1 seconds.
## Total execution time: 33.5 seconds.
gam$p

# Parameter Standardization

std_params <- function(model, min=0, max=2) {
  estimate_relation(
  model,
    at = list(Illusion_Strength = c(0), 
              Illusion_Difference = seq(min, max, length.out=500)),
    ) |> 
    select(Illusion_Strength, Illusion_Difference, Error = Predicted) |> 
    slice(c(which.min(abs(Error - 0.005)), 
            which.min(abs(Error - 0.025)), 
            which.min(abs(Error - 0.25)))) |> 
    mutate(Error = insight::format_value(Error, as_percent=TRUE))
}

# range(data$Illusion_Difference)
# range(data$Illusion_Strength)

std_params(model, min=0, max=2)
std_params(gam$model, min=0, max=2)

Ebbinghaus

Descriptive

data <- filter(df, Illusion_Type == "Ebbinghaus")

plot_descriptive(data)

Model Selection

best_models(data)
##                  Name  BIC R2_marginal       BF
## 1      dif_cbrt1-str1 2212       0.615         
## 2      dif_sqrt1-str1 2215       0.615  = 0.199
## 3 dif_cbrt1-str1-side 2227       0.623  < 0.001
## 4 dif_sqrt1-str1-side 2230       0.623  < 0.001
## 5       dif_log1-str1 2232       0.617  < 0.001
## 6           dif1-str1 2258       0.619  < 0.001

Model Visualization

formula <- brms::bf(
  Error ~ cbrt(Illusion_Difference) * Illusion_Strength + 
    (1 | Participant),
  family = "bernoulli"
)

model <- brms::brm(formula,
  data = data,
  refresh = 0
)
## Running MCMC with 4 parallel chains...
## 
## Chain 4 finished in 16.5 seconds.
## Chain 2 finished in 16.7 seconds.
## Chain 1 finished in 17.0 seconds.
## Chain 3 finished in 17.8 seconds.
## 
## All 4 chains finished successfully.
## Mean chain execution time: 17.0 seconds.
## Total execution time: 17.9 seconds.

# parameters::parameters(model)
plot_model(data, model)

GAM

gam <- make_gam(data)
## Running MCMC with 4 parallel chains...
## 
## Chain 4 finished in 30.6 seconds.
## Chain 1 finished in 31.7 seconds.
## Chain 2 finished in 31.6 seconds.
## Chain 3 finished in 35.3 seconds.
## 
## All 4 chains finished successfully.
## Mean chain execution time: 32.3 seconds.
## Total execution time: 35.4 seconds.
gam$p

# Parameter Standardization

# range(data$Illusion_Difference)
# range(data$Illusion_Strength)

std_params(model, min=0, max=2)
std_params(gam$model, min=0, max=2)

Rod and Frame

Descriptive

data <- filter(df, Illusion_Type == "Rod-Frame")

plot_descriptive(data)


data <- filter(data, abs(Illusion_Strength) < 15)

plot_descriptive(data)

Model Selection

best_models(data)
##                  Name  BIC R2_marginal       BF
## 1      dif_sqrt1-str1 2070       0.464         
## 2       dif_log1-str1 2072       0.458  = 0.371
## 3      dif_cbrt1-str1 2075       0.452  = 0.063
## 4 dif_sqrt1-str1-side 2095       0.478  < 0.001
## 5           dif1-str1 2095       0.505  < 0.001
## 6  dif_log1-str1-side 2097       0.471  < 0.001
## 7 dif_cbrt1-str1-side 2101       0.465  < 0.001
## 8      dif1-str1-side 2119       0.525  < 0.001

Model Visualization

formula <- brms::bf(
  Error ~ sqrt(Illusion_Difference) * poly(Illusion_Strength, 2) + 
    (1 | Participant),
  family = "bernoulli"
)

model <- brms::brm(formula,
  data = data,
  refresh = 0
)
## Running MCMC with 4 parallel chains...
## 
## Chain 3 finished in 6.3 seconds.
## Chain 2 finished in 6.8 seconds.
## Chain 1 finished in 7.0 seconds.
## Chain 4 finished in 7.4 seconds.
## 
## All 4 chains finished successfully.
## Mean chain execution time: 6.9 seconds.
## Total execution time: 7.5 seconds.

# parameters::parameters(model)
plot_model(data, model)

GAM

gam <- make_gam(data)
## Running MCMC with 4 parallel chains...
## 
## Chain 1 finished in 15.6 seconds.
## Chain 3 finished in 17.2 seconds.
## Chain 2 finished in 17.7 seconds.
## Chain 4 finished in 17.9 seconds.
## 
## All 4 chains finished successfully.
## Mean chain execution time: 17.1 seconds.
## Total execution time: 18.0 seconds.
gam$p

# Parameter Standardization

# range(data$Illusion_Difference)
# range(data$Illusion_Strength)

std_params(model, min=0.01, max=12)
std_params(gam$model, min=0.01, max=12)

Vertical-Horizontal

Descriptive

data <- filter(df, Illusion_Type == "Vertical-Horizontal")

plot_descriptive(data)


data <- filter(data, abs(Illusion_Strength) < 90)

plot_descriptive(data)

Model Selection

best_models(data)
##                  Name  BIC R2_marginal       BF
## 1      dif_sqrt1-str1 2011       0.661         
## 2      dif_cbrt1-str1 2014       0.657  = 0.234
## 3       dif_log1-str1 2017       0.674  = 0.050
## 4           dif1-str1 2022       0.679  = 0.004
## 5 dif_sqrt1-str1-side 2031       0.665  < 0.001
## 6 dif_cbrt1-str1-side 2034       0.662  < 0.001
## 7  dif_log1-str1-side 2038       0.677  < 0.001
## 8      dif1-str1-side 2043       0.681  < 0.001

Model Visualization

formula <- brms::bf(
  Error ~ sqrt(Illusion_Difference) * Illusion_Strength + 
    (1 | Participant),
  family = "bernoulli"
)

model <- brms::brm(formula,
  data = data,
  refresh = 0
)
## Running MCMC with 4 parallel chains...
## 
## Chain 4 finished in 22.6 seconds.
## Chain 3 finished in 23.3 seconds.
## Chain 2 finished in 24.0 seconds.
## Chain 1 finished in 24.2 seconds.
## 
## All 4 chains finished successfully.
## Mean chain execution time: 23.5 seconds.
## Total execution time: 24.3 seconds.

# parameters::parameters(model)
plot_model(data, model)

GAM

gam <- make_gam(data)
## Running MCMC with 4 parallel chains...
## 
## Chain 1 finished in 22.7 seconds.
## Chain 2 finished in 22.9 seconds.
## Chain 3 finished in 23.0 seconds.
## Chain 4 finished in 23.3 seconds.
## 
## All 4 chains finished successfully.
## Mean chain execution time: 23.0 seconds.
## Total execution time: 23.5 seconds.
gam$p

# Parameter Standardization

# range(data$Illusion_Difference)
# range(data$Illusion_Strength)

std_params(model, min=0, max=0.40)
std_params(gam$model, min=0, max=0.40)

Zöllner

Descriptive

data <- filter(df, Illusion_Type == "Zöllner")

plot_descriptive(data)


data <- filter(data, abs(Illusion_Strength) < 45)

plot_descriptive(data)

Model Selection

best_models(data)
##                  Name  BIC R2_marginal       BF
## 1      dif_cbrt1-str1 1041       0.405         
## 2       dif_log1-str1 1043       0.414  = 0.464
## 3      dif_sqrt1-str1 1044       0.423  = 0.260
## 4           dif1-str1 1065       0.491  < 0.001
## 5 dif_cbrt1-str1-side 1066       0.409  < 0.001
## 6  dif_log1-str1-side 1068       0.418  < 0.001
## 7 dif_sqrt1-str1-side 1069       0.427  < 0.001
## 8      dif1-str1-side 1090       0.494  < 0.001

Model Visualization

formula <- brms::bf(
  Error ~ cbrt(Illusion_Difference) * Illusion_Strength + 
    (1 | Participant),
  family = "bernoulli"
)

model <- brms::brm(formula,
  data = data,
  refresh = 0
)
## Running MCMC with 4 parallel chains...
## 
## Chain 2 finished in 9.7 seconds.
## Chain 3 finished in 9.8 seconds.
## Chain 1 finished in 9.9 seconds.
## Chain 4 finished in 10.3 seconds.
## 
## All 4 chains finished successfully.
## Mean chain execution time: 9.9 seconds.
## Total execution time: 10.4 seconds.

# parameters::parameters(model)
plot_model(data, model)

Parameter Standardization

# Parameter Standardization

# range(data$Illusion_Difference)
# range(data$Illusion_Strength)

std_params(model, min=0, max=5)

White

Descriptive

data <- filter(df, Illusion_Type == "White")

plot_descriptive(data)

Model Selection

best_models(data)
##                  Name  BIC R2_marginal       BF
## 1       dif_log1-str1 2176       0.783         
## 2      dif_cbrt1-str1 2178       0.784  = 0.404
## 3      dif_sqrt1-str1 2181       0.784  = 0.101
## 4  dif_log1-str1-side 2188       0.788  = 0.003
## 5 dif_cbrt1-str1-side 2190       0.789  = 0.001
## 6 dif_sqrt1-str1-side 2192       0.790  < 0.001
## 7           dif1-str1 2195       0.787  < 0.001
## 8      dif1-str1-side 2206       0.793  < 0.001

Model Visualization

formula <- brms::bf(
  Error ~ log(Illusion_Difference) * poly(Illusion_Strength, 2) + 
    (1 | Participant),
  family = "bernoulli"
)

model <- brms::brm(formula,
  data = data,
  refresh = 0
)
## Running MCMC with 4 parallel chains...
## 
## Chain 1 finished in 73.3 seconds.
## Chain 3 finished in 84.4 seconds.
## Chain 4 finished in 89.1 seconds.
## Chain 2 finished in 93.7 seconds.
## 
## All 4 chains finished successfully.
## Mean chain execution time: 85.1 seconds.
## Total execution time: 93.9 seconds.

# parameters::parameters(model)
plot_model(data, model)

GAM

gam <- make_gam(data)
## Running MCMC with 4 parallel chains...
## 
## Chain 1 finished in 36.6 seconds.
## Chain 3 finished in 45.0 seconds.
## Chain 4 finished in 45.1 seconds.
## Chain 2 finished in 45.7 seconds.
## 
## All 4 chains finished successfully.
## Mean chain execution time: 43.1 seconds.
## Total execution time: 45.9 seconds.
gam$p

# range(data$Illusion_Difference)
# range(data$Illusion_Strength)
# unique(data$Illusion_Strength)

std_params(model, min=0, max=20)
std_params(gam$model, min=0, max=20)

Müller-Lyer

Descriptive

data <- filter(df, Illusion_Type == "Müller-Lyer")

plot_descriptive(data, side = "updown")

Model Selection

best_models(data)
## Random effect variances not available. Returned R2 does not account for random effects.
## Random effect variances not available. Returned R2 does not account for random effects.
##                  Name  BIC R2_marginal       BF
## 1      dif_cbrt1-str1 2449       0.749         
## 2  dif_log1-str1-side 2534       0.748  < 0.001
## 3 dif_cbrt1-str1-side 2535       0.757  < 0.001

Model Visualization

formula <- brms::bf(
  Error ~ log(Illusion_Difference) * poly(Illusion_Strength, 2) + 
    (1 | Participant),
  family = "bernoulli"
)

model <- brms::brm(formula,
  data = data,
  refresh = 0
)
## Running MCMC with 4 parallel chains...
## 
## Chain 3 finished in 17.2 seconds.
## Chain 2 finished in 18.1 seconds.
## Chain 4 finished in 20.2 seconds.
## Chain 1 finished in 20.7 seconds.
## 
## All 4 chains finished successfully.
## Mean chain execution time: 19.1 seconds.
## Total execution time: 20.8 seconds.

# parameters::parameters(model)
plot_model(data, model)

GAM

gam <- make_gam(data)
## Running MCMC with 4 parallel chains...
## 
## Chain 4 finished in 32.0 seconds.
## Chain 1 finished in 32.3 seconds.
## Chain 2 finished in 32.5 seconds.
## Chain 3 finished in 32.7 seconds.
## 
## All 4 chains finished successfully.
## Mean chain execution time: 32.4 seconds.
## Total execution time: 32.7 seconds.
gam$p

# range(data$Illusion_Difference)
# range(data$Illusion_Strength)

std_params(model, min=0, max=0.6)
std_params(gam$model, min=0, max=0.6)

Ponzo

Descriptive

data <- filter(df, Illusion_Type == "Ponzo")

plot_descriptive(data, side = "updown")

Model Selection

best_models(data)
##                 Name  BIC R2_marginal       BF
## 1     dif_cbrt1-str1 2551       0.661         
## 2     dif_sqrt1-str1 2555       0.665  = 0.187
## 3      dif_log1-str1 2576       0.678  < 0.001
## 4          dif1-str1 2590       0.684  < 0.001
## 5 dif_log1-str1-side 2592       0.687  < 0.001
## 6     dif1-str1-side 2606       0.694  < 0.001

Model Visualization

formula <- brms::bf(
  Error ~ poly(Illusion_Difference, 2) * poly(Illusion_Strength, 2) + 
    (1 | Participant),
  family = "bernoulli"
)

model <- brms::brm(formula,
  data = data,
  refresh = 0
)
## Running MCMC with 4 parallel chains...
## 
## Chain 2 finished in 12.6 seconds.
## Chain 1 finished in 12.8 seconds.
## Chain 3 finished in 13.8 seconds.
## Chain 4 finished in 15.0 seconds.
## 
## All 4 chains finished successfully.
## Mean chain execution time: 13.5 seconds.
## Total execution time: 15.0 seconds.

# parameters::parameters(model)
plot_model(data, model)

GAM

gam <- make_gam(data)
## Running MCMC with 4 parallel chains...
## 
## Chain 2 finished in 29.8 seconds.
## Chain 3 finished in 29.9 seconds.
## Chain 1 finished in 30.3 seconds.
## Chain 4 finished in 31.5 seconds.
## 
## All 4 chains finished successfully.
## Mean chain execution time: 30.4 seconds.
## Total execution time: 31.7 seconds.
gam$p

# range(data$Illusion_Difference)
# range(data$Illusion_Strength)

std_params(model, min=0, max=20)
std_params(gam$model, min=0, max=20)

Poggendorff

Descriptive

data <- filter(df, Illusion_Type == "Poggendorff")

plot_descriptive(data, side = "updown")


data <- filter(data, abs(Illusion_Strength) < 45)

plot_descriptive(data, side = "updown")

Model Selection

best_models(data)
##                  Name  BIC R2_marginal       BF
## 1      dif_cbrt1-str1 1900       0.527         
## 2      dif_sqrt1-str1 1905       0.526  = 0.086
## 3       dif_log1-str1 1923       0.522  < 0.001
## 4 dif_cbrt1-str1-side 1929       0.529  < 0.001
## 5           dif1-str1 1930       0.521  < 0.001
## 6 dif_sqrt1-str1-side 1934       0.529  < 0.001
## 7  dif_log1-str1-side 1952       0.527  < 0.001
## 8      dif1-str1-side 1958       0.526  < 0.001

Model Visualization

formula <- brms::bf(
  Error ~ cbrt(Illusion_Difference) * poly(Illusion_Strength, 2) + 
    (1 | Participant),
  family = "bernoulli"
)

model <- brms::brm(formula,
  data = data,
  refresh = 0
)
## Running MCMC with 4 parallel chains...
## 
## Chain 3 finished in 11.5 seconds.
## Chain 1 finished in 11.6 seconds.
## Chain 2 finished in 11.6 seconds.
## Chain 4 finished in 11.8 seconds.
## 
## All 4 chains finished successfully.
## Mean chain execution time: 11.6 seconds.
## Total execution time: 11.9 seconds.

# parameters::parameters(model)
plot_model(data, model)

GAM

gam <- make_gam(data)
## Running MCMC with 4 parallel chains...
## 
## Chain 4 finished in 18.0 seconds.
## Chain 3 finished in 19.5 seconds.
## Chain 1 finished in 20.2 seconds.
## Chain 2 finished in 20.6 seconds.
## 
## All 4 chains finished successfully.
## Mean chain execution time: 19.6 seconds.
## Total execution time: 20.7 seconds.
gam$p

# range(data$Illusion_Difference)
# range(data$Illusion_Strength)

std_params(model, min=0, max=0.5)
std_params(gam$model, min=0, max=0.5)

Contrast

Descriptive

data <- filter(df, Illusion_Type == "Contrast")

plot_descriptive(data, side = "updown")

Model Selection

best_models(data)
## Random effect variances not available. Returned R2 does not account for random effects.
## Random effect variances not available. Returned R2 does not account for random effects.
##                  Name  BIC R2_marginal       BF
## 1      dif_sqrt1-str1 2353       0.673         
## 2      dif_cbrt1-str1 2354       0.673  = 0.713
## 3       dif_log1-str1 2357       0.672  = 0.128
## 4           dif1-str1 2359       0.675  = 0.047
## 5 dif_cbrt1-str1-side 2580       0.685  < 0.001
## 6      dif1-str1-side 2585       0.685  < 0.001

Model Visualization

formula <- brms::bf(
  Error ~ Illusion_Difference * poly(Illusion_Strength, 2) + 
    (1 | Participant),
  family = "bernoulli"
)

model <- brms::brm(formula,
  data = data,
  refresh = 0
)
## Running MCMC with 4 parallel chains...
## 
## Chain 1 finished in 21.4 seconds.
## Chain 4 finished in 22.1 seconds.
## Chain 2 finished in 22.5 seconds.
## Chain 3 finished in 22.8 seconds.
## 
## All 4 chains finished successfully.
## Mean chain execution time: 22.2 seconds.
## Total execution time: 22.9 seconds.
plot_model(data, model)

GAM

gam <- make_gam(data)
## Running MCMC with 4 parallel chains...
## 
## Chain 1 finished in 31.7 seconds.
## Chain 3 finished in 31.6 seconds.
## Chain 4 finished in 32.2 seconds.
## Chain 2 finished in 33.5 seconds.
## 
## All 4 chains finished successfully.
## Mean chain execution time: 32.2 seconds.
## Total execution time: 33.6 seconds.
gam$p

# Parameter Standardization

# range(data$Illusion_Difference)
# range(data$Illusion_Strength)

std_params(model, min=0, max=25)
std_params(gam$model, min=0, max=25)

Discussion

This pilot study provided a clearer understanding of the magnitude of the parametric effects at stake and the type of interaction between them. Furthermore, it allowed us to better understand and test the stimuli generated by Pyllusion, as well as uncover incidental bugs and technical issues (for instance, the specification direction of the illusion strength was reversed for a few illusions), which were fixed in a new software release. Crucially, this study allowed us to refine the range of task difficulty and illusion strength values in order to maximize information gain.

In most illusions, the task difficulty exhibited monotonic power-law scaled effects, which is in line with the psychophysics literature on perceptual decisions (Ditzinger 2010; Shekhar and Rahnev 2021; Bogacz et al. 2006). One notable result was the illusion effect pattern for the Zöllner illusion, which suggested a non-linear relationship. By generating a wider range of illusion strength values, the next study will attempt at clarifying this point.

References

Bogacz, Rafal, Eric Brown, Jeff Moehlis, Philip Holmes, and Jonathan D. Cohen. 2006. “The Physics of Optimal Decision Making: A Formal Analysis of Models of Performance in Two-Alternative Forced-Choice Tasks.” Psychological Review 113 (4): 700–765. https://doi.org/10.1037/0033-295X.113.4.700.
Bürkner, Paul-Christian. 2017. brms: An R Package for Bayesian Multilevel Models Using Stan.” Journal of Statistical Software 80 (1): 1–28. https://doi.org/10.18637/jss.v080.i01.
De Leeuw, Joshua R. 2015. “jsPsych: A JavaScript Library for Creating Behavioral Experiments in a Web Browser.” Behavior Research Methods 47 (1): 1–12.
Ditzinger, Thomas. 2010. “Optical Illusions: Examples for Nonlinear Dynamics in Perception.” In, 328:179–91.
Lüdecke, Daniel, Mattan Ben-Shachar, Indrajeet Patil, Philip Waggoner, and Dominique Makowski. 2021. performance: An R Package for Assessment, Comparison and Testing of Statistical Models.” Journal of Open Source Software 6 (60): 3139. https://doi.org/10.21105/joss.03139.
Lüdecke, Daniel, Philip Waggoner, and Dominique Makowski. 2019. “Insight: A Unified Interface to Access Information from Model Objects in R.” Journal of Open Source Software 4 (38): 1412. https://doi.org/10.21105/joss.01412.
Makowski, Dominique, Mattan Ben-Shachar, and Daniel Lüdecke. 2019. bayestestR: Describing Effects and Their Uncertainty, Existence and Significance Within the Bayesian Framework.” Journal of Open Source Software 4 (40): 1541. https://doi.org/10.21105/joss.01541.
Makowski, Dominique, Mattan Ben-Shachar, Indrajeet Patil, and Daniel Lüdecke. 2020. “Methods and Algorithms for Correlation Analysis in R.” Journal of Open Source Software 5 (51): 2306. https://doi.org/10.21105/joss.02306.
Peer, Eyal, David Rothschild, Andrew Gordon, Zak Evernden, and Ekaterina Damer. 2022. “Data Quality of Platforms and Panels for Online Behavioral Research.” Behavior Research Methods 54 (4): 1643–62. https://doi.org/10.3758/s13428-021-01694-3.
R Core Team. 2022. R: A Language and Environment for Statistical Computing. Vienna, Austria: R Foundation for Statistical Computing. https://www.R-project.org/.
Shekhar, Medha, and Dobromir Rahnev. 2021. “The Nature of Metacognitive Inefficiency in Perceptual Decision Making.” Psychological Review 128 (1): 45–70. https://doi.org/10.1037/rev0000249.
Wickham, Hadley, Mara Averick, Jennifer Bryan, Winston Chang, Lucy McGowan, Romain François, Garrett Grolemund, et al. 2019. “Welcome to the Tidyverse.” Journal of Open Source Software 4 (43): 1686. https://doi.org/10.21105/joss.01686.