library(here)
library(tidyverse)
library(lme4)
library(BayesFactor)
library(brms)
frames <- read_csv(here("data", "data_samplesize.csv"))

If you’d done the sampling frames experiment, which analyses would you actually report in a paper? Here we’ll give a frequentist approach and two Bayesian approaches.

Load and plot data

fullframes <- frames %>% 
  mutate(generalisation = (response+.1)/9.2) %>% mutate(id=factor(id)) %>% 
  mutate(id=factor(id)) %>% 
  mutate(sample_size = factor(sample_size, levels = c("small","medium","large"))) 

fullframes_avg <- fullframes %>%
  group_by(test_item, condition, sample_size) %>%
  summarise(
    n = n(),
    sd=sd(generalisation), 
    se=sd/sqrt(n),
    generalisation = mean(generalisation),
    ) %>%
  ungroup()

expsummary <- fullframes_avg %>%
  ggplot(aes(x = test_item, y = generalisation, colour = condition)) +
  geom_line() +
  geom_point() +
  geom_errorbar(aes(ymin = generalisation - se, ymax = generalisation + se)) +
  facet_wrap(~sample_size)
plot(expsummary)

Eyeballing the data it seems clear that

  1. responses are higher overall for the category than the property condition
  2. responses decrease as test_item increases
  3. there is an interaction between n_obs and test item (as n_obs increases, difference between small and large test items increases)
  4. there is an interaction between test item and condition (as test_item increases, difference between category and property sampling increases)
  5. there is an interaction between n_obs and condition (as n_obs increases, difference between category and property sampling increases)
  6. there is a three-way interaction between n_obs, test item and condition condition (as n_obs increases, the interaction between test item and condition becomes more pronounced)

On the other hand, it’s not clear whether

  1. the average response increases or decreases as n_obs increases

Frequentist and Bayesian methods can both be used to test these qualitative impressions.

Frequentist approach

The day4/statistics.Rmd notebook ended up finding that a logistic regression model with the formula

generalisation ~ condition * test_item * n_obs + (1 + test_item * n_obs|id) 

was the best at capturing the responses of individual participants. Here, however, we’ll use a model with an intercept-only random effect

generalisation ~ condition * test_item * n_obs + (1|id) 

The more elaborate version of the model would be a better choice under some circumstances. For example, if the project focused on individual differences and aimed to develop a cognitive model that accounted for data at an individual level, it would probably make sense to use the more complex version of the model for the data analysis. Here we use the simpler version because it will be easier for readers to understand, and because it parallels the Bayes factor approach that the authors actually used in the published paper.

First we fit the model:

logitmod <- glmer(
  formula = generalisation ~ condition * test_item * n_obs + (1|id), 
  family = gaussian(link = "logit"), 
  data = fullframes)
## Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, :
## Model failed to converge with max|grad| = 1.26663 (tol = 0.001, component 1)
## Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, : Model is nearly unidentifiable: very large eigenvalue
##  - Rescale variables?

There is a convergence issue, so we try a set of different optimization methods.

af <- allFit(logitmod)
## Loading required namespace: dfoptim
## bobyqa :
## Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, :
## Model failed to converge with max|grad| = 0.0017224 (tol = 0.001, component 1)
## Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, : Model is nearly unidentifiable: very large eigenvalue
##  - Rescale variables?
## [OK]
## Nelder_Mead :
## Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, : Model failed to converge with max|grad| = 0.0555489 (tol = 0.001, component 1)

## Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, : Model is nearly unidentifiable: very large eigenvalue
##  - Rescale variables?
## [OK]
## nlminbwrap :
## Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, : Model is nearly unidentifiable: very large eigenvalue
##  - Rescale variables?
## [OK]
## nmkbw : [OK]
## optimx.L-BFGS-B : [OK]
## nloptwrap.NLOPT_LN_NELDERMEAD :
## Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, : Model failed to converge with max|grad| = 0.033128 (tol = 0.001, component 1)

## Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, : Model is nearly unidentifiable: very large eigenvalue
##  - Rescale variables?
## [OK]
## nloptwrap.NLOPT_LN_BOBYQA :
## Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, : Model failed to converge with max|grad| = 0.033128 (tol = 0.001, component 1)

## Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, : Model is nearly unidentifiable: very large eigenvalue
##  - Rescale variables?
## [OK]
summary(af)
## $which.OK
##                        bobyqa                   Nelder_Mead 
##                          TRUE                          TRUE 
##                    nlminbwrap                         nmkbw 
##                          TRUE                         FALSE 
##               optimx.L-BFGS-B nloptwrap.NLOPT_LN_NELDERMEAD 
##                         FALSE                          TRUE 
##     nloptwrap.NLOPT_LN_BOBYQA 
##                          TRUE 
## 
## $msgs
## $msgs$bobyqa
## $msgs$bobyqa[[1]]
## [1] "Model failed to converge with max|grad| = 0.0017224 (tol = 0.001, component 1)"
## 
## $msgs$bobyqa[[2]]
## [1] "Model is nearly unidentifiable: very large eigenvalue\n - Rescale variables?"
## 
## 
## $msgs$Nelder_Mead
## $msgs$Nelder_Mead[[1]]
## [1] "Model failed to converge with max|grad| = 0.0555489 (tol = 0.001, component 1)"
## 
## $msgs$Nelder_Mead[[2]]
## [1] "Model is nearly unidentifiable: very large eigenvalue\n - Rescale variables?"
## 
## 
## $msgs$nlminbwrap
## $msgs$nlminbwrap[[1]]
## [1] "Model is nearly unidentifiable: very large eigenvalue\n - Rescale variables?"
## 
## 
## $msgs$nloptwrap.NLOPT_LN_NELDERMEAD
## $msgs$nloptwrap.NLOPT_LN_NELDERMEAD[[1]]
## [1] "Model failed to converge with max|grad| = 0.033128 (tol = 0.001, component 1)"
## 
## $msgs$nloptwrap.NLOPT_LN_NELDERMEAD[[2]]
## [1] "Model is nearly unidentifiable: very large eigenvalue\n - Rescale variables?"
## 
## 
## $msgs$nloptwrap.NLOPT_LN_BOBYQA
## $msgs$nloptwrap.NLOPT_LN_BOBYQA[[1]]
## [1] "Model failed to converge with max|grad| = 0.033128 (tol = 0.001, component 1)"
## 
## $msgs$nloptwrap.NLOPT_LN_BOBYQA[[2]]
## [1] "Model is nearly unidentifiable: very large eigenvalue\n - Rescale variables?"
## 
## 
## 
## $fixef
##                               (Intercept) conditionproperty  test_item
## bobyqa                           1.822592        -0.5863202 -0.3072154
## Nelder_Mead                      1.822078        -0.5859371 -0.3071836
## nlminbwrap                       1.822625        -0.5863583 -0.3072184
## nloptwrap.NLOPT_LN_NELDERMEAD    1.823458        -0.5875537 -0.3073020
## nloptwrap.NLOPT_LN_BOBYQA        1.823458        -0.5875537 -0.3073020
##                                    n_obs conditionproperty:test_item
## bobyqa                        0.05741118                  0.09228230
## Nelder_Mead                   0.05743683                  0.09221985
## nlminbwrap                    0.05740964                  0.09228586
## nloptwrap.NLOPT_LN_NELDERMEAD 0.05735635                  0.09240829
## nloptwrap.NLOPT_LN_BOBYQA     0.05735635                  0.09240829
##                               conditionproperty:n_obs test_item:n_obs
## bobyqa                                     0.06365571    -0.007968392
## Nelder_Mead                                0.06361331    -0.007974932
## nlminbwrap                                 0.06365757    -0.007968091
## nloptwrap.NLOPT_LN_NELDERMEAD              0.06373498    -0.007957674
## nloptwrap.NLOPT_LN_BOBYQA                  0.06373498    -0.007957674
##                               conditionproperty:test_item:n_obs
## bobyqa                                              -0.03361771
## Nelder_Mead                                         -0.03360303
## nlminbwrap                                          -0.03361808
## nloptwrap.NLOPT_LN_NELDERMEAD                       -0.03363252
## nloptwrap.NLOPT_LN_BOBYQA                           -0.03363252
## 
## $llik
##                        bobyqa                   Nelder_Mead 
##                      27.87748                      27.87747 
##                    nlminbwrap nloptwrap.NLOPT_LN_NELDERMEAD 
##                      27.87748                      27.87746 
##     nloptwrap.NLOPT_LN_BOBYQA 
##                      27.87746 
## 
## $sdcor
##                               id.(Intercept)     sigma
## bobyqa                             0.5934868 0.2394024
## Nelder_Mead                        0.5934220 0.2394028
## nlminbwrap                         0.5934865 0.2394025
## nloptwrap.NLOPT_LN_NELDERMEAD      0.5934470 0.2394051
## nloptwrap.NLOPT_LN_BOBYQA          0.5934470 0.2394051
## 
## $theta
##                               id.(Intercept)
## bobyqa                              2.479034
## Nelder_Mead                         2.478760
## nlminbwrap                          2.479032
## nloptwrap.NLOPT_LN_NELDERMEAD       2.478840
## nloptwrap.NLOPT_LN_BOBYQA           2.478840
## 
## $times
##                               user.self sys.self elapsed user.child sys.child
## bobyqa                           35.368    0.736  36.182          0         0
## Nelder_Mead                      26.224    0.506  26.773          0         0
## nlminbwrap                        4.427    0.099   4.546          0         0
## nloptwrap.NLOPT_LN_NELDERMEAD     2.876    0.073   2.970          0         0
## nloptwrap.NLOPT_LN_BOBYQA         2.842    0.066   2.922          0         0
## 
## $feval
##                        bobyqa                   Nelder_Mead 
##                          6162                          4564 
##                    nlminbwrap nloptwrap.NLOPT_LN_NELDERMEAD 
##                            NA                           284 
##     nloptwrap.NLOPT_LN_BOBYQA 
##                           284 
## 
## attr(,"class")
## [1] "summary.allFit"

We get essentially the same estimates using several different optimization methods, so it looks like we’re OK despite the convergence warning. We’ll now compare the full model to a set of reduced models, each of which drops one of the terms in the formula.

logitmod_no_condition <- glmer(
  formula = generalisation ~ test_item + n_obs + condition:test_item + condition:n_obs + test_item:n_obs + condition:test_item:n_obs + (1|id), 
  family = gaussian(link = "logit"), 
  data = fullframes)
## Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, :
## Model failed to converge with max|grad| = 0.00152743 (tol = 0.001, component 1)
## Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, : Model is nearly unidentifiable: very large eigenvalue
##  - Rescale variables?
anova(logitmod, logitmod_no_condition)
## Data: fullframes
## Models:
## logitmod_no_condition: generalisation ~ test_item + n_obs + condition:test_item + condition:n_obs + 
## logitmod_no_condition:     test_item:n_obs + condition:test_item:n_obs + (1 | id)
## logitmod: generalisation ~ condition * test_item * n_obs + (1 | id)
##                       Df     AIC    BIC logLik deviance  Chisq Chi Df
## logitmod_no_condition  9 -32.198 25.948 25.099  -50.198              
## logitmod              10 -35.732 28.875 27.866  -55.732 5.5338      1
##                       Pr(>Chisq)  
## logitmod_no_condition             
## logitmod                 0.01865 *
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
logitmod_no_test_item <- glmer(
  formula = generalisation ~ condition + n_obs + condition:test_item + condition:n_obs + test_item:n_obs + condition:test_item:n_obs + (1|id), 
  family = gaussian(link = "logit"), 
  data = fullframes)
## Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, : Model failed to converge with max|grad| = 0.435203 (tol = 0.001, component 1)

## Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, : Model is nearly unidentifiable: very large eigenvalue
##  - Rescale variables?
anova(logitmod, logitmod_no_test_item)
## Data: fullframes
## Models:
## logitmod: generalisation ~ condition * test_item * n_obs + (1 | id)
## logitmod_no_test_item: generalisation ~ condition + n_obs + condition:test_item + condition:n_obs + 
## logitmod_no_test_item:     test_item:n_obs + condition:test_item:n_obs + (1 | id)
##                       Df     AIC    BIC logLik deviance  Chisq Chi Df
## logitmod              10 -35.732 28.875 27.866  -55.732              
## logitmod_no_test_item 10 -35.734 28.872 27.867  -55.734 0.0029      0
##                       Pr(>Chisq)    
## logitmod                            
## logitmod_no_test_item  < 2.2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
logitmod_no_n_obs <- glmer(
  formula = generalisation ~ condition + test_item + condition:test_item + condition:n_obs + test_item:n_obs + condition:test_item:n_obs + (1|id), 
  family = gaussian(link = "logit"), 
  data = fullframes)
## Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, : Model failed to converge with max|grad| = 0.0631325 (tol = 0.001, component 1)

## Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, : Model is nearly unidentifiable: very large eigenvalue
##  - Rescale variables?
anova(logitmod, logitmod_no_n_obs)
## Data: fullframes
## Models:
## logitmod: generalisation ~ condition * test_item * n_obs + (1 | id)
## logitmod_no_n_obs: generalisation ~ condition + test_item + condition:test_item + 
## logitmod_no_n_obs:     condition:n_obs + test_item:n_obs + condition:test_item:n_obs + 
## logitmod_no_n_obs:     (1 | id)
##                   Df     AIC    BIC logLik deviance Chisq Chi Df Pr(>Chisq)
## logitmod          10 -35.732 28.875 27.866  -55.732                        
## logitmod_no_n_obs 10 -35.365 29.241 27.683  -55.365     0      0          1
# TODO: could add 4 more reduced models, one for each of the interaction terms

The anova results suggest that the main-effect terms for condition and test_item make an important contribution (p < 0.05 and p < 1e-15) but that the main-effect term for n_obs does not. But the AIC and BIC scores suggest that the reduced models logitmod_no_test_item and logitmod_no_n_obs are both about as good as the full model logitmod.

More investigation and model-checking is needed, but the basic approach here is to include a writeup of the full model (including coefficients and standard errors) along with a writeup and interpretation of each model comparison.

Bayesian approach (1)

In their paper, Hayes et al report a set of Bayes factors computed using JASP . JASP is worth knowing about — it’s a stand-alone program that allows you to load a data set then run both frequentist and Bayesian analyses using a point-and-click approach. Under the hood, JASP computes Bayes factors using the BayesFactor package in R, so we’ll use that package here to compare a set of reduced models to the full model.

The package doesn’t support logistic regression so we’re using linear regressions here. Instead of specifying the random effect (1|id) in the model formula, we tell the package that id is a random effect using whichRandom="id".

full_bf <-  lmBF( generalisation ~ condition * test_item * n_obs + id, data=fullframes, whichRandom="id")
## Warning: data coerced from tibble to data frame
no_condition_bf <-  lmBF( generalisation ~ test_item + n_obs + condition:test_item + condition:n_obs + test_item:n_obs + condition:test_item:n_obs + id, data=fullframes, whichRandom="id")
## Warning: data coerced from tibble to data frame
summary(full_bf/no_condition_bf)
## Bayes factor analysis
## --------------
## [1] condition * test_item * n_obs + id : 22599.96 ±4.25%
## 
## Against denominator:
##   generalisation ~ test_item + n_obs + condition:test_item + condition:n_obs + test_item:n_obs + condition:test_item:n_obs + id 
## ---
## Bayes factor type: BFlinearModel, JZS
no_test_item_bf <-  lmBF( generalisation ~ condition + n_obs + condition:test_item + condition:n_obs + test_item:n_obs + condition:test_item:n_obs + id, data=fullframes, whichRandom="id")
## Warning: data coerced from tibble to data frame
summary(full_bf/no_test_item_bf)
## Bayes factor analysis
## --------------
## [1] condition * test_item * n_obs + id : 1.664674e+331 ±3.47%
## 
## Against denominator:
##   generalisation ~ condition + n_obs + condition:test_item + condition:n_obs + test_item:n_obs + condition:test_item:n_obs + id 
## ---
## Bayes factor type: BFlinearModel, JZS
no_n_obs_bf <-  lmBF( generalisation ~ condition + test_item + condition:test_item + condition:n_obs + test_item:n_obs + condition:test_item:n_obs + id, data=fullframes, whichRandom="id")
## Warning: data coerced from tibble to data frame
summary(full_bf/no_n_obs_bf)
## Bayes factor analysis
## --------------
## [1] condition * test_item * n_obs + id : 0.04973803 ±4.09%
## 
## Against denominator:
##   generalisation ~ condition + test_item + condition:test_item + condition:n_obs + test_item:n_obs + condition:test_item:n_obs +     id 
## ---
## Bayes factor type: BFlinearModel, JZS
# TODO: could add similar tests of interactions 

Most of the Bayes factors are large, suggesting that the full model should be preferred to the reduced models. The one exception is the Bayes factor comparing the full model to the model without a main effect of n_obs. This Bayes factor is less than one, suggesting that the model without the main effect of n_obs should be preferred to the full model.

If following this approach, the paper would include a writeup and interpretation for each of the Bayes factors computed. For example, in their paper Hayes et al. write

“Property generalization decreased as the size of the test rocks increased (i.e. as they became less similar to the observed sample), BF10 > 10,000. Generalization ratings averaged across test items were higher overall for smaller than for larger samples, BF10 > 10,000. Overall generalization was also stronger following category than property sampling, BF10 > 10,000.”

The Bayes factors we computed line up with these results except for the Bayes factor comparing models with and without the n_obs term. The reason (I think) is that the Bayes factor reported by Hayes et al is not a comparison of two individual models: instead it is an “inclusion Bayes factor” that compares two classes of models, one class of models that include n_obs and another of models that do not. See here for more about inclusion Bayes factors in JASP. There might be grounds for using and reporting inclusion Bayes factors but we have not computed them here because they are not directly supported by the BayesFactor package. If you’d like to explore them further, see the bayesfactor_inclusion() function in the bayestestR package.

Bayesian Approach (2)

Psychologists seem to expect hypothesis tests, so for a psychology paper I’d probably go for one of the previous two approaches. But some statisticians argue that the discrete hypothesis tests we’ve been using in previous sections are misguided — if we had enough data we’d be able to tell that all coefficients in the regression model are present (ie non-zero). From this perspective the real question is what we should believe about the magnitudes and sizes of these coefficients — and one way to answer this question is to report posterior distributions on these coefficients. We can achieve this using the brms package to fit a Bayesian regression model (here we use beta regression because the dependent variable is a probability). Warning: the chunk below takes a long time to run, so set eval = FALSE unless you are willing to wait.

betamod_bayes <- brm(
  formula = generalisation ~ condition * test_item * n_obs + (1|id), 
  family = Beta, 
  data = fullframes)
## Compiling the C++ model
## Start sampling
## 
## SAMPLING FOR MODEL 'a2746bde36060dd9507f14989f110309' NOW (CHAIN 1).
## Chain 1: Rejecting initial value:
## Chain 1:   Error evaluating the log probability at the initial value.
## Chain 1: Exception: beta_lpdf: Second shape parameter[18] is 0, but must be > 0!  (in 'model980a838cb72_a2746bde36060dd9507f14989f110309' at line 58)
## 
## Chain 1: Rejecting initial value:
## Chain 1:   Error evaluating the log probability at the initial value.
## Chain 1: Exception: beta_lpdf: Second shape parameter[21] is 0, but must be > 0!  (in 'model980a838cb72_a2746bde36060dd9507f14989f110309' at line 58)
## 
## Chain 1: 
## Chain 1: Gradient evaluation took 0.003091 seconds
## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 30.91 seconds.
## Chain 1: Adjust your expectations accordingly!
## Chain 1: 
## Chain 1: 
## Chain 1: Iteration:    1 / 2000 [  0%]  (Warmup)
## Chain 1: Iteration:  200 / 2000 [ 10%]  (Warmup)
## Chain 1: Iteration:  400 / 2000 [ 20%]  (Warmup)
## Chain 1: Iteration:  600 / 2000 [ 30%]  (Warmup)
## Chain 1: Iteration:  800 / 2000 [ 40%]  (Warmup)
## Chain 1: Iteration: 1000 / 2000 [ 50%]  (Warmup)
## Chain 1: Iteration: 1001 / 2000 [ 50%]  (Sampling)
## Chain 1: Iteration: 1200 / 2000 [ 60%]  (Sampling)
## Chain 1: Iteration: 1400 / 2000 [ 70%]  (Sampling)
## Chain 1: Iteration: 1600 / 2000 [ 80%]  (Sampling)
## Chain 1: Iteration: 1800 / 2000 [ 90%]  (Sampling)
## Chain 1: Iteration: 2000 / 2000 [100%]  (Sampling)
## Chain 1: 
## Chain 1:  Elapsed Time: 188.59 seconds (Warm-up)
## Chain 1:                60.0834 seconds (Sampling)
## Chain 1:                248.674 seconds (Total)
## Chain 1: 
## 
## SAMPLING FOR MODEL 'a2746bde36060dd9507f14989f110309' NOW (CHAIN 2).
## Chain 2: Rejecting initial value:
## Chain 2:   Error evaluating the log probability at the initial value.
## Chain 2: Exception: beta_lpdf: Second shape parameter[19] is 0, but must be > 0!  (in 'model980a838cb72_a2746bde36060dd9507f14989f110309' at line 58)
## 
## Chain 2: Rejecting initial value:
## Chain 2:   Error evaluating the log probability at the initial value.
## Chain 2: Exception: beta_lpdf: Second shape parameter[19] is 0, but must be > 0!  (in 'model980a838cb72_a2746bde36060dd9507f14989f110309' at line 58)
## 
## Chain 2: Rejecting initial value:
## Chain 2:   Error evaluating the log probability at the initial value.
## Chain 2: Exception: beta_lpdf: Second shape parameter[29] is 0, but must be > 0!  (in 'model980a838cb72_a2746bde36060dd9507f14989f110309' at line 58)
## 
## Chain 2: Rejecting initial value:
## Chain 2:   Error evaluating the log probability at the initial value.
## Chain 2: Exception: beta_lpdf: Second shape parameter[19] is 0, but must be > 0!  (in 'model980a838cb72_a2746bde36060dd9507f14989f110309' at line 58)
## 
## Chain 2: Rejecting initial value:
## Chain 2:   Error evaluating the log probability at the initial value.
## Chain 2: Exception: beta_lpdf: Second shape parameter[19] is 0, but must be > 0!  (in 'model980a838cb72_a2746bde36060dd9507f14989f110309' at line 58)
## 
## Chain 2: 
## Chain 2: Gradient evaluation took 0.001271 seconds
## Chain 2: 1000 transitions using 10 leapfrog steps per transition would take 12.71 seconds.
## Chain 2: Adjust your expectations accordingly!
## Chain 2: 
## Chain 2: 
## Chain 2: Iteration:    1 / 2000 [  0%]  (Warmup)
## Chain 2: Iteration:  200 / 2000 [ 10%]  (Warmup)
## Chain 2: Iteration:  400 / 2000 [ 20%]  (Warmup)
## Chain 2: Iteration:  600 / 2000 [ 30%]  (Warmup)
## Chain 2: Iteration:  800 / 2000 [ 40%]  (Warmup)
## Chain 2: Iteration: 1000 / 2000 [ 50%]  (Warmup)
## Chain 2: Iteration: 1001 / 2000 [ 50%]  (Sampling)
## Chain 2: Iteration: 1200 / 2000 [ 60%]  (Sampling)
## Chain 2: Iteration: 1400 / 2000 [ 70%]  (Sampling)
## Chain 2: Iteration: 1600 / 2000 [ 80%]  (Sampling)
## Chain 2: Iteration: 1800 / 2000 [ 90%]  (Sampling)
## Chain 2: Iteration: 2000 / 2000 [100%]  (Sampling)
## Chain 2: 
## Chain 2:  Elapsed Time: 178.042 seconds (Warm-up)
## Chain 2:                63.1729 seconds (Sampling)
## Chain 2:                241.215 seconds (Total)
## Chain 2: 
## 
## SAMPLING FOR MODEL 'a2746bde36060dd9507f14989f110309' NOW (CHAIN 3).
## Chain 3: Rejecting initial value:
## Chain 3:   Error evaluating the log probability at the initial value.
## Chain 3: Exception: beta_lpdf: Second shape parameter[60] is 0, but must be > 0!  (in 'model980a838cb72_a2746bde36060dd9507f14989f110309' at line 58)
## 
## Chain 3: 
## Chain 3: Gradient evaluation took 0.00167 seconds
## Chain 3: 1000 transitions using 10 leapfrog steps per transition would take 16.7 seconds.
## Chain 3: Adjust your expectations accordingly!
## Chain 3: 
## Chain 3: 
## Chain 3: Iteration:    1 / 2000 [  0%]  (Warmup)
## Chain 3: Iteration:  200 / 2000 [ 10%]  (Warmup)
## Chain 3: Iteration:  400 / 2000 [ 20%]  (Warmup)
## Chain 3: Iteration:  600 / 2000 [ 30%]  (Warmup)
## Chain 3: Iteration:  800 / 2000 [ 40%]  (Warmup)
## Chain 3: Iteration: 1000 / 2000 [ 50%]  (Warmup)
## Chain 3: Iteration: 1001 / 2000 [ 50%]  (Sampling)
## Chain 3: Iteration: 1200 / 2000 [ 60%]  (Sampling)
## Chain 3: Iteration: 1400 / 2000 [ 70%]  (Sampling)
## Chain 3: Iteration: 1600 / 2000 [ 80%]  (Sampling)
## Chain 3: Iteration: 1800 / 2000 [ 90%]  (Sampling)
## Chain 3: Iteration: 2000 / 2000 [100%]  (Sampling)
## Chain 3: 
## Chain 3:  Elapsed Time: 185.898 seconds (Warm-up)
## Chain 3:                83.9625 seconds (Sampling)
## Chain 3:                269.861 seconds (Total)
## Chain 3: 
## 
## SAMPLING FOR MODEL 'a2746bde36060dd9507f14989f110309' NOW (CHAIN 4).
## Chain 4: Rejecting initial value:
## Chain 4:   Error evaluating the log probability at the initial value.
## Chain 4: Exception: beta_lpdf: Second shape parameter[13] is 0, but must be > 0!  (in 'model980a838cb72_a2746bde36060dd9507f14989f110309' at line 58)
## 
## Chain 4: Rejecting initial value:
## Chain 4:   Error evaluating the log probability at the initial value.
## Chain 4: Exception: beta_lpdf: Second shape parameter[1] is 0, but must be > 0!  (in 'model980a838cb72_a2746bde36060dd9507f14989f110309' at line 58)
## 
## Chain 4: Rejecting initial value:
## Chain 4:   Error evaluating the log probability at the initial value.
## Chain 4: Exception: beta_lpdf: Second shape parameter[48] is 0, but must be > 0!  (in 'model980a838cb72_a2746bde36060dd9507f14989f110309' at line 58)
## 
## Chain 4: Rejecting initial value:
## Chain 4:   Error evaluating the log probability at the initial value.
## Chain 4: Exception: beta_lpdf: Second shape parameter[55] is 0, but must be > 0!  (in 'model980a838cb72_a2746bde36060dd9507f14989f110309' at line 58)
## 
## Chain 4: Rejecting initial value:
## Chain 4:   Error evaluating the log probability at the initial value.
## Chain 4: Exception: beta_lpdf: Second shape parameter[56] is 0, but must be > 0!  (in 'model980a838cb72_a2746bde36060dd9507f14989f110309' at line 58)
## 
## Chain 4: Rejecting initial value:
## Chain 4:   Error evaluating the log probability at the initial value.
## Chain 4: Exception: beta_lpdf: Second shape parameter[56] is 0, but must be > 0!  (in 'model980a838cb72_a2746bde36060dd9507f14989f110309' at line 58)
## 
## Chain 4: Rejecting initial value:
## Chain 4:   Error evaluating the log probability at the initial value.
## Chain 4: Exception: beta_lpdf: Second shape parameter[19] is 0, but must be > 0!  (in 'model980a838cb72_a2746bde36060dd9507f14989f110309' at line 58)
## 
## Chain 4: Rejecting initial value:
## Chain 4:   Error evaluating the log probability at the initial value.
## Chain 4: Exception: beta_lpdf: Second shape parameter[20] is 0, but must be > 0!  (in 'model980a838cb72_a2746bde36060dd9507f14989f110309' at line 58)
## 
## Chain 4: Rejecting initial value:
## Chain 4:   Error evaluating the log probability at the initial value.
## Chain 4: Exception: beta_lpdf: Second shape parameter[1] is 0, but must be > 0!  (in 'model980a838cb72_a2746bde36060dd9507f14989f110309' at line 58)
## 
## Chain 4: Rejecting initial value:
## Chain 4:   Error evaluating the log probability at the initial value.
## Chain 4: Exception: beta_lpdf: Second shape parameter[14] is 0, but must be > 0!  (in 'model980a838cb72_a2746bde36060dd9507f14989f110309' at line 58)
## 
## Chain 4: Rejecting initial value:
## Chain 4:   Error evaluating the log probability at the initial value.
## Chain 4: Exception: beta_lpdf: Second shape parameter[106] is 0, but must be > 0!  (in 'model980a838cb72_a2746bde36060dd9507f14989f110309' at line 58)
## 
## Chain 4: 
## Chain 4: Gradient evaluation took 0.00185 seconds
## Chain 4: 1000 transitions using 10 leapfrog steps per transition would take 18.5 seconds.
## Chain 4: Adjust your expectations accordingly!
## Chain 4: 
## Chain 4: 
## Chain 4: Iteration:    1 / 2000 [  0%]  (Warmup)
## Chain 4: Iteration:  200 / 2000 [ 10%]  (Warmup)
## Chain 4: Iteration:  400 / 2000 [ 20%]  (Warmup)
## Chain 4: Iteration:  600 / 2000 [ 30%]  (Warmup)
## Chain 4: Iteration:  800 / 2000 [ 40%]  (Warmup)
## Chain 4: Iteration: 1000 / 2000 [ 50%]  (Warmup)
## Chain 4: Iteration: 1001 / 2000 [ 50%]  (Sampling)
## Chain 4: Iteration: 1200 / 2000 [ 60%]  (Sampling)
## Chain 4: Iteration: 1400 / 2000 [ 70%]  (Sampling)
## Chain 4: Iteration: 1600 / 2000 [ 80%]  (Sampling)
## Chain 4: Iteration: 1800 / 2000 [ 90%]  (Sampling)
## Chain 4: Iteration: 2000 / 2000 [100%]  (Sampling)
## Chain 4: 
## Chain 4:  Elapsed Time: 179.796 seconds (Warm-up)
## Chain 4:                72.7766 seconds (Sampling)
## Chain 4:                252.573 seconds (Total)
## Chain 4:
summary(betamod_bayes)
##  Family: beta 
##   Links: mu = logit; phi = identity 
## Formula: generalisation ~ condition * test_item * n_obs + (1 | id) 
##    Data: fullframes (Number of observations: 4725) 
## Samples: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
##          total post-warmup samples = 4000
## 
## Group-Level Effects: 
## ~id (Number of levels: 225) 
##               Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sd(Intercept)     0.65      0.03     0.58     0.72 1.00      835     1839
## 
## Population-Level Effects: 
##                                   Estimate Est.Error l-95% CI u-95% CI Rhat
## Intercept                             1.13      0.12     0.90     1.36 1.00
## conditionproperty                    -0.28      0.17    -0.61     0.04 1.01
## test_item                            -0.23      0.02    -0.27    -0.18 1.00
## n_obs                                 0.06      0.01     0.04     0.08 1.00
## conditionproperty:test_item           0.05      0.03    -0.01     0.11 1.00
## conditionproperty:n_obs               0.02      0.02    -0.01     0.06 1.00
## test_item:n_obs                      -0.01      0.00    -0.01    -0.00 1.00
## conditionproperty:test_item:n_obs    -0.02      0.00    -0.03    -0.01 1.00
##                                   Bulk_ESS Tail_ESS
## Intercept                             1367     2188
## conditionproperty                     1240     2001
## test_item                             2090     2713
## n_obs                                 2125     2799
## conditionproperty:test_item           1721     2246
## conditionproperty:n_obs               1761     2519
## test_item:n_obs                       2094     2722
## conditionproperty:test_item:n_obs     1771     2377
## 
## Family Specific Parameters: 
##     Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## phi     2.26      0.04     2.18     2.35 1.00     4962     3047
## 
## Samples were drawn using sampling(NUTS). For each parameter, Eff.Sample 
## is a crude measure of effective sample size, and Rhat is the potential 
## scale reduction factor on split chains (at convergence, Rhat = 1).

If following this approach, the paper would include a writeup of this single model along with plots showing posterior distributions on all coefficients.