This is sometimes referred to in the literature as the “50–30 rule” of multilevel modelling and has been used before as sample size justification for using this type of statistical method. Maas and Hox’s and Pacagnella’s simulation studies provide one of the most often-cited guidelines regarding sample sizes in multilevel models where they claim that, if fixed effects are of interest, a minimum of 30 Level 1 units and 10 Level 2 units are required and, if the inferences pertain to random effects, the number of Level 2 units should increase to 50. Sample size determination falls within this spectrum of added complexity since it cannot be calculated exactly and needs to be approximated via computer simulation. In spite of the popularity of these statistical approaches, the added complexity implied by them places a demand for a more sophisticated technical knowledge on the user, whether it relates to issues of estimation, interpretation or distributional assumptions of the data. From the array of statistical techniques that can handle these types of dependencies, multilevel modelling or linear mixed effects models have become commonplace, with a wide variety of applications within epidemiological, social, educational and psychological fields. Acknowledging these dependencies increases the complexity of research hypotheses and places new demands on the analytical methods needed to test said hypotheses. With this web application, users can conduct simulations, tailored to their study design, to estimate statistical power for multilevel logistic regression models.ĭata with dependencies due to clustering or repeated measurements are commonplace within the behavioural and health sciences. To assist researchers in planning research studies, a user-friendly web application that conducts power analysis via computer simulations in the R programming language is provided. The more skewed or imbalanced the predictor is, the larger the sample size requirements. Given the complex interactive influence among sample sizes, effect sizes and predictor distribution characteristics, it seems unwarranted to make generic rule-of-thumb sample size recommendations for multilevel logistic regression, aside from the fact that larger sample sizes are required when the distributions of the predictors are not symmetric or balanced. In the most extreme case of imbalance (10% incidence) and skewness of a chi-square distribution with 1 degree of freedom, even 110 Level 2 units and 100 Level 1 units were not sufficient for all predictors to reach power of 80%, mostly hovering at around 50% with the exception of the skewed, continuous Level 2 predictor. Skewed continuous predictors and unbalanced binary ones require larger sample sizes at both levels than balanced binary predictors and normally-distributed continuous ones. Power curves were simulated to see in what ways non-normal/unbalanced distributions of a binary predictor and a continuous predictor affect the detection of population effect sizes for main effects, a cross-level interaction and the variance of the random effects. MethodĬomputer simulations are implemented to estimate statistical power in multilevel logistic regression with varying numbers of clusters, varying cluster sample sizes, and non-normal and non-symmetrical distributions of the Level 1/2 predictors. To address both matters, we present a sample of cases documenting the influence that predictor distribution have on statistical power as well as a user-friendly, web-based application to conduct power analysis for multilevel logistic regression. These issues are further compounded by the fact that the distribution of the predictors can play a role in the power to estimate these effects. Despite its popularity, issues concerning the estimation of power in multilevel logistic regression models are prevalent because of the complexity involved in its calculation (i.e., computer-simulation-based approaches).
0 Comments
Leave a Reply. |