Sampling weights in multilevel modelling: an investigation using PISA sampling structures

Julia Mang, Helmut Küchenhoff, Sabine Meinck, Manfred Prenzel

Research output: Contribution to journalArticlepeer-review

21 Scopus citations


Background: Standard methods for analysing data from large-scale assessments (LSA) cannot merely be adopted if hierarchical (or multilevel) regression modelling should be applied. Currently various approaches exist; they all follow generally a design-based model of estimation using the pseudo maximum likelihood method and adjusted weights for the corresponding hierarchies. Specifically, several different approaches to using and scaling sampling weights in hierarchical models are promoted, yet no study has compared them to provide evidence of which method performs best and therefore should be preferred. Furthermore, different software programs implement different estimation algorithms, leading to different results. Objective and method: In this study, we determine based on a simulation, the estimation procedure showing the smallest distortion to the actual population features. We consider different estimation, optimization and acceleration methods, and different approaches on using sampling weights. Three scenarios have been simulated using the statistical program R. The analyses have been performed with two software packages for hierarchical modelling of LSA data, namely Mplus and SAS. Results and conclusions: The simulation results revealed three weighting approaches performing best in retrieving the true population parameters. One of them implies using only level two weights (here: final school weights) and is because of its simple implementation the most favourable one. This finding should provide a clear recommendation to researchers for using weights in multilevel modelling (MLM) when analysing LSA data, or data with a similar structure. Further, we found only little differences in the performance and default settings of the software programs used, with the software package Mplus providing slightly more precise estimates. Different algorithm starting settings or different accelerating methods for optimization could cause these distinctions. However, it should be emphasized that with the recommended weighting approach, both software packages perform equally well. Finally, two scaling techniques for student weights have been investigated. They provide both nearly identical results. We use data from the Programme for International Student Assessment (PISA) 2015 to illustrate the practical importance and relevance of weighting in analysing large-scale assessment data with hierarchical models.

Original languageEnglish
Article number6
JournalLarge-Scale Assessments in Education
Issue number1
StatePublished - Dec 2021
Externally publishedYes


  • Hierarchical models (HLM)
  • Large-scale assessment (LSA)
  • Multilevel models (MLM)
  • Programme for International Student Assessment (PISA)
  • Sampling weights
  • Scaling of sampling weights


Dive into the research topics of 'Sampling weights in multilevel modelling: an investigation using PISA sampling structures'. Together they form a unique fingerprint.

Cite this