improving the modelling of response variation in international large-scale assessments
Annemiek Punter is a PhD student in the Department of Research Methodology, Measurement and Data Analysis (OMD). Her supervisors are prof.dr. C.A.W. Glas and prof.dr. T.J.H.M. Eggen from the Faculty of Behavioural, Management and Social sciences (BMS).
International large-scale assessments (ILSAs) play a major role in the evaluation of educational systems. These projects are characterized by the standardized assessment of student achievement and the collection of contextual data by means of curriculum, student, teacher, school, and home questionnaires. Together, the resulting high-quality data on student achievement and contextual factors provide great opportunities for more theory-oriented educational effectiveness research, particularly in international contexts. To ensure the validity of analyses based on these data, particularly relating to measurement invariance across (sub)populations, efforts must be made to evaluate response behaviour across (sub)populations of interest. A lack of measurement invariance characterized by these differences in response behaviour, is called differential item functioning (DIF).
This thesis presents five studies that contribute to research in the field of education by deploying ILSA data in research areas where the availability of standardized data from multiple countries offers new research opportunities. Topics addressed are: computer and information literacy, parental involvement and reading literacy, and language demand in testing mathematics. Also, in each chapter methods for identifying and handling potential DIF in the framework of item response theory are explored.
The studies in this thesis show how DIF analyses can be insightful by benefiting from the synergy between a methodological focus on validity and a focus on more substantive research questions. More than simply a task to tick off before the “real” questions are investigated, DIF analyses can lead to insights into effects underlying test results. Throughout the studies in this thesis it is therefore shown how, in studies with a substantive interest in comparing groups, the study of validity on both test and questionnaire items should be integrated into the methodology. Though no clear-cut one-method-fits-all strategy is presented here, the thesis shows that there are many ways to approach the issue.