Dr. Reeve will provide an introduction and overview of item response theory (IRT). His presentation will introduce the IRT models and describe how to interpret the item and person parameters. Trace curves, information curves, and standard error of measurement curves will be reviewed. Dr. Reeve will discuss how IRT complements classical test theory approaches. Finally, Dr. Reeve will discuss how IRT can enhance measurement of health outcomes through: 1) questionnaire development and evaluation; 2) testing for differential item functioning; 3) questionnaire linking; and 4) item banking and computerized adaptive testing. This presentation will provide a framework for the presentations to follow in the symposium.
Dr. Edelen will discuss key IRT concepts including model assumptions, model fit, IRT scoring approaches, and DIF detection and evaluation. Her presentation will review IRT model assumptions and describe how to evaluate them. Dr. Edelen will also discuss the various ways to evaluate model fit, including overall fit, examination of LD indices, and item-level fit. The principles behind IRT-based pattern scoring and summed-scoring will be reviewed. Finally, Dr. Edelen will describe optimal approaches for DIF detection and evaluation in the IRT framework. This presentation will build on the introduction by providing useful information about concepts relevant to the application of IRT in PRO development.
The disease activity score for 28 joints (DAS-28) is a widely used index measure for assessing the disease activity of individual rheumatoid arthritis (RA) patients and for evaluating treatment effectiveness aimed at reaching or sustaining a state of remission. However, despite the frequent involvement of the 28 joints included in the DAS-28, it has been argued that the omission of the foot joints causes the DAS-28 to underestimate actual disease activity in early RA patients predominantly suffering from disease activity in the feet. Consequently, the exclusion of the foot joints remains a topic of debate and research. In this presentation I will demonstrate how item response theory (and information curves in particular) can provide insight into the effect of including foot joints on the measurement range and measurement precision of the total joint count.
Dr. Krishnan has been a co-principal investigator for the NIH PROMIS site at Stanford University, along with Dr. James Fries. Together, they have developed, tested and disseminated IRT-based promise tools to measure physical function. He will discuss the contributions of Stanford PROMIS site for physical function measurement, the limitations, and a vision for future development. One main feature of this talk would be the clinical impact of the IRT-based physical function tools developed at Stanford University in terms of sample size requirements for clinical trials and observational studies.
Dr. Glas will discuss the use of IRT measures such as obtained in traditional linear item administration designs or in computerized adaptive testing. The first issue addressed will be how to equate the outcomes of different versions of an instrument that are supposed to be parallel in some sense and how to link measures obtained using instruments which target closely related constructs. The second issue addressed will be how to use IRT measures in further analyses such as in variance and regression analyses. The third and final issue will be how to use IRT measures in a multilevel framework, such as a framework where patients are nested under hospitals or physicians.
Patient reported outcome measures (PROMs) are increasingly used in clinical research settings to evaluate treatment outcomes from the perspective of the patient. Typically, multiple instruments are available for measuring the same construct in a given field. A drawback of this is that each measure usually has its own metric that is idiosyncratically related to the underlying construct that it pertains to measure. This is a significant barrier to the interpretation of studies that utilize different instruments. To achieve comparability of scores from different PROMs, a number of statistical linking or equating techniques can be utilized to convert the system of units of one measure to that of another. The resulting conversion tables or crosswalks aid the interpretability of research results by explicating the relationship between total score levels of the cross-walked instruments. The topic of this presentation will be the development and (cross-cultural) validation of a crosswalk that was recently created to convert scores of two PROMs that are frequently used to assess physical function in rheumatology.
Dr. Bjorner will describe the unique features of item banks and CAT, discuss how to develop item banks, and describe some of the opportunities and challenges for using CAT in medical research. In CAT, a computer selects the items from an item bank that are most relevant for and informative about the particular respondent; thus optimizing test relevance and precision. Dr. Bjorner will discuss principles and practical experiences in developing item banks and CAT for health outcomes. In addition to the careful psychometric analysis, the clinical application of PROM assessments needs clear construct definitions, good items, and well-founded interpretation guidelines. The application of CAT in medical research offers new opportunities and challenges for test strategies and for the statistical analysis of data from a CAT. The strongest statistical analysis is usually achieved through expanded IRT models that specify the research question of interest as a parameter in the model. Such specialized IRT models represent an interesting area for statistical development, in particular for studies with data collection at multiple time points for each participant.
This presentation is about the construction of a multidimensional computerized adaptive test for fatigue in rheumatoid arthritis (CATFatigueRA). It will be discussed which challenges can occur during the development of a computerized adaptive test in health care. Moreover, the results of a usability test of the new measurement instrument will be presented and a demonstration of the CATFatigueRA will be given.
Dr. Hays will discuss possible future directions for applications of IRT in patient-reported outcomes research. His presentation will touch upon issues such as calibrating of item parameters for domains applicable to a subset of the population, modeling common and unique variance in multiple domains (e.g., bifactor model, multidimensional CAT), use of the nominal response model, and person fit.