On Varying Item Difficulty by Changing the Response Format for a Mathematical Competence Test

Christine Hohensinn; Klaus D. Kubinger

doi:10.17713/ajs.v38i4.276

Authors

Christine Hohensinn Faculty of Psychology, University of Vienna Center of Testing and Consulting - Division of Psychological Assessment and Applied Psychometrics
Klaus D. Kubinger Faculty of Psychology, University of Vienna Center of Testing and Consulting - Division of Psychological Assessment and Applied Psychometrics

DOI:

https://doi.org/10.17713/ajs.v38i4.276

Abstract

Educational and psychological aptitude and achievement tests employ a variety of different response formats. Today the predominating format is a multiple-choice format with a single correct answer option (at most out of altogether four answer options) because of the possibility for fast, economical and objective scoring. But it is often mentioned that multiple-choice questions are easier than those with constructed response format, for which
the examinee has to create the solution without answer options. The present study investigates the influence of three different response formats on the difficulty of an item using stem-equivalent items in a mathematical competence test. Impact of formats is modelled applying the Linear Logistic Test model (Fischer, 1974) appertaining to Item Response Theory. In summary, the different response formats measure the same latent trait but bias the difficulty of the item.

References

Andersen, E. B. (2005). A goodness of fit test for the Rasch model. Psychometrika, 38, 123-140.

Birenbaum, M., Tatsuoka, K. K., and Gutvirtz, Y. (1992). Effects of response format on diagnostic assessment of scholastic achievement. Applied Psychological Measurement, 16, 353-363.

Bridgemen, B. (1992). A comparison of quantitative questions in open-ended and multiple-choice formats. Journal of Educational Measurement, 29, 253-271.

Cronbach, L. J. (1941). An experimental comparison of the multiple true-false and multiple multiple-choice tests. The Journal of Educational Psychology, 32, 533-543.

De Boeck, P., and Wilson, M. (Eds.). (2004). Explanatory Item Response Models. New York: Springer.

Fischer, G. H. (1973). The linear logistic test model as an instrument in educational research. Acta Psychologica, 37, 359-374.

Fischer, G. H. (1974). Einführung in die Theorie psychologischer Tests. Bern: Huber.

Fischer, G. H. (1995). The linear logistic test model. In G. H. Fischer and I. W. Molenaar (Eds.), Rasch Models: Foundations, Recent Developments and Applications (p. 131-156). New York: Springer.

Katz, I. R., Bennet, R. E., and Berger, A. E. (2000). Effects of response format on difficulty of SAT-mathematics items: It‘s not the strategy. Journal of Educational Measurement, 37, 39-57.

Kubinger, K. D. (2008). On the revival of the Rasch model-based LLTM: From constructing tests using item generating rules to measuring item administration effects. Psychology Science Quarterly, 50, 311-327.

Kubinger, K. D., Frebort, M., Holocher-Ertl, S., Khorramdel, L., Sonnleitner, P., Weitensfelder, L., et al. (2007). Large-Scale Assessment zu den Bildungsstandards in Österreich: Testkonzept, Testdurchführung und Ergebnisverwertung. Erziehung

und Unterricht, 157, 588-599.

Kubinger, K. D., Holocher-Ertl, S., and Frebort, M. (2006). Zur testtheoretischen Qualität von Multiple Choice-Items: 2 richtige aus 5 vs. 1 richtige aus 6 Antwortmöglichkeiten. In B. Gula, R. Alexandrowicz, S. Strauß, E. Brunner, B. Jenull-Schiefer, and O. Vitouch (Eds.), Perspektiven psychologischer Forschung

in Österreich. Proceedings zur 7. Wissenschaftlichen Tagung der Österreichischen Gesellschaft für Psychologie (p. 459-464). Lengerich: Pabst.

Kubinger, K. D., Holocher-Ertl, S., Reif, M., Hohensinn, C., and Frebort, M. (2010). On minimizing guessing effects on multiple-choice items: Superiority of a 2-solutionsand3-distractors item format to a 1-solution-and-5-distractors item format. International

Journal of Selection and Assessment.

Mair, P., and Hatzinger, R. (2007). eRm: Extended Rasch modeling. R package. (http://cran.r-project.org/)

Poinstingl, H., Mair, P., and Hatzinger, R. (2007). Manual zum Softwarepackage eRm (extended Rasch modeling) – Anwendung des Rasch-Modells (1-PL Modell). Lengerich: Pabst Science Publishers.

Pomplun, M., and Omar, M. H. (1997). Multiple-mark items : An alternative objective item format. Educational and Psychologial Measurement, 57, 949-962.

Rasch, G. (1980). Probabilistic Models for some Intelligence and Attainment Tests. Chicago: The University of Chicago Press.

Thissen, D., Wainer, H., and Wang, X.-B. (1994). Are tests comprising both multiplechoice and free-response items necessarily less unidimensional than multiplechoice tests? Journal of Educational Measurement, 31, 113-123.

Traub, R. E., and MacRury, K. (2006). Antwort-Auswahl- vs Freie-Antwort-Aufgaben bei Lernerfolgstests. In K. Ingenkamp and R. S. Jäger (Eds.), Tests und Trends 8: Jahrbuch der Pädagogischen Diagnostik. Weinheim: Beltz Verlag.