Decomposing Difficulty of Reading Literacy Test Items

Keywords: reading, elementary school, testing, item difficulty modeling, LLTM


The current study investigates the question of test difficulty decomposition depending on the characteristics of items (such as: format, belonging to the type of text to which the item belongs) and the reader's actions required to answer it (search for information in the text, simple conclusions, complex conclusions, critical interpretation of the text). The sample of the study consisted of fourth grade elementary school students in Krasnoyarsk, who completed the computerized test of reading literacy "Progress" in the spring of 2022. Research method: psychometric modeling using the LLTM+e model. Research hypothesis: the decomposition of item difficulties will help to prove that the reading actions required to complete the tasks will form a hierarchy of difficulties similar to traditional taxonomies (B. Bloom), that is, reading skills aimed at analyzing, synthesizing, interpreting information will give tasks greater difficulty than simple conclusions, and those, in turn, will make tasks more difficult than the reader's actions to find information in the text. The results show that the assignment of items to the group of reader's actions is a significant factor. The size of the effects does not allow us to speak of a strict hierarchy, but when other attributes are controlled, the tasks for information retrieval in an explicit form are easier for students than the tasks for complex conclusions and for critical understanding of the text.


Download data is not yet available.


Anderson L.W., Krathwohl D.R., Airasian P.W., Cruikshank K.A., Mayer R., Pintrich P.R., Raths J., Wittrock M.C. (eds) (2001) A Taxonomy for Learning, Teaching, and Assessing: A Revision of Bloom’s Taxonomy of Educational Objectives. New York, NY: Longman.

Bakai E.A., Yusupova E.M., Antipkina I.V. (2023) Chitayut ili delayut vid? Analiz povedeniya uchashchikhsya nachalʾnykh klassov pri vypolnenii zadaniy testa chitatelʾskoy gramotnosti [Reading or Pretending to Read? Analysis of the Behavior of Primary School Students during a Reading Comprehension Test]. Voprosy obrazovaniya / Educational Studies Moscow, no 1, pp. 8–28.

Becker A., Nekrasova-Beker T. (2018) Investigating the Effect of Different Selected-Response Item Formats for Reading Comprehension. Educational Assessment, vol. 23, no 4, pp. 296–317.

Burnham K., Anderson D. (eds) (2002) Model Selection and Multi-Model Inference. A Practical Information-Theoretic Approach. New York, Berlin, Heidelberg: Springer.

De Boeck P., Bakker M., Zwitser R., Nivard M., Hofman A., Tuerlinckx F., Partchev I. (2011) The Estimation of Item Response Models with the lmer Function from the lme4 Package in R. Journal of Statistical Software, vol. 39, iss. 12.

De Boeck P., Wilson M. (eds) (2004) Explanatory Item Response Models: A Generalized Linear and Nonlinear Approach. New York, NY: Springer.

Delgado P., Vargas C., Ackerman R., Salmerón L. (2018) Don't Throw Away Your Printed Books: A Meta-Analysis on the Effects of Reading Media on Reading Comprehension. Educational Research Review, vol. 25, November, pp. 23–38.

Desjardins C.D., Bulut O. (2018) Handbook of Educational Measurement and Psychometrics Using R. Boca Raton, FL: CRC.

Douglas G. (1982) Issues in the Fit of Data to Psychometric Models. Education Research and Perspectives, vol. 9, no 1, pp. 32–43.

Effatpanah F., Baghaei P. (2021) Cognitive Components of Writing in a Second Language: An Analysis with the Linear Logistic Test Model. Psychological Test and Assessment Modeling, vol. 63, no 1, pp. 13–44.

Fischer G.H. (2005) Linear Logistic Test Models. Encyclopedia of Social Measurement (ed. K. Kempf-Leonard), Boston; London: Elsevier, pp. 505–514.

Fischer G.H. (1973) The Linear Logistic Test Model as an Instrument in Educational Research. Acta Psychologica, vol. 37, no 6, pp. 359–374.

Gosteva Yu.N., Kuznetsova M.I., Ryabinina L.A., Sidorova G.A., Chaban T. Yu. (2019) Teoriya i praktika otsenivaniya chitatel´skoy gramotnosti kak komponenta funktsional´noy gramotnosti [Theory and Practice of Reading Literacy as a Component of Functional Literacy]. Otechestvennaya i zarubezhnaya pedagogika, vol. 1, no 4 (61), pp. 34–57.

Lang J.W., Tay L. (2021) The Science and Practice of Item Response Theory in Organizations. Annual Review of Organizational Psychology and Organizational Behavior, vol. 8, pp. 311–338.

Lenkeit J., Chan J., Hopfenbeck T.N., Baird J.A. (2015) A Review of the Representation of PIRLS Related Research in Scientific Journals. Educational Research Review, vol. 16, October, pp. 102–115.

Linacre J.M. (2004) Rasch Model Estimation: Further Topics. Journal of Applied Measurement, vol. 5, no 1, pp. 95–110.

Mair P., Hatzinger R., Maier M.J., Rusch T., Mair M.P. (2020) ERm: Extended Rasch Modeling. 1.0-2. Available at: https://cran. r-project. org/package= eRm (accessed 17 September 2023).

Mead R. (2008) A Rasch Primer: The Measurement Theory of Georg Rasch. Psychometrics Services Research Memorandum 2008–001. Maple Grove, MN: Data Recognition Corporation. Available at: (accessed 13 September 2023).

Melentieva Yu.P. (2015) Obshchaya teoriya chteniya [Theory of Reading]. Moscow: Nauka.

Mullis I.V., Martin M.O., Sainsbury M. (2016) PIRLS 2016 Reading Framework. Chestnut Hill, MA: TIMSS & PIRLS International Study Center, pp. 11–29.

OECD (2019) PISA 2018 Assessment and Analytical Framework. Paris: OECD.

Pearson P.D., Johnson D.D. (1978) Teaching Reading Comprehension. New York, NY: Holt, Winehart and Winston.

Rahman T., Alexander P.A., Chae S.E. (2022) Reader Attributes, Task Attributes, and Reading Comprehension Proficiency: The Relation Revealed by Two Analytic Approaches. Reading Psychology, vol. 43, no 7, pp. 495–522.

Støle H., Mangen A., Schwippert K. (2020) Assessing Children's Reading Comprehension on Paper and Screen: A Mode-Effect Study. Computers & Education, vol. 151, March, Article no 103861.

Schwippert K., Lenkeit J. (eds) (2012) Progress in Reading Literacy in National and International Context. The Impact of PIRLS 2006 in 12 Countries. Münster: Waxmann Verlag.

Thompson B., Gipe J.P., Pitts M.M. (1985) Validity of the Pearson‐Johnson Taxonomy of Comprehension Questions. Reading Psychology: An International Quarterly, vol. 6, no 1–2, pp. 43–49.

Verguts T., De Boeck P. (2000) A Note on the Martin-Löf Test for Unidimensionality. Methods of Psychological Research, vol. 5, no 1, pp. 77–82.

Warren W., Nicholas D., Trabasso T. (1979) Event Chaince and Inferences in Understanding Narratives. New Directions in Discourse Processing, Vol. 2. Advances in Discourse Processing (ed. R.O. Freedle), Norwood, N.J.: Ablex Publication Corporation.

Whitely S.E. (1983) Construct Validity: Construct Representation Versus Nomothetic Span. Psychological Bulletin, vol. 93, no 1, pp. 179–197.

Woodcock S., Howard S.J., Ehrich J. (2020) A Within-Subject Experiment of Item Format Effects on Early Primary Students’ Language, Reading, and Numeracy Assessment Results. School Psychology, vol. 35, no 1, pp. 80–87.

Wright B.D. (1996) Reliability and Separation. Rasch Measurement Transactions, vol. 9, no 4, p. 472.

Zuckerman G.A., Kovaleva G.S., Baranova V.Yu. (2018) Chitatel´skie umeniya rossijskikh chetveroklassnikov: uroki PIRLS‐2016 [Reading Literacy of Russian Fourth-Graders: Lessons from PIRLS-2016]. Voprosy obrazovaniya / Educational Studies Moscow, no 1, pp. 58–78.‐9545‐2018‐1‐58‐78

How to Cite
IvanovaAlina Ye., and AntipkinaInna V. 2023. “Decomposing Difficulty of Reading Literacy Test Items”. Voprosy Obrazovaniya / Educational Studies Moscow, no. 3 (November).
SI Psychometrics