Measuring Learning Progress Based on Cognitive Operations
Abstract
Measuring students’ growth and change is considered one of the main ways for evidence-based development of educational systems. However, it is a non-trivial methodological task, despite the numerous approaches available for its conceptualization and statistical realization. In this article, we describe the main features of measuring students' growth and change using Item Response Theory (IRT) in detail. We then expand this approach to allow for the modeling of cognitive operations with the Linear Logistic Test Model (LLTM). We show that the synthesis of traditional IRT models for measuring growth and change with LLTM significantly enriches the interpretability of ability estimates while preserving the advantages of the traditional approach. To illustrate this approach, we use a set of monitoring tests to measure educational progress in mathematics in secondary school.
Downloads
References
Adams R.J. (2005) Reliability as a Measurement Design Effect. Studies in Educational Evaluation, vol. 31, no 2, pp. 162–172. https://doi.org/10.1016/j.stueduc.2005.05.008
Akaike H. (1974) A New Look at the Statistical Model Identification. IEEE Transactionson Automatic Control, vol. 19, no 6, pp. 716–723. https://doi.org/10.1109/TAC.1974.1100705
Andersen E.B. (1985) Estimating Latent Correlations between Repeated Testings. Psychometrika, vol. 50, March, pp. 3–16. https://doi.org/10.1007/BF02294143
Andersen E.B. (1977) Sufficient Statistics and Latent Trait Models. Psychometrika, vol. 42, March, pp. 69–81. https://doi.org/10.1007/BF02293746
Baker F.B. (1993) Sensitivity of the Linear Logistic Test Model to Misspecification of the Weight Matrix. Applied Psychological Measurement, vol. 17, no 3, pp. 201–210. https://doi.org/10.1177/014662169301700301
Bechger T.M., Maris G. (2015) A Statistical Test for Differential Item Pair Functioning. Psychometrika, vol. 80, June, pp. 317–340. https://doi.org/10.1007/s11336-014-9408-y
Cizek G.J. (ed.) (2013) Vertically Moderated Standard Setting: A Special Issue of Applied Measurement in Education. New York. NY: Routledge. https://doi.org/10.4324/97813150459008. Cizek G.J., Bunch M.B. (2007) Standard Setting: A Guide to Establishing and Evaluating Performance Standards on Tests. Thousand Oaks, CA: Sage. http://dx.doi.org/10.4135/9781412985918
Cooper H., Nye B., Charlton K., Lindsay J., Greathouse S. (1996) The Effects of Summer Vacation on Achievement Test Scores: A Narrative and Meta-Analytic Review. Review of Educational Research, vol. 66, no 3, pp. 227–268. https://doi.org/10.2307/1170523
Deonovic B., Chopade P., Yudelson M., de la Torre J., von Davier A.A. (2019) Application of Cognitive Diagnostic Models to Learning and Assessment Systems. Handbook of Diagnostic Classification Models: Models and Model Extensions, Applications, Software Packages (eds M. von Davier, Y.-S. Lee), Cham: Springer, pp. 437–460. https://doi.org/10.1007/978-3-030-05584-4_21
Dimitrov D.M., Rumrill Jr. P.D. (2003) Pretest-Posttest Designs and Measurement of Change. Work, vol. 20, no 2, pp. 159–165.
Embretson S.E. (1991) A Multidimensional Latent Trait Model for Measuring Learning and Change. Psychometrika, vol. 56, September, pp. 495–515. https://doi.org/10.1007/BF02294487
Federiakin D.A. Uglanova I.L., Skryabin M.A. (2021) Novye istochniki informatsii v komp'yuternom testirovanii [New Sources of Information in Computerized Testing]. Tomsk State University Journal, no 465, pp. 179–187. https://doi.org/10.17223/15617793/465/24
Fischer G.H. (1995) The Linear Logistic Test Model. Rasch Models (eds G.H. Fischer, I.W. Molenaar), New York, NY: Springer, pp. 131–155. https://doi.org/10.1007/978-1-4612-4230-7_8
Fischer G.H. (1973) The Linear Logistic Test Model as an Instrument in Educational Research. Acta Psychologica, vol. 37, no 6, pp. 359–374. https://doi.org/10.1016/0001-6918(73)90003-6
Gideon S. (1978) Estimating the Dimension of a Model. The Annals of Statistics, vol. 6, no 2, pp. 461–464. https://doi.org/10.1214/aos/1176344136
Hubley A.M., Zumbo B.D. (2011) Validity and the Consequences of Test Interpretation and Use. Social Indicators Research, vol. 103, no 2, pp. 219-230. https://doi.org/10.1007/s11205-011-9843-4
Lee H.K. (2016) An Application of Item Response Theory to Investigate the Validity of a Learning Progression for Number Sense (PhD Thesis), Berkeley, CA: University of California.
Linden van der W. J. (2018) Handbook of Item Response Theory: Three Volume Set. Boca Raton, FL: Chapman and Hall/CRC. https://doi.org/10.1201/9781315119144
Loyd B.H., Hoover H.D. (1980) Vertical Equating Using the Rasch Model. Journal of Educational Measurement, vol. 17, no 3, pp. 179–193. https://doi.org/10.1111/j.1745-3984.1980.tb00825.x
Macdonald G.T. (2014) The Performance of the Linear Logistic Test Model When the Q-Matrix Is Misspecified: A Simulation Study (PhD Thesis), Tampa, FL: University of South Florida.
Messick S. (1998) Test Validity: A Matter of Consequence. Social Indicators Research, vol. 45, November, pp. 35–44. https://doi.org/10.1023/A:1006964925094
Nsowaa B. (2018) The Ordered Latent Transition Analysis Model for the Measurement of Learning (PhD Thesis), New York, NY: Columbia University.
Rolfes T., Roth J., Schnotz W. (2018) Effects of Tables, Bar Charts, and Graphs on Solving Function Tasks. Journal für Mathematik-Didaktik, vol. 39, no 1, pp. 97–125. http://dx.doi.org/10.1007/s13138-017-0124-x
Salyakhutdinova D.R., Federiakin D.A. (2022) Sposoby svyazyvaniya shkal dlya izmereniya obrazovatel'nogo progressa v raznykh paradigmakh analiza dannykh obrazovatel'nogo testirovaniya [Methods of Linking Scales for Measuring Educational Progress in Different Paradigms of Educational Testing Data Analysis]. Domestic and Foreign Pedagogy, vol. 1, no 3, pp. 98–111. https://doi.org/10.24412/2224–0772–2022–84–98–111
Slavin R.E. (2005) Evidence-Based Reform: Advancing the Education of Students at Risk. Report Prepared for Renewing Our Schools, Securing Our Future. Available at: https://goo.su/vYeO (accessed 20 July 2023).
Sontag L.M. (1984) Vertical Equating Methods: A Comparative Study of Their Efficacy. New York, NY: Columbia University.
Sun Y., Ye S., Inoue S., Sun Y. (2014) Alternating Recursive Method for Q-matrix Learning. Proceedings of the 7th International Conference on Educational Data Mining (London, July 4–7, 2014), pp. 14–20.
Vandenberg R.J., Lance C.E. (2000) A Review and Synthesis of the Measurement Invariance Literature: Suggestions, Practices, and Recommendations for Organizational Research. Organizational Research Methods, vol. 3, no 1, pp. 4–70. https://doi.org/10.1177/109442810031002
Waterbury G.T., DeMars C.E. (2021) Anchors Aweigh: How the Choice of Anchor Items Affects the Vertical Scaling of 3PL Data with the Rasch Model. Educational Assessment, vol. 26, no 3, pp. 175–197. https://doi.org/10.1080/10627197.2020.185878231
Wilson M., Zheng X., McGuire L. (2012) Formulating Latent Growth Using an Explanatory Item Response Model Approach. Journal of Applied Measurement, vol. 13, no 1, pp. 1–22.
Yu X., Zhan P., Chen Q. (2023) Don’t Worry about the Anchor-Item Setting in Longitudinal Learning Diagnostic Assessments. Frontiers in Psychology, vol. 14, February, Article no 1112463. https://doi.org/10.3389/fpsyg.2023.1112463