Измерение образовательного прогресса на основе когнитивных операций

Сергей Тарасов; Ирина Зуева; Денис Федерякин

doi:10.17323/vo-2023-16902

Сергей Тарасов Национальный исследовательский университет "Высшая школа экономики" https://orcid.org/0000-0003-4151-115X
Ирина Зуева Национальный исследовательский университет "Высшая школа экономики"
Денис Федерякин Университет им. Иоганна Гутенберга (Майнц, Германия)

DOI: https://doi.org/10.17323/vo-2023-16902

Ключевые слова: измерение прогресса, IRT, LLTM, когнитивные операции, диагностические критериально-ориентированные пороги

Аннотация

Измерение образовательного прогресса остается нетривиальной методологической задачей даже при наличии множества описанных в литературе подходов к его концептуализации и моделированию. Рассматривается методологический подход к измерению образовательного прогресса в рамках современной теории тестирования, при этом традиционная концептуализация этого подхода расширяется за счет моделирования когнитивных операций. Показано, что синтез традиционных моделей для измерения образовательного прогресса с одной из самых популярных моделей современной теории тестирования — LLTM, позволяющей моделировать когнитивные операции, — существенно обогащает возможности интерпретации тестовых баллов учеников, сохраняя все достоинства традиционного подхода к измерению образовательного прогресса. Для иллюстрации предлагаемого подхода использована линейка тестов, применявшихся для мониторинга образовательного прогресса в математике в 8–9-х классах средней школы.

Скачивания

Данные скачивания пока не доступны.

Литература

Adams R.J. (2005) Reliability as a Measurement Design Effect. Studies in Educational Evaluation, vol. 31, no 2, pp. 162–172. https://doi.org/10.1016/j.stueduc.2005.05.008

Akaike H. (1974) A New Look at the Statistical Model Identification. IEEE Transactionson Automatic Control, vol. 19, no 6, pp. 716–723. https://doi.org/10.1109/TAC.1974.1100705

Andersen E.B. (1985) Estimating Latent Correlations between Repeated Testings. Psychometrika, vol. 50, March, pp. 3–16. https://doi.org/10.1007/BF02294143

Andersen E.B. (1977) Sufficient Statistics and Latent Trait Models. Psychometrika, vol. 42, March, pp. 69–81. https://doi.org/10.1007/BF02293746

Baker F.B. (1993) Sensitivity of the Linear Logistic Test Model to Misspecification of the Weight Matrix. Applied Psychological Measurement, vol. 17, no 3, pp. 201–210. https://doi.org/10.1177/014662169301700301

Bechger T.M., Maris G. (2015) A Statistical Test for Differential Item Pair Functioning. Psychometrika, vol. 80, June, pp. 317–340. https://doi.org/10.1007/s11336-014-9408-y

Cizek G.J. (ed.) (2013) Vertically Moderated Standard Setting: A Special Issue of Applied Measurement in Education. New York. NY: Routledge. https://doi.org/10.4324/97813150459008. Cizek G.J., Bunch M.B. (2007) Standard Setting: A Guide to Establishing and Evaluating Performance Standards on Tests. Thousand Oaks, CA: Sage. http://dx.doi.org/10.4135/9781412985918

Cooper H., Nye B., Charlton K., Lindsay J., Greathouse S. (1996) The Effects of Summer Vacation on Achievement Test Scores: A Narrative and Meta-Analytic Review. Review of Educational Research, vol. 66, no 3, pp. 227–268. https://doi.org/10.2307/1170523

Deonovic B., Chopade P., Yudelson M., de la Torre J., von Davier A.A. (2019) Application of Cognitive Diagnostic Models to Learning and Assessment Systems. Handbook of Diagnostic Classification Models: Models and Model Extensions, Applications, Software Packages (eds M. von Davier, Y.-S. Lee), Cham: Springer, pp. 437–460. https://doi.org/10.1007/978-3-030-05584-4_21

Dimitrov D.M., Rumrill Jr. P.D. (2003) Pretest-Posttest Designs and Measurement of Change. Work, vol. 20, no 2, pp. 159–165.

Embretson S.E. (1991) A Multidimensional Latent Trait Model for Measuring Learning and Change. Psychometrika, vol. 56, September, pp. 495–515. https://doi.org/10.1007/BF02294487

Federiakin D.A. Uglanova I.L., Skryabin M.A. (2021) Novye istochniki informatsii v komp'yuternom testirovanii [New Sources of Information in Computerized Testing]. Tomsk State University Journal, no 465, pp. 179–187. https://doi.org/10.17223/15617793/465/24

Fischer G.H. (1995) The Linear Logistic Test Model. Rasch Models (eds G.H. Fischer, I.W. Molenaar), New York, NY: Springer, pp. 131–155. https://doi.org/10.1007/978-1-4612-4230-7_8

Fischer G.H. (1973) The Linear Logistic Test Model as an Instrument in Educational Research. Acta Psychologica, vol. 37, no 6, pp. 359–374. https://doi.org/10.1016/0001-6918(73)90003-6

Gideon S. (1978) Estimating the Dimension of a Model. The Annals of Statistics, vol. 6, no 2, pp. 461–464. https://doi.org/10.1214/aos/1176344136

Hubley A.M., Zumbo B.D. (2011) Validity and the Consequences of Test Interpretation and Use. Social Indicators Research, vol. 103, no 2, pp. 219-230. https://doi.org/10.1007/s11205-011-9843-4

Lee H.K. (2016) An Application of Item Response Theory to Investigate the Validity of a Learning Progression for Number Sense (PhD Thesis), Berkeley, CA: University of California.

Linden van der W. J. (2018) Handbook of Item Response Theory: Three Volume Set. Boca Raton, FL: Chapman and Hall/CRC. https://doi.org/10.1201/9781315119144

Loyd B.H., Hoover H.D. (1980) Vertical Equating Using the Rasch Model. Journal of Educational Measurement, vol. 17, no 3, pp. 179–193. https://doi.org/10.1111/j.1745-3984.1980.tb00825.x

Macdonald G.T. (2014) The Performance of the Linear Logistic Test Model When the Q-Matrix Is Misspecified: A Simulation Study (PhD Thesis), Tampa, FL: University of South Florida.

Messick S. (1998) Test Validity: A Matter of Consequence. Social Indicators Research, vol. 45, November, pp. 35–44. https://doi.org/10.1023/A:1006964925094

Nsowaa B. (2018) The Ordered Latent Transition Analysis Model for the Measurement of Learning (PhD Thesis), New York, NY: Columbia University.

Rolfes T., Roth J., Schnotz W. (2018) Effects of Tables, Bar Charts, and Graphs on Solving Function Tasks. Journal für Mathematik-Didaktik, vol. 39, no 1, pp. 97–125. http://dx.doi.org/10.1007/s13138-017-0124-x

Salyakhutdinova D.R., Federiakin D.A. (2022) Sposoby svyazyvaniya shkal dlya izmereniya obrazovatel'nogo progressa v raznykh paradigmakh analiza dannykh obrazovatel'nogo testirovaniya [Methods of Linking Scales for Measuring Educational Progress in Different Paradigms of Educational Testing Data Analysis]. Domestic and Foreign Pedagogy, vol. 1, no 3, pp. 98–111. https://doi.org/10.24412/2224–0772–2022–84–98–111

Slavin R.E. (2005) Evidence-Based Reform: Advancing the Education of Students at Risk. Report Prepared for Renewing Our Schools, Securing Our Future. Available at: https://goo.su/vYeO (accessed 20 July 2023).

Sontag L.M. (1984) Vertical Equating Methods: A Comparative Study of Their Efficacy. New York, NY: Columbia University.

Sun Y., Ye S., Inoue S., Sun Y. (2014) Alternating Recursive Method for Q-matrix Learning. Proceedings of the 7th International Conference on Educational Data Mining (London, July 4–7, 2014), pp. 14–20.

Vandenberg R.J., Lance C.E. (2000) A Review and Synthesis of the Measurement Invariance Literature: Suggestions, Practices, and Recommendations for Organizational Research. Organizational Research Methods, vol. 3, no 1, pp. 4–70. https://doi.org/10.1177/109442810031002

Waterbury G.T., DeMars C.E. (2021) Anchors Aweigh: How the Choice of Anchor Items Affects the Vertical Scaling of 3PL Data with the Rasch Model. Educational Assessment, vol. 26, no 3, pp. 175–197. https://doi.org/10.1080/10627197.2020.185878231

Wilson M., Zheng X., McGuire L. (2012) Formulating Latent Growth Using an Explanatory Item Response Model Approach. Journal of Applied Measurement, vol. 13, no 1, pp. 1–22.

Yu X., Zhan P., Chen Q. (2023) Don’t Worry about the Anchor-Item Setting in Longitudinal Learning Diagnostic Assessments. Frontiers in Psychology, vol. 14, February, Article no 1112463. https://doi.org/10.3389/fpsyg.2023.1112463

Major Indexing