Is Psychometrics So Useful for Academic Psychology?

Keywords: psychometric modelling, latent construct modelling, psychological construct, psychological theory, test

Abstract

Psychological theories regarding ability and personality traits often rely on the results of psychometric modelling. The latter is assumed to link responses to test items to an unobserved 'construct' (trait, ability), which is 'modelled' from the test data. However, does the agreement between the data and the model indicate that the model represents a psychological construct? To what extent is ‘psychometric modelling’ modelling in the general scientific sense of the term? The validity of using modelling data to understand psychological phenomena depends on the answer to these questions. The article analyses the logic of psychometric modelling in comparison with modelling in other sciences and argues that psychological phenomena as a subject of modelling are not involved neither in the construction nor in the correction of models. The problem of unjustified interpretations of modelling results in psychology and their undesirable consequences for psychological theory is raised. At the same time, the use of psychometric modelling for human resource decision-making is still waiting for its evaluation.

Downloads

Download data is not yet available.

References

Ackerman T.A., Gierl M.J., Walker C.M. (2003) Using Multidimensional Item Response Theory to Evaluate Educational and Psychological Tests. Educational Measurement: Issues and Practice, vol. 22, no 3, pp. 37–51. http://dx.doi.org/10.1111/j.1745-3992.2003.tb00136.x

Alexander P.A., Dumas D., Grossnickle E.M., List A., Firetto C.M. (2016) Measuring Relational Reasoning. The Journal of Experimental Education, vol. 84, no 1, pp. 119–151. http://dx.doi.org/10.1080/00220973.2014.963216

Araujo A.L.S.O., Andrade W.L., Guerrero D.D.S., Melo M.R.A. (2019) How Many Abilities Can We Measure in Computational Thinking? A Study on Bebras Challenge. Proceedings of the 50th ACM Technical Symposium on Computer Science Education (Minneapolis, MN, 2019, 27 February), New York, NY: Machinery, pp. 545–551.

Arhonditsis G.B., Stow C.A., Steinberg L.J., Kenney M.A., Lathrop R.C., McBride S.J., Reckhow K.H. (2006) Exploring Ecological Patterns with Structural Equation Modeling and Bayesian Analysis. Ecological Modelling, vol. 192, no 3–4, pp. 385–409. https://doi.org/10.1016/j.ecolmodel.2005.07.028

Ayzel G.V., Gusev E.M., Nasonova O.N. (2017) Raschety rechnogo stoka na osnove modeli SWAP dlya vodosborov s nedostatochnym informatsionnym obespecheniem. 2. Ispol´zovanie metodov fiziko-geograficheskogo podpbiya i prostranstvennoy geostatistiki [Runoff Evaluation for Ungauged Watersheds by SWAP Model. 2. Using Methods of Physical and Geographical Similarity and Spatial Geostatistics]. Water Resources, vol. 44, no 4, pp. 419–431. https://doi.org/10.7868/S0321059617020043

Birenbaum M., DeLuca C., Earl L., Heritage M., Klenowski V., Looney A. et al. (2015) International Trends in the Implementation of Assessment for Learning: Implications for Policy and Practice. Policy Futures in Education, vol. 13, no 1, pp. 117–140. http://dx.doi.org/10.1177/1478210314566733

Birnbaum M.H. (2008) New Paradoxes of Risky Decision Making. Psychological Review, vol. 115, no 2, pp. 463–501. https://doi.org/10.1037/0033-295X.115.2.463

Borsboom D., Mellenbergh G.J., van Heerden J. (2004) The Concept of Validity. Psychological Review, vol. 111, no 4 pp., 1061–1071. https://doi.org/10.1037/0033-295X.111.4.1061

Borsboom D., Molenaar D. (2015) Psychometrics. International Encyclopedia of the Social & Behavioral Sciences (ed. J.D. Wright), Oxford: Elsevier, pp. 418–422. https://doi.org/10.1016/B978-0-08-097086-8.43079-5

Borsboom D., Rhemtulla M., Cramer A.O., van der Maas H.L., Scheffer M., Dolan C.V. (2016) Kinds Versus Continua: A Review of Psychometric Approaches to Uncover the Structure of Psychiatric Constructs. Psychological Medicine, vol. 46, no 8, pp. 1567–1579. http://dx.doi.org/10.1017/S0033291715001944

Buchholz J., Hartig J. (2020) Measurement Invariance Testing in Questionnaires: A Comparison of Three Multigroup-CFA and IRT-Based Approaches. Psychological Test and Assessment Modeling, vol. 62, no 1, pp. 29–53.

Caycho-Rodríguez T., Vilca L.W., Carbajal-León C., White M., Vivanco-Vidal A., Saroli-Araníbar D. et al. (2022) Coronavirus Anxiety Scale: New Psychometric Evidence for the Spanish Version Based on CFA and IRT Models in a Peruvian Sample. Death Studies, vol. 46, no 5, pp. 1090–1099. http://dx.doi.org/10.1080/07481187.2020.1865480

Costantini G., Epskamp S., Borsboom D., Perugini M., Mõttus R., Waldorp L.J., Cramer A.O. (2015) State of the aRt Personality Research: A Tutorial on Network Analysis of Personality Data in R. Journal of Research in Personality, vol. 54, July, pp. 13–29.

https://doi.org/10.1016/j.jrp.2014.07.003

Credé M. (2018) What Shall We Do about Grit? A Critical Review of What We Know and What We Don’t Know. Educational Researcher, vol. 47, no 9, pp. 606–611. http://dx.doi.org/10.3102/0013189X18801322

Cronbach L.J., Meehl P.E. (1955) Construct Validity in Psychological Tests. Psychological Bulletin, vol. 52, no 4, pp. 281–302. https://doi.org/10.1037/h0040957

Dam van J.C., Groenendijk P., Hendriks R.F., Kroes J.G. (2008) Advances of Modeling Water Flow in Variably Saturated Soils with SWAP. Vadose Zone Journal, vol. 7, no 2, pp. 640–653. http://dx.doi.org/10.2136/vzj2007.0060

Divgi D.R. (1986) Does the Rasch Model Really Work for Multiple Choice Items? Not If You Look Closely. Journal of Educational Measurement, vol. 23, no 4, pp. 283–298.

Duckworth A.L., Quinn P.D. (2012) Short Grit Scale. Journal of Personality Assessment, vol. 91, no 2, pp. 166-174. https://psycnet.apa.org/doi/10.1037/t01598-000

Duckworth A.L., Peterson C., Matthews M.D., Kelly D.R. (2007) Grit: Perseverance and Passion for Long-Term Goals. Journal of Personality and Social Psychology, vol. 92, no 6, 1087–1101. http://dx.doi.org/10.1037/0022-3514.92.6.1087

Dumas D., Dong Y. (2022) Relational Reasoning and Thinking: Theory, Measurement, and Empirical Findings. International Encyclopedia of Education (eds R. Tierney, F. Rizvi, K. Ercican), New York, NY: Taylor & Francis. https://doi.org/10.4324/9781138609877-REE179-1

Fischer G.H. (1973) The Linear Logistic Test Model as an Instrument in Educational Research. Acta Psychologica, vol. 37, no 6, pp. 359–374. http://dx.doi.org/10.1016/0001-6918(73)90003-6

Fisher Jr.W.P., Stenner A.J. (2022) Metrology for the Social, Behavioral, and Economic Sciences. Explanatory Models, Unit Standards, and Personalized Learning in Educational Measurement: Selected Papers by A. Jackson Stenner (eds W.P. Fisher, P.J. Massengill), Singapore: Springer Nature Singapore, pp. 217–222.

Fox J.P. (2005) Multilevel IRT Using Dichotomous and Polytomous Response Data. British Journal of Mathematical and Statistical Psychology, vol. 58, no 1, pp. 145–172. http://dx.doi.org/10.1348/000711005X38951

Franić S., Borsboom D., Dolan C.V., Boomsma D.I. (2014) The Big Five Personality Traits: Psychological Entities or Statistical Constructs? Behavior Genetics, vol. 44, no 6, pp. 591–604. http://dx.doi.org/10.1007/s10519-013-9625-7

Franic S., Dolan C.V., Borsboom D., Boomsma D.I. (2012) Structural Equation Modeling in Genetics. Handbook of Structural Equation Modeling (ed. R.H. Hoyle), New York, NY: The Guilford, pp. 617–635.

Freund P.A., Lohbeck A. (2021) Modeling Self-Determination Theory Motivation Data by Using Unfolding IRT. European Journal of Psychological Assessment, vol. 37, no 5, pp. 388–396. http://dx.doi.org/10.1027/1015-5759/a000629

Hambleton R.K., Swaminathan H. (2013) Item Response Theory: Principles and Applications. Springer Science & Business Media.

Hartig J., Höhler J. (2009) Multidimensional IRT Models for the Assessment of Competencies. Studies in Educational Evaluation, vol. 35, no 2–3, pp. 57–63. http://dx.doi.org/10.1016/j.stueduc.2009.10.002

Hauwaert van S.M., Schimpf C.H., Azevedo F. (2020) The Measurement of Populist Attitudes: Testing Cross-National Scales Using Item Response Theory. Politics, vol. 40, no 1, Article no 026339571985930. http://dx.doi.org/10.1177/0263395719859306

Johnson H.M. (1945) Are Psychophysical Problems Genuine or Spurious? The American Journal of Psychology, vol. 58, no 2, pp. 189–211. https://doi.org/10.2307/1417845

Kane M.T. (2016) Explicating Validity. Assessment in Education: Principles, Policy & Practice, vol. 23, no 2, pp. 198–211. https://doi.org/10.1080/0969594X.2015.1060192

Kunina-Habenicht O., Goldhammer F. (2020) ICT Engagement: A New Construct and Its Assessment in PISA 2015. Large-Scale Assessments in Education, vol. 8, no 1, pp. 1–21. http://dx.doi.org/10.1186/s40536-020-00084-z

Lange J., Dalege J., Borsboom D., van Kleef G.A., Fischer A.H. (2020) Toward an Integrative Psychometric Model of Emotions. Perspectives on Psychological Science, vol. 15, no 2, pp. 444–468. http://dx.doi.org/10.1177/1745691619895057

Linden van der W.J., Hambleton R.K. (eds) (2013) Handbook of Modern Item Response Theory. Springer Science & Business Media.

Luo Y. (2021) A Comparison of Common IRT Model-selection Methods with Mixed-Format Tests. Measurement: Interdisciplinary Research and Perspectives, vol. 19, no 4, pp. 199–212. http://dx.doi.org/10.1080/15366367.2021.1878779

MacCorquodale K., Meehl P.E. (1948) On a Distinction between Hypothetical Constructs and Intervening Variables. Psychological Review, vol. 55, no 2, pp 95–107. https://doi.org/10.1037/h0056029

Maraun M. (2017) The Object Detection Logic of Latent Variable Technologies. Quality and Quantity, vol. 51, no 1, pp. 239–259. https://doi.org/10.1007/s11135-015-0303-0

Maraun M.D., Gabriel S.M. (2013) Illegitimate Concept Equating in the Partial Fusion of Construct Validation Theory and Latent Variable Modeling. New Ideas in Psychology, vol. 31, no 1, pp. 32–42. https://doi.org/10.1016/j.newideapsych.2011.02.006

Maraun M.D., Halpin P.F. (2008) Manifest and Latent Variates. Measurement: Interdisciplinary Research and Perspectives, vol. 6, no 1-2, pp. 113–117. https://doi.org/10.1080/15366360802035596

Markus K.A., Borsboom D. (2013) Frontiers of Test Validity Theory: Measurement, Causation, and Meaning. New York, NY: Routledge/Taylor & Francis Group. https://doi.org/10.4324/9780203501207

Messick S. (1994) The Interplay of Evidence and Consequences in the Validation of Performance Assessments. Educational Researcher, vol. 23, no 2, pp. 13–23. https://doi.org/10.3102/0013189x023002013

Michell J. (2013) Constructs, Inferences, and Mental Measurement. New Ideas in Psychology, vol. 31, no 1, pp. 13–21. https://doi.org/10.1016/j.newideapsych.2011.02.004

Mislevy R.J., Steinberg L.S., Almond R.G. (2002) On the Roles of Task Model Variables in Assessment Design. Generating Items for Cognitive Tests: Theory and Practice (eds S. Irvine, P. Kyllonen), Hillsdale, NY: Erlbaum, pp. 97–128.

Nering M.L., Ostini R. (eds) (2010) Handbook of Polytomous Item Response Theory Models. New York, NY: Routledge. https://doi.org/10.4324/9780203861264

Nima A.A., Cloninger K.M., Persson B.N., Sikström S., Garcia D. (2020) Validation of Subjective Well-Being Measures Using Item Response Theory. Frontiers in Psychology, vol. 10, January, Article no 3036. http://dx.doi.org/10.3389/fpsyg.2019.03036

Ottensen J. (2000) Mathematical Modelling in Medicine. Amsterdam: IOS Press.

Podolsky A., Kaufman K.R., Cahalan T.D., Aleshinsky S.Y., Chao E.Y. (1990) The Relationship of Strength and Jump Height in Figure Skaters. The American Journal of Sports Medicine, vol. 18, no 4, pp. 400–405. https://doi.org/10.1177/036354659001800412

Power M.J. (2006) The Structure of Emotion: An Empirical Comparison of Six Models. Cognition and Emotion, vol. 20, no 5, pp 694–713. https://doi.org/10.1080/02699930500367925

Pugesek B.H., Tomer A., von Eye A. (2003) Structural Equation Modeling: Applications in Ecological and Evolutionary Biology. Cambridge, UK: Cambridge University. https://doi.org/10.1017/CBO9780511542138

Qian M., Plucker J.A., Yang X. (2019) Is Creativity Domain Specific or Domain General? Evidence from Multilevel Explanatory Item Response Theory Models. Thinking Skills and Creativity, vol. 33, May, Article no 100571. http://dx.doi.org/10.1016/j.tsc.2019.100571

Oberkampf W.L., DeLand S.M., Rutherford B.M., Diegert K.V., Alvin K.F. (2002) Error and Uncertainty in Modeling and Simulation. Reliability Engineering & System Safety, vol. 75, no 3 pp., 333–357. http://dx.doi.org/10.1016/S0951-8320(01)00120-X

Rasch G. (1960) Probabilistic Models for Some Intelligence and Attainment Tests. Copenhagen: Danmarks Paedagogiske Institut.

Ravand H., Robitzsch A. (2015) Cognitive Diagnostic Modeling Using R. Practical Assessment, Research, and Evaluation, vol. 20, no 11. Available at: http://pareonline.net/getvn.asp?v=20&n=11 (accessed 20 August 2023).

Reise S.P. (2012) The Rediscovery of Bifactor Measurement Models. Multivariate Behavioral Research, vol. 47, no 5, pp. 667–696. https://doi.org/10.1080/00273171.2012.715555

Riconscente M.M., Mislevy R.J., Corrigan S. (2015) Evidence-Centered Design. Handbook of Test Development (eds S. Lane, M.R. Raymond, T.M. Haladyna), New York, NY: Routledge, pp. 40–63. http://dx.doi.org/10.4324/9780203102961.ch3

Robitzsch A. (2022) On the Choice of the Item Response Model for Scaling PISA Data: Model Selection Based on Information Criteria and Quantifying Model Uncertainty. Entropy, vol. 24, no 6, Article no 760. http://dx.doi.org/10.3390/e24060760

Rhodes M., Putkaradze V. (2022) Trajectory Tracing in Figure Skating. Nonlinear Dynamics, vol. 110, no 4, pp. 3031–3044. https://doi.org/10.1007/s11071-022-07806-8

Schmittmann V.D., Cramer A.O.J., Waldorp L.J., Epskamp S., Kievit R.A., Borsboom D. (2013) Deconstructing the Construct: A Network Perspective on Psychological Phenomena. New Ideas in Psychology, vol. 31, no 1, pp. 43–53. https://doi.org/10.1016/j.newideapsych.2011.02.007

Sen S., Cohen A.S. (2019) Applications of Mixture IRT Models: A Literature Review. Measurement: Interdisciplinary Research and Perspectives, vol. 17, no 4, pp. 177–191. http://dx.doi.org/10.1080/15366367.2019.1583506

Shaw A., Kapnek M., Morelli N.A. (2021) Measuring Creative Self-Efficacy: An Item Response Theory Analysis of the Creative Self-Efficacy Scale. Frontiers in Psychology, vol. 12, July, Article no 678033. http://dx.doi.org/10.3389/fpsyg.2021.678033

Sijtsma K., Ark van der A. (2020) Measurement Models for Psychological Attributes: Classical Test Theory, Factor Analysis, Item Response Theory, and Latent Class Models. Boca Raton, FL: CRC. https://doi.org/10.1201/9780429112447

Streckert N., Kurtz L., Kajonius P.J. (2023) Can Your Darkness Be Measured? Analyzing the Full and Brief Version of the Dark Factor of Personality in Swedish. International Journal of Testing, vol. 23, no 2, pp. 1–45. http://dx.doi.org/10.1080/15305058.2023.2195659

Templin J.L., Henson R.A. (2006) Measurement of Psychological Disorders Using Cognitive Diagnosis Models. Psychological Methods, vol. 11, no 3, 287–305. http://dx.doi.org/10.1037/1082-989X.11.3.287

Trendler G. (2022) Is Measurement in Psychology an Empirical or a Conceptual Issue? A Comment on David Franz. Theory & Psychology, vol. 32, no 1, pp. 164–170. https://doi.org/10.1177/09593543211050025

Trendler G. (2013) Measurement in Psychology: A Case of Ignoramus et Ignorabimus? A Rejoinder. Theory & Psychology, vol. 23, no 5, pp. 591–615. https://doi.org/10.1177/0959354313490451

Tynan M.C. (2021) Deconstructing Grit’s Validity: The Case for Revising Grit Measures and Theory. Multidisciplinary Perspectives on Grit: Contemporary Theories, Assessments, Applications and Critiques (eds L.E. van Zyl, C. Olckers, L. van der Vaart), Cham: Springer Nature Switzerland, pp. 137–155. http://dx.doi.org/10.1007/978-3-030-57389-8_8

Tyumeneva Y., Kardanova E., Kuzmina J. (2019) Grit: Two Related but Independent Constructs Instead of One. Evidence from Item Response Theory. European Journal of Psychological Assessment, vol. 35, no 4, pp. 469–478. http://dx.doi.org/10.1027/1015-5759/a000424

Uglanova I.L.1, Brun I.V.1, Vasin G.M. (2018) Metodologiya Evidence-Centered Design dlya izmereniya kompleksnykh psikhologicheskikh konstruktov [Evidence-Centered Design Method for Measuring Complex Psychological Constructs]. Journal of Modern Foreign Psychology, vol. 7, no 3, pp. 18–27. https://doi.org/10.17759/jmfp.2018070302

Uher J. (2021) Quantitative Psychology under Scrutiny: Measurement Requires Not Result-Dependent But Traceable Data Generation. Personality and Individual Differences, vol. 170, no 5, Article no110205. https://doi.org/10.1016/j.paid.2020.110205

Vessonen E. (2021) Conceptual Engineering and Operationalism in Psychology. Synthese, vol. 199, no 3–4, pp. 10615–10637. https://doi.org/10.1007/s11229-021-03261-x

Wagner T.A., Harvey R.J. (2006) Development of a New Critical Thinking Test Using Item Response Theory. Psychological Assessment, vol. 18, no 1, pp. 100–105. https://doi.org/10.1037/1040-3590.18.1.100

Walton K.E., Roberts B.W., Krueger R.F., Blonigen D.M., Hicks B.M. (2008) Capturing Abnormal Personality with Normal Personality Inventories: An Item Response Theory Approach. Journal of Personality, vol. 76, no 6, pp. 1623–1648. http://dx.doi.org/10.1111/j.1467-6494.2008.00533.x

Wiggins B.J., Christopherson C.D. (2019) The Replication Crisis in Psychology: An Overview for Theoretical and Philosophical Psychology. Journal of Theoretical and Philosophical Psychology, vol. 39, no 4, pp. 202–217. http://dx.doi.org/10.1037/teo0000137

Will C.M. (2000) Einstein’s Relativity and Everyday Life. Available at: http://www.physicscentral.com/writers/writers-00-2.html (accessed 20 August 202).

Wilson M. (2004) Constructing Measures. An Item Response Modeling Approach. New York, NY: Routledge

Yen W.M, Fizpatrick A.R. (2006) Item Response Theory. Educational Measurement (ed. R.L. Brennan), Westport, CT: American Council on Education and Praeger, pp. 17–64.

Zhao H., Alexander P.A., Sun Y. (2021) Relational Reasoning’s Contributions to Mathematical Thinking and Performance in Chinese Elementary and Middle-School Students. Journal of Educational Psychology, vol. 113, no 2, pp. 279–303. http://dx.doi.org/10.1037/edu0000595

Published
2023-11-06
How to Cite
TyumenevaYulia A. 2023. “Is Psychometrics So Useful for Academic Psychology?”. Voprosy Obrazovaniya / Educational Studies Moscow, no. 3 (November). https://doi.org/10.17323/vo-2023-16781.
Section
SI Psychometrics