Preliminary Development of an Item Bank and an Adaptive Test in Mathematical Knowledge for University Students

Fernanda Belén Ghio 1 * , Manuel Bruzzone 1, Luis Rojas-Torres 2, Marcos Cupani 1
More Detail
1 Instituto de investigaciones Psicológicas (IIPSI), Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET), Facultad de Psicología, Universidad Nacional de Córdoba (UNC), ARGENTINA
2 Institute for Psychological Research, University of Costa Rica, COSTA RICA
* Corresponding Author
EUR J SCI MATH ED, Volume 10, Issue 3, pp. 352-365. https://doi.org/10.30935/scimath/11968
Published: 06 April 2022
OPEN ACCESS   1531 Views   757 Downloads
Download Full Text (PDF)

ABSTRACT

In the last decades, the development of computerized adaptive testing (CAT) has allowed more precise measurements with a smaller number of items. In this study, we develop an item bank (IB) to generate the adaptive algorithm and simulate the functioning of CAT to assess the domains of mathematical knowledge in Argentinian university students (N=773). Data were analyzed from the Rasch model. A simulation design created with the R software was used to determine the necessary items of the IB to estimate examinee ability. Our results indicate that the IB in the domains of mathematical knowledge is adequate to be applied in CAT, especially to estimate average ability levels. The use of CAT is recommended for rapidly generating indicators of the knowledge acquired by students and to design educational strategies that enhance student performance. Results, constrains, and implications of this study are discussed.

CITATION

Ghio, F. B., Bruzzone, M., Rojas-Torres, L., & Cupani, M. (2022). Preliminary Development of an Item Bank and an Adaptive Test in Mathematical Knowledge for University Students. European Journal of Science and Mathematics Education, 10(3), 352-365. https://doi.org/10.30935/scimath/11968

REFERENCES

  • Andrich, D., & Marais, I. (2019). A course in Rasch measurement theory: Measuring in the educational, social and health sciences. Springer. https://doi.org/10.1007/978-981-13-7496-8
  • Andrich, D., Sheridan, B., & Luo, G. (2010). Rasch models for measurement: RUMM2030 [computer software]. RUMM Laboratory Pty Ltd. https://www.rasch.org/rmt/rmt114d.htm
  • Aybek, E. C., & Demirtasli, R. N. (2017). Computerized adaptive test (CAT) applications and item response theory models for polytomous items. International Journal of Research in Education and Science, 3(2), 475-487. https://doi.org/10.21890/ijres.327907
  • Baker, F. B., & Kim, S. H. (2004). Item response theory: Parameter estimation techniques. Marcel Dekker.
  • Baldasaro, R. E., Shanahan, M. J., & Bauer, D. J. (2013). Psychometric properties of the mini-IPIP in a large, nationally representative sample of young adults. Journal of Personality Assessment, 95(1), 74-84. https://doi.org/10.1080/00223891.2012.700466
  • Barrada, J. R., Olea, J., Ponsoda, V., & Abad, F. J. (2006). Estrategias de selección de ítems en un test adaptativo informatizado para la evaluación de Inglés escrito. [Item selection rules in a computerized adaptive test for the assessment of written English]. Psicothema [Psychothema], 18(4), 828-834.
  • Cavanagh, R. F., & Waugh, R. F. (2011). Applications of Rasch measurement in learning environments research. Sense Publishers. https://doi.org/10.1007/978-94-6091-493-5
  • Chang, H. (2015). Psychometrics behind computerized adaptive testing. Psychometrika, 80(1), 1-20. https://doi.org/10.1007/s11336-014-9401-5
  • Chen, W. H., & Thissen, D. (1997). Local dependence indices for item pairs using item response theory. Journal of Educational and Behavioral Statistics, 22(3), 265-289. https://doi.org/10.3102/10769986022003265
  • Čisar, S. M., Čisar, P., & Pinter, R. (2016). Evaluation of knowledge in object oriented programming course with computer Adaptive tests. Computers & Education, 92-93, 142-160. https://doi.org/10.1016/j.compedu.2015.10.016
  • Čisar, S. M., Radosav, D., Markoski, B., Pinter, R., & Čisar, P. (2010). Computer adaptive testing of student knowledge. Acta Polytechnica Hungarica, 7(4), 139-152.
  • Costa, P., & Ferrão, M. E. (2015). On the complementarity of classical test theory and item response models: Item difficulty estimates and computerized adaptive testing. Ensaio: Avaliação e Políticas Públicas em Educação [Essay: Evaluation and Public Policies in Education], 23(88), 593-610. https://doi.org/10.1590/S0104-40362015000300003
  • Cupani, M., Ghio, F., Leal, M., Giraudo, G., Castro Zamparella, T., Piumatti, G., Casalotti, A., Ramírez, J., Arranz, M., Farías, A., Padilla, N., & Barrionuevo, L. (2016). Desarrollo de un banco de ítems para medir conocimiento en estudiantes universitarios [Development of an item bank to measure knowledge in university students]. Revista de Psicología [Psychology Journal], 25(2), 1-18. https://doi.org/10.5354/0719-0581.2017.44808
  • Cupani, M., Zamparella, T. C., Piumatti, G., & Vinculado G. (2017). Development of an item bank for the assessment of knowledge on biology in Argentine university students. Journal of Applied Measurement, 18(3), 360-369.
  • Doran, Y. J. (2017). The role of mathematics in physics: Building knowledge and describing the empirical world. ONOMÁZEIN Número Especial LSF y TCL Sobre Educación y Conocimiento [ONOMÁZEIN Special Issue LSF and TCL on Education and Knowledge], 13(2), 209-226. https://doi.org/10.7764/onomazein.sfl.08
  • Dorans, N. J., & Kingston, N. M. (1985). The effects of violations of unidimensionality on the estimation of item and ability parameters and on item response theory equating of the GRE verbal scale. Journal of Educational Measurement, 22(4), 249-262. https://doi.org/10.1111/j.1745-3984.1985.tb01062.x
  • Downing, S. M., & Haladyna, T.M. (2006). Handbook of test development. Lawrence Erlbaum Associates.
  • Engelbrecht, J., Harding, A., & Du Preez, J. (2007). Long-term retention of basic mathematical knowledge and skills with engineering students. European Journal of Engineering Education, 32(6), 735-744. https://doi.org/10.1080/03043790701520792
  • Flores, A. H., & Gómez, A. (2009). Aprender matemática, haciendo matemática: La evaluación en el aula [Learning mathematics, doing mathematics: Assessment in the classroom]. Educación Matemática [Mathematics Education], 21(2) 117-142.
  • Ghio, F. B., Cupani, M., Garrido, S. J., Azpilicueta, A. E., & Morán, V. E. (2019). Prueba para evaluar conocimiento en leyes: Análisis de los ítems mediante la aplicación del modelo de Rasch [Test to evaluate knowledge of law: Analysis of items applying the Rasch model]. Revista Científica Digital de Psicología PSIQUEMAG [Digital Scientific Journal of Psychology PSIQUEMAG], 8(1), 105-116
  • Gierl, M. J., Bulut, O., Guo, Q., & Zhang, X. (2017). Developing, analyzing, and using distractors for multiple-choice tests in education: A comprehensive review. Review of Educational Research, 87(6), 1082-1116. https://doi.org/10.3102/0034654317726529
  • Haladyna, T. M., & Rodriguez, M. C. (2013). Developing and validating test items. Routledge. https://doi.org/10.4324/9780203850381
  • Han, K. (C.) T. (2018). Conducting simulation studies for computerized adaptive testing using SimulCAT: An instructional piece. Journal of Educational Evaluation for Health Professions, 15, 20. https://doi.org/10.3352/jeehp.2018.15.20
  • Karjanto, N., & Yong, S. T. (2013). Test anxiety in mathematics among early undergraduate students in a British university in Malaysia. European Journal of Engineering Education, 38(1), 11-37. https://doi.org/10.1080/03043797.2012.742867
  • Kaya, Z., & Tan, S. (2014). New trends of measurement and assessment in distance education. Turkish Online Journal of Distance Education, 15(1), 206-217. https://doi.org/10.17718/tojde.30398
  • Kingsbury, G. G., & Houser, R. L. (1999). Developing computerized adaptive tests for school children. In Drasgow, F., & Olson-Buchanan, J. B. (Eds.), Innovations in computerized assessment (pp. 93-116). Erlbaum.
  • Kolen, M. J., & Brennan, R. L. (2014). Test equating, scaling, and linking. Methods and practices. Springer. https://doi.org/10.1007/978-1-4939-0317-7
  • Linacre, J. M. (2000). Computer-adaptive testing: A methodology whose time has come. In S. Chea, U. Kang, & J. M. Linacre (Eds.), Development of computerized middle school achievement test. Komesa Press.
  • Lindquist, M., Philpot, R., Mullis, I. V. S., & Cotter, K. E. (2017). TIMSS 2019 mathematics framework. In I. V. S. Mullis, & M. O. Martin (Eds.), TIMSS 2019 assessment frameworks (pp. 11-25). TIMSS & PIRLS International Study Center, Boston College.
  • Messick, S. (1989). Meaning and values in test validation: The science and ethics of assessment. Educational Researcher, 18(2), 5-11. https://doi.org/10.3102/0013189X018002005
  • Navas, M. J. (1996). Equiparación de puntuaciones [Equalization of scores]. In J. Muñiz (Ed.), Psicometría [Psychometry] (pp. 293-370). Universitas, S. A.
  • Olea, J., Ponsoda, V., & Prieto, G. (1999). Tests informatizados: Fundamentos y aplicaciones [Computerized tests: Fundamentals and applications]. Pirámide.
  • Pallant, J., & Tennant, A. (2007). An introduction to the Rasch measurement model: An example using the hospital anxiety and depression scale (HADS). British Journal of Clinical Psychology, 46(1),1-18. https://doi.org/10.1348/014466506X96931014466506X96931
  • Phankokkruad, M. (2012). Association rules for data mining in item classification algorithm: Web service approach. In Proceedings of the 2nd International Conference on Digital Information and Communication Technology and its Applications (pp. 463-468). https://doi.org/10.1109/DICTAP.2012.6215408
  • Pollock, M. J. (2002). Introduction of CAA into a mathematics course for technology students to address a change in curriculum requirements. International Journal of Technology and Design Education, 12(3), 249-270. https://doi.org/10.1023/A:1020229330655
  • Programa Estado de la Nación. (2011). Tercer informe estado de la educación [Third state of education report]. PEN.
  • Putwain, D. W., Connors, L., & Symes, W. (2010). Do cognitive distortions mediate the test anxiety–examination performance relationship? Educational Psychology, 30(1), 11-26. https://doi.org/10.1080/01443410903328866
  • R Core Team. (2017). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. https://www.R-project.org/
  • Reeve, B. B., Hays, R. D, Bjorner, J. B., Cook, K. F, Crane, P. K., Teresi, J. A., Thissen, D., Revicki, D. A., Weiss, D. J., Hambleton, R. K., Liu, H., Gershon, R., Reise, S. P., Lai, J. S., & Cella, D. (2007). Psychometric evaluation and calibration of health-related quality of life item banks: Plans for the patient-reported outcomes measurement information system (PROMIS). Medical Care, 45(5), S22-S31. https://doi.org/10.1097/01.mlr.0000250483.85507.04
  • Rodriguez, M. C. (2005). Three-options are optimal for multiple-choice items: A meta-analysis of 80 years of research. Educational Measurement, 24(2), 3-13. https://doi.org/10.1111/j.1745-3992.2005.00006.x
  • Rodríguez, P., Díaz, M., & Correa, A. (2015). Los aprendizajes al ingreso en un Centro Universitario Regional [Learning upon admission to a Regional University Center]. Intercambios, 2(1), 90–99. https://ojs.intercambios.cse.udelar.edu.uy/index.php/ic/article/view/47/149
  • Rojas, L., Mora, M., & Ordóñez, G. (2018). Asociación del razonamiento cuantitativo con el rendimiento académico en cursos introductorios de matemática de carreras STEM [Association of quantitative reasoning with academic performance in introductory mathematics courses of STEM careers]. Revista Digital Matemática, Educación e Internet [Digital Journal of Mathematics, Education and the Internet], 19(1), 1-13. https://doi.org/10.18845/rdmei.v19i1.3851
  • Rojas-Torres, L., & Ordóñez, G. (2019). Proceso de construcción de pruebas educativas: El caso de la prueba de habilidades cuantitativas [Development process of educational tests: Quantitative ability test]. Evaluar [Evaluate], 19(2), 15-29. https://doi.org/10.35670/1667-4545.v19.n2.25080
  • Smith, E. V. (2002). Detecting and evaluating the impact of multidimensionality using item fit statistics and principal component analysis of residuals. Journal of Applied Measurement, 3(2) 205-231.
  • Tennant, A., & Conaghan, P.G. (2007). The Rasch measurement model in rheumatology: What is it and why use it? When should it be applied and what should one look for in a Rasch paper? Arthritis Care & Research, 57(8), 1358-1362. https://doi.org/10.1002/art.23108
  • Tseng, W. (2016). Measuring English vocabulary size via computerized adaptive testing. Computers & Education, 97, 69-85. http://doi.org/10.1016/j.compedu.2016.02.018
  • Universidad Nacional de Córdoba. Secretaría de Asuntos Académicos. Programa de Estadística Universitaria (2020). Anuario estadístico 2019 [Statistical Yearbook 2019]. http://www.interior.gob.es/web/archivos-y-documentacion/anuario-estadistico-de-2019
  • Vie, J. J., Popineau, F., Bruillard, E., & Bourda, Y. (2017). A review of recent advances in adaptive assessment. In A. Peña-Ayala (Ed.), Learning analytics: Fundaments, applications, and trends. Studies in systems, decision and control. Springer, Cham. https://doi.org/10.1007/978-3-319-52977-6_4
  • Wainer, H. (2000). Computerized adaptive testing: A primer. Lawrence Erlbaum Associates. https://doi.org/10.4324/9781410605931
  • Yen, W. M. (1993). Scaling performance assessments: Strategies for managing local item dependence. Journal of Educational Measurement, 30(3), 187-213. https://doi.org/10.1111/j.1745-3984.1993.tb00423.x
  • Zamora Araya, J. A. (2015). Análisis de la confiabilidad de los resultados de la prueba de diagnóstico matemática en la Universidad Nacional de Costa Rica utilizando el modelo de Rasch [Reliability analysis diagnostic mathematics test at the National University of Costa Rica]. Actualidades en Psicología [News in Psychology], 29(119), 153-165. https://doi.org/10.15517/ap.v29i119.18693