Research Article

Preliminary Development of an Item Bank and an Adaptive Test in Mathematical Knowledge for University Students

Fernanda Belén Ghio 1 * , Manuel Bruzzone 1 , Luis Rojas-Torres 2 , Marcos Cupani 1
More Detail
1 Instituto de investigaciones Psicológicas (IIPSI), Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET), Facultad de Psicología, Universidad Nacional de Córdoba (UNC), ARGENTINA2 Institute for Psychological Research, University of Costa Rica, COSTA RICA* Corresponding Author
European Journal of Science and Mathematics Education, 10(3), July 2022, 352-365, https://doi.org/10.30935/scimath/11968
Published: 06 April 2022
OPEN ACCESS   1554 Views   776 Downloads
Download Full Text (PDF)

ABSTRACT

In the last decades, the development of computerized adaptive testing (CAT) has allowed more precise measurements with a smaller number of items. In this study, we develop an item bank (IB) to generate the adaptive algorithm and simulate the functioning of CAT to assess the domains of mathematical knowledge in Argentinian university students (N=773). Data were analyzed from the Rasch model. A simulation design created with the R software was used to determine the necessary items of the IB to estimate examinee ability. Our results indicate that the IB in the domains of mathematical knowledge is adequate to be applied in CAT, especially to estimate average ability levels. The use of CAT is recommended for rapidly generating indicators of the knowledge acquired by students and to design educational strategies that enhance student performance. Results, constrains, and implications of this study are discussed.

CITATION (APA)

Ghio, F. B., Bruzzone, M., Rojas-Torres, L., & Cupani, M. (2022). Preliminary Development of an Item Bank and an Adaptive Test in Mathematical Knowledge for University Students. European Journal of Science and Mathematics Education, 10(3), 352-365. https://doi.org/10.30935/scimath/11968

REFERENCES

  1. Andrich, D., & Marais, I. (2019). A course in Rasch measurement theory: Measuring in the educational, social and health sciences. Springer. https://doi.org/10.1007/978-981-13-7496-8
  2. Andrich, D., Sheridan, B., & Luo, G. (2010). Rasch models for measurement: RUMM2030 [computer software]. RUMM Laboratory Pty Ltd. https://www.rasch.org/rmt/rmt114d.htm
  3. Aybek, E. C., & Demirtasli, R. N. (2017). Computerized adaptive test (CAT) applications and item response theory models for polytomous items. International Journal of Research in Education and Science, 3(2), 475-487. https://doi.org/10.21890/ijres.327907
  4. Baker, F. B., & Kim, S. H. (2004). Item response theory: Parameter estimation techniques. Marcel Dekker.
  5. Baldasaro, R. E., Shanahan, M. J., & Bauer, D. J. (2013). Psychometric properties of the mini-IPIP in a large, nationally representative sample of young adults. Journal of Personality Assessment, 95(1), 74-84. https://doi.org/10.1080/00223891.2012.700466
  6. Barrada, J. R., Olea, J., Ponsoda, V., & Abad, F. J. (2006). Estrategias de selección de ítems en un test adaptativo informatizado para la evaluación de Inglés escrito. [Item selection rules in a computerized adaptive test for the assessment of written English]. Psicothema [Psychothema], 18(4), 828-834.
  7. Cavanagh, R. F., & Waugh, R. F. (2011). Applications of Rasch measurement in learning environments research. Sense Publishers. https://doi.org/10.1007/978-94-6091-493-5
  8. Chang, H. (2015). Psychometrics behind computerized adaptive testing. Psychometrika, 80(1), 1-20. https://doi.org/10.1007/s11336-014-9401-5
  9. Chen, W. H., & Thissen, D. (1997). Local dependence indices for item pairs using item response theory. Journal of Educational and Behavioral Statistics, 22(3), 265-289. https://doi.org/10.3102/10769986022003265
  10. Čisar, S. M., Čisar, P., & Pinter, R. (2016). Evaluation of knowledge in object oriented programming course with computer Adaptive tests. Computers & Education, 92-93, 142-160. https://doi.org/10.1016/j.compedu.2015.10.016
  11. Čisar, S. M., Radosav, D., Markoski, B., Pinter, R., & Čisar, P. (2010). Computer adaptive testing of student knowledge. Acta Polytechnica Hungarica, 7(4), 139-152.
  12. Costa, P., & Ferrão, M. E. (2015). On the complementarity of classical test theory and item response models: Item difficulty estimates and computerized adaptive testing. Ensaio: Avaliação e Políticas Públicas em Educação [Essay: Evaluation and Public Policies in Education], 23(88), 593-610. https://doi.org/10.1590/S0104-40362015000300003
  13. Cupani, M., Ghio, F., Leal, M., Giraudo, G., Castro Zamparella, T., Piumatti, G., Casalotti, A., Ramírez, J., Arranz, M., Farías, A., Padilla, N., & Barrionuevo, L. (2016). Desarrollo de un banco de ítems para medir conocimiento en estudiantes universitarios [Development of an item bank to measure knowledge in university students]. Revista de Psicología [Psychology Journal], 25(2), 1-18. https://doi.org/10.5354/0719-0581.2017.44808
  14. Cupani, M., Zamparella, T. C., Piumatti, G., & Vinculado G. (2017). Development of an item bank for the assessment of knowledge on biology in Argentine university students. Journal of Applied Measurement, 18(3), 360-369.
  15. Doran, Y. J. (2017). The role of mathematics in physics: Building knowledge and describing the empirical world. ONOMÁZEIN Número Especial LSF y TCL Sobre Educación y Conocimiento [ONOMÁZEIN Special Issue LSF and TCL on Education and Knowledge], 13(2), 209-226. https://doi.org/10.7764/onomazein.sfl.08
  16. Dorans, N. J., & Kingston, N. M. (1985). The effects of violations of unidimensionality on the estimation of item and ability parameters and on item response theory equating of the GRE verbal scale. Journal of Educational Measurement, 22(4), 249-262. https://doi.org/10.1111/j.1745-3984.1985.tb01062.x
  17. Downing, S. M., & Haladyna, T.M. (2006). Handbook of test development. Lawrence Erlbaum Associates.
  18. Engelbrecht, J., Harding, A., & Du Preez, J. (2007). Long-term retention of basic mathematical knowledge and skills with engineering students. European Journal of Engineering Education, 32(6), 735-744. https://doi.org/10.1080/03043790701520792
  19. Flores, A. H., & Gómez, A. (2009). Aprender matemática, haciendo matemática: La evaluación en el aula [Learning mathematics, doing mathematics: Assessment in the classroom]. Educación Matemática [Mathematics Education], 21(2) 117-142.
  20. Ghio, F. B., Cupani, M., Garrido, S. J., Azpilicueta, A. E., & Morán, V. E. (2019). Prueba para evaluar conocimiento en leyes: Análisis de los ítems mediante la aplicación del modelo de Rasch [Test to evaluate knowledge of law: Analysis of items applying the Rasch model]. Revista Científica Digital de Psicología PSIQUEMAG [Digital Scientific Journal of Psychology PSIQUEMAG], 8(1), 105-116
  21. Gierl, M. J., Bulut, O., Guo, Q., & Zhang, X. (2017). Developing, analyzing, and using distractors for multiple-choice tests in education: A comprehensive review. Review of Educational Research, 87(6), 1082-1116. https://doi.org/10.3102/0034654317726529
  22. Haladyna, T. M., & Rodriguez, M. C. (2013). Developing and validating test items. Routledge. https://doi.org/10.4324/9780203850381
  23. Han, K. (C.) T. (2018). Conducting simulation studies for computerized adaptive testing using SimulCAT: An instructional piece. Journal of Educational Evaluation for Health Professions, 15, 20. https://doi.org/10.3352/jeehp.2018.15.20
  24. Karjanto, N., & Yong, S. T. (2013). Test anxiety in mathematics among early undergraduate students in a British university in Malaysia. European Journal of Engineering Education, 38(1), 11-37. https://doi.org/10.1080/03043797.2012.742867
  25. Kaya, Z., & Tan, S. (2014). New trends of measurement and assessment in distance education. Turkish Online Journal of Distance Education, 15(1), 206-217. https://doi.org/10.17718/tojde.30398
  26. Kingsbury, G. G., & Houser, R. L. (1999). Developing computerized adaptive tests for school children. In Drasgow, F., & Olson-Buchanan, J. B. (Eds.), Innovations in computerized assessment (pp. 93-116). Erlbaum.
  27. Kolen, M. J., & Brennan, R. L. (2014). Test equating, scaling, and linking. Methods and practices. Springer. https://doi.org/10.1007/978-1-4939-0317-7
  28. Linacre, J. M. (2000). Computer-adaptive testing: A methodology whose time has come. In S. Chea, U. Kang, & J. M. Linacre (Eds.), Development of computerized middle school achievement test. Komesa Press.
  29. Lindquist, M., Philpot, R., Mullis, I. V. S., & Cotter, K. E. (2017). TIMSS 2019 mathematics framework. In I. V. S. Mullis, & M. O. Martin (Eds.), TIMSS 2019 assessment frameworks (pp. 11-25). TIMSS & PIRLS International Study Center, Boston College.
  30. Messick, S. (1989). Meaning and values in test validation: The science and ethics of assessment. Educational Researcher, 18(2), 5-11. https://doi.org/10.3102/0013189X018002005
  31. Navas, M. J. (1996). Equiparación de puntuaciones [Equalization of scores]. In J. Muñiz (Ed.), Psicometría [Psychometry] (pp. 293-370). Universitas, S. A.
  32. Olea, J., Ponsoda, V., & Prieto, G. (1999). Tests informatizados: Fundamentos y aplicaciones [Computerized tests: Fundamentals and applications]. Pirámide.
  33. Pallant, J., & Tennant, A. (2007). An introduction to the Rasch measurement model: An example using the hospital anxiety and depression scale (HADS). British Journal of Clinical Psychology, 46(1),1-18. https://doi.org/10.1348/014466506X96931014466506X96931
  34. Phankokkruad, M. (2012). Association rules for data mining in item classification algorithm: Web service approach. In Proceedings of the 2nd International Conference on Digital Information and Communication Technology and its Applications (pp. 463-468). https://doi.org/10.1109/DICTAP.2012.6215408
  35. Pollock, M. J. (2002). Introduction of CAA into a mathematics course for technology students to address a change in curriculum requirements. International Journal of Technology and Design Education, 12(3), 249-270. https://doi.org/10.1023/A:1020229330655
  36. Programa Estado de la Nación. (2011). Tercer informe estado de la educación [Third state of education report]. PEN.
  37. Putwain, D. W., Connors, L., & Symes, W. (2010). Do cognitive distortions mediate the test anxiety–examination performance relationship? Educational Psychology, 30(1), 11-26. https://doi.org/10.1080/01443410903328866
  38. R Core Team. (2017). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. https://www.R-project.org/
  39. Reeve, B. B., Hays, R. D, Bjorner, J. B., Cook, K. F, Crane, P. K., Teresi, J. A., Thissen, D., Revicki, D. A., Weiss, D. J., Hambleton, R. K., Liu, H., Gershon, R., Reise, S. P., Lai, J. S., & Cella, D. (2007). Psychometric evaluation and calibration of health-related quality of life item banks: Plans for the patient-reported outcomes measurement information system (PROMIS). Medical Care, 45(5), S22-S31. https://doi.org/10.1097/01.mlr.0000250483.85507.04
  40. Rodriguez, M. C. (2005). Three-options are optimal for multiple-choice items: A meta-analysis of 80 years of research. Educational Measurement, 24(2), 3-13. https://doi.org/10.1111/j.1745-3992.2005.00006.x
  41. Rodríguez, P., Díaz, M., & Correa, A. (2015). Los aprendizajes al ingreso en un Centro Universitario Regional [Learning upon admission to a Regional University Center]. Intercambios, 2(1), 90–99. https://ojs.intercambios.cse.udelar.edu.uy/index.php/ic/article/view/47/149
  42. Rojas, L., Mora, M., & Ordóñez, G. (2018). Asociación del razonamiento cuantitativo con el rendimiento académico en cursos introductorios de matemática de carreras STEM [Association of quantitative reasoning with academic performance in introductory mathematics courses of STEM careers]. Revista Digital Matemática, Educación e Internet [Digital Journal of Mathematics, Education and the Internet], 19(1), 1-13. https://doi.org/10.18845/rdmei.v19i1.3851
  43. Rojas-Torres, L., & Ordóñez, G. (2019). Proceso de construcción de pruebas educativas: El caso de la prueba de habilidades cuantitativas [Development process of educational tests: Quantitative ability test]. Evaluar [Evaluate], 19(2), 15-29. https://doi.org/10.35670/1667-4545.v19.n2.25080
  44. Smith, E. V. (2002). Detecting and evaluating the impact of multidimensionality using item fit statistics and principal component analysis of residuals. Journal of Applied Measurement, 3(2) 205-231.
  45. Tennant, A., & Conaghan, P.G. (2007). The Rasch measurement model in rheumatology: What is it and why use it? When should it be applied and what should one look for in a Rasch paper? Arthritis Care & Research, 57(8), 1358-1362. https://doi.org/10.1002/art.23108
  46. Tseng, W. (2016). Measuring English vocabulary size via computerized adaptive testing. Computers & Education, 97, 69-85. http://doi.org/10.1016/j.compedu.2016.02.018
  47. Universidad Nacional de Córdoba. Secretaría de Asuntos Académicos. Programa de Estadística Universitaria (2020). Anuario estadístico 2019 [Statistical Yearbook 2019]. http://www.interior.gob.es/web/archivos-y-documentacion/anuario-estadistico-de-2019
  48. Vie, J. J., Popineau, F., Bruillard, E., & Bourda, Y. (2017). A review of recent advances in adaptive assessment. In A. Peña-Ayala (Ed.), Learning analytics: Fundaments, applications, and trends. Studies in systems, decision and control. Springer, Cham. https://doi.org/10.1007/978-3-319-52977-6_4
  49. Wainer, H. (2000). Computerized adaptive testing: A primer. Lawrence Erlbaum Associates. https://doi.org/10.4324/9781410605931
  50. Yen, W. M. (1993). Scaling performance assessments: Strategies for managing local item dependence. Journal of Educational Measurement, 30(3), 187-213. https://doi.org/10.1111/j.1745-3984.1993.tb00423.x
  51. Zamora Araya, J. A. (2015). Análisis de la confiabilidad de los resultados de la prueba de diagnóstico matemática en la Universidad Nacional de Costa Rica utilizando el modelo de Rasch [Reliability analysis diagnostic mathematics test at the National University of Costa Rica]. Actualidades en Psicología [News in Psychology], 29(119), 153-165. https://doi.org/10.15517/ap.v29i119.18693