Establishing a physics concept inventory using computer marked free-response questions

Mark A. J. Parker; Holly Hedgeland; Sally E. Jordan; Nicholas St. J. Braithwaite

doi:10.30935/scimath/12680

Establishing a physics concept inventory using computer marked free-response questions

Mark A. J. Parker ¹ ^* , Holly Hedgeland ¹ ², Sally E. Jordan ¹, Nicholas St. J. Braithwaite ¹

More Detail

¹ School of Physical Sciences, The Open University, Walton Hall, Milton Keynes, MK7 6AA, UK
² Clare Hall, University of Cambridge, Herschel Road, Cambridge, CB3 9AL, UK
^* Corresponding Author

EUR J SCI MATH ED, Volume 11, Issue 2, pp. 360-375. https://doi.org/10.30935/scimath/12680

Published Online: 05 December 2022, Published: 01 April 2023

OPEN ACCESS 1358 Views 687 Downloads

Download Full Text (PDF)

ABSTRACT

The study covers the development and testing of the alternative mechanics survey (AMS), a modified force concept inventory (FCI), which used automatically marked free-response questions. Data were collected over a period of three academic years from 611 participants who were taking physics classes at high school and university level. A total of 8,091 question responses were gathered to develop and test the AMS. The AMS questions were tested for reliability using classical test theory (CTT). The AMS computer marking rules were tested for reliability using inter-rater reliability (IRR). Findings from the CTT and IRR studies demonstrated that the AMS questions and marking rules were overall reliable. Therefore, the AMS was established as a physics concept inventory which uses automatically-marked, free-response questions. The approach used to develop and test the AMS could be used in further attempts to develop concept inventories which make use of automatically-marked, free-response questions.

Keywords: computer-marked assessment, automated marking, concept inventories, free-response questions, physics education

CITATION

Parker, M. A. J., Hedgeland, H., Jordan, S. E., & Braithwaite, N. S. J. (2023). Establishing a physics concept inventory using computer marked free-response questions. European Journal of Science and Mathematics Education, 11(2), 360-375. https://doi.org/10.30935/scimath/12680

REFERENCES

Artstein, R., & Poesio, M. (2008). Inter-coder agreement for computational linguistics. Computational Linguistics, 34(4), 555-596. https://doi.org/10.1162/coli.07-034-R2
Butcher, P. G., & Jordan, S. E. (2010). A comparison of human and computer marking of short free-text student responses. Computers and Education, 55, 489-499. https://doi.org/10.1016/j.compedu.2010.02.012
Cohen, J. (1960). A coefficient for nominal scales. Educational and Psychological Measurement, 20, 37-46. https://doi.org/10.1177/001316446002000104
Crocker, L., & Algina, J. (1986). Introduction to classical and modern test theory. Wadsworth Group/Thompson Learning.
Ding, L., & Beichner, R. (2009). Approaches to data analysis of multiple-choice questions. Physical Review Special Topics-Physics Education Research, 5, 020103. https://doi.org/10.1103/PhysRevSTPER.5.020103
Ding, L., Chaby, R., Sherwood, B., & Beichner, R., (2006). Evaluating an electricity and magnetism assessment tool: Brief electricity and magnetism assessment. Physical Review Special Topics-Physics Education Research, 2, 010105. https://doi.org/10.1103/PhysRevSTPER.2.010105
Doran, R. (1980). Basic measurement and evaluation of science instruction. NSTA.
Eaton, P., (2021). Evidence of measurement invariance across gender for the force concept inventory. Physical Review Physics Education Research, 17, 010130. https://doi.org/10.1103/PhysRevPhysEducRes.17.010130
Garvin-Doxas, K., Klymkowsky, M., & Elrod, S. (2007). Building, using, and maximizing the impact of concept inventories in the biological sciences: Report on a National Science Foundation-sponsored conference on the construction of concept inventories in the biological sciences. CBE Life Sciences Education, 6(4), 277-282. https://doi.org/10.1187/cbe.07-05-0031
Han, J., Bao, L., Chen, L., Cai, T., Pi, Y., Zhou, S., Tu, Y., & Koenig, K. (2015). Dividing the force concept inventory into two equivalent half-length tests. Physical Review Special Topics-Physics Education Research, 11, 010112. https://doi.org/10.1103/PhysRevSTPER.11.010112
Han, J., Koenig, K., Cui, L., Fritchman, J., Li, D., Sun, W., Fu, Z., & Bao, L. (2016). Experimental validation of the half-length force concept inventory. Physical Review Special Topics-Physics Education Research, 12, 020122. https://doi.org/10.1103/PhysRevPhysEducRes.12.020122
Hestenes, D., Wells, M., & Swackhamer, G. (1992). Force concept inventory. The Physics Teacher, 30, 141-158. https://doi.org/10.1119/1.2343497
Hufnagel, B. (2002). Development of the astronomy diagnostic test. Astronomy Education Review, 1(1), 47-51. https://doi.org/10.3847/AER2001004
Hunt, T. (2012). Computer-marked assessment in Moodle: Past, present, and future. In Proceedings of Computer Assisted Assessment 2012 International Conference.
Jordan, S. (2012). Short-answer e-assessment questions: Five years on. In Proceedings of the 2012 International Computer Assisted Assessment Conference.
Kline, P. (1986). A handbook of test construction: Introduction to psychometric design. Methuen.
Lee, N. W., Shamsuddin, W. N. F. W, Wei, L. C., Anuardi, M. N. A. M., Heng, C. S., & Abdullah, A. N. (2021). Using online multiple choice questions with multiple attempts: A case for self-directed learning among tertiary students. International Journal of Evaluation and Research in Education, 10(2), 553-568. https://doi.org/10.11591/ijere.v10i2.21008
Mitchell, T., Aldridge, N., Williamson, W., & Broomhead, P. (2003). Computer based testing of medical knowledge. In Proceedings of the 7^th International Computer Assisted Assessment Conference.
Nicol, D., (2007). E‐assessment by design: Using multiple‐choice tests to good effect. Journal of Further and Higher Education, 31(1), 53-64. https://doi.org/10.1080/03098770601167922
Porter, L., Taylor, C., & Webb, K. (2014). Leveraging open source principles for flexible concept inventory development. In Proceedings of the 2014 Conference on Innovation & Technology in Computer Science Education (pp. 243-248). https://doi.org/10.1145/2591708.2591722
Rebello, N., & Zollman, D. (2004). The effect of distractors on student performance on the force concept inventory. American Journal of Physics, 72, 116. https://doi.org/10.1119/1.1629091
Scott, T. F., & Schumayer, D. (2017). Conceptual coherence of non-Newtonian worldviews in force concept inventory data. Physical Review Physics Education Research, 13, 010126. https://doi.org/10.1103/PhysRevPhysEducRes.13.010126
Simon, & Snowdon, S. (2014). Multiple-choice vs free-text code-explaining examination questions. In Proceedings of the 14^th Koli Calling International Conference on Computing Education Research (pp. 91-97). https://doi.org/10.1145/2674683.2674701
Smith, J. I., & Tanner, K. (2010). The problem of revealing how students think: Concept inventories and beyond. CBE Life Sciences Education, 9(1), 1-5. https://doi.org/10.1187/cbe.09-12-0094
Sychev, O., Anikin, A., & Prokudin, A. (2020) Automatic grading and hinting in open-ended text questions. Cognitive Systems Research, 59, 264-272. https://doi.org/10.1016/j.cogsys.2019.09.025
Thornton, R., & Sokoloff, D. (1998). Assessing student learning of Newton’s laws: The force and motion conceptual evaluation and the evaluation of active learning laboratory and lecture curricula. American Journal of Physics, 66, 338. https://doi.org/10.1119/1.18863
Yasuda, J., Mae, N., Hull, M. M., & Taniguchi, M., (2021). Optimizing the length of computerized adaptive testing for the force concept inventory. Physical Review Physics Education Research, 17, 010115. https://doi.org/10.1103/PhysRevPhysEducRes.17.010115
Zehner, F., Salzer, C., & Goldhammer, F. (2016). Automatic coding of short text responses via clustering in educational assessment. Educational and Psychological Measurement, 76(2), 280-303. https://doi.org/10.1177/0013164415590022
Zeilik, M., (2003). Birth of the astronomy diagnostic test: Prototest evolution. Astronomy Education Review, 1(2), 46-52. https://doi.org/10.3847/AER2002005
Zhang, L., & VanLehn, K., (2021). Evaluation of auto-generated distractors in multiple choice questions from a semantic network. Interactive Learning Environments, 29(6), 1019-1036. https://doi.org/10.1080/10494820.2019.1619586
Zwick, R. (1988). Another look at interrater agreement. Psychological Bulletin, 103(3), 374-378. https://doi.org/10.1037/0033-2909.103.3.374

Journal Details

Founded In: 2013

Published: Quarterly

Language: English

APC: €500

Indexed in ERIC & SCOPUS

CiteScore 2021 : 1.3

Submit Now