ChatGPT versus engineering education assessment: a multidisciplinary and multi-institutional benchmarking and analysis of this generative artificial intelligence tool to investigate assessment integrity

Sasha Nikolic, Scott Daniel, Rezwanul Haque, Marina Belkina, Ghulam Mubashar Hassan, Sarah Grundy, Sarah Lyden, Peter Neal, Caz Sandison

Research output: Contribution to journalArticlepeer-review

27 Citations (Scopus)

Abstract

ChatGPT, a sophisticated online chatbot, sent shockwaves through many sectors once reports filtered through that it could pass exams. In higher education, it has raised many questions about the authenticity of assessment and challenges in detecting plagiarism. Amongst the resulting frenetic hubbub, hints of potential opportunities in how ChatGPT could support learning and the development of critical thinking have also emerged. In this paper, we examine how ChatGPT may affect assessment in engineering education by exploring ChatGPT responses to existing assessment prompts from ten subjects across seven Australian universities. We explore the strengths and weaknesses of current assessment practice and discuss opportunities on how ChatGPT can be used to facilitate learning. As artificial intelligence is rapidly improving, this analysis sets a benchmark for ChatGPT’s performance as of early 2023 in responding to engineering education assessment prompts. ChatGPT did pass some subjects and excelled with some assessment types. Findings suggest that changes in current practice are needed, as typically with little modification to the input prompts, ChatGPT could generate passable responses to many of the assessments, and it is only going to get better as future versions are trained on larger data sets.
Original languageEnglish
Pages (from-to)559-614
Number of pages56
JournalEuropean Journal of Engineering Education
Volume48
Issue number4
Early online date26 May 2023
DOIs
Publication statusPublished - 2023

Fingerprint

Dive into the research topics of 'ChatGPT versus engineering education assessment: a multidisciplinary and multi-institutional benchmarking and analysis of this generative artificial intelligence tool to investigate assessment integrity'. Together they form a unique fingerprint.

Cite this