ChatGPT versus engineering education assessment: a multidisciplinary and multi-institutional benchmarking and analysis of this generative artificial intelligence tool to investigate assessment integrity

TLDR

ChatGPT’s ability to pass exams has sparked concerns about assessment authenticity in higher education, while also hinting at potential benefits for learning and critical thinking. This study investigates how ChatGPT may influence engineering assessment by testing its responses to prompts from ten subjects across seven Australian universities and evaluating current assessment practices. The authors benchmark ChatGPT’s performance on early‑2023 engineering prompts, assessing strengths, weaknesses, and opportunities for AI‑facilitated learning. ChatGPT succeeded on some subjects and assessment types, indicating that minimal prompt changes can yield passable responses, underscoring the need to revise current practices as future AI versions improve.

Abstract

ChatGPT, a sophisticated online chatbot, sent shockwaves through many sectors once reports filtered through that it could pass exams. In higher education, it has raised many questions about the authenticity of assessment and challenges in detecting plagiarism. Amongst the resulting frenetic hubbub, hints of potential opportunities in how ChatGPT could support learning and the development of critical thinking have also emerged. In this paper, we examine how ChatGPT may affect assessment in engineering education by exploring ChatGPT responses to existing assessment prompts from ten subjects across seven Australian universities. We explore the strengths and weaknesses of current assessment practice and discuss opportunities on how ChatGPT can be used to facilitate learning. As artificial intelligence is rapidly improving, this analysis sets a benchmark for ChatGPT’s performance as of early 2023 in responding to engineering education assessment prompts. ChatGPT did pass some subjects and excelled with some assessment types. Findings suggest that changes in current practice are needed, as typically with little modification to the input prompts, ChatGPT could generate passable responses to many of the assessments, and it is only going to get better as future versions are trained on larger data sets.

References

Page 1

	Year	Citations

Page 1