<script type="text/javascript">
<!--
document.write('<div id="oa_widget"></div>');
document.write('<script type="text/javascript" src="https://beta.openaire.eu/index.php?option=com_openaire&view=widget&format=raw&projectId=undefined&type=result"></script>');
-->
</script>

COPY SCRIPT

For further information contact us at helpdesk@openaire.eu

Coding energy knowledge in constructed responses with explainableNLPmodels

descriptionPublicationkeyboard_double_arrow_right Article 15 Dec 2022 Netherlands Publisher:WileyJournal:Journal of Computer Assisted Learning, volume 39, pages 767-786 (issn: 0266-4909, eissn: 1365-2729,

Authors: Sebastian Gombert; Daniele Di Mitri; Onur Karademir; Marcus Kubsch; Hannah Kolbe; Simon Tautz; Adrian Grimm; +3 Authors

doi: 10.1111/jcal.12767

handle: 1820/596123c6-87df-4736-9de6-3b74cdcc737b

Coding energy knowledge in constructed responses with explainableNLPmodels

- Summary
- Subjects
- Metrics

Abstract

AbstractBackgroundFormative assessments are needed to enable monitoring how student knowledge develops throughout a unit. Constructed response items which require learners to formulate their own free‐text responses are well suited for testing their active knowledge. However, assessing such constructed responses in an automated fashion is a complex task and requires the application of natural language processing methodology. In this article, we implement and evaluate multiple machine learning models for coding energy knowledge in free‐text responses of German K‐12 students to items in formative science assessments which were conducted during synchronous online learning sessions.DatasetThe dataset we collected for this purpose consists of German constructed responses from 38 different items dealing with aspects of energy such as manifestation and transformation. The units and items were implemented with the help of project‐based pedagogy and evidence‐centered design, and the responses were coded for seven core ideas concerning the manifestation and transformation of energy. The data was collected from students in seventh, eighth and ninth grade.MethodologyWe train various transformer‐ and feature‐based models and compare their ability to recognize the respective ideas in students' writing. Moreover, as domain knowledge and its development can be formally modeled through knowledge networks, we evaluate how well the detection of the ideas within responses translated into accurate co‐occurrence‐based knowledge networks. Finally, in terms of the descriptive accuracy of our models, we inspect what features played a role for which prediction outcome and if the models pick up on undesired shortcuts. In addition to this, we analyze how much the models match human coders in what evidence within responses they consider important for their coding decisions.ResultsA model based on a modified GBERT‐large can achieve the overall most promising results, although descriptive accuracy varies much more than predictive accuracy for the different ideas assessed. For reasons of comparability, we also evaluate the same machine learning architecture using the SciEntsBank 3‐Way benchmark with an English RoBERTa‐large model, where it achieves state‐of‐the‐art results in two out of three evaluation categories.

Country

Netherlands

Related Organizations

Goethe University Frankfurt
Germany
Open University in the Netherlands
Netherlands
Ruhr University Bochum
Germany
German Institute for International Educational Research
Germany
Leibniz Institute for Science and Mathematics Education
Germany

View all View all

Keywords

Physikunterricht, Pupil Evaluation, Lower secondary, Modell, Teaching of physics, energy didactics, Secondary education lower level, Automatisierung, Germany, Task, Empirische Bildungsforschung, knowledge networks, Sekundarstufe I, Energy, Coding, Empirische Untersuchung, automated coding,, Pupils, Text, Empirical study, Physics Education, automated short answer scoring, Knowledge, Lower level secondary education, Schüler, Energie, Lower secondary education, Leistungsbeurteilung, Encoding (Psychology), Erziehung, Schul- und Bildungswesen, Assessment, Wissen, Education, Judgment, Achievement test, Codierung, Deutschland, automated coding, constructed response assessment, Pupil, Physics lessons, Computerunterstützter Unterricht, energy transformation, Antwort, Achievement measurement, Judgement, Encoding, short answer scoring, Bewertung, Aufgabe, ddc: ddc:370

Impact byBIP!

	citations This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	23
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Top 10%
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Top 10%
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Top 10%