Validating automated assessments of teaching effectiveness using multimodal data

  • Tim Fütterer
  • , Ruikun Hou
  • , Babette Bühler
  • , Efe Bozkir
  • , Courtney Bell
  • , Enkelejda Kasneci
  • , Peter Gerjets
  • , Ulrich Trautwein

Research output: Contribution to journalArticlepeer-review

Abstract

Background: For enhancing student learning in classrooms, high-quality teaching is essential. Research highlighted core dimensions of effective teaching, including classroom management, student support, and cognitive activation. However, traditional methods of assessing teaching effectiveness dimensions (e.g., student surveys) have limitations, including rating biases and resource intensiveness. Aims: To overcome these challenges, we explored machine learning (ML) algorithms for the automated assessment of teaching effectiveness. Sample: The study analyzed multimodal data—such as video, audio, and transcripts—from the Global Teaching Insights study, which included video recordings and transcripts from 46 teachers and 1,132 students in Germany. Method: Scores for 18 teaching effectiveness subdimensions from three core dimensions were automatically generated by training attention-based ML models on multimodal features extracted from pretrained encoders. These ML-generated scores were compared with scores provided by human experts. A content validity study was conducted, where human experts evaluated the plausibility of ML-generated scores against human-generated scores. Structural equation models were used to assess the relationship between teaching effectiveness subdimensions and students’ tested achievement. Results: ML-generated scores were more reliable for some subdimensions (e.g., nature of discourse), and they were also plausible and content valid. ML-generated scores achieved higher absolute accuracy than human scores in 11 of 18 subdimensions. Limitations include reliance on human ratings as ground truth and inconsistent predictive validity, underscoring the need for refined models to generate actionable insights, such as real-time feedback systems. Conclusions: The findings provide valuable insights for the development of automated feedback, enhancing the practical application of teaching effectiveness assessments.

Original languageEnglish
Article number102264
JournalLearning and Instruction
Volume101
DOIs
StatePublished - Feb 2026

Keywords

  • Artificial intelligence
  • Automated assessment
  • Machine learning
  • Multimodal data
  • Teaching effectiveness

Fingerprint

Dive into the research topics of 'Validating automated assessments of teaching effectiveness using multimodal data'. Together they form a unique fingerprint.

Cite this