Download PDFOpen PDF in browser

Using Semantics of Textbook Highlights to Predict Student Comprehension and Knowledge Retention

EasyChair Preprint no. 9161

13 pagesDate: October 26, 2022


As students read textbooks, they often highlight the material they deem to be most important. We mine students’ highlights to predict their subsequent performance on quiz questions. Past research in this area has encoded highlights in terms of where the highlights appear in the stream of text — a positional representation. In this work, we construct a semantic representation based on a state-of-the-art deep-learning sentence embedding technique (SBERT) that captures the content-based similarity between quiz questions and highlighted (as well as non-high-lighted) sentences in the text. We construct regression models that include latent variables for student skill level and question difficulty and augment the models with highlighting features. We find that highlighting features reliably boosts model performance. We conduct experiments that validate models on held-out questions, students, and student questions and find strong generalizations for the latter two but not for held-out questions. Surprisingly, highlighting features improve models for questions at all levels of the Bloom taxonomy, from straightforward recall questions to inferential synthesis/evaluation/creation questions.

Keyphrases: Deep embeddings, Natural Language Processing, student modeling, textbook annotation

BibTeX entry
BibTeX does not have the right entry for preprints. This is a hack for producing the correct reference:
  author = {David Young-Jae Kim and Tyler Scott and Debshila Mallick and Michael Mozer},
  title = {Using Semantics of Textbook Highlights to Predict Student Comprehension and Knowledge Retention},
  howpublished = {EasyChair Preprint no. 9161},

  year = {EasyChair, 2022}}
Download PDFOpen PDF in browser