Sentiment and intent classification of in-text citations using BERT

17 pages•Published: July 18, 2022

Abstract

Methods such as the h-index and the journal impact factor are commonly used by the scientific community to quantify the quality or impact of research output. These methods rely primarily on citation frequency without taking the context of citations into consideration. Furthermore, these methods weigh each citation equally ignoring valuable citation characteristics, such as citation intent and sentiment. The correct classification of citation intents and sentiments can therefore be used to further improve scientometric impact metrics.
In this paper we evaluate BERT for intent and sentiment classification of in-text ci- tations of articles contained in the database of the Association for Computing Machinery (ACM) library. We analyse various BERT models which are fine-tuned with appropriately labelled datasets for citation sentiment classification and citation intent classification.
Our results show that BERT can be used effectively to classify in-text citations. We also find that shorter citation context ranges can significantly improve their classification. Lastly, we also evaluate these models with a manually annotated test dataset for sentiment classification and find that BERT-cased and SciBERT-cased perform the best.

Keyphrases: bert, citation analysis, neural networks, text classification

In: Aurona Gerber (editor). Proceedings of 43rd Conference of the South African Institute of Computer Scientists and Information Technologists, vol 85, pages 129-145.

Links:	https://easychair.org/publications/paper/6Fbc
	https://doi.org/10.29007/wk21

BibTeX entry

@inproceedings{SAICSIT2022:Sentiment_intent_classification_text,
  author    = {Ruan Visser and Marcel Dunaiski},
  title     = {Sentiment and intent classification of in-text citations using BERT},
  booktitle = {Proceedings of 43rd Conference of the South African Institute of Computer Scientists and Information Technologists},
  editor    = {Aurona Gerber},
  series    = {EPiC Series in Computing},
  volume    = {85},
  publisher = {EasyChair},
  bibsource = {EasyChair, https://easychair.org},
  issn      = {2398-7340},
  url       = {/publications/paper/6Fbc},
  doi       = {10.29007/wk21},
  pages     = {129-145},
  year      = {2022}}

Download PDF Open PDF in browser