Arianna Graciotti


2025

pdf bib
‘... like a needle in a haystack”: Annotation and Classification of Comparative Statements
Pritha Majumdar | Franziska Pannach | Arianna Graciotti | Johan Bos
Proceedings of the 9th Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature (LaTeCH-CLfL 2025)

We present a clear distinction between the phenomena of comparisons and similes along with a fine-grained annotation guideline that facilitates the structural annotation and assessment of the two classes, with three major contributions: 1) a publicly available annotated data set of 100 comparative statements; 2) theoretically grounded annotation guidelines for human annotators; and 3) results of machine learning experiments to establish how the–often subtle–distinction between the two phenomena can be automated.

2024

pdf bib
Latent vs Explicit Knowledge Representation: How ChatGPT Answers Questions about Low-Frequency Entities
Arianna Graciotti | Valentina Presutti | Rocco Tripodi
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

In this paper, we present an evaluation of two different approaches to the free-form Question Answering (QA) task. The main difference between the two approaches is that one is based on latent representations of knowledge, and the other uses explicit knowledge representation. For the evaluation, we developed DynaKnowledge, a new benchmark composed of questions concerning Wikipedia low-frequency entities. We wanted to ensure, on the one hand, that the questions are answerable and, on the other, that the models can provide information about very specific facts. The evaluation that we conducted highlights that the proposed benchmark is particularly challenging. The best model answers correctly only on 50% of the questions. Analysing the results, we also found that ChatGPT shows low reliance on low-frequency entity questions, manifesting a popularity bias. On the other hand, a simpler model based on explicit knowledge is less affected by this bias. With this paper, we want to provide a living benchmark for open-form QA to test knowledge and latent representation models on a dynamic benchmark.
OSZAR »