Articles

DeepSeek-R1 Thoughtology Let’s think about LLM reasoning

SV Marjanović, A Patel, V Adlakha, M Aghajohari, P BehnamGhader, M Bhatia, A Khandelwal, A Kraft, B Krojer, XH Lù, N Meade, D Shin, A Kazemnejad, G Kamath, M Mosbach, K Stańczak, S Reddy

In TMLR 2026

Posted on 02 April 2025

Large Reasoning Models like DeepSeek-R1 mark a fundamental shift in how LLMs approach complex problems. Instead of directly producing an answer for a given input, DeepSeek-R1 creates detailed multi-step reasoning chains, seemingly “thinking” about a problem before providing an answer. This reasoning process is publicly available to the user, creating... [Read More]

Tags:

A Reality Check on Context Utilisation for Retrieval-Augmented Generation

L Hagström, SV Marjanović, H Yu, A Arora, C Lioma, M Maistro, P Atanasova, I Augenstein

In ACL 2025

Posted on 24 January 2025

Retrieval-augmented generation (RAG) helps address the limitations of the parametric knowledge embedded within a language model (LM). However, investigations of how LMs utilise retrieved information of varying complexity in real-world scenarios have been limited to synthetic contexts. We introduce DRUID (Dataset of Retrieved Unreliable, Insufficient and Difficult-to-understand contexts) with real-world... [Read More]

Tags:

DYNAMICQA Tracing Internal Knowledge Conflicts in Language Models

SV Marjanović, H Yu, P Atanasova, M Maistro, C Lioma, I Augenstein

In EMNLP Findings 2024

Posted on 24 July 2024

Knowledge-intensive language understanding tasks require Language Models (LMs) to integrate relevant context, mitigating their inherent weaknesses, such as incomplete or outdated knowledge. However, conflicting knowledge can be present in the LM’s parameters, termed intra-memory conflict, which can affect a model’s propensity to accept contextual knowledge. To study the effect of... [Read More]

Tags:

Investigating the Impact of Model Instability on Explanations and Uncertainty.

SV Marjanović, I Augenstein, C Lioma

In ACL Findings 2024

Posted on 04 June 2024

Explainable AI methods facilitate the understanding of model behaviour, yet, small, imperceptible perturbations to inputs can vastly distort explanations. As these explanations are typically evaluated holistically, before model deployment, it is difficult to assess when a particular explanation is trustworthy. Some studies have tried to create confidence estimators for explanations,... [Read More]

Tags:

Quantifying gender biases towards politicians on Reddit

Sara Marjanović, Karolina Stańczak, Isabelle Augenstein

In PLoSONE

Posted on 26 October 2022

Despite attempts to increase gender parity in politics, global efforts have struggled to ensure equal female representation. This is likely tied to implicit gender biases against women in authority. We present a comprehensive study of gender biases that appear in online political discussion. To this end, we collect 10 million... [Read More]

Tags:

Ridiculing the “tinfoil hats:” Citizen responses to COVID-19 misinformation in the Danish facemask debate on Twitter.

N Johansen, SV Marjanović, CV Kjaer, RB Baglini, R Adler-Nissen

In HKS Misinformation Review

Posted on 02 March 2022

We study how citizens engage with misinformation on Twitter in Denmark during the COVID-19 pandemic. We find that misinformation regarding facemasks is not corrected through counterarguments or fact-checking. Instead, many tweets rejecting misinformation use humor to mock misinformation spreaders, whom they pejoratively label wearers of “tinfoil hats.” Tweets rejecting misinformation... [Read More]

Tags: