
The expression of hate speech against Afro-descendant, Roma, and LGBTQ+ communities in YouTube comments
What’s in a word, and especially one mobilized for online hate speech (OHS)? A team of INESC-ID researchers has asked exactly that.
Authored by INESC-ID Information and Decision Support Systems (IDSS) researchers Paula Carvalho and Danielle Caled, and Human Language Technologies (HLT) researchers Fernando Batista and Ricardo Ribeiro (together with Cláudia Silva from ITI-LARSyS)*, The expression of hate speech against Afro-descendant, Roma, and LGBTQ+ communities in YouTube comments — published this month in the Journal of Language Aggression and Conflict — explores the prevalence of overt and covert hate speech, counter-speech and offensive speech in CO-HATE (Counter, Offensive and Hate speech), a corpus of Portuguese 20,590 YouTube comments posted by more than 8,000 different online users.
By asking two simple yet challenging questions — 1) how does OHS against the Afro-descendant, Roma, and LGBTQ+ communities materialize in the Portuguese social context? and 2) which are the main linguistic and rhetorical features underlying the expression of covert hate speech? — and creating a detailed database of written Portuguese (essential in studying and identifying online hate speech targeting Afro-descendant, Roma, and LGBTQ+ communities on social media), the team analyzed the specific characteristics of hateful comments towards these groups by combining quantitative and qualitative research methods based on corpus linguistics (which analyze large collections of texts to understand how language is used, uncovering patterns and relationships between words and structures, thus providing data-driven insights into the myriad ways language is used). They then measured agreement among annotators when identifying OHS and related topics.
By studying how people express hatred in their comments, the team found that comment writers often use specific language and persuasive techniques. They also discovered that hate speech is often hidden behind irony and misleading arguments, a kind of speech that tries to make people afraid and encourages them to take action.
This study offers valuable insights that can help detect online hate speech more effectively. It also deepens our understanding of how hate speech works online in Portugal, especially towards marginalized groups. Furthermore, the corpus created by Paula Carvalho et al. will be a valuable resource for those interested in developing methods to detect both obvious and hidden hate speech, as well as other related behaviors like counter-speech and offensive language, in Portuguese.
Future research venues might involve expanding this study to other social media platforms like Twitter and include more communities such as migrants and refugees. The team is also planning on involving more annotators, considering their social backgrounds, to better assess agreement between different communities.
This project follows a very successful research line at INESC-ID. Last year we had reported on the FCT-funded HATE COVID-19.PT project, coordinated by Paula Carvalho, and under which methods for semi-automatically putting together a large-scale Portuguese annotated corpus covering online hate speech were created.
*Paula Carvalho, Danielle Caled and Cláudia Silva are also affiliated with Instituto Superior Técnico, and Ricardo Ribeiro with Instituto Universitário de Lisboa (ISCTE-IUL).
Upcoming Events
INESC Lisboa 2023 Annual Meeting

On November 3rd will take place the first Meeting of INESC Lx, a consortium of the INESC institutions based in Lisbon: INESC-ID, INOV and INESC MN.
The event will take place at Hotel Golf Mar, in Vimeiro, giving all the participants the opportunity to discuss and share the work being developed at INESC Lx. The agenda includes in the Opening Session the President of the Commission for Regional Development and Coordination of Lisbon and Tagus Valley (CCDR LVT), Teresa Almeida and in the Closing Session the Portuguese Minister of Economy, António da Costa Silva.
If you are a member of INESC Lx you can register here by September 29th.
About INESC Lisboa
INESC Lisboa is a unique research consortium in Portugal, representing INESC-ID, INESC MN and INOV, research institutions in the areas of computer science, electronics and computer engineering, and physics engineering in Lisbon. Its mission is to facilitate synergies among the institutes to promote the Research, Development, and Innovation (R&D+i) made in Portugal.
ER 2023 in Lisbon between November 6-9

This year’s edition of the International Conference on Conceptual Modeling (ER 2023) will be held between November 6 and 9 in Lisbon, at the Congress Center of the Instituto Superior Técnico (Alameda Campus)
ER 2023 is the main international forum for discussing the state of the art, emerging issues, and future challenges in research and practice on conceptual modeling.
Topics of interest span the entire spectrum of conceptual modeling, including research and practice in areas such as theories of concepts and ontologies, techniques for transforming conceptual models into effective implementations, and methods and tools for developing and communicating conceptual models.
The event is being jointly organized by INESC-ID, Instituto Superior Técnico and University of Twente.
Keynote speakers:
- Daniel Jackson, “Software Design, Concepts and AI”
- Maurizio Lenzerini, “Conceptual Modeling and Knowledge Representation: a journey from Data Modeling to Knowledge Graphs”
- Walid S. Saba, “Reverse Engineering of Language at Scale: Towards Symbolic and Explainable Large Language Models”
- Catia Pesquita, “True or False? The impact of negative knowledge in biomedical artificial intelligence”
The event’s agenda is available here and online registration.
More on ER 2023 Website.
OLISSIPO Workshop: “Metabolism and mathematical models: Two for a tango”

The 3rd Edition of the workshop “Metabolism and mathematical models: Two for a tango” will take place online on November 14-15, 2023.
The topic of the workshop is metabolism in general. As in the previous cases, a special focus will be placed on parasitology. Besides an exploration of the biological, biochemical and biomedical aspects of metabolism, the workshop will also aim at presenting some of the mathematical modelling, algorithmic theory and software development that have become crucial to explore such aspects.
More details will soon be provided but initial information may be already found here.
Keynote speakers:
- Barbara Bakker (Faculty of Medical Sciences, University of Groningen, The Netherlands)
- Igor Cestari (Institute of Parasitology, McGill University, Montréal, Canada)
- Vassily Hatzimanikatis (École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland)
- Steffen Klamt (Max Planck Institute for Dynamics of Complex Technical Systems, Magdeburg, Germany)
- Ina Koch (Goethe Universität, Frankfurt am Main, Germany)
- Laura-Isobel McCall (San Diego State University, US)
Besides the talks given by the above Keynote speakers, there will be two sessions, one per day, of 30 minutes each, dedicated to a discussion of some specific open questions. More details on these will be provided very soon.The workshop will take place in the afternoons, from 2pm to 5:30pm CET time.
Registration is free but is required here.
Olissipo website.