INESC-ID Talk: “Machine Learning as a magnifying glass to study society” by Joana Gonçalves de Sá (SPAC, LIP)

INESC-ID Talk: “Machine Learning as a magnifying glass to study society” by Joana Gonçalves de Sá (SPAC, LIP)

On January 23, at 13:00, INESC-ID will host a talk by Joana Gonçalves de Sá, a researcher at LIP and the coordinator investigator of the Social Physics and Complexity (SPAC) research group, at Técnico. The talk is titled “Machine Learning as a magnifying glass to study society”.

Date & Time: January 23, 13:00
Where: INESC Lisboa, Rua Alves Redol, 9, 1000-029 Lisboa | Room 9 (Auditorium), Ground Floor

“Machine Learning Algorithms (MLAs) are trained on vast amounts of data and work by learning patterns and finding non-linear and often black box mathematical relations between that data. A central challenge MLAs face is that the data used to train them is not generated in a social vacuum: if the data or the targets are biased, the models will also be biased. This creates an important problem: how should MLAs be trained to identify relevant differences in data while not perpetuating or even amplifying prejudice or social bias? To date, the main approach has been deductive, or top-down: researchers or coders start by listing known biases, such as racial prejudice, and then search for signs of their presence in the data, the models, or in societies.

The implicit assumptions are that a) all biases or all types of biased features are known a-priori, b) they are identifiable; and c) once identified, they can be debiased-against. However, there is no comprehensive and universal list of biases, new biases emerge dynamically, and the coder or researcher’s contextual backgrounds influence the debiasing approaches.

In summary, even screened datasets or models are likely to contain biased patterns. Therefore, it is crucial to develop inductive systems to identify biases in MLAs.

The talk will be divided into two parts. In the first, I will describe the first (to the best of our knowledge), experimental audit study for detecting possible differential tracking in misinformation websites and its impact on third-party content and search engine results. We created a two-staged experimental audit, which resorts to stateful crawlers to mimic users browsing the web, while experimentally controlling for websites, time and geo-location, and collecting online tracking data. But analyzing differences in search-engine recommendation to bots that have different browsing experiences (and, thus, collected different cookies) it should be possible to audit their algorithms for biased customization. I will present results indicating that 1) disinformation websites are tracked more heavily by third-parties than non-disinformation websites, 2) simply changing the location of the bots is sufficient to customize the content being recommended, and 3) this has implications for polarization and misinformation spread. In the second part, I will discuss the possibility of expanding on this and other work and take advantage of MLAs to identify novel biases. That MLAs so efficiently learn from widely recognized prejudice, suggests that it should be possible to use algorithms to reverse the problem and develop statistical, bottom-up tools to identify latent, unknown biases. This is a very preliminary project, and I would value the community’s input.”

Joana Gonçalves de Sá is a researcher at LIP and the coordinator investigator of the Social Physics and Complexity (SPAC) research group. She has a degree in Physics Engineering from Instituto Superior Técnico – University of Lisbon, and a PhD in Systems Biology from NOVA – ITQB, having developed her thesis at Harvard University, USA. Her current research uses data analytics and machine learning to study complex problems at the interface between Biomedicine, Social Sciences, and Computation, with a large ethical and societal focus. Before that, she was an Associated Professor at Nova School of Business and Economics and a Principal Investigator at Instituto Gulbenkian de Ciência, where she also coordinated the Science for Society Initiative and was the founder and Director of the Graduate Program Science for Development (PGCD), aiming at improving scientific research in Africa. She received two ERC grants (Stg_2019 and PoC_2022) to study human and algorithmic biases using fake news as a model system.

The event is finished.


INESC-ID, “Instituto de Engenharia de Sistemas e Computadores: Investigação e Desenvolvimento em Lisboa” is a Research and Development and Innovation Organization (R&D+i) in the fields of Computer Science and Electrical and Computer Engineering. INESC-ID mission is to produce added value to people and society, supporting the response of public policies to scientific, health, environmental, cultural, social, economic and political challenges. INESC-ID promotes cooperation between academia and industry by addressing research on daily life issues, such as healthcare, space, mobility, agri-food, industry 4.0, and smart grids. This high level of knowledge transfer is achieved through both competitive research projects and direct contracted research. Public and private entities have therefore access to a pool of knowledge, resources and services provided through the unique competencies available at the institution.


INESC-ID is supported by:

Join our newsletter

* indicates required

Subscriber consent

The data submitted through this form will be used exclusively for the sending of INESC-ID Newsletter, NEWS-ID, and will not, under any circumstances, be shared with third parties. If you choose to, you can easily unsubscribe from the newsletter by following the link presented in the footer. In that case, your data will be automatically deleted from our information system. If you need to update your contact information or clarify any questions related to the newsletter, please contact By submitting this form, you give permission to the use of your personal data according to the conditions above.

We use Mailchimp as our marketing platform. By clicking below to subscribe, you acknowledge that your information will be transferred to Mailchimp for processing. Learn more about Mailchimp's privacy practices here.