INRIA & CNRS, University Montpellier –
With the advent of Web 3.0, the Internet of things, and citizen science applications, users are producing bigger and bigger amounts of diverse data, which are stored in a large variety of systems. Since the users’ data spaces are scattered among those independent systems, data sharing becomes a challenging problem. Distributed search and recommendation provides a general solution for data sharing and among its various alternatives, gossip-based approaches are particularly interesting as they provide scalability, dynamicity, autonomy and decentralized control. Generally, in these approaches each participant maintains a cluster of “relevant” users, which are later employed in query processing. However, only considering relevance in the construction of the cluster introduces a significant amount of redundancy among users, which in turn leads to reduced recall. Indeed, when a query is submitted, due to the high similarity among the users in a cluster, the probability of retrieving the same set of relevant items increases, thus limiting the number of distinct results that can be obtained. In this talk I will present the resultant new gossip-based clustering algorithms and validate them through experimental evaluation over four real datasets, we show that taking into account diversity based clustering score enables to obtain major gains in terms of recall. In addition, I will also present same ongoing work on scientific data management carried by Zenith Inria team.
Esther Pacitti is a full professor of Computer Science at University of Montpellier in the south of France. She is co-head of the Zenith team (Inria&Cnrs), pursuing her research in distributed data management and scientific data management. She teaches in an engineering school (Polytech’ Montpellier) where she is responsible for international relations, welcoming foreign students. Previously, she was an assistant professor at the University of Nantes (2002-2009). Her teaching and research interests include data replication, recommendation systems, query processing in large-scale distributed systems (cluster, P2P, cloud) and scientific workflow management. She has published more than 90 technical papers. She has served or is serving as program committee member of major international conferences including SIGMOD, ICDE, CIKM ,VLDB, EDBT, etc.
For more information: