Feature extraction for content-based recommendation – Mining the long tail
Paula Vaz Lobo,
Inesc-ID –
Abstract:
The large amount of available items for consumption surpasses our processing capabilities. New content (books, news, music, video, etc.) is published every day, highly exceeding our capacity to make informed choices. The items that we do not know become potentially useless, because we are not aware of its existence and cannot specifically search for them.
Current recommendation systems try to predict what we want to consume. Nevertheless, quite often tend to recommend popular items, because they are mostly based on ratings. This phenomenon shapes the consumer curve as a Pareto’s distribution placing popular rated items in the “head” (the first 20% of the total items) and the unpopular unrated items in the “long tail” (the rest 80%). Items in the long tail have a recognized interest for smaller groups of people. However, current recommendation systems are failing to reveal the unpopular items, because of the rating scarcity. There is a need to assist people finding interesting unrated items in the long tail.
In this thesis we explore textual features of documents in long tail. We explore document content to find similar documents using a top-N recommendation algorithm. We use semantic similarity (documents about the same subjects) as well as stylometric similarity (documents with similar types of writing style) to find documents that are closer to user preferences. Document similarity is measured using documents semantic and stylometric features. The combination of these two features type can improve recommendations novelty and help people find interesting documents in the long tail.
Date: 2011-Mar-09 Time: 14:30:00 Room: 336
For more information:
Upcoming Events
INESC Brussels HUB Winter Meeting 2023

This edition of the HUB Winter Meeting will be co-organised with Science Business and will take place on the 30 and 31 January, in Lisbon, at Instituto Superior Técnico, Department of Computer Science and Engineering.
Please see below a summary of the agenda, this will be updated on the INESC Brussels HUB website regularly (confirmed speakers and other relevant info). Places for onsite participation are limited so registration is mandatory. Online participants will be sent a ZOOM link for each specific session on the 27th January.
INESC Brussels HUB website: https://hub.inesc.pt/
Monday, 30 January
a) Digital Europe Programme & Chips Act: state of play and possibilities for INESC.
9h to 10h30 GMT
(Exclusive for INESC researchers and administrators).
b) Science Business: how can INESC tap into Science Business network, activities and communications tools.
(Exclusive for INESC researchers and administrators).
c) Networking Lunch (for all onsite participants).
d) Roundtable: From rhetoric to reality – Embedding international strategy in the DNA of research organisations.
(Closed-door, roundtable workshop, Chatham House rules, open to INESC researchers and administrators, external participants by invitation only).
e) Networking Dinner
(By invitation only – INESC researchers participating onsite in the event are elegible to join).
Tuesday, 31 January
f) Workshop: How they did it? Strategic positioning for structural success in Horizon Europe: a discussion of best practices.
(Exclusive for INESC researchers, administrators and international invited speakers).
g) The public consultation on European R&I Programmes: Towards FP10.
(Closed-door, roundtable workshop, Chatham House rules, open to INESC researchers and administrators, external participants by invitation only).
h) Networking Lunch (for all onsite participants).
i) Management Committee meeting (Directors and POB members)
The HUB Winter Meeting aims at bringing together researchers and administrators from the 5 INESC institutes, affiliated higher education institutions in Portugal and abroad, with key European and global players, to:
– Discuss key research and innovation issues at EU level.
– Inform institutional policy and strategy.
– Exchange best-practices about R&I management, career development and policy positioning.
– Promote, discuss and deliver vision, visibility, networking and impactful communication.
– Create, identify and deepen partnerships and collaboration opportunities for collaborative R&I.