A Linear Time Biclustering Algorithm for Time Series Genomic Expression Data
Sara C. Madeira,
Universidade da Beira Interior –
Abstract:
Recent developments in DNA chips now enable the simultaneous measure of the expression level of a large number of genes (sometimes all the genes of an organism) for a given experimental condition. Most commonly, gene expression data is arranged in a data matrix, where each gene corresponds to one row and each condition to one column. The conditions may correspond to different time points, different environmental conditions, different organs or different individuals. Simply visualizing this kind of data is challenging. Using it to extract biologically relevant knowledge is even harder.
Several non-supervised machine learning methods have been used in the analysis of gene expression data obtained from microarray experiments. Recently, biclustering, a non-supervised approach that performs simultaneous clustering on the row and column dimensions of the data matrix, has been shown to be remarkably effective in a variety of applications. The goal of biclustering is to find subgroups of genes and subgroups of conditions, where the genes exhibit highly correlated behaviors. In the most common settings, biclustering is an NP-complete problem, and heuristic approaches are used to obtain sub-optimal solutions using reasonable computational resources.
In this talk, we describe a particular setting of the problem, where we are concerned with finding biclusters in time series expression data, and present a linear time biclustering algorithm to achieve this goal.
When analyzing time series expression data, with the goal of isolating coherent activity between genes in a subset of conditions, it is reasonable to restrict the attention to biclusters with contiguous columns. We support this view by assuming that the activation of a set of genes under specific conditions corresponds to the activation of a particular biological process. As time goes on, biological processes start and finish, leading to increased (or decreased) activity of sets of genes that can be identified because they form biclusters with contiguous columns. In this setting, we are interested in finding biclusters where the columns are consecutive in time. For this particular version of the problem, we propose an algorithm that finds and reports all relevant biclusters in time linear on the size of the data matrix. This impressive reduction in complexity is obtained by manipulating a discretized version of the data matrix and by using advanced string manipulation techniques based on suffix trees.
The talk will give a short introduction to biclustering and suffix trees, present the biclustering algorithm and show results in synthetic data and preliminary results on a real biological data set from Yeast, that show the effectiveness of the approach.
Date: 2005-Feb-24 Time: 16:30:00 Room: 336
For more information:
Upcoming Events
INESC Brussels HUB Winter Meeting 2023

This edition of the HUB Winter Meeting will be co-organised with Science Business and will take place on the 30 and 31 January, in Lisbon, at Instituto Superior Técnico, Department of Computer Science and Engineering.
Please see below a summary of the agenda, this will be updated on the INESC Brussels HUB website regularly (confirmed speakers and other relevant info). Places for onsite participation are limited so registration is mandatory. Online participants will be sent a ZOOM link for each specific session on the 27th January.
INESC Brussels HUB website: https://hub.inesc.pt/
Monday, 30 January
a) Digital Europe Programme & Chips Act: state of play and possibilities for INESC.
9h to 10h30 GMT
(Exclusive for INESC researchers and administrators).
b) Science Business: how can INESC tap into Science Business network, activities and communications tools.
(Exclusive for INESC researchers and administrators).
c) Networking Lunch (for all onsite participants).
d) Roundtable: From rhetoric to reality – Embedding international strategy in the DNA of research organisations.
(Closed-door, roundtable workshop, Chatham House rules, open to INESC researchers and administrators, external participants by invitation only).
e) Networking Dinner
(By invitation only – INESC researchers participating onsite in the event are elegible to join).
Tuesday, 31 January
f) Workshop: How they did it? Strategic positioning for structural success in Horizon Europe: a discussion of best practices.
(Exclusive for INESC researchers, administrators and international invited speakers).
g) The public consultation on European R&I Programmes: Towards FP10.
(Closed-door, roundtable workshop, Chatham House rules, open to INESC researchers and administrators, external participants by invitation only).
h) Networking Lunch (for all onsite participants).
i) Management Committee meeting (Directors and POB members)
The HUB Winter Meeting aims at bringing together researchers and administrators from the 5 INESC institutes, affiliated higher education institutions in Portugal and abroad, with key European and global players, to:
– Discuss key research and innovation issues at EU level.
– Inform institutional policy and strategy.
– Exchange best-practices about R&I management, career development and policy positioning.
– Promote, discuss and deliver vision, visibility, networking and impactful communication.
– Create, identify and deepen partnerships and collaboration opportunities for collaborative R&I.