VersionClimber: an algorithm and system for package evolution in data science
Prof. Dennis Shasha,
Courant Institute of New York University –
Abstract:
Imagine you are a data scientist (as many of us are/have become).
Systems you build typically require many data sources and many packages
(machine learning/data mining, data management, and visualization) to run.
Your working configuration will consist of a set of packages each at a particular version.You want to update some packages (software or data) to their most recent versions possible, but you want your system to run after the upgrades,
thus perhaps entailing changes to the versions of other packages.
One approach is to hope the latest versions of all packages work.If that fails, the fallback is manual trial and error, but that quickly ends in frustration.
We advocate a provenance-style approach in which tools like ptrace
enable us to identify version combinations of different packages.
Then version control systems like pip, and github and VirtualEnv enable us to fetch particular versions of packages and try them in a sandbox-like environment.
Because the space of versions to explore grows exponentially with the number of packages, we have developed a memoizing algorithm that avoids exponential search while still finding an optimum version combination.
Heuristics combined with certain empirical facts about packages (e.g. local upward compatibility) improves performance further still.
We present experimental results on well known packages used in data science to illustrate the effectiveness of our approach.
Bio
Dennis Shasha is a professor of computer science at the Courant Institute of New York University and an Associate Director of NYU Wireless.
He works with biologists on pattern discovery for network inference; with computational chemists on algorithms for protein design; with physicists and financial people on algorithms for time series; on clocked computation for DNA computing; and on computational reproducibility.
Other areas of interest include database tuning as well as tree and graph matching.
Because he likes to type, he has written six books of puzzles about a mathematical detective named Dr. Ecco, a biography about great computer scientists, and a book about the future of computing.
He has also written five technical books about database tuning, biological pattern recognition, time series, DNA computing, resampling statistics, and causal inference in molecular networks.
He has co-authored over eighty journal papers, seventy conference papers, and twenty-five patents.He has written the puzzle column for various publications including Scientific American, Dr. Dobb’s Journal, and the Communications of the ACM.
He is a fellow of the ACM and an INRIA International Chair.
Host
Helena Isabel de Jesus Galhardas
Venue:
IST – anfiteatro VA1
Upcoming Events
INESC Brussels HUB Winter Meeting 2023

This edition of the HUB Winter Meeting will be co-organised with Science Business and will take place on the 30 and 31 January, in Lisbon, at Instituto Superior Técnico, Department of Computer Science and Engineering.
Please see below a summary of the agenda, this will be updated on the INESC Brussels HUB website regularly (confirmed speakers and other relevant info). Places for onsite participation are limited so registration is mandatory. Online participants will be sent a ZOOM link for each specific session on the 27th January.
INESC Brussels HUB website: https://hub.inesc.pt/
Monday, 30 January
a) Digital Europe Programme & Chips Act: state of play and possibilities for INESC.
9h to 10h30 GMT
(Exclusive for INESC researchers and administrators).
b) Science Business: how can INESC tap into Science Business network, activities and communications tools.
(Exclusive for INESC researchers and administrators).
c) Networking Lunch (for all onsite participants).
d) Roundtable: From rhetoric to reality – Embedding international strategy in the DNA of research organisations.
(Closed-door, roundtable workshop, Chatham House rules, open to INESC researchers and administrators, external participants by invitation only).
e) Networking Dinner
(By invitation only – INESC researchers participating onsite in the event are elegible to join).
Tuesday, 31 January
f) Workshop: How they did it? Strategic positioning for structural success in Horizon Europe: a discussion of best practices.
(Exclusive for INESC researchers, administrators and international invited speakers).
g) The public consultation on European R&I Programmes: Towards FP10.
(Closed-door, roundtable workshop, Chatham House rules, open to INESC researchers and administrators, external participants by invitation only).
h) Networking Lunch (for all onsite participants).
i) Management Committee meeting (Directors and POB members)
The HUB Winter Meeting aims at bringing together researchers and administrators from the 5 INESC institutes, affiliated higher education institutions in Portugal and abroad, with key European and global players, to:
– Discuss key research and innovation issues at EU level.
– Inform institutional policy and strategy.
– Exchange best-practices about R&I management, career development and policy positioning.
– Promote, discuss and deliver vision, visibility, networking and impactful communication.
– Create, identify and deepen partnerships and collaboration opportunities for collaborative R&I.
INESC-ID ESR Talks – February 2023

If you are a masters/PhD student or a postdoctoral fellow, come and present your work in an informal and friendly environment – and savour some tasty snacks!
Individual talks will be 10-15 minutes plus time for feedback. Enroll on your selected date by emailing pedro.ferreira[at]inesc-id.pt.
Happening on the second Wednesday of every month (4pm-5pm):
- 11 January (Alves Redol, Room 9)
- 15 February (Alves Redol, Room 9)
- 15 March (Alves Redol, Room 9)
- 12 April (Alves Redol, Room 9)
- 10 May (Alves Redol, Room 9)
- 14 June (Alves Redol, Room 9)
- 12 July (Alves Redol, Room 9)
We hope to see you there!