Elastic and Fault-Tolerant Stream Processing in the Cloud
Prof. Peter Pietzuch,
Department of Computing, Imperial College London –
As users of “big data” applications want fresh processing results, we
witness a new breed of stream processing systems that are designed to
scale to large numbers of cloud-hosted machines. Such systems face new
challenges: (i) to benefit from the “pay-as-you-go” model of cloud
computing, they must scale out on demand; (ii) with deployments on
hundreds of virtual machines (VMs), failures are common — systems
must therefore be fault-tolerant with fast recovery times. An open
question is how to achieve these two goals when stream queries include
stateful operators whose state may depend on the complete history of
In this talk, I describe an integrated approach for dynamic scale out
and recovery of stateful stream processing operators. The idea is to
expose internal operator state explicitly to the stream processing
system through a set of state management primitives. Externalised
operator state is checkpointed periodically and backed up by the
system. In addition, the system identifies operator bottlenecks and
automatically scales them out by allocating new VMs. We evaluate this
approach as part of the SEEP experimental stream processing system on
the Amazon EC2 cloud platform and show that it can scale
automatically, while recovering quickly from failures.
(This talk is based on work published at ACM SIGMOD’13 and USENIX ATC’14.)
Peter Pietzuch is a Senior Lecturer (Associate Professor) at Imperial
College London, leading the Large-scale Distributed Systems (LSDS)
group in the Department of Computing. His research focuses on the
design and engineering of scalable, reliable and secure large-scale
software systems, with a particular interest in data management and
networking issues. He has published over sixty research papers in
international venues, including USENIX ATC, NSDI, SIGMOD, VLDB, ICDE,
ICDCS, Middleware and DEBS. He has co-authored a book on Distributed
Event-based Systems published by Springer. Before joining Imperial
College, he was a post-doctoral fellow at Harvard University. He holds
PhD and MA degrees from the University of Cambridge.
Paulo Jorge Pires Ferreira
IST, room QA1.3 (south tower)
Mathematics, Physics & Machine Learning Seminar Series (Online)
The Mathematics, Physics & Machine Learning seminar series has started on October 2020 and runs until March 2021.
The seminars aim to bring together mathematicians and physicists interested in machine learning (ML) with ML and AI experts interested in mathematics and physics, with the goal of introducing innovative Mathematics and Physics-inspired techniques in Machine Learning and, reciprocally, applying Machine Learning to problems in Mathematics and Physics.
Attendance is free but registration is required.
More information is available here.
International European Conference on Parallel and Distributed Computing
The 27th International European Conference on Parallel and Distributed Computing (Euro-Par 2021) will take from August 30 to September 3 2021 in Lisbon.
Euro-Par is the prime European conference covering all aspects of parallel and distributed processing, ranging from theory to practice, from small to the largest parallel and distributed systems and infrastructures, from fundamental computational problems to full-fledged applications, from architecture, compiler, language and interface design and implementation, to tools, support infrastructures, and application performance aspects.
The 2021 edition of Euro-Par will be organized as a collaboration between INESC-ID and Instituto Superior Técnico (IST).
– Abstract Submission: February 5, 2021
– Paper Submission Deadline: February 12, 2021
– Author Notification: April 30, 2021
– Camera-Ready Papers: June 6, 2021
More information is available here.