Prof. Peter Pietzuch,

Department of Computing, Imperial College London

Abstract:

As users of “big data” applications want fresh processing results, we
witness a new breed of stream processing systems that are designed to
scale to large numbers of cloud-hosted machines. Such systems face new
challenges: (i) to benefit from the “pay-as-you-go” model of cloud
computing, they must scale out on demand; (ii) with deployments on
hundreds of virtual machines (VMs), failures are common — systems
must therefore be fault-tolerant with fast recovery times. An open
question is how to achieve these two goals when stream queries include
stateful operators whose state may depend on the complete history of
the stream.

In this talk, I describe an integrated approach for dynamic scale out
and recovery of stateful stream processing operators. The idea is to
expose internal operator state explicitly to the stream processing
system through a set of state management primitives. Externalised
operator state is checkpointed periodically and backed up by the
system. In addition, the system identifies operator bottlenecks and
automatically scales them out by allocating new VMs. We evaluate this
approach as part of the SEEP experimental stream processing system on
the Amazon EC2 cloud platform and show that it can scale
automatically, while recovering quickly from failures.

(This talk is based on work published at ACM SIGMOD’13 and USENIX ATC’14.)

Bio

Peter Pietzuch is a Senior Lecturer (Associate Professor) at Imperial
College London, leading the Large-scale Distributed Systems (LSDS)
group in the Department of Computing. His research focuses on the
design and engineering of scalable, reliable and secure large-scale
software systems, with a particular interest in data management and
networking issues. He has published over sixty research papers in
international venues, including USENIX ATC, NSDI, SIGMOD, VLDB, ICDE,
ICDCS, Middleware and DEBS. He has co-authored a book on Distributed
Event-based Systems published by Springer. Before joining Imperial
College, he was a post-doctoral fellow at Harvard University. He holds
PhD and MA degrees from the University of Cambridge.

Host

Paulo Jorge Pires Ferreira

Venue:

IST, room QA1.3 (south tower)