Scaling Distributed Machine Learning with In-Network Aggregation (Seminar)

Scaling Distributed Machine Learning with In-Network Aggregation (Seminar)

Professor Marco Canini,

KAUST: King Abdullah University of Science and Technology

Abstract:

Training complex machine learning models in parallel is an increasingly important workload. We accelerate distributed parallel training by designing a communication primitive that uses a programmable switch dataplane to execute a key step of the training process. Our approach reduces the volume of exchanged data by aggregating the model updates from multiple workers in the network. We co-design the switch processing with the end-host protocols and ML frameworks to provide a robust, efficient solution that speeds up training by up to 310%, and at least by 20% in most cases for a number of real-world benchmark models.

Bio

Marco does not know what the next big thing will be. But he’s sure that our next-gen computing and networking infrastructure must be a viable platform for it and avoid stifling innovation. Marco’s research area is cloud computing, distributed systems and networking. His current interest is in designing better systems support for AI/ML and provide practical implementations deployable in the real-world.
Marco is an associate professor in Computer Science at KAUST. Marco obtained his Ph.D. in computer science and engineering from the University of Genoa in 2009 after spending the last year as a visiting student at the University of Cambridge, Computer Laboratory. He was a postdoctoral researcher at EPFL from 2009 to 2012 and after that a senior research scientist for one year at Deutsche Telekom Innovation Labs & TU Berlin. Before joining KAUST, he was an assistant professor at the UCLouvain. He also held positions at Intel, Microsoft and Google.

For more information:

The event is finished.

About INESC-ID

INESC-ID, “Instituto de Engenharia de Sistemas e Computadores: Investigação e Desenvolvimento em Lisboa” is a Research and Development and Innovation Organization (R&D+i) in the fields of Computer Science and Electrical and Computer Engineering. INESC-ID mission is to produce added value to people and society, supporting the response of public policies to scientific, health, environmental, cultural, social, economic and political challenges. INESC-ID promotes cooperation between academia and industry by addressing research on daily life issues, such as healthcare, space, mobility, agri-food, industry 4.0, and smart grids. This high level of knowledge transfer is achieved through both competitive research projects and direct contracted research. Public and private entities have therefore access to a pool of knowledge, resources and services provided through the unique competencies available at the institution.

 

INESC-ID is supported by:

Join our newsletter

* indicates required

Subscriber consent

The data submitted through this form will be used exclusively for the sending of INESC-ID Newsletter, NEWS-ID, and will not, under any circumstances, be shared with third parties. If you choose to, you can easily unsubscribe from the newsletter by following the link presented in the footer. In that case, your data will be automatically deleted from our information system. If you need to update your contact information or clarify any questions related to the newsletter, please contact info@inesc-id.pt. By submitting this form, you give permission to the use of your personal data according to the conditions above.

We use Mailchimp as our marketing platform. By clicking below to subscribe, you acknowledge that your information will be transferred to Mailchimp for processing. Learn more about Mailchimp's privacy practices here.

© 2024, INESC-ID. All rights reserved

});