4-7 December 2023
Skukuza
Africa/Johannesburg timezone
Please support the 2024 conference by completing the short survey.

® Distributed Deep Learning in HPC: Challenges and Opportunities

5 Dec 2023, 14:30
20m
1-1-2+4 - Ndau + Nari (Skukuza)

1-1-2+4 - Ndau + Nari

Skukuza

110
Talk Cognitive Computing and Machine Learning Special

Speaker

Dr Albert Kahira (Julich Supercomputing Center)

Description

Large-scale training of Deep Learning Models (DL) in High Performance Computing(HPC) systems has become increasingly common to achieve faster training time for larger models and datasets by alleviating memory constraints. Training DL models in these systems cuts weeks or even months of training to mere hours and facilitates faster prototyping and research in DL. Importantly, training some of the larger models is only possible through these large-scale machines. This talk will provide participants with a foundational understanding of the concepts and techniques involved in Deep Learning in HPC as well as challenges and opportunities for research in the area.

Student or Postdoc? Post-Doctoral

Primary author

Dr Albert Kahira (Julich Supercomputing Center)

Presentation Materials

There are no materials yet.