30 November 2025 to 3 December 2025
Century City Conference Centre
Africa/Johannesburg timezone
The conference programme and timetable now live.

Optimization and Placement Patterns for training and inference workloads

1 Dec 2025, 14:10
20m
1/1-8+9 - Room 8+9 (Century City Conference Centre)

1/1-8+9 - Room 8+9

Century City Conference Centre

80

Speaker

Dr Lerato Mohapi (Eclipse Holdings)

Description

The current paradigm in HPC for AI has shifted from simple "data parallelism" to multi-dimensional (3D) parallelism and heterogeneous co-execution. In this talk, we will discuss optimization and placement patterns for training and inference workloads. For Training, we will focus on topology-aware placement that minimizes inter-node communication latency using 3D parallelism. For inference, We will also focus on maximizing throughput per watt and utilizing "stranded" capacity via hybrid CPU-GPU pipelining and dynamic model partitioning (e.g., Multi-Instance GPU or MIG). We will then demonstrate how these placement strategies can be used to harness the power of HPC in AI workloads by applying 3D parallelism and heterogeneous co-execution.

Primary author

Dr Lerato Mohapi (Eclipse Holdings)

Presentation Materials

There are no materials yet.