Speaker
Description
South Africa’s rich linguistic diversity poses unique challenges for artificial intelligence systems, particularly in automatic speech recognition (ASR) where multilingual speakers frequently switch languages mid-conversation. This study proposes a robust ASR pipeline tailored for code-switched speech in health settings, addressing practical issues such as overlapping dialogue, background noise, and inconsistent language usage. The pipeline will integrate multilingual acoustic models and language-specific preprocessing techniques, trained on a standardised dataset comprising South African languages including isiZulu, Sepedi and English.
By focusing on pipeline design, dataset standardisation and multilingual integration, this work demonstrates how AI can be built to truly understand South African voices rather than ignoring them. Structured and reproducible approaches to code-switched data lay the foundation for inclusive, fair, and context-aware AI that represents local language communities and highlight the broader opportunities for leveraging multilingual data responsibly.
| Institute | CSIR, MRatsoma@csir.co.za |
|---|---|
| Presenting Author | Mahlatse Mbooi |