2026 Annual DIRISA National Research Data Workshop

Name: 2026 Annual DIRISA National Research Data Workshop
Start: 2026-07-01T07:30:00+02:00
End: 2026-07-02T17:00:00+02:00
Location: CSIR ICC

1-2 July 2026

CSIR ICC

Africa/Johannesburg timezone

Provisional programme now available.

Contact

What influences language resource reuse? The SADiLaR repository as a case study

Not scheduled

20m

ICC (CSIR ICC)

ICC

CSIR ICC

Talk DIRISA

Dr Benito Trollip (South African Centre for Digital Language Resources (SADiLaR), North-West University)

The reuse of language resources is a cornerstone of responsible, sustainable, and FAIR-aligned research in linguistics and digital humanities. The South African Centre for Digital Language Resources, as a specialist repository for language-specific resources, serves as a case study for this presentation. We demonstrate that repository infrastructure alone, including persistent identifiers, rich metadata, and long-term preservation, does not automatically ensure visible or measurable reuse. Researchers and/or other possible users may be unaware of datasets, may cite secondary publications rather than the underlying resources, or may reuse data in ways that leave no explicit trace. As a result, valuable language resources can appear underutilised despite playing an important role in research processes.

To explore this further, we conducted a manual and exploratory search for references to SADiLaR-hosted resources in academic publications, as well as analysing web-based user activity. The labour-intensive and imperfect nature of this process highlights the absence of systematic mechanisms for identifying and documenting dataset reuse.

Improving the visibility of the reuse of language resources, especially of South(ern) African languages, is not only important for research transparency but also for demonstrating the broader societal value of language infrastructures. In multilingual contexts where accessible digital resources support language development, education, and inclusive knowledge production, this is even more important. Unfortunately, tracing the reuse of language resources remains a challenge and is often insufficiently supported by aspects like consistent citation practices. In this presentation, we will argue that current approaches to measuring language resource reuse underestimate the extent to which such resources are utilised by a broader user community.

Dr Benito Trollip (South African Centre for Digital Language Resources (SADiLaR), North-West University)

Dr Michelle White (South African Centre for Digital Language Resources (SADiLaR), North-West University)

There are no materials yet.

2026 Annual DIRISA National Research Data Workshop

Contact

What influences language resource reuse? The SADiLaR repository as a case study

ICC

CSIR ICC

Speaker

Description

Primary author

Co-author

Presentation Materials