Presentation supporting a paper published as part of the Proceedings of the Language Resources and Evaluation Conference (LREC - organised virtually) 2020 first workshop on Resources for African Indigenous Languages (RAIL) on May 16, 2020. The presentation was also updated for the 50th Colloquium of African Languages and Linguistics on Wednesday September 2, 2020 (updated with more discussion curating place names).
The ǂKhomani San | Hugh Brody Collection features the voices and history of indigenous hunter gatherer descendants in three endangered languages namely, Nǀuu, Kora and Khoekhoe as well as a regional dialect of Afrikaans. A large component of this collection is audio-visual (legacy media) recordings of interviews conducted with members of the community by Hugh Brody and his colleagues between 1997 and 2012, referring as far back as the 1800s. The Digital Library Services team at the University of Cape Town aim to showcase the collection digitally on the UCT-wide Digital Collections platform, Ibali which runs on Omeka S. In this presentation we highlight the importance of such a collection in the context of South Africa, and the steps that were taken to prepare the transcripts which were generated from the audiovisual material for publication. We outline our development process in preparing the collection for a linked data online showcase website, from digitisation to repository publishing as well as present some of the challenges in data clean-up, the curation of legacy media, multi-lingual support, and site organisation.
TOC:
00:00 | Welcome and Intro 05:30 | Overview of Collection and Transcription Process
09:38 | Digital Curation
18:13 | Conclusion
19:12 | References
As both conferences were held online the presentation was pre-recorded in each case.