In the century between the 1870s and the 1970s, hundreds of Malay-language periodicals circulated around the Malay-speaking world. These periodicals chronicle a fascinating era and have been the focus of intense study by scholars such as William Roff and Ian Proudfoot. Many of these periodicals have been digitized, and comprehensive collections are at the National Library of Singapore, as well as in other libraries and archives.
Given the availability and size of the collections, the opportunity is ripe for systematic digital analysis. Projects elsewhere in the world have demonstrated the power of analysing historical newspapers at scale using computational methods. Examples include "Living With Machines" (a partnership between the British Library and several universities in the UK) and "Oceanic Exchanges" (a partnership between Finland, Germany, Mexico, the Netherlands, the United Kingdom, and the United States).
What is holding back a similar study of Malay-language newspapers?
– The main obstacle is the script. The majority of these periodicals were published in Jawi, an adaptation of the Perso-Arabic script for the Malay language, which poses significant challenges for digital processing. For one thing, typical Optical Character Recognition pipelines (OCR) don't work well for Jawi.
– Another challenge is that most contemporary Malay readers, including many historians who would be interested in these collections, are less familiar with Jawi than with Rumi (the Romanized version of Malay most commonly used today). The automatic transliteration from Jawi to Rumi is also a complex task, as vowels are often not marked down in Jawi.
– In addition to this, spelling conventions have changed, and there are many different approaches for transliterating the same word.
To address these challenges, the "Computational Heritage" research group at the National University of Singapore has developed specialized AI models for both Jawi OCR and Jawi-to-Rumi transliteration. In this talk, I will describe our progress so far, the challenges we still face and the future directions of our work.
Institutions
Prof. Øyvind Breivik (University of Bergen)
The recent advances in ML or AI modelling of the atmosphere and the ocean have upended decades of conventional wisdom - namely that the way forward is higher resolution and better parameterizations of what remains unresolved. Here I will present a handful of examples of how forecasting and modelling the atmosphere and the ocean can be done using graph neural networks and more traditional convolutional neural networks. The big question is then whether we are headed toward a future where models in the traditional sense become obsolete? I will argue that on the contrary, we need the models to guide (supervise) machine learning and artificial intelligence. However, the current use of numerical models is not fit for purpose and we need to rethink what type of numerical models we use for the training. We also need to be aware of the common pitfalls in machine learning - perhaps most importantly how ML models handle previously "unseen" cases, whether these come in the form of extreme weather events or in modelling a future climate very different from what the models have been trained on.
Institutions
Universität Hamburg
Adeline Scharfenberg
Universität Hamburg
Adeline Scharfenberg
Universität Hamburg
Adeline Scharfenberg