EVENTS

Our events in the areas of Big Data and Research Innovation include a diverse set of topics such as Future, Strategy, Technology, Applications, and Management.

If you feel that your event or event series should be part of this event calendar, just contact us!

images/02_events/Stabi1.jpg#joomlaImage://local-images/02_events/Stabi1.jpg?width=800&height=300

Wednesday, November 11th, 2023 | 17:00 - 18:30 p.m.

OCR4all – Open-Source Text Recognition from Mass Processing of Prints to High-Quality Transcription of Manuscripts

via Zoom

Examining historical sources in the form of printed and manuscript textual material is a crucial component of the work done by scholars in the humanities, as well as in the cultural and human sciences. These are frequently only accessible as scans, which drastically restricts how useful they may be because automatic indexing techniques like full-text search or quantitative analysis methods cannot be applied. The so-called machine-processable full text must first be extracted from the digitized data for this purpose, and methods for automatic text recognition of prints (OCR) or manuscripts (HTR) are becoming increasingly crucial in this process. Old prints and manuscripts, in particular, can still be exceedingly difficult to work with for a variety of reasons. Fortunately, historical OCR/HTR has made significant strides in recent years, leading to the development of some high-performance solutions.

OCR4all, a freely downloadable open source program created by the University of Würzburg's Center for Philology and Digitization (ZPD), seeks to make it possible for users of all skill levels to independently and accurately index complex printed materials and manuscripts. OCR4all is a single application that includes the whole text recognition workflow as well as all necessary tools. It is simple to install and use because to its user-friendly graphical user interface.

In addition to introducing OCR4all and its features through a live demonstration, the lecture goes through the fundamentals of automatic text recognition. Additionally, the performance and application on various materials will be shown, and an overview of recent work as well as a prognosis for future advances will be provided.

Speaker: Christian Reul

This event is in the series "Digital Humanities – How does it work?" of the Department for Digital Scholarship Services.

Institutions

Referat für Digitale Forschungsdienste, State and University Library Hamburg Carl von Ossietzky

Tags digital humanities, lecture, htr, ocr4all

Universität Hamburg
Adeline Scharfenberg
Diese E-Mail-Adresse ist vor Spambots geschützt! Zur Anzeige muss JavaScript eingeschaltet sein.