Tuesday 14 January 2025 – Postgraduate Panel: Lu Liu (UCL) and Brad Scott (QMUL)
This seminar will be live online via Zoom at https://zoom.us/j/98773145835, and later posted to our YouTube channel.
The IHR Digital History Postgraduate Panels showcase historical research using digital methods that is taking place in the postgraduate community. A series of short papers will be followed by a question and answer session.
Lu Liu: ‘Gold Standard Corpus: a Shared Standard for Assessing the Capabilities of NER in Historical Documents’. Abstract: Named Entity Recognition (NER) is a Natural Language Processing technology that identifies and classifies named entities of common interest in text, including persons, places, organizations etc. In Digital History, NER contributes to more efficient analysis of textual data, opening up new possibilities for researchers to explore cultural and historical materials in innovative ways. However, the application of NER across various domains, languages, and historical periods, poses a challenge in systematically comparing different NER approaches. This hampers the use of technology, the comparability of outputs and transferability of techniques in the broader field of historical NER. Therefore, my research aims to create a shared annotated corpus with the objective of standardising the assessment and performance measurement of various historical NER systems. By establishing the corpus, my study can drive the advancement of historical NER technology in this field, and advance collaboration and innovation at the intersection of history and digital methods.
Brad Scott: ‘Using TEI to model the plant collections of Hans Sloane’. Abstract: Hans Sloane (1660-1753) amassed over 120,000 specimens of pressed plants over the course of his life, the vast majority of which were acquired, bequeathed or bought from someone else. As such, this ‘collection of collections’ is a remarkable archive of global encounters, transactions and exchanges which can potentially be used to inform our knowledge of the history of plant collecting before Linnaeus. This paper will describe how the XML schema of the Text Encoding Initiative (TEI) has been utilised to provide several complementary models for describing the herbarium, which enable various analyses at scale. However, reading against the grain of the data reveals the ambiguities, gaps and silences that such models provide. Constructing a data set in parallel with work on the material culture and paper archives has enabled new insights into the making of the collection and of botanical knowledge production in the late seventeenth and eighteenth centuries.