Venue: John S Cohen Room 203, 2nd floor, IHR, North block, Senate House
In this talk, Ryan Cordell will draw from the Viral Texts project at Northeastern University to demonstrate how reprinting, excerpting, and related textual practices shaped popular ideas about science and mechanics in the mid-nineteenth-century, both in the US and internationally. In widely-circulated advertisements from the 1840s, 50s, and 60s, the publishers of Scientific American lauded the paper’s “interesting, valuable, and useful information” for readers. Many nineteenth-century editors agreed, and columns from Scientific American were among the most widely-reprinted in the period, along with a plethora of related recipes, household tips, listicles, and columns of practical knowledge that promised to be of immediate use to readers. While individually such pieces might seem ephemeral to modern readers, when considered as a corpus—and tracked across space and time—they contribute to a more comprehensive understanding of everyday reading and writing during the nineteenth-century. Computationally-derived bibliographies of “information literature” allow us to ask what kinds of scientific knowledge “went viral”—to borrow a modern term—among nineteenth-century readers, and what might these pieces tell us about the priorities of readers and editors? What “information literature” spread beyond national borders? How did nineteenth-century newspaper exchanges foster a more diffuse (but possibly less robust) understanding of science and technology among the public?
Convenor’s Response: Are historical newspapers terrible historical sources?
By James Baker
18 May 2016
As a historian I know my sources are always problematic. I know they don’t provide an unmediated window into the past. As a historian of long eighteenth century Britain I know that the newspaper sources I use are no different. I know they don’t provide an unmediated window into the past and I know that the format by which they are mediated – the news sheet – may closely resemble the newspaper I buy on a Saturday morning but that it is, in reality, only the same in so far as it appears every day (if it is a daily, excluding Sunday editions) and that it has columns of type that contain a combination of current affairs, reportage, advertisements, gossip, stocks, and shares. Of course even the columns of type are barely comparable: my Saturday morning newspaper contains little poetry (unless Carol Anne-Duffy has been tasked to mark), few short stories (flurries of Hilary Mantel aside), and no notices of takings from subscription lists for the relief of persons wrongly accused of riot. I’ve published on the latter and to find these notices – and other material relating to the Covent Garden Old Price riots – I browsed through six months of digitised newspaper pages. I know that those digitised newspaper sources are problematic as well. I know that they don’t provide an unmediated window into the past, not least because of the digitisation bias of memory institutions created by the circumstances under which both collection and digitisation take place, that is who decided what was important, why they decided it, and that both the position of the chooser and the choices they made are culturally constructed by temporal, spatial, gender, social, class, racial, and economic factors. In spite of all of this I invest some authority in the newspapers I find in archives, physical and digital. I qualify that authority. I try to find sources that corroborate what they are saying and thereby their authority. And whether I find that corroboration or not I still invest some authority in what the (largely) anonymous authors of those newspapers have to say, for otherwise I – and plenty of other historians – wouldn’t get much done.
Last night Ryan Cordell tore through that authority. Or, more precisely, he underscored the nagging doubts about the authority of those newspapers that I – and no doubt many of my fellow historians – have always had; those same doubts that lead us to pause and qualify and cringe when anyone takes historic newspapers as an unproblematic neutral record of historical fact. What Ryan told us – alongside many other things (see my live, partial, CC notes) – is that historical newspapers, in particular newspapers published in nineteenth century North America, did not necessarily deal in truths. They took things from other newspapers and printed them without citation. They changed those things along the way and as those changes took place – in most cases – so as to satisfy restrictions on space, those that made the changes weren’t always as careful as they might be with details: descriptions, figures, names were all liable to shift. Even authors could change. At its most extreme the author of a poem could change from Charles Dickenson to Charles Dickens with potentially far-reaching consequences for the history of literature (or, given that the poem in question was clearly not a Dickens, the reception history of literature).
Ryan and the Viral Texts team – of which he is co-PI – did all this with a source base of millions of newspaper pages, billions of words, clustered and queried by computational methods. And they did this in spite of the thing that historians whinge and moan about most when it comes to digitised historical newspapers: that is, Optical Character Recognised text (or, OCR for short). OCR text is the best that human directed software at a single moment in history could do to make sense of the human readable contents of a series of digital images of historic pages of text. When those pages of text are in a poor condition – as most 100-250 year old historic newspaper pages designed to last only a few days are at the time of digitisation – then the OCR is ‘poor’: that is, there is a gap between what the software ‘outputs’ when reading the page and what me the human ‘outputs’ when reading the page. Irrespective of this gap, the Viral Texts team have waded through this OCR to find patterns. Specifically, they have found instances where strings of the same five words appear in multiple texts. And when more than one of those five word strings appear within sufficient proximity to one another and with sufficient frequency, they have deemed the corresponding texts as a match. And when a match is found, they have – in most cases – an example of an instance where one newspaper editor has reprinted text found in another newspaper, usually edited to fit the column inches they have to play with. And this text, this reprint, more often than not contains those shifts in descriptions, figures, and names, uncertainties in the stuff the historian – like me – uses to construct our histories, to build our sense of – in lieu of an actual window – the past phenomena we care about.
Viral Texts has found over one million reprints in the Chronicling America corpus alone. This reprint culture is not new to historians of eighteenth and nineteenth century newspapers. Neither is the trouble with equating newspapers then with newspapers now; pace and with apologies to Adrian Johns, they know we need to forget what we think we know about the historic newspaper in order to know the historic newspaper. Nor are these historians of newspapers unfamiliar with the biases inherent to the collection and digitisation of historic newspapers. And none of this will be new to you, the historian, sagely nodding along. We know the newspaper sources we use are always problematic. We know that, and yet we persist in using them. Indeed, driven by their digitisation, we are using them more than we ever have before. But given all that we know, given their mediated, casually reprinted, and fungible character, and given the massive scale of reprinting Viral Texts has uncovered, are these newspapers not far from the sources of ephemeral record we like to think they are and in fact the most terrible of historical sources? And if not, if I’m being dramatic, what is it about them that we can salvage from the wreckage of their contemporary, subsequent, and modern editing, repackaging, unpicking, and repacking? Which aspects of these most fertile, voluminous, and vital sources of the pasts that we care about can we confidently invest some authority in?