DHQ: Digital Humanities Quarterly

2021 15.4

Articles

[en] A Named Entity Recognition Model for Medieval Latin Charters

Pierre Chastang, UVSQ-Université Paris-Saclay; Sergio Torres Aguilar, UVSQ-Université Paris-Saclay; Xavier Tannier, Sorbonne Université

Abstract [en]

[en] Modernism and Gender at the Limits of Stylometry

Sean Weidman, Pennsylvania State University; Aaren Pastor, Pennsylvania State University

Abstract [en]

[en] Compounded Mediation: A Data Archaeology of the Newspaper Navigator Dataset

Benjamin Lee, The Library of Congress & The University of Washington

Abstract [en] The increasing roles of machine learning and artificial intelligence in the construction of cultural heritage and humanities datasets necessitate critical examination of the myriad biases introduced by machines, algorithms, and the humans who build and deploy them. From image classification to optical character recognition, the effects of decisions ostensibly made by machines compound through the digitization pipeline and redouble in each step, mediating our interactions with digitally-rendered artifacts through the search and discovery process. As a result, scholars within the digital humanities community have begun advocating for the proper contextualization of cultural heritage datasets within the socio-technical systems in which they are created and utilized. One such approach to this contextualization is the data archaeology, a form of humanistic excavation of a dataset that Paul Fyfe defines as “recover[ing] and reconstitut[ing] media objects within their changing ecologies” . Within critical data studies, this excavation of a dataset - including its construction and mediation via machine learning - has proven to be a capacious approach. However, the data archaeology has yet to be adopted as standard practice among cultural heritage practitioners who produce such datasets with machine learning. In this article, I present a data archaeology of the Library of Congress’s Newspaper Navigator dataset, which I created as part of the Library of Congress’s Innovator in Residence program . The dataset consists of visual content extracted from 16 million historic newspaper pages in the Chronicling America database using machine learning techniques. In this case study, I examine the manifold ways in which a Chronicling America newspaper page is transmuted and decontextualized during its journey from a physical artifact to a series of probabilistic photographs, illustrations, maps, comics, cartoons, headlines, and advertisements in the Newspaper Navigator dataset . Accordingly, I draw from fields of scholarship including media archaeology, critical data studies, science and technology studies, and the autoethnography throughout. To excavate the Newspaper Navigator dataset, I consider the digitization journeys of four different pages in Black newspapers included in Chronicling America, all of which reproduce the same photograph of W.E.B. Du Bois in an article announcing the launch of The Crisis, the official magazine of the NAACP. In tracing the newspaper pages’ journeys, I unpack how each step in the Chronicling America and Newspaper Navigator pipelines, such as the imaging process and the construction of training data, not only imprints bias on the resulting Newspaper Navigator dataset but also propagates the bias through the pipeline via the machine learning algorithms employed. Along the way, I investigate the limitations of the Newspaper Navigator dataset and machine learning techniques more generally as they relate to cultural heritage, with a particular focus on marginalization and erasure via algorithmic bias, which implicitly rewrites the archive itself. In presenting this case study, I argue for the value of the data archaeology as a mechanism for contextualizing and critically examining cultural heritage datasets within the communities that create, release, and utilize them. I offer this autoethnographic investigation of the Newspaper Navigator dataset in the hope that it will be considered not only by users of this dataset in particular but also by digital humanities practitioners and end users of cultural heritage datasets writ large.

[en] Classifying and Contextualizing Edits in Variants with Coleto: Three Versions of Andy Weir’s The Martian

Erik Ketzan, Trinity College Dublin; Christof Schöch, University of Trier

Abstract [en]

[en] Character Recognition Of Seventeenth-Century Spanish American Notary Records Using Deep Learning

Nouf Alrasheed, Department of Computer Science. & Electrical Engineering. University of Missouri-Kansas City; Praveen Rao, Department of Health Management & Informatics, Department of Electrical Engineering & Computer Science. University of Missouri-Columbia; Viviana Grieco, Department of History. University of Missouri-Kansas City

Abstract [en]

[en] Finding Narratives in News Flows: The Temporal Dimension of News Stories

Blanca Calvo Figueras, University of Gronigen; Tommaso Caselli, University of Groningen; Marcel Broersma, Centre for Media and Journalism Studies, University of Groningen

Abstract [en]

[en] Innovation Through Collaboration in Humanities Research

Maria Bonn, School of Information Sciences, Univesity of Illinois Urbana Champaign; Harriett Green, Washington University in St. Louis; Angela Courtney, Indiana University Bloomington; Megan Senseney, University of Arizona

Abstract [en]

Reviews

[en] The Age Old Question: A Review of What is Digital History? by Hannu Salmi

Tracy L. Barnett, University of Georgia

Abstract [en]

Author Biographies

URL: http://www.digitalhumanities.org/dhq/vol/15/4/index.html
Comments:
Published by: and
Affiliated with: Digital Scholarship in the Humanities
DHQ has been made possible in part by the National Endowment for the Humanities.
Copyright © 2005 -

Unless otherwise noted, the DHQ web site and all DHQ published content are published under a Creative Commons Attribution-NoDerivatives 4.0 International License. Individual articles may carry a more permissive license, as described in the footer for the individual article, and in the article’s metadata.

Announcements

2021 15.4

Articles

Reviews

Author Biographies