During the last ten years the use of web archives as primary sources for historical research has attracted the attention of researchers in several different fields. From the web archiving community (Foot et al., 2003; Hockx-Yu, 2014) to Internet studies scholars (Brügger, 2009; Ankerson, 2012), from web scientists (Huurdeman et al., 2013; Hale et al., 2014) to STS researchers (Rogers, 2013; Schafer, 2013), several case studies have been presented. All these different projects have been designed to highlight the potential of born digital sources to offer a new and more complete perspective on our recent past. More recently, traditionally trained and digital historians have been directly engaged in the debate, raising both methodological and theoretical questions (Milligan, 2012; Webster, 2015). However, within Digital Humanities, the web as a source and as an object of study has not been widely debated.
For these reasons, the proposed panel brings together different researchers whose work focuses on the use of born digital materials as historical sources and who have developed, employed, criticized or refused the use of computational/quantitative methods in order to extract useful pieces of information from them.
The purpose of this panel is twofold. First, we aim to discuss how the toolkit available to new generations of historians will necessarily have to combine a vast series of new skills: from accessing and dealing with web archive objects to analyzing networks of hyperlinks, from employing text mining approaches to re-discussing the reliability of a primary source when it is born digital.
Second, our intention is to discuss with the community how this new field of study could be recognized as a relevant example of a digital humanities practice. While the number of born digital sources is increasing rapidly, a broad discussion within the Digital Humanities community is needed, in order to reflect on how to prepare the first generation of digital historians that will work with materials that do not have an analogue counterpart, of which web archives are just one.
Anat Ben-David is a lecturer in the department of Sociology, Political Science and Communication, the Open University of Israel. Her research focuses on Internet geopolitics, web historiography and digital methods for Web research. Ben-David's contribution presents her recent work on the reconstruction of portions of the Web's deleted pasts. In particular, her presentation argues that the use of the Web as a primary source for studying the history of nations is conditioned by the structural ties between sovereignty and the Internet Protocol, and by a temporal proximity between live and archived websites. The argument is illustrated with an archival reconstruction of the history of the top-level domain of former Yugoslavia, .yu, which operated on the Web since 1989 and was discontinued in 2010. The archival reconstruction of a portion of the Web's deleted past serves to assess and conceptualise the Web’s limits as an appropriate source for telling its own history.
Niels Brügger is Professor and head of the Centre for Internet Studies as well as of the internet research infrastructure NetLab within the Danish Digital Humanities Lab, Aarhus University, Denmark. His research interests are web historiography, web archiving, and media theory. Within these fields he has published monographs, edited books, and book chapters at international publishers, as well as articles in international peer reviewed journals. He has participated in a number of large research projects, in Denmark and in the UK. Niels Brügger is cofounder and now coordinator of RESAW, a Research Infrastructure for the Study of Archived Web Materials ( resaw.eu).
Meghan Dougherty is an assistant professor of digital communication at Loyola University Chicago. Her research focuses on methodological challenges in answering questions about how we move through networks of digital culture to create knowledge, form memory, and make history. The questions that guide her inquiry explore how knowledge production — ranging from identity formation and group membership to scholarly knowledge — is shaped by digital communication infrastructure, and vice versa. Dougherty’s contribution to the panel stems from her current book project under contract at University of Toronto Press, Virtual Digs: Excavating, preserving, archiving, and curating the Web in which she describes a common field of Web Archaeology drawn together from experiments in Web archiving, digital preservation, and curating that are commonly found in Internet research, digital humanities, and information science. She argues that the nature of scholarly evidence and interpretation must be reconsidered in the new media ecology.
Ian Milligan, an assistant professor of digital and Canadian history at the University of Waterloo (Canada, Ontario) will talk on a specific web archiving project that he has been involved with. His contribution, "WebArchives.ca: Enabling Access to Canadian Political Party Web Archives," explores the development, deployment, and reception of http://webarchives.ca. He argues that since 1996, we have been collecting web archives – now we need to put them to good use. However, accessing and making sense of results requires computational skills. Milligan's case study, a 2005-2015 assemblage of political parties and political interest groups within Canada, should have arguably have been used far more than it had been given pivotal shifts in the Canadian political milieu during the period it studies. These collections were underused, however, due to problems of access restrictions and a lack of technical knowledge. His contribution to this roundtable will then quickly explore various access methods, from content analysis to metadata parsing, using his case study as a reference point throughout.
Federico Nanni is a PhD student in Science, Technology and Society at the Centre for the History of Universities and Science of the University of Bologna and a visiting researcher at the Data and Web Science Group of the University of Mannheim. His research is focused on understanding how to combine methodologies from different fields of study in order to face both the scarcity and the abundance of born digital sources related to the recent history of Italian universities. In particular, he employed oral histories and traditional hermeneutic practices for reconstructing the evolution of the University of Bologna website, which was excluded from the Wayback Machine of the Internet Archive. Later, he applied text mining methods in order to study the collected data. He finally focused his attention on comparing the changes in academic input and output of this institution during the last two decades by analysing the descriptions of the courses presented on the website and the dissertations abstracts available in the digital library.
Jane Winters is Professor of Digital History at the Institute of Historical Research, University of London. Her research interests include the ways in which researchers in the humanities can work with born digital big data, including the archived web. She was Principal Investigator of a big data project funded by the Arts and Humanities Research Council, ‘Big UK Domain Data for the Arts and Humanities’, which sought to develop a theoretical and methodological framework for the study of web archives over time. Drawing on the lessons of this project she will discuss the challenges faced by researchers wishing to work with the archive(s) of UK web space, and suggest ways in which archiving institutions and researchers can collaborate to overcome them. She will address the fractured and diverse nature of the available web archives - from the open and comprehensive UK Government Web Archive to the dataset derived from the Internet Archive for 1996-2013, from a focused institutional collection such as the Parliamentary Web Archive to the ongoing domain crawl undertaken by the British Library since 2013 - and consider the barriers for researchers in the arts and humanities who choose to use this material, whether as a key source for study or simply as one primary source among many.
Together, these six papers all point towards the growing significance of web archives within contemporary historical practice. In order to encourage the discussion, each speaker will briefly introduce his or her work (5 min), then another speaker will address a general comment (5 min) and following there will be an open discussion (5 min). 1
The building blocks of the field are in place: what is needed is a discussion within the historical community, but also towards the broader field of digital humanities practice. Our discussion will look to find commonalities and similarities both within our work, but also within the broader DH 2016 conference.
A similar format has been already successfully employed at the conference “Web Archives as Scholarly Sources: Issues, Practices and Perspectives”, Aarhus, 2015.