Measuring the Impact of DH Conference Abstracts: The Case of DH2016

We are still toying around with the proceedings of ADHO’s Digital Humanities conferences. This time we are interested in the measurability of the scientific impact of published abstracts.

Long story short, since no other meaningful metrics are at hand (or are there?), we have chosen Google Scholar’s citation counts. As a test case we decided to go with the abstracts from DH2016 in Kraków, because 1.) they are well covered in Google Scholar and 2.) it has been three and a half years since the conference, so that enough time has passed to actually have an impact and attract citations.

Our starting point was the fabulous DBLP, a giant computer science bibliography founded at University of Trier. Fortunately they also cover ADHO conferences since some time, so here are all contributions to DH2016 in a concise and interoperable form: https://dblp1.uni-trier.de/db/conf/dihu/dh2016.html.

Before we dive into the data, some known issues:

Google Scholar is “notoriously hard to mine”, there’s no API and access is limited (if you don’t want to deal with CAPTCHAs and “We’re sorry, but your computer network may be sending automated queries” you will have to apply some IP address magic).
Google Scholar works pretty well, but there might be some incorrectly assigned citations – since their search algorithm is proprietary, we can’t really know why things go wrong if they do.
It is also unclear what exactly is monitored by Google Scholar, so there are definitely missing citations.
When counting and ranking citations we don’t check if they are mainly self-citations.
Given the above points, please take the following with a grain of salt.

Out of 435 accepted abstracts at DH2016 (plenary lectures, panels, long and short papers, posters, pre-conference workshops and tutorials), 158 are cited at least once according to Google Scholar, as of 15 February 2020 (= 3,5 years after the conference).

Here are our resulting data sets in case you wanna explore them yourself (CSV format):

DH2016 citations according to Google Scholar (158 abstracts)
DSH Special DH2016 Issue citations according to Google Scholar (15 papers)
DH2016 follow-up papers published elsewhere (selection) (3 papers)

What follows are some rankings based on these data sets (plus some general thoughts at the bottom of this blog post):

1. Most cited abstracts from DH2016 (as of 15 February 2020)

There are 28 abstracts with at least 4 citations collected by Google Scholar:

Eetu Mäkelä, Thea Lindquist, Eero Hyvönen:
CORE - A Contextual Reader based on Linked Data – Long Paper – 15 citations
(DBLP) (Google Scholar)
Esko Ikkala, Jouni Tuominen, Eero Hyvönen:
Contextualizing Historical Places in a Gazetteer by Using Historical Maps and Linked Data – Short Paper – 11 citations
(DBLP) (Google Scholar)
Terhi Nurmikko-Fuller, Jacob Jett, Timothy W. Cole, Chris Maden, Kevin R. Page, J. Stephen Downie:
A Comparative Analysis of Bibliographic Ontologies: Implications for Digital Humanities – Short Paper – 10 citations
(DBLP) (Google Scholar)
Peer Trilcke, Frank Fischer, Mathias Göbel, Dario Kampkaspar:
Theatre Plays as ‘Small Worlds’? Network Data on the History and Typology of German Drama, 1730-1930 – Long Paper – 8 citations
(DBLP) (Google Scholar)
Aris Xanthos, Isaac Pante, Yannick Rochat, Martin Grandjean:
Visualising the Dynamics of Character Networks – Long Paper – 8 citations
(DBLP) (Google Scholar)
Manuel Burghardt, Lukas Lamm, David Lechler, Matthias Schneider, Tobias Semmelmann:
Tool-based Identification of Melodic Patterns in MusicXML Documents – Short Paper – 8 citations
(DBLP) (Google Scholar)
Fotis Jannidis, Isabella Reger, Markus Krug, Lukas Weimer, Luisa Macharowsky, Frank Puppe:
Comparison of Methods for the Identification of Main Characters in German Novels – Short Paper – 8 citations
(DBLP) (Google Scholar)
Roman Klinger, Surayya Samat Suliya, Nils Reiter:
Automatic Emotion Detection for Quantitative Literary Studies – Poster – 8 citations
(DBLP) (Google Scholar)
Maciej Maryl, Maciej Piasecki, Ksenia Mlynarczyk:
Where Close and Distant Readings Meet: Text Clustering Methods in Literary Analysis of Weblog Genres – Long Paper – 7 citations
(DBLP) (Google Scholar)
Manuel Burghardt, Michael Kao, Christian Wolff:
Beyond Shot Lengths - Using Language Data and Color Information as Additional Parameters for Quantitative Movie Analysis – Poster – 7 citations
(DBLP) (Google Scholar)
Christof Schöch, Daniel Schlör, Stefanie Popp, Annelen Brunner, Ulrike Henny, José Calvo Tello:
Straight Talk! Automatic Recognition of Direct Speech in Nineteenth-Century French Novels – Long Paper – 6 citations
(DBLP) (Google Scholar)
Johannes Hellrich, Udo Hahn:
Measuring the Dynamics of Lexico-Semantic Change Since the German Romantic Period – Short Paper – 6 citations
(DBLP) (Google Scholar)
Fahad Khan, Francesca Frontini, Federico Boschetti, Monica Monachini:
Converting the Liddell Scott Greek-English Lexicon into Linked Open Data using lemon – Short Paper – 6 citations
(DBLP) (Google Scholar)
Federico Nanni, Pablo Ruiz Fabo:
Entities as topic labels: improving topic interpretability and evaluability combining Entity Linking and Labeled LDA – Short Paper – 6 citations
(DBLP) (Google Scholar)
Susan Brown, Tanya E. Clement, Laura Mandell, Deb Verhoeven, Jacque Wernimont:
Creating Feminist Infrastructure in the Digital Humanities – Panel – 5 citations
(DBLP) (Google Scholar)
Alexander Czmiel:
Sustainable publishing - Standardization possibilities for Digital Scholarly Edition technology – Long Paper – 5 citations
(DBLP) (Google Scholar)
Isabella di Lenardo, Benoit Seguin, Frédéric Kaplan:
Visual Patterns Discovery in Large Databases of Paintings – Long Paper – 5 citations
(DBLP) (Google Scholar)
Francesca Frontini, Carmen Brando, Jean-Gabriel Ganascia:
REDEN ONLINE: Disambiguation, Linking and Visualisation of References in TEI Digital Editions – Long Paper – 5 citations
(DBLP) (Google Scholar)
Kim Jautze, Andreas van Cranenburgh, Corina Koolen:
Topic Modeling Literary Quality – Long Paper – 5 citations
(DBLP) (Google Scholar)
Mikko Tolonen, Niko Ilomäki, Hege Roivainen, Leo Lahti:
Printing in a Periphery: a Quantitative Study of Finnish Knowledge Production, 1640-1828 – Long Paper – 5 citations
(DBLP) (Google Scholar)
Peter Robinson, Barbara Bordalejo:
Textual Communities – Poster – 5 citations
(DBLP) (Google Scholar)
Alexander Dunst, Rita Hartel, Sven Hohenstein, Jochen Laubrock:
Corpus Analyses of Multimodal Narrative: The Example of Graphic Novels – Long Paper – 4 citations
(DBLP) (Google Scholar)
Maciej Eder, Jan Rybicki:
Go Set A Watchman while we Kill the Mockingbird in Cold Blood, with Cats and Other People – Long Paper – 4 citations
(DBLP) (Google Scholar)
Chao-Lin Liu:
Quantitative Analyses of Chinese Poetry of Tang and Song Dynasties: Using Changing Colors and Innovative Terms as Examples – Long Paper – 4 citations
(DBLP) (Google Scholar)
Alexandre Rigal, Dario Rodighiero, Loup Cellard:
The Trajectories Tool: Amplifying Network Visualization Complexity – Long Paper – 4 citations
(DBLP) (Google Scholar)
Valérie Beaudouin, Zeynep Pehlivan:
The Great War on the Web: the Making of Citing and Referencing by Amateurs – Short Paper – 4 citations
(DBLP) (Google Scholar)
Angelo Mario Del Grosso, Davide Albanesi, Emiliano Giovannetti, Simone Marchi:
Defining the Core Entities of an Environment for Textual Processing in Literary Computing – Poster – 4 citations
(DBLP) (Google Scholar)
Martin Reckziegel, Stefan Jänicke, Gerik Scheuermann:
CTRaCE: Canonical Text Reader and Citation Exporter – Poster – 4 citations
(DBLP) (Google Scholar)

2. Citation counts for follow-up articles published in the DSH special conference issue

EADH’s journal “Digital Scholarship in the Humanities” (DSH) published a special edition with 15 selected articles from the DH2016 conference (Volume 32, Issue suppl_2, December 2017). Here are their citation counts (titles sometimes differ from the original abstracts):

Stefan Evert, Thomas Proisl, Fotis Jannidis, Isabella Reger, Steffen Pielström, Christof Schöch, Thorsten Vitt:
Understanding and explaining Delta measures for authorship attribution – 22 citations (original abstract: 2)
(Google Scholar)
Heather Richards-Rissetto:
An iterative 3D GIS analysis of the role of visibility in ancient Maya landscapes: A case study from Copan, Honduras – 10 (2)
(Google Scholar)
Stefan Jänicke, David Joseph Wrisley:
Visualizing Mouvance: Toward a visual analysis of variant medieval text traditions – 9 (3)
(Google Scholar)
Grace Muzny, Mark Algee-Hewitt, Dan Jurafsky:
Dialogism in the novel: A computational model of the dialogic nature of narration and quotations – 8 (2)
(Google Scholar)
Elena González-Blanco, Clara Martínez Cantón, Gimena del Rio Riande, Salvador Ros, Rafael Pastor, Antonio Robles-Gómez, Agustín Caminero, María Luisa Díez Platas, Álvaro del Olmo, Miguel Urízar:
EVI-LINHD, a virtual research environment for the Spanish-speaking community – 5 (2)
(Google Scholar)
Arianna Ciula:
Digital palaeography: What is digital about it? – 4 (0)
(Google Scholar)
Katarzyna Bazarnik, Jakub Wróblewski:
First We Feel Then We Fall: James Joyce’s Finnegans Wake as an interactive video application – 4 (0)
(Google Scholar)
David L Hoover:
The microanalysis of style variation – 3 (0)
(Google Scholar)
James O Gawley, A Caitlin Diddams:
Comparing the intertextuality of multiple authors using Tesserae: A new technique for normalization – 3 (0)
(Google Scholar)
Marine Riguet, Suzanne Mpouli:
At the crossroads between the scientific and the literary discourse: Comparison as a figure of dialogism – 3 (3)
(Google Scholar)
Martijn Kleppe, Marco Otte:
Analysing and understanding news consumption patterns by tracking online user behaviour with a multimodal research design – 3 (0)
(Google Scholar)
Claire Warwick:
Beauty is truth: Multi-sensory input and the challenge of designing aesthetically pleasing digital resources – 1 (0)
(Google Scholar)
Taylor Arnold, Peter Leonard, Lauren Tilton:
Knowledge creation through recommender systems – 1 (0)
(Google Scholar)
Rui Hu, Carlos Pallán Gayol, Jean-Marc Odobez, Daniel Gatica-Perez:
Analyzing and visualizing ancient Maya hieroglyphics using shape: From computer vision to Digital Humanities – 1 (2)
(Google Scholar)
Joris J van Zundert, Tara L Andrews:
Qu’est-ce qu’un texte numérique?—A new rationale for the digital representation of text – 0 (0)
(Google Scholar)

3. Articles published elsewhere

As it is not always possible to say whether a full paper is really the elaborated version of a conference abstract, we can’t provide an exhaustive list. We only list here the three papers that go by the same titles as the original abstracts:

Melissa Terras, James Baker, James Hetherington, David Beavan, Martin Zaltz Austwick, Anne Welsh, Helen O’Neill, Will Finley, Oliver Duke-Williams, Adam Farquhar:
Enabling complex analysis of large-scale digital collections: humanities research, high-performance computing, and transforming access to British Library digital collections (DSH 33.2, June 2018) – 12 citations (original abstract: 0)
(Google Scholar)
Yuta Hashimoto, Yoichi Iikura, Yukio Hisada, SungKook Kang, Tomoyo Arisawa, Daniel Kobayashi-Better:
The Kuzushiji Project: Developing a Mobile Learning Application for Reading Early Modern Japanese Books (DHQ 11.1, 2017) – 8 (0)
(Google Scholar)
Pelle Snickars, Roger Mähler:
SpotiBot-Turing testing Spotify (DHQ 12.1, 2018) – 7 (0)
(Google Scholar)

Some Thoughts

ADHO conference abstracts do have a measurable scientific impact in the community.
Publishing a reworked conference abstract in a journal can (but doesn’t have to) increase your impact.
It would be nice to have a more stable, sustainable, citable format for ADHO conference proceedings (think DOIs maybe?).
Case in point: proceedings of DH2015 in Sydney seem to have vanished into the internet ether, there are only a bunch of unrendered XML files left somewhere, not really citable.
DOIs for books of abstracts are already a step in the right direction (see, e.g., DOI:10.5281/zenodo.2596095 for DHd2019).
Working with Google Scholar is still tedious. There are some Python libraries that are supposed to make it easier to work with it, but they don’t cover the full range of Google Scholar functions, and the proprietary mechanism can change at any time without warning and render these libraries unusable.

Anyway, so much for this little experiment. It’s a beautiful day in Moscow, time to visit the ice rink in Gorky Park! ❄️

weltliteratur.net

A Black Market for the Digital Humanities

Measuring the Impact of DH Conference Abstracts: The Case of DH2016

1. Most cited abstracts from DH2016 (as of 15 February 2020)

2. Citation counts for follow-up articles published in the DSH special conference issue

3. Articles published elsewhere

Some Thoughts

Frank Fischer