Tools Mentioned in DH2020 Abstracts

by Frank Fischer and Yoann Moranville

Building on our previous attempts to extract digital tools mentioned in abstracts of Digital Humanities conferences, we conducted a similar exercise for this week’s virtual DH2020. We again used our simple string-matching programme ToolXtractor and an updated list of tools in the TAPoR database.

So here is an overview of tools mentioned in more than one abstract (out of a total of 475 abstracts for all conference formats), including links to the actual texts – followed by some observations at the bottom of this blog post:

spaCy (5)

word2vec (5)

Gephi (4)

Natural Language Toolkit (4)

Voyant Tools (4)

Drupal (3)

Google Maps (3)

Tesseract (3)

ArcGIS (2)

Cytoscape (2)


Neo4j (2)

Omeka (2)

OpenStreetMap (2)

Some Acknowledgements and Observations

  • Programming languages were excluded in this list (for a reality check: Python is mentioned in 26 abstracts, JavaScript in 11, MySQL in 4). Also, no full-text, data or code repositories like HathiTrust, Google Books, GitHub, Dataverse or Islandora were included.
  • Disambiguation was needed: The stylo R package didn’t make it into the list, although the term ‘stylo’ is mentioned in two abstracts. But the second time it did not refer to the well-known stylometry library, but to another tool.
  • So, hm, only 14 tools that were explicitely mentioned more than once. Our guess is still that not all tools that contributed to a research project or workshop were mentioned, which makes it more difficult to understand how things were done. One reason for this could be the limited writing space for conference abstracts, but there is definitely room for improvement in terms of specifically mentioning the tools used.
  • The TAPoR list of tools, even if updated, still isn’t (and never will be) exhaustive, so as usual, take this with a grain of salt.