Testing Wikibase with Mapping the Scottish Reformation

One of the goals for our National Endowment of the Humanities grant in 2019/20 was to explore ways we could capture the complexity of early modern clerical careers in a digital format. We needed a system that was robust enough to handle large volumes of textual, temporal and geographic data, as well as offering some methods to visualise that data (either natively or via export). In January 2020, we started the process of compiling our dataset on Wikidata, an open-linked data repository. 

All the data that sits behind our first public website is available on Wikidata. We have outlined how we use Wikidata in previous posts on this blog. Of course, as an open repository, using Wikidata presents risks around data vandalism, sloppy editing (something of which I have been guilty myself) and commercial reuse of data. As we look to the future of Mapping the Scottish Reformation, we have started exploring other methods of managing our data that can tick the same boxes as Wikidata but offer features that we might need as our dataset grows to encompass different types of material (some of which may not be appropriate to full open access platforms like Wikidata). We’ve been working on our own private laboratory version of Mapping the Scottish Reformation: this will allow us to test new ideas like data ontologies or visualisations, without worrying about affecting Wikidata. Enter MSR Test on Wikibase. 

Wikibase Logo

Wikibase is open-source software for creating knowledge bases. It is effectively the backend of Wikidata: allowing information to be inputted and linked. Unlike Wikidata, you can create your own instance of a dataset. This can be hosted online on a new service called Wikibase Cloud or it can be managed privately on a personal or commercial server. Several Galleries, Libraries, Archives and Museums have used Wikibase as a way to better understand their collections (see the excellent work of Koninklijke Bibliotheek Nederland and the National Library of Wales).

To get started, one has to register for a Wikibase Cloud account (at the moment, this is invite only) or download a local version of Wikibase. For the purposes of our test, we have chosen to use the cloud version. Like most Wikimedia projects, there is an active community of users, many of whom offer guides to help get you started (we relied on the sage advice of Jason Evans at National Library of Wales!). Once you’ve set up your Wikibase instance, you are greeted by a completely empty database. Unlike Wikidata’s 106.86m items, your new Wikibase instance contains nothing. Zero. If you’re used to Wikidata, this experience is rather jarring. The first step to populating the database is creating each property you need: think of these as the headings in a spreadsheet. Here’s a selection of some of the properties we created:

A screenshot showing some of the properties we created

Once these properties are in place, key items in the dataset are needed. For us, these are key concepts – such as ‘minister’ or ‘presbytery’ that help us categorise items. These properties and items create the pieces that we need to build our data structure: the framework to capture information about clergy careers. 

Due to Wikibase’s structural similarities with Wikidata, we can use bulk upload tools like Quickstatements to import data into the dataset. This allowed us to bulk upload the names and coordinates of every presbytery region in early modern Scotland. Moreover, due to those similarities with Wikidata, Wikibase can be searched using its own Query Service. Results can be downloaded into CSV, TSV or JSON formats or visualised in maps, tables and network diagrams. Here’s a visualisation of most of those presbyteries, all using the Wikidata Query Service, but located in our own Wikibase instance:

An early visualisation of Scottish Presbyteries made using Wikibase

While the Wikibase instance doesn’t include many ministers yet, we hope you can see some of the advantages of this system. All of the data you have seen in this blog post exists outside of Wikidata. Moreover, while users can see the data here, only members of the project team can edit it, meaning we have greater editorial control. Finally, as we move forward with the project, Wikibase may allow us to offer different levels of open access when compared to Wikidata. For example, all of our data on Wikibase is made available on CC BY-NC 4.0 licence, meaning it is free for personal usage, but cannot be reused for commercial purposes. This may be an important feature if we move into archival collections that prohibit commercial usage. 

Being so easy to set up, Wikibase may offer humanities scholars an easy to learn alternative to resource heavy database software. Moreover, Wikibase gives us the freedom to structure data in different ways or to test new approaches, without worrying about affecting Wikidata’s other users. MSR Test on Wikibase serves as a laboratory in which we can develop the next stages of the project. If you think Wikibase might be of use to you and your projects, get in touch with us.

You can keep track of our Wikibase progress at https://msrtest.wikibase.cloud/ .

Chris R. Langley

The Open University