Entering Stage Two: Mapping Parishes in Lothian and Tweeddale, 1560-1689

With Stage One of Mapping the Scottish Reformation nearing completion, we now have a large dataset of ministers, detailing their movements and important moments of their careers. We have parsed over 9,000 pages of manuscripts from presbyteries across the region of Lothian and Tweeddale. With these significant task complete, we can now turn our attentions to how we intend to present this data in Stage Two of the project.

Stage Two of Mapping the Scottish Reformation is generously funded by the Strathmartine Trust and will see us explore our user interfaces for the first time. Up to now, we have deposited our data onto Wikidata and used the built-in tools to test our material, to see the locations of gaps, and to create quick mock-up visuals that we think users might find useful. You can read more about our use of Wikidata mapping tools and how we structure our dataset here.

A screen grab of using the mapping tools built into the Wikidata Query Service.

Followers of the project will have seen some of the demos we have been able to quickly put together that visualise the breadth of our data and hint at some of the ways we can put it to work. Critically, Stage Two of Mapping the Scottish Reformation will use even more powerful mapping technologies to create visualisations that load faster, run more smoothly and show even more data. Before we formally embark on this process, we have spent the last couple of weeks dipping our toes into some of these more powerful mapping technologies.

The Wikidata Query Service — and the SPARQL queries we write to ask questions of our data — sits at the heart of our project. It allows our data to be open for other researchers to use (and build on) in the future, but it can also be quickly exported and patched into other programmes/services. The first stage of testing more powerful maps is to take the result of a SPARQL query, add a few modifications to the code, and export it into a TSV (tab-separated values) file.

The Wikidata Query Service allows the results of queries to be exported into various file formats

We exported a SPARQL query that shows all of the parishes in the Synod of Lothian and Tweeddale between 1560 and 1689, as well as a label that showed in which presbytery each parish sat. This has the fewest values of any of our datasets so we thought it would be an easy place to start! The resulting TSV file is effectively a huge spreadsheet: and as much as I like spreadsheets, I think it would be stretch to call it attractive or user friendly. The key thing here is that we have the latitude and longitude data in separate columns and have the key bits of information we want to display to users. Our test file included around 120 lines.

The resulting TSV file is functional, if unappealing

Formatted correctly, a TSV file like this one can be imported into GeoJSON, an open format mapping service that allows users to input geographical data and show them on maps (note: have you noticed our constant use of open source and access services?!! It’s no coincidence!). Users can either add points manually or, critically for us, add geo-referenced locations in bulk. Having uploaded the file, the result is a much more appealing map that includes more attractive and comprehensive icons and the ability to select different mapping layers. We can even add different mapping tiles, using a service like Map Tiler, enabling us to test different backgrounds.

The beauty about GeoJSON is that it transforms that ugly TSV file into something more machine readable. Unfortunately, GeoJSON doesn’t allow you to automatically export your map or embed it into a website like this one. This is where Leaflet.JS comes in.

Leaflet is quietly taking over the world of internet mapping applications, but its huge functionality comes at a significant technical cost: we aren’t in the world of drag and drop or ‘what you see is what you get’ editing anymore. The benefits of a little perseverance, however, are huge.

Leaflet demands an understanding of CSS, or at least an understanding of what to swap into lines of code cribbed from GitHub and when. This process was made infinitely easier by Leaflet’s own tutorials and, in particular, by this superb tutorial on YouTube by Eduonix. The key here is to take the code generated by GeoJSON and to copy it into our HTML file (shown below in Sublime Text markup editor). You can see how the data from GeoJSON is shown just below the various lines of code for headers etc.

After generating our map, a few fairly simply lines of code allows Leaflet to then take the data from GeoJSON and display it, as well as adding a custom mapping layer and popup menus that are, in theory, infinitely customisable. The resulting map can be exported to HTML and embedded into a website. And because the database values were all pasted into Leaflet, at least for the moment, Leaflet doesn’t have to request the info each time the page loads. The result is that the embedded map loads almost immediately.

You can play with this simple demo, showing all of the parishes of Lothian and Tweeddale between 1560 and 1689, below.

Notice that we have made use of the NLS Historical Maps API to plot the points on a historical map. This dynamically adjusts to a different background map depending on how far a user zooms in or out of the map.

If this seems like a tremendous amount of effort to go to in order to embed a map, then I suppose you’re right. What’s important here is that we have demonstrated that the data we manually took from manuscripts within the National Records of Scotland, passed into Wikidata, and then queried using SPARQL and the Wikidata Query Service, can be exported, customised and presented in a way that it as visually friendly as we can make it!

This is just a test, but it reflects the process we will go through during Stage Two of Mapping the Scottish Reformation, with colleagues from our international Advisory Board and our technical friends and colleagues at the University of Edinburgh. Ultimately, this process will allow us to create a number of interactive visualisations that will distill the months we have spent looking at handwritten archival material and make it more accessible. So while we’ve been recording, storing and querying the Scottish Reformation up to now, Stage Two of this project will allow us to start intricately mapping the Scottish Reformation.

Chris R. Langley

Working with Imperfect Manuscripts: Image Manipulation in MSR

MSR uses images from documents produced by Church of Scotland presbyteries between 1560 and 1689 to gather data on clerical careers. Indeed, at the time of writing, we have read over two and a half thousand pages of such material. And we aren’t finished!

Our friends at the National Records of Scotland provide us with bulk deposits of images of these manuscripts, presbytery by presbytery. These images were taken between 2003 and 2005, during which time around five million document images were snapped and digitised. These images form a terrific resource that can be accessed via the Virtual Volumes system at the National Records of Scotland reading rooms in Edinburgh. Due to bandwidth constraints in the early 2000s, these images were taken at a resolution of 2174 × 1655 pixels, or just over 3.5 million pixels per image. By contrast, images snapped on a current mobile phone camera can hit around 12.1 million pixels per image. These technical constraints and the fact that our manuscripts were produced four centuries ago and survive in varying states of decay means that our source base can be quite difficult to read. This blog post will take you through some of the methodologies we use to enhance the amount of data we can recover from our source base.

The following image is from a particularly faded section the records of Dalkeith Presbytery from 1614 (NRS, CH2/424/1). One can observe how the marginal annotation appears in a much deeper ink on the left-hand side, but the main entry to the right is faded. Moreover, some of the ink from the other side of the manuscript is just starting to bleed in to cloud our vision further. While this isn’t the worst manuscript pre-modern historians are likely to see (!), this is a common trait of some of the volumes with which we work.

Any illegible part of a manuscript is irksome, but it is particularly annoying in projects like ours that rely on parsing large numbers of manuscripts each day. Moreover, the early part of the seventeenth century was when Dalkeith Presbytery became far more active in recording details of clerical careers (among other things), so it is essential that we capture this data. 

Fortunately, there are a number of simple techniques that historians can use to improve their chances of seeing through this sort of haze. In particular, we can manipulate the colour balance of the image to bring the text into greater relief using quite basic computer software. Here’s one example with the image contrast boosted and with exposure increased. I think this setting allows manuscripts to appear as my family expect them –  suitably old – but it also allows the ink to become far clearer to the naked eye.

The next example saps most colours from the image to bring darker colours into greater relief. Such methods can also produce the dreaded white-on-black images that you might remember from older microfilm scans. Nevertheless, this approach can reveal obscured letter forms and even obliterated text.

With higher resolution images than ours, the amount of detail captured by a more modern camera’s sensor will allow for potentially better results. But at 3.6 megapixels, I’m quite happy with the results we’ve obtained here.

There are a number of different methods scholars can use to get these results, some more computationally intensive than others. The first is to use in-built image editing software that comes bundled with most consumer computers. For example, the app Preview in MacOS has an ‘Adjust Colour’ feature in its ‘Tools’ menu. Similar tools are available in the Photos app on Windows.

The key options here are the contrast and exposure sliders, that will allow you to adjust image accordingly. The sliders at the top of the menu allow manual adjustments so you can emphasise particular colours. 

More specialist software packages offer more powerful tools that can be used to target certain problematic areas of a manuscript image, rather than affecting the entire image. Software packages like Adobe PhotoShop and the cheaper Pixelmator are understandably associated with commercial enterprise work but can be used fruitfully by scholars to improve the visibility of problematic manuscripts. In particular, these software packages offer tools that will metaphorically ‘burn’ areas of the manuscript in order to raise faded text into a darker, more readable, form. Here’s a video of our manuscript sample from Dalkeith Presbytery again, this time being ‘burned’ in Pixelmator:

The more times the user passes the cursor over the chosen area, the deeper the darkening effect will become. Changing the ‘Exposure’ (or ‘Opacity’) setting (at the top of the screen in the video) allows the user to adjust the strength of the effect. While this is a time-consuming process, it can serve to reveal details in manuscripts that would have been too faded to enter into our analysis. It is an ideal approach for small-scale repairs to areas of the source base.

Such is the power of online computing that there are some online tools that can process images in equally powerful ways. The website Retro Reveal runs a number of image processing algorithms that are tuned to bring the sorts of text one might find in manuscripts into greater relief. While Retro Reveal is more suited to looking for very specific details in manuscripts, it can prove useful for generating alternate versions of large manuscript images, too.

These techniques are part of MSR’s daily toolbox to help us navigate the world of Church of Scotland presbytery manuscripts from between 1560 and 1689. We wanted to share our experience because these approaches will be of interest to other scholars working with manuscript images, but they will also be largely hidden when our dataset is released into the wild. When viewing MSR’s dataset, it is effectively naked and extracted from its physical context of the manuscript in which it exists. It is easy to forget that each entry in our database involves numerous steps of discovery, manipulation and manuscript analysis.

Chris R. Langley