Being Open: Raw Data & Mapping the Scottish Reformation

In version 1.1 of our project website, we introduced a new feature: a button on each map that allows users to download the results of their chosen query in JSON format. While this seems like a small and rather specialist gesture, this change to our website underlined a fundamental aim of Mapping the Scottish Reformation: the desire for our data to be open for future researchers to use, without our input.

The download button that launched a thousand…datapoints?

We all know that digital projects in the humanities are often labour intensive. The initial effort to gather information, in our case from manuscripts, is time consuming and requires considerable amounts of skill (reading the handwriting, knowing what you’re looking for, being able to parse lots of information, etc). Then there is the encoding of that data into structures that a machine can read and query. Up until relatively recently, a great many digital projects in the humanities have developed custom methods to deal with these problems, resulting in beautifully crafted projects. However, these methods often (though not always) have one flaw: they usually don’t play nice with other projects. The data you can extract on the website or app is often limited to the types of questions that were originally envisaged by the project’s designers. These projects aren’t easily interoperable.

Think about it this way: when you wanted information from your favourite or most used online humanities resource, like a historical database, it is highly likely that you wrote down the data you found by hand or in a document on Microsoft Word or equivalent. You might have taken a screen grab. This is your record. This is fine if you need a relatively small piece of information, but what if you want to leverage the power of a significant proportion of the dataset of that entire website? You can’t use it, you have to laboriously (and manually) write down the data you need, or you’re forced to contact the creators of the resource and ask for permission to access the raw dataset. The creators (if they’re still active) will then need to reframe their data so it works for your purposes.

It is our contention that the creators of digital projects in the humanities should think about interoperability when considering the legacy of their projects. Legacy is much more than just the longevity of your website or resource (“we need x years of hosting costs” or “we need to ensure that this website doesn’t break whenever there’s a major iOS or Android update”). Longevity needs to be understood as the life and utility of the data at the heart of your project, long after the research questions that created your project are obsolete.

This process of siloing data in custom repositories is not helpful to developing humanities projects. As part of our talk for the Northern Early Modern Network in September 2021, we showed how considering three key features can help make a project more sustainable by encouraging interoperability.

Interoperability slide from our presentation ‘Using Digital Tools to Explore Early Modernity’

First is to be open about data structures. Scholars building digital resources have spent hours considering how to encode information from manuscripts into databases: basically this is a process of establishing how to make the complexities of these records machine readable. Unfortunately, up to now, projects rarely share how they structure their data collection, seeing it as something that is largely internal to the project. We’ve created standardised ways to capture complex data that will work in any project, right down to the references. We write about this process on this blog.

Second, offer your data openly online. While most data from digital projects is available for free and not behind a paywall, the data can never be scraped and extracted from the resource. At least, not easily. For us, Wikidata does the heavy lifting: no need to create a custom database, allows for quick prototyping of visualisations, and interfaces nicely with other platforms and programs. Everything is uploaded in accordance with Creative Commons Zero, meaning it’s free to use without a licence. The raw data of our research can be used by scholars for free long into the future and without the need to enter into lengthy negotiations with us about how to use it. Projects should explore similarly open data standards wherever possible.

Third, data should be exportable. People searching through your digital resource should be given the option to download their results. Why? Well, many archives have poor or non-existent wifi, meaning scholars may need quick access to information offline. I suppose more important, however, at least for the purposes of this post, is that having data available to download means that it can be immediately switched to different applications. Is there a visualisation one of our users wants to see but that we haven’t created (perhaps for a teaching resource, presentation, or PhD thesis)? Fine, they can now create it with the raw data. Want to use a new technology that we couldn’t incorporate into our website? The exportable raw data can be repurposed into the new or emergent technology. Having data that can be downloaded in a way that suits the user can empower them to perform a broad range of new projects that we could never have envisaged.

Data should be exportable. Here is a downloaded search from our website in JSON format.

We still have a way to go before we achieve all of these lofty goals as well as we would like. For example, our current website only allows results to be downloaded as a JSON, which limits its use to certain applications (TSV or CSV may be more attractive to a wider range of uses and a simple HTML list would be much easier to use for non-specialist users). Similarly, we could do more to emphasise our methods of structuring data on Wikidata to promote the availability of our dataset. And neither are we the first to consider these issues: colleagues at University of Edinburgh revived data from the Survey of Scottish Witchcraft (gathered between 2000 and 2003) and repurposed it by using many of these techniques (we owe a great deal from them).

Open standards come with some risk, but we would argue that the benefits are more numerous. We must ensure robust editorial standards for our work, so we need to ensure quality control of our data. Similarly, methods to acknowledge the original labour put into the data gathering becomes more difficult the further one moves away from traditional citation techniques like footnotes or endnotes. Nevertheless, sharing knowledge about data structuring and giving access to raw data in some form can enhance the interoperability of a project.

Ultimately, considering interoperability at the start of your project may extend the longevity of all of that effort in collecting data in the first place.

Chris R. Langley

Geoshapes, Wikimedia Commons, and the Early Modern Church of Scotland

Earlier this year, we published a blog post explaining how we mapped the ecclesiastical regions of early modern Scotland. We made the data available to view on our website and the resulting files can be downloaded from Github. In this follow-up post, I want to explain how we can broaden the audience of this geographic data even further by using Wikimedia Commons.

Jan Ainali and Albin Larsson recently showcased how users of Wikidata can query data in a way that highlights areas on a map, rather than just single points. An underused feature in Wikidata is the ability to include geoshapes stored in Wikidata Commons as properties in Wikidata items. A typical example of this functionality would be for Wikidata items showing for national parks in the UK:

Wikidata item Q15052206 – Brecon Beacons — showing both a coordinate location and a file path for a geoshape

The Wikidata property ‘geoshape’ (P3896) can link directly to a file stored in Wikimedia Commons. Here is the map of the Brecon Beacons:

Wikimedia Commons data source: geoshape for the Brecon Beacons

Rather than just showing a point on a map, the geoshape allows users to see precise boundaries of a certain area.

Following AInali and Larsson’s excellent tutorial, a quick SPARQL query can bring up all of the national parks in the UK, including other bits of data, like when they were established as national parks:

Output of Wikidata query for all national parks in the UK

Being able to link Wikidata and Wikimedia Commons, as well as being able to visualize areas as well as points in Wikidata, has a huge range of potential applications. For our purposes, I decided to upload to Wikimedia Commons all of the synod shape files we built earlier this year. The remainder of the post will detail some of the steps involved in that process.

I started by taking each geojson shapefiles that we had created earlier this year. These roughly mark out the different synod regions of the early modern Church of Scotland (minus Argyll). I then tidied these files using Mapster’s ‘Right Hand Rule’ tool, which brings the geojson into line with the format Wikimedia Commons will accept. You can then upload this into Wikimedia Commons using the data tool. Simply point your web browser to the page where you want to upload your data:

https://commons.wikimedia.org/wiki/Data:[insert your file name here].map

You can then click ‘create this page’ to add your data.

Creating a new data file on Wikimedia Commons

You can then create your data file, by pasting in the text from the geoJSON. Wikimedia Commons also demands that you add several lines to the geoJSON detailing the data source and copyright details. There is even a little debugger at the bottom left of the text panel to show you if there are any errors in your text:

The data entry panel in Wikimedia Commons. Note the debugging icons on the bottom left

Publishing the data will make it available, openly, to anyone on the internet. Also, Wikimedia Commons is clever enough to render your datafile into a shape overlaid on a map:

Wikimedia Commons displaying map data for the Synod of Perth and Stirling

Now this data is safely deposited on Wikimedia Commons, we can add a new property — ‘geoshape’ (P3896) — to our Wikidata entry for the Synod of Perth and Stirling. We populate this item with the file path shown on Wikimedia Commons:

Adding the Wikimedia Commons file path to the ‘geoshape’ property (P3896) on Wikidata

I repeated this process for all of the synod region shape files that we created earlier this year. I then modified the query that we built earlier: so rather than searching for national parks, we can now search for synod regions of the early modern Church of Scotland. Here is the result:

This takes the geoshapes we built earlier this year and makes them more widely available by contributing to Wikimedia Commons. And by linking these items to Wikidata, we can capture more information about ecclesiastical synod regions in the sixteenth and seventeenth centuries: making our queries more detailed and allowing for a new set of visualizations.

Perhaps more profoundly, this process shows how our commitment to open data enhances our project: any errors in the geoshape files can be corrected on Wikimedia Commons and further information about the synods can be added to each item on Wikidata (e.g. dates of establishment, names of moderators, changes in geographic scope, etc). And, finally, all of this stems from the supportive and open community around Wikimedia: building on the foundations of AInali, Larsson, and so many others.

Chris R. Langley

Newman University, Birmingham, UK

Mapping Scotland’s Synods, 1560-1689

As we step up our preparations for Stage 3 — venturing out from Lothian and Tweeddale and into the rest of Scotland — we are busy scoping out the extent of the work before us (spoiler: there is a lot). We have already entered Presbytery data to Wikidata. Now we turn our attentions to Scotland’s provincial synods.

Arrowsmith’s ecclesiastical map of Scotland (1825)

Many of our readers will know about Aaron Arrowsmith’s 1825 ecclesiastical map of Scotland. A scanned version is available on the National Library of Scotland website here. I’ve spent a lot of time looking at this map, but there are two problems with it, at least from our perspective: it shows the shape of the nineteenth-century Kirk of Scotland and, unlike so many of the terrific maps on the NLS website, it is not georeferenced.

Fortunately, NLS have adopted the International Image Interoperability Framework (IIIF) on all of its scanned map images. This means that the image can be exported to various different software viewers (ARCGIS, CanvasPanel, Mirador, Recogito, etc). Using the free website georeferencer.com, we imported Arrowsmith’s map and started the georeferencing process.

Georeferencer.com tied Arrowsmith’s map to modern map data

By using georeferencer.com, we were able to overlay Arrowsmith’s representation onto a modern map, pinpoint known places and features, and allow the website to adjust the nineteenth-century map to accord with modern-day cartography.

Georeferencer has an extra trick up its sleeve: the georeferenced map can be taken directly to MapTiler, a service that allows users to create vector shapes on top of georeferenced maps. Translation: we could now trace the boundaries of provincial synods shown on Arrowsmith’s map and use them in modern projects.

Tracking the Aberdeenshire coast on MapTiler

This is fiddly work: you start with a pretty generalised polygon and have to edit it, by hand, to accord with coastal features and the like. There is the added complication that Arrowsmith’s knowledge of certain parts of Scotland was less-than-stellar: thankfully, MapTiler allows users to alter the opacity of each layer so one can see how the modern and historical maps compare. Of course, there are editorial judgements to be made here, especially in areas where Arrowsmith’s map lacked key detail.

We’re using Arrowsmith’s nineteenth-century map here and our project covers the period 1560 to 1689. Unsurprisingly, a number of features had changed between the end of the period covered by our project and the moment of Arrowsmith’s composition. The Synod of Ross, for example, was established in 1707, so was shown on Arrowsmith’s map, but would not have been recognised in the period covered by our project. Similarly, the Synod of Glenelg appears on Arrowsmith’s map, but was not established until 1724. Such challenges meant that it would be remarkably difficult for us to represent the area covered by the Synod of Argyll, in particular, in a single shape file. Then there are changes within the period covered by our project: the Synod of Lothian and Tweeddale includes the region covered by Biggar Presbytery, an area disjoined from the Synod of Glasgow and Ayr in 1644. To capture the complexity of these changes, we would have to create new shape files for each significant revision to the Kirk’s ecclesiastical boundaries. Our shapefiles are no-less an interpretation of provincial synod boundaries than Arrowsmith’s nineteenth century work. Nevertheless, they offer a useful indication of ecclesiastical structures.

Synod boundaries as geoJSON

Once complete, each shape file can be edited further, exported as a GeoJSON (and then possibly converted into other shape file formats) and imported into any mapping platform. Using the workflow we detailed here, we took these datafiles and created a map in Leaflet showing most of Scotland’s provincial synods between 1560 and 1689. For the reasons stated above, we felt it best to not map the Synods of Argyll and Caithness and Sutherland before obtaining further details on their precise extent in the period 1560 to 1689.

Click the thumbnail to see the shapefiles in action

For those interested in making use of these rough shapefiles for their own projects, each file can be found at one of the following links on GitHub:

Aberdeen

Angus and Mearns

Dumfries

Fife

Galloway

Glasgow and Ayr

Lothian and Tweeddale

Merse and Teviotdale

Moray

Perth and Stirling

These shapefiles are far from perfect — a result of Arrowsmith’s inaccuracies and less-than-perfect drawing on my part — but they represent a start in visually understanding the organisation of the Church of Scotland. And while this represents only a snapshot in time — a more fluid picture would require multiple shapefiles of each synod (especially in Argyll) — such an overview of Scotland’s provincial synods, 1560-1689, shows how scanned images can be georeferenced and then opened up in such a way as to make them machine readable. Tools like Mapping the Scottish Reformation will offer more ways to interpret this messy data.

Chris R. Langley

Editor’s Note: Several months after the writing of this post, we uploaded these shapefiles to Wikimedia Commons (July 2021). To find out more, read this blog post.

Scotland’s Presbyteries, 1560-1689

As we move into Stage 3 of Mapping the Scottish Reformation, we are shifting our attentions beyond the Synod of Lothian and Tweeddale and to the other regions of Scotland.

We have been laying the groundwork for our data collection in Stage 3 by roughly mapping presbyteries and synods across Scotland that were active between 1560 and 1689 and logging them into Wikidata, our repository for structured data. We imported all 65 presbyteries into Wikidata with QuickStatements and using existing Wikidata properties. While the way we have structured this data may change as our work progresses, tracking different synods and presbyteries is essential work as we build our dataset beyond Lothian and Tweeddale.

We originally intended for this activity to be of internal use only — giving us a way to track our progress — but we soon realised it could also offer users a resource to see the administrative structures of the Church of Scotland in a clear (and somewhat interactive) way.

The layers button on the toplight of the map offers the ability to filter different synod regions. Each different colour denotes the presbyteries within a particular synod province (please note that these colours are chosen at random and will change when you reload the page).

Some health warnings: first, the locations of the presbyteries — denoted by the dots — only relate loosely to where a presbytery usually met or was centred. This is to provide a broad idea of where a presbytery was active. The process of mapping the changing boundaries of a synod region or presbytery is something we have tested elsewhere. Secondly, the presbyteries shown are a snapshot in time and, due to the parameters of our project, do not include presbyteries or synods formed after 1689. Equally, presbyteries that were newly created (like Biggar) during our survey period appear alongside the rest of Scotland’s ecclesiastical courts with no note.

Using the workflow we developed in Stage 2, we have also created this simplified Leaflet map by exporting our information from Wikidata. You can read more about the process here.

Click the screen grab to access the map

We have written about the distribution of parishes before on this blog, but seeing provincial assemblies and regional presbyteries mapped, even in this rough way, is of use. This map reflects the geographical distribution of people as well as ecclesiastical power in early modern Scotland. It also represents how far our project must travel to track religious change on a truly national scale.

Chris R. Langley

Visualization Matters: Collaborating to Show Religious Change

In December 2019, Uta Hinrichs, Stefania Forlini, and Bridget Moynihan published a reflective piece entitled ‘In defence of sandcastles’ in Digital Scholarship in the Humanities. In the article, the authors argued that visualizations in humanities projects are not only tools for end users to search through complex data, but are themselves critical research in their own right. Like much humanities scholarship, digital visualizations of historical data are the result of a set of editorial decisions. What makes these decisions interesting is that they must cross different disciplines — bringing together humanities scholars and technical specialists (data modellers, UI/UX designers, web developers etc). In what follows, I want to outline the rationale for some of the decisions behind the visualizations in the Mapping the Scottish Reformation website and to underline how the maps users can play with on the ‘site are the product of deep collaboration between different disciplines.

I want to focus on the Journeys map on the MSR website. This will keep me on track, but it is also because this particular map contained a number of editorial and technical challenges that made us reflect critically on what constituted a clerical career path in early modern Scotland. 

TL;DR: The interdisciplinary discussions around the creation of the Journeys map — a process that was part of a modern humanities project — made us reflect on the experience of being a cleric in early modern Scotland. 

For context, our current Journeys map contains the career paths of 654 ministers and includes 935 separate appointments made across Lothian and Tweeddale between 1560 and the turn of the end of the seventeenth century. At the outset, the historians on the project team had some ideas for what such a map should achieve:

  • To show clerical migration patterns
  • To be able to identify typical clerical careers
  • To understand the distances a minister might cover in his professional life

These questions are all related to longstanding discussions within scholarship of the Reformation and, as such, they represented our own assumptions about what a) users might find useful and b) what we think constitutes a clerical career.

The discussions over how to visualize these journeys forced us to reflect on these assumptions. Our original idea was that parishes in which a minister served would simply be connected by lines that would be clickable by users. These lines would be searchable in some manner. Here is an early example we built on Wikidata:

The example above had four properties:

  1. Places and years: ‘Athelstaneford (1682), Bathgate (1665)
  2. Total moves: ‘2’
  3. Name: Walter Rigg’
  4. Coordinates: [to plot him on the map]

Once the data was presented on our test website, however, it quickly became apparent that these lines would not be particularly useful for visitors to our website (for example, simply drawing lines from one place to the next offered no temporal context), but also that, from a technical perspective, applying filters to huge lines with no differentiation would be very difficult: we had exported our data as strings of text so places and years could not be read by separate filters. Also, the string of text showed a minister’s journey in any order (notice how Rigg’s career in the image above is recorded in reverse sequence). The technical limitations of our data and what this data was saying about a minister’s career forced us to go back to the database and extract a different set of values:

  1. Name
  2. Total moves
  3. Place ranking sequence
  4. Separate year for each move
  5. Year when a minister left parish
  6. Coordinates 

This change to the data model gave us a much more flexible structure for our filters to actually work, but it also allowed us to consider the direction a minister’s career could take. We could now add arrowheads to each move in a cleric’s career and red and green dots denoting the start and end point of (what could be v lengthy) careers :

This was a significant technical and user-friendly fix, but our approach belied one huge assumption we, as historians on the team, had made: our decisions to this point had privileged clerical mobility. In a project about mapping the Scottish Reformation we had fesitishized the idea of movement. 

In our passion to convey to our developer partners that we wanted to explore movement, we had missed a huge chunk of our dataset: those ministers who had only served one parish — those who did not move. Our data modelling allowed for these individuals, but our ideas about visualization gave no place for them. Here is a screengrab of an early prototype showing James French, a minister who had served at only one parish (Penicuik):

Without a conversation with our developer partners who spotted this issue, we may have shipped a build of our website that did not show these individuals (who, btw, represent the largest proportion of ministers). But their diligence also raised the question of how, in a website about mobility, we give due credit to those individuals who dedicated their lives to just one place? Historical assumptions meet visualization questions. 

A technical solution solved the historical question: our developer partners implemented a system whereby those ministers with a parish count of ‘one’ would be listed in their parish. This would allow these critical figures to be displayed and that their service to one parish be specifically parked: in other words, their lack of mobility is marked by a specific UI element and defined by the parish they served:

These kinds of fruitful discussions between developers, UI/UX specialists, and historians can be cyclical. For example, having seen the linestrings and prompted by feedback online, we reflected on the extent to which the point to point linestrings reflected contemporary travel routes, roads, and communication networks. Using a modern route planner, we cobbled together an example for our developers:

Unfortunately, there were both technical and historical problems with this solution. First, this route planner used modern road networks. It will only really be with the fulfilment of projects like this one that would allow scholars to follow contemporary travel routes. Second, the route plug-ins for our maps platform only allowed for one route to be animated at any one time raising questions about how to implement this at scale. Finally, there was an issue of interpretation in animating lines like this: were we, again as historians, privileging mobility in clerical careers with cool UI features that were not actually representative of most clerical careers. This is an example of where editorial discussions between developers, historians, and UI designers forced us to reconsider what we were conveying with these visualizations. 

There are many more examples in our discussions over our first website where expertise in web development, historical knowledge, and UI/UX design brought about significant innovations in how we treated historical material, modelled the data, and built the website. This is just one of them. The process of iterating different versions of our maps forced us to ask questions of what we were seeing: not merely from the perspective of ‘how might a user use this’, but also fundamentally questioning what we thought we knew about the data. Our dev’ partners posed questions to the historians on the team, which in turn forced reflection, which then forced further questions about visualisation.

At their best — and I cannot emphasise this enough — these conversations are utterly energising. While the historians on the team have already been outed for their sheer enthusiasm during these meetings, the level of collaboration implicit in this process is very real and exciting. I remember several project meetings where the collaboration between the team was so fluent that we were flying through new ideas, facing interpretative or technical challenges, devising alternatives, and then watching our developer partners design and revise the website code live on a screen share on Zoom. We were watching the results of our editorial conversations between different disciplines come to life in real time.

This communication process isn’t always easy. But with a degree of candour, an understanding of what domain expertise one area brings to the project, and an ability to openly reflect on the assumptions you bring to the table, projects like ours can build data structures and visualizations that are products of collaboration in their own right, as well as tools for other researchers. 

Chris R. Langley

Entering Stage Two: Mapping Parishes in Lothian and Tweeddale, 1560-1689

With Stage One of Mapping the Scottish Reformation nearing completion, we now have a large dataset of ministers, detailing their movements and important moments of their careers. We have parsed over 9,000 pages of manuscripts from presbyteries across the region of Lothian and Tweeddale. With these significant task complete, we can now turn our attentions to how we intend to present this data in Stage Two of the project.

Stage Two of Mapping the Scottish Reformation is generously funded by the Strathmartine Trust and will see us explore our user interfaces for the first time. Up to now, we have deposited our data onto Wikidata and used the built-in tools to test our material, to see the locations of gaps, and to create quick mock-up visuals that we think users might find useful. You can read more about our use of Wikidata mapping tools and how we structure our dataset here.

A screen grab of using the mapping tools built into the Wikidata Query Service.

Followers of the project will have seen some of the demos we have been able to quickly put together that visualise the breadth of our data and hint at some of the ways we can put it to work. Critically, Stage Two of Mapping the Scottish Reformation will use even more powerful mapping technologies to create visualisations that load faster, run more smoothly and show even more data. Before we formally embark on this process, we have spent the last couple of weeks dipping our toes into some of these more powerful mapping technologies.

The Wikidata Query Service — and the SPARQL queries we write to ask questions of our data — sits at the heart of our project. It allows our data to be open for other researchers to use (and build on) in the future, but it can also be quickly exported and patched into other programmes/services. The first stage of testing more powerful maps is to take the result of a SPARQL query, add a few modifications to the code, and export it into a TSV (tab-separated values) file.

The Wikidata Query Service allows the results of queries to be exported into various file formats

We exported a SPARQL query that shows all of the parishes in the Synod of Lothian and Tweeddale between 1560 and 1689, as well as a label that showed in which presbytery each parish sat. This has the fewest values of any of our datasets so we thought it would be an easy place to start! The resulting TSV file is effectively a huge spreadsheet: and as much as I like spreadsheets, I think it would be stretch to call it attractive or user friendly. The key thing here is that we have the latitude and longitude data in separate columns and have the key bits of information we want to display to users. Our test file included around 120 lines.

The resulting TSV file is functional, if unappealing

Formatted correctly, a TSV file like this one can be imported into GeoJSON, an open format mapping service that allows users to input geographical data and show them on maps (note: have you noticed our constant use of open source and access services?!! It’s no coincidence!). Users can either add points manually or, critically for us, add geo-referenced locations in bulk. Having uploaded the file, the result is a much more appealing map that includes more attractive and comprehensive icons and the ability to select different mapping layers. We can even add different mapping tiles, using a service like Map Tiler, enabling us to test different backgrounds.

The beauty about GeoJSON is that it transforms that ugly TSV file into something more machine readable. Unfortunately, GeoJSON doesn’t allow you to automatically export your map or embed it into a website like this one. This is where Leaflet.JS comes in.

Leaflet is quietly taking over the world of internet mapping applications, but its huge functionality comes at a significant technical cost: we aren’t in the world of drag and drop or ‘what you see is what you get’ editing anymore. The benefits of a little perseverance, however, are huge.

Leaflet demands an understanding of CSS, or at least an understanding of what to swap into lines of code cribbed from GitHub and when. This process was made infinitely easier by Leaflet’s own tutorials and, in particular, by this superb tutorial on YouTube by Eduonix. The key here is to take the code generated by GeoJSON and to copy it into our HTML file (shown below in Sublime Text markup editor). You can see how the data from GeoJSON is shown just below the various lines of code for headers etc.

After generating our map, a few fairly simply lines of code allows Leaflet to then take the data from GeoJSON and display it, as well as adding a custom mapping layer and popup menus that are, in theory, infinitely customisable. The resulting map can be exported to HTML and embedded into a website. And because the database values were all pasted into Leaflet, at least for the moment, Leaflet doesn’t have to request the info each time the page loads. The result is that the embedded map loads almost immediately.

You can play with this simple demo, showing all of the parishes of Lothian and Tweeddale between 1560 and 1689, below.

Notice that we have made use of the NLS Historical Maps API to plot the points on a historical map. This dynamically adjusts to a different background map depending on how far a user zooms in or out of the map.

If this seems like a tremendous amount of effort to go to in order to embed a map, then I suppose you’re right. What’s important here is that we have demonstrated that the data we manually took from manuscripts within the National Records of Scotland, passed into Wikidata, and then queried using SPARQL and the Wikidata Query Service, can be exported, customised and presented in a way that it as visually friendly as we can make it!

This is just a test, but it reflects the process we will go through during Stage Two of Mapping the Scottish Reformation, with colleagues from our international Advisory Board and our technical friends and colleagues at the University of Edinburgh. Ultimately, this process will allow us to create a number of interactive visualisations that will distill the months we have spent looking at handwritten archival material and make it more accessible. So while we’ve been recording, storing and querying the Scottish Reformation up to now, Stage Two of this project will allow us to start intricately mapping the Scottish Reformation.

Chris R. Langley

Working with Imperfect Manuscripts: Image Manipulation in MSR

MSR uses images from documents produced by Church of Scotland presbyteries between 1560 and 1689 to gather data on clerical careers. Indeed, at the time of writing, we have read over two and a half thousand pages of such material. And we aren’t finished!

Our friends at the National Records of Scotland provide us with bulk deposits of images of these manuscripts, presbytery by presbytery. These images were taken between 2003 and 2005, during which time around five million document images were snapped and digitised. These images form a terrific resource that can be accessed via the Virtual Volumes system at the National Records of Scotland reading rooms in Edinburgh. Due to bandwidth constraints in the early 2000s, these images were taken at a resolution of 2174 × 1655 pixels, or just over 3.5 million pixels per image. By contrast, images snapped on a current mobile phone camera can hit around 12.1 million pixels per image. These technical constraints and the fact that our manuscripts were produced four centuries ago and survive in varying states of decay means that our source base can be quite difficult to read. This blog post will take you through some of the methodologies we use to enhance the amount of data we can recover from our source base.

The following image is from a particularly faded section the records of Dalkeith Presbytery from 1614 (NRS, CH2/424/1). One can observe how the marginal annotation appears in a much deeper ink on the left-hand side, but the main entry to the right is faded. Moreover, some of the ink from the other side of the manuscript is just starting to bleed in to cloud our vision further. While this isn’t the worst manuscript pre-modern historians are likely to see (!), this is a common trait of some of the volumes with which we work.

Any illegible part of a manuscript is irksome, but it is particularly annoying in projects like ours that rely on parsing large numbers of manuscripts each day. Moreover, the early part of the seventeenth century was when Dalkeith Presbytery became far more active in recording details of clerical careers (among other things), so it is essential that we capture this data. 

Fortunately, there are a number of simple techniques that historians can use to improve their chances of seeing through this sort of haze. In particular, we can manipulate the colour balance of the image to bring the text into greater relief using quite basic computer software. Here’s one example with the image contrast boosted and with exposure increased. I think this setting allows manuscripts to appear as my family expect them –  suitably old – but it also allows the ink to become far clearer to the naked eye.

The next example saps most colours from the image to bring darker colours into greater relief. Such methods can also produce the dreaded white-on-black images that you might remember from older microfilm scans. Nevertheless, this approach can reveal obscured letter forms and even obliterated text.

With higher resolution images than ours, the amount of detail captured by a more modern camera’s sensor will allow for potentially better results. But at 3.6 megapixels, I’m quite happy with the results we’ve obtained here.

There are a number of different methods scholars can use to get these results, some more computationally intensive than others. The first is to use in-built image editing software that comes bundled with most consumer computers. For example, the app Preview in MacOS has an ‘Adjust Colour’ feature in its ‘Tools’ menu. Similar tools are available in the Photos app on Windows.

The key options here are the contrast and exposure sliders, that will allow you to adjust image accordingly. The sliders at the top of the menu allow manual adjustments so you can emphasise particular colours. 

More specialist software packages offer more powerful tools that can be used to target certain problematic areas of a manuscript image, rather than affecting the entire image. Software packages like Adobe PhotoShop and the cheaper Pixelmator are understandably associated with commercial enterprise work but can be used fruitfully by scholars to improve the visibility of problematic manuscripts. In particular, these software packages offer tools that will metaphorically ‘burn’ areas of the manuscript in order to raise faded text into a darker, more readable, form. Here’s a video of our manuscript sample from Dalkeith Presbytery again, this time being ‘burned’ in Pixelmator:

The more times the user passes the cursor over the chosen area, the deeper the darkening effect will become. Changing the ‘Exposure’ (or ‘Opacity’) setting (at the top of the screen in the video) allows the user to adjust the strength of the effect. While this is a time-consuming process, it can serve to reveal details in manuscripts that would have been too faded to enter into our analysis. It is an ideal approach for small-scale repairs to areas of the source base.

Such is the power of online computing that there are some online tools that can process images in equally powerful ways. The website Retro Reveal runs a number of image processing algorithms that are tuned to bring the sorts of text one might find in manuscripts into greater relief. While Retro Reveal is more suited to looking for very specific details in manuscripts, it can prove useful for generating alternate versions of large manuscript images, too.

These techniques are part of MSR’s daily toolbox to help us navigate the world of Church of Scotland presbytery manuscripts from between 1560 and 1689. We wanted to share our experience because these approaches will be of interest to other scholars working with manuscript images, but they will also be largely hidden when our dataset is released into the wild. When viewing MSR’s dataset, it is effectively naked and extracted from its physical context of the manuscript in which it exists. It is easy to forget that each entry in our database involves numerous steps of discovery, manipulation and manuscript analysis.

Chris R. Langley