On the Scope and Methods of Transcription

The majority of this post was contributed by one of our anonymous volunteers, who had been doing the fiddly but essential job of going over the data and trying to spot and correct issues, as well as add in the historic counties in which accidents took place. We’re extremely grateful to the volunteer, both for all their work on this set of data (and others) and for writing this blog post. It reminds us that it’s important to be aware of how our archive has been constructed, as well as the practical challenges of doing this kind of project.

This post is about how handwritten records become digitised and therefore become more widely and easily available to anyone with an internet connection. Much of this post will be general advice and information about transcribing handwritten records into digital formats, but it is based on this project’s experience with the records of the Amalgamated Society of Railway Servants (ASRS) which was the subject of “Transcription Tuesday” in 2019. This event, run by Who Do You Think You Are? Magazine was a great success: together around 3,800 accident cases were transcribed by members of the public and this post discusses some of the issues encountered in getting from handwritten records to a useable database.

Records: Who, why, what where and when?

The records under consideration here are 120 pages of the ASRS record book detailing accidents and incidents involving members of the Society between the years 1901-1905.  The ASRS was an early attempt at unionising railway workers and members joined from all trades and grades within the railway industry. The ASRS supported their members in various ways following accidents; sometimes it was in the form of financial support until they could return to work, sometimes they provided legal advice and representation, and in cases involving fatalities would provide for the widows and dependants. The records cover all the railway companies operating at that time, including companies in Ireland; please remember that, at that time, the whole of the island of Ireland was part of the United Kingdom.

Reading 19th century handwriting and the ASRS records in particular

Reading 19th century documents is a fascinating exercise because of the infinite variety of styles of handwriting that were in common usage. (I know that the records under discussion were written in the early 1900s, but the writers would all have learned to write in the 19th century – please bear with me here!). Pens that were dipped into an inkwell were the norm at that time and it is surprising that there are so few ink blots in the ASRS records, probably a sign of a careful hand at work. That it not to say that there are no errors in the records; there are several duplicate entries (probably by different people) and the numbering system also has some errors. Capital letters were rather floral in design and differentiating between a “T” and an “F” is difficult, as is the difference between a “G” and an “S”; this is made more difficult for the reader if the two letters were written by different people.

Understanding how people wrote at that time makes the transcription process easier. For example, the letter “z” usually looks like an “m” with a long tail, and sometimes is transcribed as “my”. Determining whether a squiggle is an “m” or a “w” is hard; the same applies to “a” and “o”; occasionally, there are problems with “r” and “n”.

Just to make life even more complicated, different writers in the ASRS record books would write place names as used in local abbreviations – and there is no list of these to refer to!

There is one last point of difficulty about these ASRS records. As well as different people’s handwriting to contend with, there is a problem with reading different coloured inks. In some of the later records, a pale blue ink was used. This may have been a stronger blue originally and it has faded over time; black ink generally does not fade as much, and is easier to read.

Interpretation of words that are not known by the transcriber

The “Transcription Tuesday” process invited people from all over the world to tackle the transcription of records. This may have involved people whose first language is not English, thus making the work of reading the records difficult. In addition, there may well be words that are unfamiliar, even to native speakers of English. It is well-known that individual industries usually have a lexicon all of their own and it is not unusual to find words that may be in common usage, with a generally-understood meaning, to take on a different meaning when used in a specific context within an industry. For example, the phrase “light engine” has nothing to do with the weight of the engine; it means it is moving without any carriages or wagons attached to it.

On top of all this, words that are no longer in use would have been common parlance in the early 1900s; this applies in particular to job titles as many jobs that were everyday activities in 1900 are no longer performed or needed. The railway industry itself had a number of very specific words for specific jobs, some of which were regionalised. For example, “Platelayer” would be used across the railway industry to describe a man who laid the tracks on which the trains ran (nothing to do with plates that you eat off!). On the other hand, a “Rulleyman” would only have been seen or heard in the North of England; elsewhere he would be a “Carter” or a “Carman”. (A “Rulleyman” was the driver of a flat wagon or cart which was used in the delivery of heavy items/objects such as barrels, quarried stone etc. The Rulley was the name of the type of wagon/cart used).

It is also normal for a transcriber, faced with an unknown word, to try to make sense of the letters staring up from the page (or screen). In this case, transcribed versions of “Rulleyman” included “Rulerman” “Bullyman” and “Pulleyman”.

Spelling and its variations

The English language is constantly evolving and transcribers of early 20th Century records are faced with a number of challenges, not least that of spelling of place names. The naming of places provides variation in spelling, sometimes dependent upon regional usage; the name of “London” has been noted as “Lunnon” in some records. Of course, there is also the difficulty of sorting out the “-burgh” from the “-borough” and the “-brough” where the sound of the spoken place name could lead the writer to spell the name differently. Aside from the additional problem of awkward spellings of unfamiliar places in Wales, Scotland and Ireland, individual’s names can also be spelled differently. If you are transcribing “Mr Smith” for instance, the variants could include “Smyth” and “Smythe”; “Mr Taylor” transcription variants might include “Tailor” or “Tailyour”. When someone is using the database for genealogical purposes, having the correct spelling of a surname/forename is very important.

Checking the transcription

Places: Many of the places referred to in the records were easily identifiable and therefore checking them was straightforward. However, some of the more “out-of-the-way” places did require a little more research to ensure that the transcriber had typed what was written in the record. Some of the handwriting made this part of the process more difficult for the transcribers, but the checking process was rigorous and place names were tied in with railway companies and their sphere of operations.

Names of members: As mentioned earlier, the correct spelling of peoples’ names is essential for the future use of the database. The names were checked against the original handwritten records and in most cases the transcribers had done a good job in transferring the names. Sometimes, the flowery handwriting caused the odd mis-spelling, but it is to be hoped that all the names are now correct.

Dates: Once again, the correct date of an incident (or a death) is vital to those who are using the database. Many of the records are not exactly in date order, mainly because there might have been a period of time between the incident happening and the member reporting the incident to his Branch Secretary. In addition, Branch Secretaries may have been tardy in sending reports in to the Head Office; therefore incidents would be recorded in the order in which they were received, but not necessarily in the date order of occurrence.

Railway Companies: The checking of the railway companies’ names has been interesting in this ASRS exercise. Many of the transcribers were familiar with many of the names, but some were not. The writers of the ASRS records were very familiar with the railway companies of the time and tended to use initials rather than the full name. Occasionally, a guess at the initials written in the record would throw up an unexpected company name which, on further study, did not exist. Sometimes the words “Western” or “Southern” would be shortened to “West” or “South” and would have to be corrected.

Counties: The major issue with Counties is making sure that the County boundaries as used in the period are correct. In the United Kingdom, County boundary changes have been made several times in the 20th Century and for accuracy, older maps and old names for the Counties have been used. As a simple example, take the county of Monmouthshire, which starts on the West side of the River Severn and is bounded by the River Wye. In the early 1900s Monmouthshire was actually part of England and not within Wales, as it is now; it became part of Gwent which was established in 1974.

Making the digitised records available

Until this point, we’d been dealing with a series of 12 spreadsheets, each holding 10 pages of transcribed records. To make it easier to search (and simply tidier) we combined all of the sheets into a single book. That took a little bit of playing, and a lot of formatting, but once done it allowed further tidying (thank goodness we spotted the record which placed Newcastle in Scotland – that might have upset lots of people!) and standardisation, so that you can search it with confidence.

Placing the new spreadsheet online was relatively straightforward – more complex was ensuring people know about it and are using it. Several blog posts, a lot of Tweets, some emails about it to interest organisations … and trying to keep up with responding to people, too! That last point is crucial. People are interested in our project work, with all sorts of different takes on it – and it’s wonderful that you’re letting us know. That’s really important in terms of keeping the project going: the more we can show what you’re liking about our work and how you’re using it, the stronger the case we can make that this work should carry on.

We’ve had a great response so far, with a huge spike in traffic on the website and an increase in downloads of the database. You’ve also been very kind in your praise and support, with plenty of you helping spread the word. This only reinforces what we’ve found already: that there’s a great sense of community around all of us involved in this, whether transcribing or using the data.

Finally, some might say we were gluttons for punishment, as there’s not much time to rest: we’ve got lots of data waiting to go through the cleaning and validation processes. So – after Christmas at least – it’ll be on to that!

, , , ,

No comments yet.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Powered by WordPress. Designed by WooThemes