The more we write about scanning and digitization projects, the more we hear about them. And given that tomorrow is Veterans Day, a project to digitize World War I diaries from British soldiers feels particularly relevant.
The UK National Archives currently has 1.5 million pages of handwritten diaries that were kept by soldiers during World War I. These account—which are not personal diaries, but official war documents—are some of the most requested documents in the National Archives reading room, according to Smithsonian.
“The purpose of the War Diary was to create a record of the operations of the unit on active service,” explains the Great War website, which is devoted to World War I history. “It would record the part it was playing in a battle and would usually list the number of men who went into action and the number of casualties when the unit came out of the action. The information in a War Diary would be used by senior commanders for intelligence about the enemy opposite their units and as a historical record for future planning.”
In honor of the 100th anniversary of World War I, the UK National Archives put the scanned documents online this year (where they can be downloaded for free) and started a crowdsourcing metadata project.
The project, Operation War Diary, comes from a partnership among the National Archives, the citizen science initiative Zooniverse, and the Imperial War Museum (IWM) in the UK. Data gathered from the project will have three main purposes:
- To enrich The National Archives’ catalogue descriptions for the unit war diaries
- To provide evidence about the experience of named individuals in IWM’s Lives of the First World War project
- To present academics with large amounts of accurate data to help them gain a better understanding of how the war was fought
Once a diary is scanned and posted online, volunteers, known as “citizen historians,” tag it, collecting metadata such as the date of the entry, whether the entry lists casualties, what people it mentions, and whether it has a map. At the moment, the diaries are not yet being transcribed, but a great deal of data is collected through the tagging process.
Crowdsourcing this project has been much more productive than hiring professionals. Just eight weeks after the project started, the amount of volunteer effort put in was equivalent to one person working 40 hours a week for four years, reports Smithsonian. This included more than 260,000 tags relating to named individuals, more than 332,000 tags relating to places, and almost 300,000 tags relating to activities.
Like the Smithsonian Institution, which offered specific types of guidance to its volunteers for different kinds of documents, the UK National Archives divides the documents into several types: blank pages, cover pages, diary pages, orders, signal pads, reports, and that all-important “other.” For example, some of the pages are on official forms intended for use as diaries, while others are on plain paper because the forms weren’t available, the project explains. Each type of page also has explanations of the type of data that could be found and tagged on it, ranging from more obvious things such as date and time to descriptions of unit activity and everyday army life.
“Little time was actually spent in combat, so what was life like on the Western Front when you were not fighting?” asks the website. “How did the men live? How often did they eat hot food, have a bath or change their clothes? When they were resting how did they entertain themselves? Tag mentions of concert parties, sporting events, religious services, food, and hygiene and help us learn more about life beyond the fighting.”
The project is even asking volunteers to collect data about the weather. “In popular culture, the Western Front was a morass of mud for four long years,” explains the website. “But the geology of the battlefield varied dramatically from the low lying fields of Flanders where the water table was barely below the surface to the chalk downs of Picardy. The war was fought in all seasons and the army had to function in all weather conditions. But what effect did the weather have on daily life and on the conduct of the conflict?”
In particular, the project is looking for names so it can track the participants of the war. “Names are central to Operation War Diary—they are what makes all the other information we’re collecting real, the visual reminder that it relates to the daily experiences of people just like you,” writes the project’s blog.
“So far, we’ve identified over 50,000 unique names. Many of these belong to officers, but there are a great number of Other Ranks too, many of them only ever mentioned once in all the millions of pages we have to tag. That’s what makes the work of our Citizen Historians so important—if that person isn’t tagged, we may never find the reference to them again, yet by tagging it we can make it visible and accessible to others who come after us.”
Like the Smithsonian Institution, the War Diary project encourages collaboration and team-building among its volunteers by providing a discussion page, a Facebook page, and a Twitter feed, which let the volunteers chat with each other about what they’re reading. In addition, the page gives volunteers the opportunity to pass on information—such as a definitive list of tags they can use for consistency—or anything particularly cool they come across, such as sketches of a battle.
As we prepare to commemorate what was originally called Armistice Day, the end of World War I, please remember to honor a veteran today.
Simplicity 2.0 is where we examine the intricate and transitory world of technology—through a Laserfiche lens. By keeping an eye on larger trends, we aim to make software that’s relevant to modern day workers, rather than build technology for technology’s sake.
Subscribe to Simplicity 2.0 and follow us on Twitter. If what we’re saying piques your interest, head over to Laserfiche.com where you’ll see how we apply the lessons learned on Simplicity 2.0 to our own processes, products and industry.