If you’ve ever read the legal notices in the newspaper, you know it can be pretty dry. But get more than a century’s worth together all in one place, make it searchable, and it becomes a treasure trove for historians.
That’s why the National Endowment for the Humanities recently awarded $260,000 to New York University’s Tandon School of Engineering. Its goal is to develop a searchable online portal for 120 years of the City Record, from 1873 to 1998, when the paper went online. In the process, the project will scan and digitize 1,723 volumes—more than a million pages—of New York City records.
The City Record, which includes both a print and an online edition, began publishing on June 24, 1873, in response to the Tweed Ring scandal. Just like government agencies do today with the Internet, it was intended to provide transparency. As we know about events even today such as Sunshine Week, government transparency (particularly through searachable electronic records) makes it easier for citizens to keep track of what their government was doing.
The paper is produced by the NYC Department of Citywide Administrative Services each business day and publishes weekly reports and regulations from every department of city government. It lists every payment, contract, appointment to office, and infrastructure project, as well as vital statistics such as weekly reports on contagious diseases. It also includes material such as public hearings and meetings, public auctions and sales, solicitations and awards, and official rules proposed and adopted by city agencies.
The City Record itself began publishing all its material into a searchable database in August, 2015. Now, material dating back to the paper’s founding will be available in the same way. The project will be led by Jonathan Soffer, Professor of History and Chair, Department of Technology, Culture & Society at Tandon.
To digitize the records, researchers will first scan microfilm of the City Record to create PDFs. (Ironically, the microfilm editions are available through a similar historic preservation project, awarded to Columbia University, in the 1990s.) Then, researchers will take the extra step to use optical character recognition (OCR) to make the words searchable, writes Kayla Nick-Kearney in StateScoop.
The school is also writing programs to tag certain kinds of data, Soffer told Nick-Kearney. In addition, researchers hope to create maps of the recorded addresses and transactions, she writes. (A similar project from the New York Public Library is digitizing insurance atlases and using crowdsourcing to fix errors in the maps.)
“These volumes contain copious data on every aspect of the city’s politics, society, economy, real estate and infrastructure development, employment, and expenditures, and will aid scholars studying the city because of the depth and breadth of the data it contains, offering digitized resources unmatched by any other city,” writes the school in a press release. “The digitized City Record Project will alter both the quantitative and qualitative study of post-Civil War New York.”
The database will provide a treasure trove for public users such as students, bankers, home buyers, family historians, urban planners and journalists who are investigating the historical and financial development of New York property and infrastructure, or researching family history, the school writes. The project is particularly tantalizing because no other city of New York’s size and importance has a comparable historical database, Soffer said.
Examples of data from the paper include:
- weekly reports on mortality and health and meteorological data
- documents on the construction of major historic buildings, such as the New York Public Library at 42d Street and the American Museum of Natural History
- official canvasses of election, down to the election district level from 1878-1940, which can be compared with manuscript censuses
- lists of who worked for the city in minor jobs, and what they did, and who and why they were promoted, disciplined, or fired, and how much money they earned
- reports on the Municipal Lodging House, including the weekly enumerations of the ethnicities of the homeless and the length of time they had been in the city
- lists of property left by prisoners and the dead, yet unclaimed
It may not be the stuff of a Frank Sinatra song, but it will be music to the ears of historians.
Simplicity 2.0 is where we examine the intricate and transitory world of technology—through a Laserfiche lens. By keeping an eye on larger trends, we aim to make software that’s relevant to modern day workers, rather than build technology for technology’s sake.
Subscribe to Simplicity 2.0 and follow us on Twitter. If what we’re saying piques your interest, head over to Laserfiche.com where you’ll see how we apply the lessons learned on Simplicity 2.0 to our own processes, products and industry.