political meetings mapper, british library labs symposium, 2 november 2015
TRANSCRIPT
A British Library Labs projectby
Dr Katrina NavickasUniversity of Hertfordshire
[email protected]@katrinanavickas
with Ben O’Steen & Mahendra Mahey
Chartism was the biggest popular movement for democracy in 19th century British history. They campaigned for the vote for all men.
http://www.bl.uk/learning/histcitizen/21cc/struggle/chartists1/historicalsources/source6/kenningtoncommon.html
The Chartists advertised these meetings in the Northern Star newspaper, from 1838 to 1850
But how many meetings?
Digitising the academic historian
https://upload.wikimedia.org/wikipedia/commons/9/9e/2004_microfilm_reader_1117365851.jpg
Days of searching the images and mapping by hand …. Or just under 2.30 mins of Python running? Take your choice!
Mission:
Find out how many meetings Map where the meetings were
held Identify the meetings reports in
the digitized newspapers
Sources:
BL digitized 19th century newspapers
BL geo-referenced historic maps
BL playbills collection
How did we do it?
• Redo the OCR of original image files using Abbyy Finereader 12
OCRNo crowd-
sourced transcripti
on needed!
How did we do it?
• Redo the OCR of original image files using Abbyy Finereader 12
OCR
• Python code to extract place names &
• geo-code places using a gazetteer
Geo-code • Python code
with regex to extract dates
• Basic NLP to calculate the dates of words like ‘tomorrow’
Date
Results!For 1841-44:5519meetings and counting…
In 462 towns & villages and counting…
Results!200+ lecture tours by Chartist lecturers paid to travel around their regions
politicalmeetingsmapper.co.uk on the Omeka platform
London venues
Machine Learning!!! Using IPython Notebook,
we made a classifier to try to identify meetings texts from other types of text
It worked!
Chartist tour of London, 12 September 2015
Where next?
Feeding in more data!• Using NLP for parsing more dates and other
data• Connecting ‘forthcoming meetings’ to reports
of the meetings in the next issue of the newspaper
More Machine Learning• Identifying columns and types of texts in the
unreconstructed XML of the newspapers in the BL digital collections
Space Syntax project with UCL Space Syntax Lab• Analysing spatial patterns of meetings’
frequency
Credits Ben O’Steen (BL Labs) – technical assistance Mahendra Mahey (BL Labs) – management Dr James Baker (formerly of this parish) – finding stuff in the system BL maps department – maps NLS Map Library – maps OCR checking – Samantha Walkden & Megan Dibble (UH History graduates) Videography – Adam Lloyd Jones Dr Simon Webster – helping me write Python code (while watching
Countryfile or Michael Portillo’s Great Railway Journeys on Sunday evenings…)
I have a new book out on 1 December … please buy it
https://about.me/katrina.navickas