enhancing and extending the digital study of intertextuality (pt. 2): revealing patterns of...

16
Enhancing and Extending the Digital Study of Intertextuality (pt. 2) Matteo Romanello (DAI/KCL) @mr56k “Making Meaning from Data” DCA Panel @ SCS annual meeting, New Orleans–January 11, 2015

Upload: matteo-romanello

Post on 14-Jul-2015

280 views

Category:

Education


0 download

TRANSCRIPT

Enhancing and Extending the Digital Study ofIntertextuality (pt. 2)

Matteo Romanello (DAI/KCL) @mr56k

“Making Meaning from Data” DCA Panel @ SCS annualmeeting, New Orleans–January 11, 2015

Digital Study of Intertextuality

Enhancing and Extending the Digital Study of Intertextuality:

1. finding new possible parallel passages

2. creating a systematic index of already studied parallels

Digital Study of Intertextuality

Enhancing and Extending the Digital Study of Intertextuality:

1. finding new possible parallel passages

2. creating a systematic index of already studied parallels

The Classicist’s Toolkit

Tool Accuracy Granularity Coverage

Indexes Locorum + + –

Library Catalogues + – –

Full-text Search – – +

Automatic Citation Indexing +/– + +

Citation Extraction: Step 1, (Named Entity Recognition)

Current accuracy (F1-score): 73,88%

Citation Extraction Step 2: (Relation Detection)

Current accuracy (F1-score): 92,60%

Citation Extraction Step 3: (Disambiguation)

Current accuracy (F1-score): 73,05%

Mining Citations from APh and JSTOR

I APh

I 80 volumesI 8 % of vol. 75 (2004)

I 366 abstractsI 26k tokensI 380 citations

I JSTOR

I Classics

I 1,456 journalsI 171k articlesI 327m tokens

I Materiali e Discussioni (29 yrs: 1978-2006)

I 669 articlesI 5.6m tokensI 40k citations

From Index to Network

From Texts to Network

APh: Micro-level

APh: Macro-level

APh: Meso-level

JSTOR: diachronic trends in MD (1978-2006)

Diachronic trends of 5 most cited authors in MD (1978-2006)

Digital Study of Intertextuality: Future Work

I Combination of:

I discovery of new possible parallel passagesI systematic index of already studied parallels

I Scenario:

I use of the parallel frequency for ranking automaticallyidentified candidates

Thank you for your attention!

Questions?

I [email protected] https://twitter.com/mr56k

Some Links:

I http://phd.mr56k.info/data/viz/macro.html

I http://phd.mr56k.info/data/viz/meso.html

I http://phd.mr56k.info/data/viz/micro.html

I https://github.com/mromanello/APh_Corpus

I https://github.com/mromanello/CRefEx