etraces-subproject text re-use in literature christian kötteritzsch / gerhard lauer / annette...

17
eTRACES-subproject Text re-use in Literature Christian Kötteritzsch / Gerhard Lauer / Annette Geßner

Upload: osborn-dawson

Post on 26-Dec-2015

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: ETRACES-subproject Text re-use in Literature Christian Kötteritzsch / Gerhard Lauer / Annette Geßner

eTRACES-subproject

Text re-use in Literature

Christian Kötteritzsch / Gerhard Lauer / Annette Geßner

Page 2: ETRACES-subproject Text re-use in Literature Christian Kötteritzsch / Gerhard Lauer / Annette Geßner

Team eTRACES Göttingen

Christian Kötteritzsch Gerhard Lauer Annette Geßner

GCDH & Uni Leipzig GCDH & Uni Göttingen GCDH

ASV German studies Classics

Page 3: ETRACES-subproject Text re-use in Literature Christian Kötteritzsch / Gerhard Lauer / Annette Geßner

Central questionAnalysis of text re-use in German literature

- to understand better how literature make use of other texts

- to understand better specific re-use of given texts in a large corpus of literature

- to understand better specific types of intertextuality

- to facilitate the identification of (indirect) quotations for editorial purposes

Page 4: ETRACES-subproject Text re-use in Literature Christian Kötteritzsch / Gerhard Lauer / Annette Geßner

Corpuszeno.org-corpus (http://www.textgrid.de/en/digitale-bibliothek.html)

includes

fictional texts from Luther to Kafka

Preprocessing of the xml files

through a toolchain to extract and format xml-based corporathanks to Frederik Baumgardt (3 Mann-Monate)

Page 5: ETRACES-subproject Text re-use in Literature Christian Kötteritzsch / Gerhard Lauer / Annette Geßner

A first idea:

A long term analysis of the emerging autonomous aesthetic in German literature,

especially novels

Text Re-use

Page 6: ETRACES-subproject Text re-use in Literature Christian Kötteritzsch / Gerhard Lauer / Annette Geßner

- text mining depends on genre and text re-use styles

- to look for text re-use only within a German corpus would miss the many foreign quotations

- looking for a simpler starting point:

one book in thousandsbut

Text Re-use

Page 7: ETRACES-subproject Text re-use in Literature Christian Kötteritzsch / Gerhard Lauer / Annette Geßner

Objectives

Test case: re-use of the Bible in German literature

- find biblical quotations and allusions

- offer a web-based text re-use tool

- online working environment to create a digital edition

Page 8: ETRACES-subproject Text re-use in Literature Christian Kötteritzsch / Gerhard Lauer / Annette Geßner

Re-use Style

Identify types of biblical re-use by hand

Design a table of quotation styles

Categorize types of "Re-use Style"

Schiller's "Die Räuber" (77 entries)

Fontane's "Effi Briest" (11 entries)

...

Page 9: ETRACES-subproject Text re-use in Literature Christian Kötteritzsch / Gerhard Lauer / Annette Geßner
Page 10: ETRACES-subproject Text re-use in Literature Christian Kötteritzsch / Gerhard Lauer / Annette Geßner

Natural language processing

+ Analysis: Tracer-software (ASV, Leipzig)

+ Server: Virtual machine (Gesellschaft für Wissenschaftliche Datenverarbeitung Göttingen [GWDG])

+ Frontend: Google Web Toolkit Framework

Page 11: ETRACES-subproject Text re-use in Literature Christian Kötteritzsch / Gerhard Lauer / Annette Geßner

Front-End (Mock-Ups)

Page 12: ETRACES-subproject Text re-use in Literature Christian Kötteritzsch / Gerhard Lauer / Annette Geßner

Possible extensions

+ more texts (and Bibles)

+ more own texts

+ more features

? more crowd editing or more personal edition? more distant reading: text statistics

--> Any suggestions?

Page 13: ETRACES-subproject Text re-use in Literature Christian Kötteritzsch / Gerhard Lauer / Annette Geßner
Page 14: ETRACES-subproject Text re-use in Literature Christian Kötteritzsch / Gerhard Lauer / Annette Geßner

Next milestones

May and June 2012:

- collect different bible versions

(Zürcher, Allioli, Keppler etc., revisions)

- integrate into the text re-use tool

- clarify server issues with GWDG

- determine re-use style by analysing more genres, historical and other specifice re-use styles

Page 15: ETRACES-subproject Text re-use in Literature Christian Kötteritzsch / Gerhard Lauer / Annette Geßner

Next milestones

Summer 2012:

- first run on folder 'Romane' of zeno.org-corpus

- evaluation and rerun

Until end of 2012:

- develop a statistic sub-tool

- version 1.0 of front end online

Page 16: ETRACES-subproject Text re-use in Literature Christian Kötteritzsch / Gerhard Lauer / Annette Geßner

Next milestones

Beginning of 2013:

- intern evaluation and optimization

April to July 2013:

- teach a seminar on text re-use (and let students evaluate tool)

- invite editors for test cases

Page 17: ETRACES-subproject Text re-use in Literature Christian Kötteritzsch / Gerhard Lauer / Annette Geßner

Next milestones

Till end of 2013:

- optimizing and bugfixing, finish tool

- enlarge the corpus of literature and of Bibles

- do statistical research cases

- community workshop with editors and text analysts

Till end of project 2014:

- write final report